Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Move lingvo patches into the manifest #409

Closed
terrykong opened this issue Dec 4, 2023 · 1 comment
Closed

Move lingvo patches into the manifest #409

terrykong opened this issue Dec 4, 2023 · 1 comment

Comments

@terrykong
Copy link
Contributor

After #405 gets merged, we should move the lingvo patches into the manifest and use create-distribution.sh to build our flavor of lingvo needed for ARM64.

yhtang added a commit that referenced this issue Dec 18, 2023
This introduces "the manifest" (`manifest.yaml`) that describes the
complete state of the jax stack to allow for reproducible nightly builds
and reproducible presubmit CI. The manifest (and patches) are staged in
a "trial branch" each night, and if the build succeeds, we can merge the
trial branch into main (TODO: GH Issue tracking automating merge). The
presubmit CI runs on the PR's git-ref and so unless the author has
committed a custom patch/manifest, they will always be running from the
"last working state".

# Description of `manifest.yaml`
The manifest allows specifying/pinning libraries that serve different
purposes:
* git repos: libraries that need their source checked out at a
particular SHA
* pip VCS constraints: [VCS dependency
specs](https://pip.pypa.io/en/stable/topics/vcs-support/) that need to
be pinned like `fiddle @ git+https://github.com/google/fiddle`
* pip constraints: Python packages we need to constrain or pin to
address one-off fixes, e.g., addressing a CVE

Here are example entries of each of these in the manifest.yaml

# Git repos
```yaml
jax:
  url: https://github.com/google/jax.git
  tracking_ref: main
  latest_verified_commit: b032a0271e3e2ea8d0df64d2f3f1a1e450a38dc9  # 2023-11-15
  mode: git-clone
```
* url: This is the url to lookup the latest git ref
* tracking_ref: This is the git-ref to lookup the latest commit SHA.
Usually it’s main, but in some cases like lingvo, it’s master
* latest_verified_commit: SHA
* mode:
    * git-clone: Whether to clone the repo or not

## Git repos with patches
```yaml
flax:
  url: https://github.com/google/flax.git
  mirror_url: https://github.com/nvjax-svc-0/flax.git
  extra_dir: null
  tracking_ref: main
  latest_verified_commit: a572f6af2fef565c0f9ba2fc12b781e9e3385140
  mode: git-clone
  patches:
    pull/3340/head: file://patches/flax/PR-3340.patch # Add Sharding Annotations to Flax Modules
```
* mirror_url: a git url where we can pull patches that start with
`^mirror/`
* extra_dir: an optional local dir to pull patches not found in upstream
or the mirror. Useful for private repos
* patches: a dictionary where keys are git-refs and values are URIs for
the local patch. The reason for the value is the git-ref can be rebased
or be updated and the SHA will change; this makes patches a moving
target when using git refs. So each `file://` URI will be a patch
committed into Jax-Toolbox's VC to ensure reproducibility

# VCS Constraint
```yaml
clu:
  url: https://github.com/google/CommonLoopUtils.git 
  tracking_ref: main
  latest_verified_commit: 89c2face3474a7482358068d7a00d9bb6e4b31fe
  mode: pip-constraint
```
These aren't cloned (in fact `get-source.sh` will error if you try to
clone). These are used in `pip-finalize.sh` to pin VCS dependencies like
```sh
clu @ git+https://github.com/google/CommonLoopUtils#egg=clu

# To

clu @ git+https://github.com/google/CommonLoopUtils@89c2face3474a7482358068d7a00d9bb6e4b31fe
```

# Pip constraint
```yaml
pydantic:
  mode: pip-constraint
  version: X.Y.Z
```
* mode: pip-constraint
    * Adds a pip constraint during pip-compile 
* version: If ${package}.version is present, then we can treat this as a
pip constraint. This allows us to hotfix any python package in case we
discover a bug or CVE.

# Changes to the CI
* `Nightly JAX Build` bumps the manifest.yaml and patches and commits
them to a trial branch. If the tests pass, this trial branch should be
merged to `main`
    * introduces `bump.sh` that bumps the world state given the manifest
* No more `REPO_*` and `REF_*` build args, since they are all specified
in the `manifest.yaml`
* `get-source.sh` and `create-distribution.sh` now take the manifest
* Rename pip-tools "manifests" to "requirements-*.in` to avoid confusion
and use the same terminology as pip-tools

# Not addressed in this PR
* [ ] Automating the merging of the trial branch at the end of a
successful nightly run
* [ ] Running presubmit on latest world state (currently only will run
on last-known working state)
* [ ] Move lingvo patches to the manifest (#409 )
* [ ] Move xla arm neon patch to the manifest
* [ ] Add a cleanup.sh script to remove things like `~/.gitconfig`
* [ ] Have not implemented `mode: pip-constraint` since there was no dep
that needed a constraint

---------

Co-authored-by: Yu-Hang Tang <Tang.Maxin@gmail.com>
@yhtang
Copy link
Collaborator

yhtang commented Jan 22, 2025

No longer relevant.

@yhtang yhtang closed this as completed Jan 22, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants