diff --git a/CONTRIBUTING.md b/CONTRIBUTING.md index f96e308e..c7d45f68 100644 --- a/CONTRIBUTING.md +++ b/CONTRIBUTING.md @@ -1,24 +1,86 @@ -# Contributing to vLLM Ascend backend plugin +# Contributing to vLLM Ascend plugin -## DCO and Signed-off-by +## Building and testing +It's recommended to set up a local development environment to build and test +before you submit a PR. -When contributing changes to this project, you must agree to the DCO. Commits must include a `Signed-off-by:` header which certifies agreement with the terms of the DCO. +### Prepare environment and build -Using `-s` with `git commit` will automatically add this header. +Theoretically, the vllm-ascend build is only supported on Linux because +`vllm-ascend` dependency `torch_npu` only supports Linux. -## Testing +But you can still set up dev env on Linux/Windows/macOS for linting and basic +test as following commands: ```bash +# Choose a base dir (~/vllm-project/) and set up venv +cd ~/vllm-project/ +python3 -m venv .venv +source ./.venv/bin/activate + +# Clone vllm code and install +git clone https://github.com/vllm-project/vllm.git +cd vllm +pip install -r requirements-build.txt +VLLM_TARGET_DEVICE="empty" pip install . +cd .. + +# Clone vllm-ascend and install +git clone https://github.com/vllm-project/vllm-ascend.git +cd vllm-ascend pip install -r requirements-dev.txt -# 1. linting and formatting +# Then you can run lint and mypy test bash format.sh -# 2. Unit tests -pytest tests/ -# 3. Commit changed files using `-s` + +# Build: +# - only supported on Linux (torch_npu available) +# pip install -e . +# - build without deps for debugging in other OS +# pip install -e . --no-deps + +# Commit changed files using `-s` git commit -sm "your commit info" ``` +### Testing + +Although vllm-ascend CI provide integration test on [Ascend](.github/workflows/vllm_ascend_test.yaml), you can run it +locally. The simplest way to run these integration tests locally is through a container: + +```bash +# Under Ascend NPU environment +git clone https://github.com/vllm-project/vllm-ascend.git +cd vllm-ascend + +IMAGE=vllm-ascend-dev-image +CONTAINER_NAME=vllm-ascend-dev +DEVICE=/dev/davinci1 + +# The first build will take about 10 mins (10MB/s) to download the base image and packages +docker build -t $IMAGE -f ./Dockerfile . +# You can also specify the mirror repo via setting VLLM_REPO to speedup +# docker build -t $IMAGE -f ./Dockerfile . --build-arg VLLM_REPO=https://gitee.com/mirrors/vllm + +docker run --name $CONTAINER_NAME --network host --device $DEVICE \ + --device /dev/davinci_manager --device /dev/devmm_svm \ + --device /dev/hisi_hdc -v /usr/local/dcmi:/usr/local/dcmi \ + -v /usr/local/bin/npu-smi:/usr/local/bin/npu-smi \ + -v /usr/local/Ascend/driver/lib64/:/usr/local/Ascend/driver/lib64/ \ + -ti --rm $IMAGE bash + +cd vllm-ascend +pip install -r requirements-dev.txt + +pytest tests/ +``` + +## DCO and Signed-off-by + +When contributing changes to this project, you must agree to the DCO. Commits must include a `Signed-off-by:` header which certifies agreement with the terms of the DCO. + +Using `-s` with `git commit` will automatically add this header. + ## PR Title and Classification Only specific types of PRs will be reviewed. The PR title is prefixed appropriately to indicate the type of change. Please use one of the following: @@ -42,3 +104,4 @@ Only specific types of PRs will be reviewed. The PR title is prefixed appropriat ## Others You may find more information about contributing to vLLM Ascend backend plugin on [docs.vllm.ai](https://docs.vllm.ai/en/latest/contributing/overview.html). +If you find any problem when contributing, you can feel free to submit a PR to improve the doc to help other developers. diff --git a/Dockerfile b/Dockerfile index 9fab5a1f..63c8bd68 100644 --- a/Dockerfile +++ b/Dockerfile @@ -21,24 +21,20 @@ FROM quay.io/ascend/cann:8.0.rc3.beta1-910b-ubuntu22.04-py3.10 ENV DEBIAN_FRONTEND=noninteractive RUN apt-get update -y && \ - apt-get install -y python3-pip git vim \ - gcc-12 g++-12 libnuma-dev - -RUN update-alternatives --install /usr/bin/gcc gcc /usr/bin/gcc-12 10 --slave /usr/bin/g++ g++ /usr/bin/g++-12 + apt-get install -y python3-pip git vim WORKDIR /workspace -COPY . /workspace/vllm_ascend/ +COPY . /workspace/vllm-ascend/ RUN pip config set global.index-url https://mirrors.tuna.tsinghua.edu.cn/pypi/web/simple -RUN pip install cmake>=3.26 wheel packaging ninja "setuptools-scm>=8" numpy +# Install vLLM main +ARG VLLM_REPO=https://github.com/vllm-project/vllm.git +RUN git clone --depth 1 $VLLM_REPO /workspace/vllm +RUN VLLM_TARGET_DEVICE="empty" python3 -m pip install /workspace/vllm/ -# install build requirements -RUN PIP_EXTRA_INDEX_URL="https://download.pytorch.org/whl/cpu" python3 -m pip install -r /workspace/vllm_ascend//vllm/requirements-build.txt -# build vLLM with NPU backend -RUN PIP_EXTRA_INDEX_URL="https://download.pytorch.org/whl/cpu" VLLM_TARGET_DEVICE="cpu" python3 -m pip install /workspace/vllm_ascend/vllm/ -# install vllm_ascend -RUN python3 -m pip install /workspace/vllm_ascend/ +# Install vllm-ascend main +RUN python3 -m pip install /workspace/vllm-ascend/ CMD ["/bin/bash"] diff --git a/README.md b/README.md index ba9aa929..c16e8372 100644 --- a/README.md +++ b/README.md @@ -21,11 +21,11 @@ vLLM Ascend Plugin --- ## Overview -`vllm-ascend` is a backend plugin for running vLLM on the Ascend NPU. +vLLM Ascend plugin (`vllm-ascend`) is a backend plugin for running vLLM on the Ascend NPU. This plugin is the recommended approach for supporting the Ascend backend within the vLLM community. It adheres to the principles outlined in the [[RFC]: Hardware pluggable](https://github.com/vllm-project/vllm/issues/11162), providing a hardware-pluggable interface that decouples the integration of the Ascend NPU with vLLM. -By using `vllm-ascend`, popular open-source models, including Transformer-like, Mixture-of-Expert, Embedding, Multi-modal LLMs can run seamlessly on the Ascend NPU. +By using vLLM Ascend plugin, popular open-source models, including Transformer-like, Mixture-of-Expert, Embedding, Multi-modal LLMs can run seamlessly on the Ascend NPU. ## Prerequisites ### Support Devices @@ -33,32 +33,29 @@ By using `vllm-ascend`, popular open-source models, including Transformer-like, - Atlas 800I A2 Inference series (Atlas 800I A2) ### Dependencies -| Requirement | Supported version | Recommended version | Note | -| ------------ | ------- | ----------- | ----------- | -| Python | >= 3.9 | [3.10](https://www.python.org/downloads/) | Required for vllm | -| CANN | >= 8.0.RC2 | [8.0.RC3](https://www.hiascend.com/developer/download/community/result?module=cann&cann=8.0.0.beta1) | Required for vllm-ascend and torch-npu | -| torch-npu | >= 2.4.0 | [2.5.1rc1](https://gitee.com/ascend/pytorch/releases/tag/v6.0.0.alpha001-pytorch2.5.1) | Required for vllm-ascend | -| torch | >= 2.4.0 | [2.5.1](https://github.com/pytorch/pytorch/releases/tag/v2.5.1) | Required for torch-npu and vllm required | +| Requirement | Supported version | Recommended version | Note | +|-------------|-------------------| ----------- |------------------------------------------| +| vLLM | main | main | Required for vllm-ascend | +| Python | >= 3.9 | [3.10](https://www.python.org/downloads/) | Required for vllm | +| CANN | >= 8.0.RC2 | [8.0.RC3](https://www.hiascend.com/developer/download/community/result?module=cann&cann=8.0.0.beta1) | Required for vllm-ascend and torch-npu | +| torch-npu | >= 2.4.0 | [2.5.1rc1](https://gitee.com/ascend/pytorch/releases/tag/v6.0.0.alpha001-pytorch2.5.1) | Required for vllm-ascend | +| torch | >= 2.4.0 | [2.5.1](https://github.com/pytorch/pytorch/releases/tag/v2.5.1) | Required for torch-npu and vllm required | Find more about how to setup your environment in [here](docs/environment.md). -## Getting Started +## Getting Started > [!NOTE] -> Currently, we are actively collaborating with the vLLM community to support the Ascend backend plugin, once supported we use one line command `pip install vllm vllm-ascend` to compelete installation. +> Currently, we are actively collaborating with the vLLM community to support the Ascend backend plugin, once supported you can use one line command `pip install vllm vllm-ascend` to compelete installation. Installation from source code: ```bash # Install vllm main branch according: # https://docs.vllm.ai/en/latest/getting_started/installation/cpu/index.html#build-wheel-from-source -git clone https://github.com/vllm-project/vllm.git +git clone --depth 1 https://github.com/vllm-project/vllm.git cd vllm -sudo apt-get update -y -sudo apt-get install -y gcc-12 g++-12 libnuma-dev -sudo update-alternatives --install /usr/bin/gcc gcc /usr/bin/gcc-12 10 --slave /usr/bin/g++ g++ /usr/bin/g++-12 -pip install cmake>=3.26 wheel packaging ninja "setuptools-scm>=8" numpy -pip install -r requirements-cpu.txt -VLLM_TARGET_DEVICE=cpu python setup.py install +pip install -r requirements-build.txt +VLLM_TARGET_DEVICE=empty pip install . # Install vllm-ascend main branch git clone https://github.com/vllm-project/vllm-ascend.git @@ -66,14 +63,15 @@ cd vllm-ascend pip install -e . ``` -Run the following command to start the vLLM server with the [Qwen2.5-1.5B-Instruct](https://huggingface.co/Qwen/Qwen2.5-1.5B-Instruct) model: +Run the following command to start the vLLM server with the [Qwen/Qwen2.5-0.5B-Instruct](https://huggingface.co/Qwen/Qwen2.5-0.5B-Instruct) model: ```bash -vllm serve Qwen/Qwen2.5-1.5B-Instruct +# export VLLM_USE_MODELSCOPE=true to speed up download +vllm serve Qwen/Qwen2.5-0.5B-Instruct curl http://localhost:8000/v1/models ``` -Find more details in the [vLLM Quickstart](https://docs.vllm.ai/en/latest/getting_started/quickstart.html). +Please refer to [vLLM Quickstart](https://docs.vllm.ai/en/latest/getting_started/quickstart.html) for more details. ## Building @@ -89,20 +87,16 @@ pip install -e . ```bash git clone https://github.com/vllm-project/vllm-ascend.git cd vllm-ascend -docker build -t vllm-ascend-dev -f ./Dockerfile . +docker build -t vllm-ascend-dev-image -f ./Dockerfile . ``` -## Contributing - -We welcome and value any contributions and collaborations, here is a quick note before you submit a PR: +See [Building and Testing](./CONTRIBUTING.md) for more details, which is a step-by-step guide to help you set up development environment, build and test. -``` -# Downloading and install dev requirements -git clone https://github.com/vllm-project/vllm-ascend -pip install -r requirements-dev.txt +## Contributing +We welcome and value any contributions and collaborations: +- Please let us know if you encounter a bug by [filing an issue](https://github.com/vllm-project/vllm-ascend/issues). +- Please see the guidance on how to contribute in [CONTRIBUTING.md](./CONTRIBUTING.md). -# Linting and formatting -bash format.sh -``` +## License -Find more details in the [CONTRIBUTING.md](./CONTRIBUTING.md). +Apache License 2.0, as found in the [LICENSE](./LICENSE) file. diff --git a/docs/environment.md b/docs/environment.md index 1a1924e3..5dd70b29 100644 --- a/docs/environment.md +++ b/docs/environment.md @@ -30,6 +30,8 @@ docker run \ -it quay.io/ascend/cann:8.0.rc3.beta1-910b-ubuntu22.04-py3.10 bash ``` +You do not need to install `torch` and `torch_npu` manually, they will be automatically installed as `vllm-ascend` dependencies. + #### Manual installation Or follow the instructions provided in the [Ascend Installation Guide](https://ascend.github.io/docs/sources/ascend/quick_install.html) to set up the environment.