-
Notifications
You must be signed in to change notification settings - Fork 37
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
### What this PR does / why we need it? This PR adds Chinese documents for vllm-ascend for Chinese-speaking developers ### Does this PR introduce _any_ user-facing change? Change as follows - add README.zh.md - add environment.zh.md - add CONTRIBUTING.zh.md ### How was this patch tested? By CI --------- Signed-off-by: wangli <wangli858794774@gmail.com>
- Loading branch information
Showing
4 changed files
with
295 additions
and
0 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,102 @@ | ||
# 为 vLLM 昇腾插件贡献 | ||
|
||
## 构建与测试 | ||
我们推荐您在提交PR之前在本地开发环境进行构建和测试。 | ||
|
||
### 环境准备与构建 | ||
理论上,vllm-ascend 构建仅支持 Linux,因为`vllm-ascend` 依赖项 `torch_npu` 仅支持 Linux。 | ||
|
||
但是您仍然可以在 Linux/Windows/macOS 上配置开发环境进行代码检查和基本测试,如下命令所示: | ||
|
||
```bash | ||
# 选择基础文件夹 (~/vllm-project/) ,创建python虚拟环境 | ||
cd ~/vllm-project/ | ||
python3 -m venv .venv | ||
source ./.venv/bin/activate | ||
|
||
# 克隆并安装vllm | ||
git clone https://github.com/vllm-project/vllm.git | ||
cd vllm | ||
pip install -r requirements-build.txt | ||
VLLM_TARGET_DEVICE="empty" pip install . | ||
cd .. | ||
|
||
# 克隆并安装vllm-ascend | ||
git clone https://github.com/vllm-project/vllm-ascend.git | ||
cd vllm-ascend | ||
pip install -r requirements-dev.txt | ||
|
||
# 通过执行以下脚本以运行 lint 及 mypy 测试 | ||
bash format.sh | ||
|
||
# 构建: | ||
# - 目前仅支持在Linux上进行完整构建(torch_npu 限制) | ||
# pip install -e . | ||
# - 在其他操作系统上构建安装,需要跳过依赖 | ||
# - build without deps for debugging in other OS | ||
# pip install -e . --no-deps | ||
|
||
# 使用 `-s` 提交更改 | ||
git commit -sm "your commit info" | ||
``` | ||
|
||
### 测试 | ||
虽然 vllm-ascend CI 提供了对 [Ascend](.github/workflows/vllm_ascend_test.yaml) 的集成测试,但您也可以在本地运行它。在本地运行这些集成测试的最简单方法是通过容器: | ||
|
||
```bash | ||
# 基于昇腾NPU环境 | ||
git clone https://github.com/vllm-project/vllm-ascend.git | ||
cd vllm-ascend | ||
|
||
IMAGE=vllm-ascend-dev-image | ||
CONTAINER_NAME=vllm-ascend-dev | ||
DEVICE=/dev/davinci1 | ||
|
||
# 首次构建会花费10分钟(10MB/s)下载基础镜像和包 | ||
docker build -t $IMAGE -f ./Dockerfile . | ||
# 您还可以通过设置 VLLM_REPO 来指定镜像仓库以加速 | ||
# docker build -t $IMAGE -f ./Dockerfile . --build-arg VLLM_REPO=https://gitee.com/mirrors/vllm | ||
|
||
docker run --name $CONTAINER_NAME --network host --device $DEVICE \ | ||
--device /dev/davinci_manager --device /dev/devmm_svm \ | ||
--device /dev/hisi_hdc -v /usr/local/dcmi:/usr/local/dcmi \ | ||
-v /usr/local/bin/npu-smi:/usr/local/bin/npu-smi \ | ||
-v /usr/local/Ascend/driver/lib64/:/usr/local/Ascend/driver/lib64/ \ | ||
-ti --rm $IMAGE bash | ||
|
||
cd vllm-ascend | ||
pip install -r requirements-dev.txt | ||
|
||
pytest tests/ | ||
``` | ||
|
||
## 开发者来源证书(DCO) | ||
|
||
在向本项目提交贡献时,您必须同意 DCO。提交必须包含“Signed-off-by:”标头,以证明同意 DCO 的条款。 | ||
|
||
在`git commit`时使用`-s`将会自动添加该标头。 | ||
|
||
## PR 标题和分类 | ||
|
||
仅特定类型的 PR 会被审核。PR 标题会以适当的前缀来表明变更类型。请使用以下之一: | ||
|
||
- `[Attention]` 关于`attention`的新特性或优化 | ||
- `[Communicator]` 关于`communicators`的新特性或优化 | ||
- `[ModelRunner]` 关于`model runner`的新特性或优化 | ||
- `[Platform]` 关于`platform`的新特性或优化 | ||
- `[Worker]` 关于`worker`的新特性或优化 | ||
- `[Core]` 关于`vllm-ascend`核心逻辑 (如 `platform, attention, communicators, model runner`)的新特性或优化 | ||
- `[Kernel]` 影响计算内核和操作的更改. | ||
- `[Bugfix]` bug修复 | ||
- `[Doc]` 文档的修复与更新 | ||
- `[Test]` 测试 (如:单元测试) | ||
- `[CI]` 构建或持续集成改进 | ||
- `[Misc]` 适用于更改内容对于上述类别均不适用的PR,请谨慎使用该前缀 | ||
|
||
> [!注意] | ||
> 如果 PR 涉及多个类别,请添加所有相关前缀 | ||
## 其他 | ||
|
||
您可以在 [<u>docs.vllm.ai</u>](https://docs.vllm.ai/en/latest/contributing/overview.html) 上找到更多有关为 vLLM 昇腾插件贡献的信息。 | ||
如果您在贡献过程中发现任何问题,您可以随时提交 PR 来改进文档以帮助其他开发人员。 |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,151 @@ | ||
<p align="center"> | ||
<picture> | ||
<!-- TODO: Replace tmp link to logo url after vllm-projects/vllm-ascend ready --> | ||
<source media="(prefers-color-scheme: dark)" srcset="https://github.com/user-attachments/assets/4a958093-58b5-4772-a942-638b51ced646"> | ||
<img alt="vllm-ascend" src="https://github.com/user-attachments/assets/838afe2f-9a1d-42df-9758-d79b31556de0" width=55%> | ||
</picture> | ||
</p> | ||
|
||
<h3 align="center"> | ||
vLLM Ascend Plugin | ||
</h3> | ||
|
||
<p align="center"> | ||
| <a href="https://www.hiascend.com/en/"><b>关于昇腾</b></a> | <a href="https://slack.vllm.ai"><b>开发者 Slack (#sig-ascend)</b></a> | | ||
</p> | ||
|
||
<p align="center"> | ||
<a href="README.md"><b>English</b></a> | <a><b>中文</b></a> | ||
</p> | ||
|
||
--- | ||
*最新消息* 🔥 | ||
|
||
- [2024/12] 我们正在与 vLLM 社区合作,以支持 [[RFC]: Hardware pluggable](https://github.com/vllm-project/vllm/issues/11162). | ||
--- | ||
## 总览 | ||
|
||
vLLM 昇腾插件 (`vllm-ascend`) 是一个让vLLM在Ascend NPU无缝运行的后端插件。 | ||
|
||
此插件是 vLLM 社区中支持昇腾后端的推荐方式。它遵循[[RFC]: Hardware pluggable](https://github.com/vllm-project/vllm/issues/11162)所述原则:通过解耦的方式提供了vLLM对Ascend NPU的支持。 | ||
|
||
使用 vLLM 昇腾插件,可以让类Transformer、混合专家(MOE)、嵌入、多模态等流行的大语言模型在 Ascend NPU 上无缝运行。 | ||
|
||
## 前提 | ||
### 支持的设备 | ||
- Atlas A2 训练系列 (Atlas 800T A2, Atlas 900 A2 PoD, Atlas 200T A2 Box16, Atlas 300T A2) | ||
- Atlas 800I A2 推理系列 (Atlas 800I A2) | ||
|
||
### 依赖 | ||
| 需求 | 支持的版本 | 推荐版本 | 注意 | | ||
|-------------|-------------------| ----------- |------------------------------------------| | ||
| vLLM | main | main | vllm-ascend 依赖 | | ||
| Python | >= 3.9 | [3.10](https://www.python.org/downloads/) | vllm 依赖 | | ||
| CANN | >= 8.0.RC2 | [8.0.RC3](https://www.hiascend.com/developer/download/community/result?module=cann&cann=8.0.0.beta1) | vllm-ascend and torch-npu 依赖 | | ||
| torch-npu | >= 2.4.0 | [2.5.1rc1](https://gitee.com/ascend/pytorch/releases/tag/v6.0.0.alpha001-pytorch2.5.1) | vllm-ascend 依赖 | | ||
| torch | >= 2.4.0 | [2.5.1](https://github.com/pytorch/pytorch/releases/tag/v2.5.1) | torch-npu and vllm 依赖 | | ||
|
||
在[此处](docs/environment.zh.md)了解更多如何配置您环境的信息。 | ||
|
||
## 开始使用 | ||
|
||
> [!NOTE] | ||
> 目前,我们正在积极与 vLLM 社区合作以支持 Ascend 后端插件,一旦支持,您可以使用一行命令: `pip install vllm vllm-ascend` 来完成安装。 | ||
通过源码安装: | ||
```bash | ||
# 安装vllm main 分支参考文档: | ||
# https://docs.vllm.ai/en/latest/getting_started/installation/cpu/index.html#build-wheel-from-source | ||
git clone --depth 1 https://github.com/vllm-project/vllm.git | ||
cd vllm | ||
pip install -r requirements-build.txt | ||
VLLM_TARGET_DEVICE=empty pip install . | ||
|
||
# 安装vllm-ascend main 分支 | ||
git clone https://github.com/vllm-project/vllm-ascend.git | ||
cd vllm-ascend | ||
pip install -e . | ||
``` | ||
|
||
运行如下命令使用 [Qwen/Qwen2.5-0.5B-Instruct](https://huggingface.co/Qwen/Qwen2.5-0.5B-Instruct) 模型启动服务: | ||
|
||
```bash | ||
# 设置环境变量 VLLM_USE_MODELSCOPE=true 加速下载 | ||
vllm serve Qwen/Qwen2.5-0.5B-Instruct | ||
curl http://localhost:8000/v1/models | ||
``` | ||
|
||
请参阅 [vLLM 快速入门](https://docs.vllm.ai/en/latest/getting_started/quickstart.html)以获取更多详细信息。 | ||
|
||
## 构建 | ||
|
||
#### 从源码构建Python包 | ||
|
||
```bash | ||
git clone https://github.com/vllm-project/vllm-ascend.git | ||
cd vllm-ascend | ||
pip install -e . | ||
``` | ||
|
||
#### 构建容器镜像 | ||
```bash | ||
git clone https://github.com/vllm-project/vllm-ascend.git | ||
cd vllm-ascend | ||
docker build -t vllm-ascend-dev-image -f ./Dockerfile . | ||
``` | ||
|
||
查看[构建和测试](./CONTRIBUTING.zh.md)以获取更多详细信息,其中包含逐步指南,帮助您设置开发环境、构建和测试。 | ||
|
||
## 特性支持矩阵 | ||
| Feature | Supported | Note | | ||
|---------|-----------|------| | ||
| Chunked Prefill | ✗ | Plan in 2025 Q1 | | ||
| Automatic Prefix Caching | ✅ | Imporve performance in 2025 Q1 | | ||
| LoRA | ✗ | Plan in 2025 Q1 | | ||
| Prompt adapter | ✅ || | ||
| Speculative decoding | ✅ | Impore accuracy in 2025 Q1| | ||
| Pooling | ✗ | Plan in 2025 Q1 | | ||
| Enc-dec | ✗ | Plan in 2025 Q1 | | ||
| Multi Modality | ✅ (LLaVA/Qwen2-vl/Qwen2-audio/internVL)| Add more model support in 2025 Q1 | | ||
| LogProbs | ✅ || | ||
| Prompt logProbs | ✅ || | ||
| Async output | ✅ || | ||
| Multi step scheduler | ✅ || | ||
| Best of | ✅ || | ||
| Beam search | ✅ || | ||
| Guided Decoding | ✗ | Plan in 2025 Q1 | | ||
|
||
## 模型支持矩阵 | ||
|
||
此处展示了部分受支持的模型。有关更多详细信息,请参阅 [supported_models](docs/supported_models.md): | ||
| Model | Supported | Note | | ||
|---------|-----------|------| | ||
| Qwen 2.5 | ✅ || | ||
| Mistral | | Need test | | ||
| DeepSeek v2.5 | |Need test | | ||
| LLama3.1/3.2 | ✅ || | ||
| Gemma-2 | |Need test| | ||
| baichuan | |Need test| | ||
| minicpm | |Need test| | ||
| internlm | ✅ || | ||
| ChatGLM | ✅ || | ||
| InternVL 2.5 | ✅ || | ||
| Qwen2-VL | ✅ || | ||
| GLM-4v | |Need test| | ||
| Molomo | ✅ || | ||
| LLaVA 1.5 | ✅ || | ||
| Mllama | |Need test| | ||
| LLaVA-Next | |Need test| | ||
| LLaVA-Next-Video | |Need test| | ||
| Phi-3-Vison/Phi-3.5-Vison | |Need test| | ||
| Ultravox | |Need test| | ||
| Qwen2-Audio | ✅ || | ||
|
||
|
||
## 贡献 | ||
我们欢迎并重视任何形式的贡献与合作: | ||
- 请通过[提交问题](https://github.com/vllm-project/vllm-ascend/issues)来告知我们您遇到的任何错误。 | ||
- 请参阅 [CONTRIBUTING.zh.md](./CONTRIBUTING.zh.md) 中的贡献指南。 | ||
## 许可证 | ||
|
||
Apache 许可证 2.0,如 [LICENSE](./LICENSE) 文件中所示。 |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,38 @@ | ||
### 昇腾NPU环境准备 | ||
|
||
### 依赖 | ||
| 需求 | 支持的版本 | 推荐版本 | 注意 | | ||
|-------------|-------------------| ----------- |------------------------------------------| | ||
| vLLM | main | main | vllm-ascend 依赖 | | ||
| Python | >= 3.9 | [3.10](https://www.python.org/downloads/) | vllm 依赖 | | ||
| CANN | >= 8.0.RC2 | [8.0.RC3](https://www.hiascend.com/developer/download/community/result?module=cann&cann=8.0.0.beta1) | vllm-ascend and torch-npu 依赖 | | ||
| torch-npu | >= 2.4.0 | [2.5.1rc1](https://gitee.com/ascend/pytorch/releases/tag/v6.0.0.alpha001-pytorch2.5.1) | vllm-ascend 依赖 | | ||
| torch | >= 2.4.0 | [2.5.1](https://github.com/pytorch/pytorch/releases/tag/v2.5.1) | torch-npu and vllm 依赖 | | ||
|
||
|
||
以下为安装推荐版本软件的简短说明: | ||
|
||
#### 容器化安装 | ||
|
||
您可以直接使用[容器镜像](https://hub.docker.com/r/ascendai/cann),只需一行命令即可: | ||
|
||
```bash | ||
docker run \ | ||
--name vllm-ascend-env \ | ||
--device /dev/davinci1 \ | ||
--device /dev/davinci_manager \ | ||
--device /dev/devmm_svm \ | ||
--device /dev/hisi_hdc \ | ||
-v /usr/local/dcmi:/usr/local/dcmi \ | ||
-v /usr/local/bin/npu-smi:/usr/local/bin/npu-smi \ | ||
-v /usr/local/Ascend/driver/lib64/:/usr/local/Ascend/driver/lib64/ \ | ||
-v /usr/local/Ascend/driver/version.info:/usr/local/Ascend/driver/version.info \ | ||
-v /etc/ascend_install.info:/etc/ascend_install.info \ | ||
-it quay.io/ascend/cann:8.0.rc3.beta1-910b-ubuntu22.04-py3.10 bash | ||
``` | ||
|
||
您无需手动安装 `torch` 和 `torch_npu` ,它们将作为 `vllm-ascend` 依赖项自动安装。 | ||
|
||
#### 手动安装 | ||
|
||
您也可以选择手动安装,按照[昇腾安装指南](https://ascend.github.io/docs/sources/ascend/quick_install.html)中提供的说明配置环境。 |