-
Notifications
You must be signed in to change notification settings - Fork 50
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Merge pull request #675 from zifeng-radxa/main
add deepseek-r1 RKLLM docs
- Loading branch information
Showing
24 changed files
with
599 additions
and
261 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,100 @@ | ||
[DeepSeek-R1](https://api-docs.deepseek.com/news/news250120) 是由杭州[深度求索](https://www.deepseek.com/)公司开发, | ||
该模型完全开源了所有训练技术和模型权重,性能对齐闭源的 OpenAI-o1, | ||
deepseek 通过 DeepSeek-R1 的输出,蒸馏了 6 个小模型给开源社区,包括 Qwen2.5 和 Llama3.1。 | ||
本文档将讲述如何使用 RKLLM 将 DeepSeek-R1 蒸馏模型 [DeepSeek-R1-Distill-Qwen-1.5B](https://huggingface.co/deepseek-ai/DeepSeek-R1-Distill-Qwen-1.5B) 大语言模型部署到 RK3588 上利用 NPU 进行硬件加速推理。 | ||
|
||
![rkllm_2.webp](/img/general-tutorial/rknn/rkllm_ds_1.webp) | ||
|
||
### 模型文件下载 | ||
|
||
:::tip | ||
瑞莎已经提供编译好的 rkllm 模型和执行文件,用户可直接下载使用,如要参考编译过程可继续参考可选部分 | ||
::: | ||
|
||
- 使用 [git LFS](https://git-lfs.com/) 从 [ModelScope](https://modelscope.cn/models/radxa/DeepSeek-R1-Distill-Qwen-1.5B_RKLLM) 下载预编译好的 rkllm | ||
|
||
```bash | ||
git clone https://www.modelscope.cn/radxa/DeepSeek-R1-Distill-Qwen-1.5B_RKLLM.git | ||
``` | ||
|
||
### (可选)模型编译 | ||
|
||
:::tip | ||
请用户根据 [RKLLM安装](./rkllm_install) 完成 PC 端和开发板端 RKLLM 工作环境的准备 | ||
::: | ||
|
||
- x86 PC 工作站中下载 [DeepSeek-R1-Distill-Qwen-1.5B](https://huggingface.co/deepseek-ai/DeepSeek-R1-Distill-Qwen-1.5B) 权重文件, 如没安装 [git-lfs](https://git-lfs.com/),请自行安装 | ||
```bash | ||
git clone https://huggingface.co/deepseek-ai/DeepSeek-R1-Distill-Qwen-1.5B | ||
``` | ||
- 激活 rkllm conda 环境, 可参考[RKLLM conda 安装](rkllm_install#x86-pc-工作站) | ||
```bash | ||
conda activate rkllm | ||
``` | ||
- 更改 `rknn-llm/rkllm-toolkit/examples/test.py` 中 modelpath 模型路径, dataset路径, rkllm 导出路径 | ||
```python | ||
15 modelpath = 'Your DeepSeek-R1-Distill-Qwen-1.5B Folder Path' | ||
29 datasert = None # 默认是 "./data_quant.json", 如无可以填写 None | ||
83 ret = llm.export_rkllm("./DeepSeek-R1-Distill-Qwen-1.5B.rkllm") | ||
``` | ||
- 运行模型转换脚本 | ||
```bash | ||
cd rknn-llm/rkllm-toolkit/examples/ | ||
python3 test.py | ||
``` | ||
转换成功后可得到 DeepSeek-R1-Distill-Qwen-1.5B.rkllm 模型 | ||
|
||
### (可选)编译可执行文件 | ||
|
||
- 下载交叉编译工具链 [gcc-arm-10.2-2020.11-x86_64-aarch64-none-linux-gnu](https://developer.arm.com/downloads/-/gnu-a/10-2-2020-11) | ||
- 修改主程序 `rknn-llm/examples/rkllm_api_demo/src/llm_demo.cpp` 代码, 这里修改 PROMPT 格式的设置,和 PROMPT 的构造 | ||
|
||
```cpp | ||
24 #define PROMPT_TEXT_PREFIX "<|im_start|>system\nYou are a helpful assistant.\n<|im_end|>\n<|im_start|>user\n" | ||
25 #define PROMPT_TEXT_POSTFIX "\n<|im_end|>\n<|im_start|>assistant\n<think>" | ||
184 text = PROMPT_TEXT_PREFIX + input_str + PROMPT_TEXT_POSTFIX; | ||
185 // text = input_str; | ||
``` | ||
|
||
:::tip | ||
为什么要修改 PROMPT_TEXT_PREFIX 和 PROMPT_TEXT_POSTFIX? 这里需要参考 [DeepSeek-R1 论文](https://github.com/deepseek-ai/DeepSeek-R1/blob/main/DeepSeek_R1.pdf)中 Table1 的说明,需按照其格式对 DeepSeek-R1 模型提问 | ||
::: | ||
![rkllm_2.webp](/img/general-tutorial/rknn/rkllm_ds_2.webp) | ||
|
||
- 修改 `rknn-llm/examples/rkllm_api_demo/build-linux.sh` 编译脚本中 `GCC_COMPILER_PATH` 路径 | ||
```bash | ||
GCC_COMPILER_PATH=gcc-arm-10.2-2020.11-x86_64-aarch64-none-linux-gnu/bin/aarch64-none-linux-gnu | ||
``` | ||
- 运行模型转换脚本 | ||
```bash | ||
cd rknn-llm/examples/rkllm_api_demo/ | ||
bash build-linux.sh | ||
``` | ||
生成的可执行文件在 `build/build_linux_aarch64_Release/llm_demo` | ||
|
||
### 板端部署 | ||
|
||
#### 终端模式 | ||
|
||
- 将转换成功后的 DeepSeek-R1-Distill-Qwen-1.5B.rkllm 模型与编译后的二进制文件 llm_demo 复制到板端 | ||
- 导入环境变量 | ||
```bash | ||
export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:rknn-llm/rkllm-runtime/Linux/librkllm_api/aarch64 | ||
``` | ||
:::tip | ||
使用 ModelScope 下载的用户可直接 export 下载仓库里的 librkllmrt.so | ||
::: | ||
- 运行 llm_demo,输入 `exit` 退出 | ||
```bash | ||
export RKLLM_LOG_LEVEL=1 | ||
./llm_demo DeepSeek-R1-Distill-Qwen-1.5B.rkllm 10000 10000 | ||
``` | ||
![rkllm_2.webp](/img/general-tutorial/rknn/rkllm_ds_3.webp) | ||
|
||
### 性能分析 | ||
|
||
对于数学问题: `解方程 x+y=12, 2x+4y=34, 求x,y的值`, 在 RK3588 上达 14.93 token/s | ||
| Stage | Total Time (ms) | Tokens | Time per Token (ms) | Tokens per Second | | ||
|----------|-----------------|--------|---------------------|-------------------| | ||
| Prefill | 429.63 | 81 | 5.30 | 188.53 | | ||
| Generate | 56103.71 | 851 | 66.99 | 14.93 | |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.