Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Model] Adapt the MiniCPM-2B model #164

Merged
merged 1 commit into from
Feb 26, 2025

Conversation

kunpengW-code
Copy link

What this PR does / why we need it?

In order to adapt the MiniCPM-2B model of the vLLM framework to Ascend hardware, the following two modifications are required: 1. The qkv type conversion in the forward function is deleted to support the rope operator. 2. The fused_moe.fused_moe module is replaced to prevent errors caused by triton not supported,this modification can be deleted if triton is supported

Does this PR introduce any user-facing change?

How was this patch tested?

Copy link
Collaborator

@wangxiyuan wangxiyuan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

image build failure is a know issue. I'll fix later.

@@ -18,4 +18,6 @@

def register():
"""Register the NPU platform."""
# To ensure that the module is correctly replaced, add it at the beginning
import vllm_ascend.patch_module # noqa: F401
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's not suggested to add any imprt vllm in register fun to avoid circle import problem. You can just add this to platform.check_and_update_config, or like you did below in worker

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was going to add it to platform.check_and_update_config or worker at first, but it executes after importing the module we need to replace. Then I tried to add it to platform.pre_register_and_update, it executes before importing the module, but maybe other modules already reference old module objects, replacing sys.modules directly does not update their references, it doesn't work. Do you have any suggestions to solve this problem?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, sorry, I just notice this is to patch vllm.model_executor.layers.fused_moe.fused_moe. It's fine currently. But for long term support, there are two way to make it better:

  1. rewrite mincpm and use ModelRegistery to register a new one in vllm-ascend
  2. contribute to vllm to make the fused_moe code compatible with non-triton supported hardware.

@wangxiyuan
Copy link
Collaborator

Please rebase the code and fix the lint error. otherwise LGTM

Signed-off-by: Wang Kunpeng <1289706727@qq.com>
@kunpengW-code
Copy link
Author

Please rebase the code and fix the lint error. otherwise LGTM

Done. Thanks a lot

@wangxiyuan wangxiyuan merged commit ca807ce into vllm-project:v0.7.3-dev Feb 26, 2025
10 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants