Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
What this PR does / why we need it?
vLLM Ascend plugin (vllm-ascend) is a backend plugin for running vLLM on the Ascend NPU.
This plugin is the recommended approach for supporting the Ascend backend within the vLLM community. It adheres to the principles outlined in the [RFC]: Hardware pluggable, providing a hardware-pluggable interface that decouples the integration of the Ascend NPU with vLLM.
This patch also include changes to make CI work and use cache speed up e2e test, including:
- use cache (pip, ms, hf) speed up e2e test (25mins --> 5mins)
- switch
git clone
command toaction/checkout
to speedup checkout and- Enable sv for pytest for better info dump
- Remove network host to resole
docker: conflicting ontions: cannot attach both user-defined and non-user-definednetwork-modes
, which is a problem on docker 1.45 but not on 1.39.Does this PR introduce any user-facing change?
Yes, init the PR.
How was this patch tested?