Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update dependency vllm to v0.7.2 [SECURITY] #28

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

renovate[bot]
Copy link
Contributor

@renovate renovate bot commented Jan 29, 2025

This PR contains the following updates:

Package Change Age Adoption Passing Confidence
vllm ==v0.6.6 -> ==0.7.2 age adoption passing confidence
vllm ==v0.6.4 -> ==0.7.2 age adoption passing confidence
vllm ==0.6.6 -> ==0.7.2 age adoption passing confidence
vllm ==0.6.4 -> ==0.7.2 age adoption passing confidence

GitHub Vulnerability Alerts

CVE-2024-8768

A flaw was found in the vLLM library. A completions API request with an empty prompt will crash the vLLM API server, resulting in a denial of service.

CVE-2025-24357

Description

The vllm/model_executor/weight_utils.py implements hf_model_weights_iterator to load the model checkpoint, which is downloaded from huggingface. It use torch.load function and weights_only parameter is default value False. There is a security warning on https://pytorch.org/docs/stable/generated/torch.load.html, when torch.load load a malicious pickle data it will execute arbitrary code during unpickling.

Impact

This vulnerability can be exploited to execute arbitrary codes and OS commands in the victim machine who fetch the pretrained repo remotely.

Note that most models now use the safetensors format, which is not vulnerable to this issue.

References

CVE-2025-25183

Summary

Maliciously constructed prompts can lead to hash collisions, resulting in prefix cache reuse, which can interfere with subsequent responses and cause unintended behavior.

Details

vLLM's prefix caching makes use of Python's built-in hash() function. As of Python 3.12, the behavior of hash(None) has changed to be a predictable constant value. This makes it more feasible that someone could try exploit hash collisions.

Impact

The impact of a collision would be using cache that was generated using different content. Given knowledge of prompts in use and predictable hashing behavior, someone could intentionally populate the cache using a prompt known to collide with another prompt in use.

Solution

We address this problem by initializing hashes in vllm with a value that is no longer constant and predictable. It will be different each time vllm runs. This restores behavior we got in Python versions prior to 3.12.

Using a hashing algorithm that is less prone to collision (like sha256, for example) would be the best way to avoid the possibility of a collision. However, it would have an impact to both performance and memory footprint. Hash collisions may still occur, though they are no longer straight forward to predict.

To give an idea of the likelihood of a collision, for randomly generated hash values (assuming the hash generation built into Python is uniformly distributed), with a cache capacity of 50,000 messages and an average prompt length of 300, a collision will occur on average once every 1 trillion requests.

References


vllm: Malicious model to RCE by torch.load in hf_model_weights_iterator

CVE-2025-24357 / GHSA-rh4j-5rhw-hr54

More information

Details

Description

The vllm/model_executor/weight_utils.py implements hf_model_weights_iterator to load the model checkpoint, which is downloaded from huggingface. It use torch.load function and weights_only parameter is default value False. There is a security warning on https://pytorch.org/docs/stable/generated/torch.load.html, when torch.load load a malicious pickle data it will execute arbitrary code during unpickling.

Impact

This vulnerability can be exploited to execute arbitrary codes and OS commands in the victim machine who fetch the pretrained repo remotely.

Note that most models now use the safetensors format, which is not vulnerable to this issue.

References

Severity

  • CVSS Score: 7.5 / 10 (High)
  • Vector String: CVSS:3.1/AV:N/AC:H/PR:N/UI:R/S:U/C:H/I:H/A:H

References

This data is provided by OSV and the GitHub Advisory Database (CC-BY 4.0).


vLLM uses Python 3.12 built-in hash() which leads to predictable hash collisions in prefix cache

CVE-2025-25183 / GHSA-rm76-4mrf-v9r8

More information

Details

Summary

Maliciously constructed prompts can lead to hash collisions, resulting in prefix cache reuse, which can interfere with subsequent responses and cause unintended behavior.

Details

vLLM's prefix caching makes use of Python's built-in hash() function. As of Python 3.12, the behavior of hash(None) has changed to be a predictable constant value. This makes it more feasible that someone could try exploit hash collisions.

Impact

The impact of a collision would be using cache that was generated using different content. Given knowledge of prompts in use and predictable hashing behavior, someone could intentionally populate the cache using a prompt known to collide with another prompt in use.

Solution

We address this problem by initializing hashes in vllm with a value that is no longer constant and predictable. It will be different each time vllm runs. This restores behavior we got in Python versions prior to 3.12.

Using a hashing algorithm that is less prone to collision (like sha256, for example) would be the best way to avoid the possibility of a collision. However, it would have an impact to both performance and memory footprint. Hash collisions may still occur, though they are no longer straight forward to predict.

To give an idea of the likelihood of a collision, for randomly generated hash values (assuming the hash generation built into Python is uniformly distributed), with a cache capacity of 50,000 messages and an average prompt length of 300, a collision will occur on average once every 1 trillion requests.

References

Severity

  • CVSS Score: 2.6 / 10 (Low)
  • Vector String: CVSS:3.1/AV:N/AC:H/PR:L/UI:R/S:U/C:N/I:L/A:N

References

This data is provided by OSV and the GitHub Advisory Database (CC-BY 4.0).


Release Notes

vllm-project/vllm (vllm)

v0.7.2

Compare Source

Highlights
  • Qwen2.5-VL is now supported in vLLM. Please note that it requires a source installation from Hugging Face transformers library at the moment (#​12604)
  • Add transformers backend support via --model-impl=transformers. This allows vLLM to be ran with arbitrary Hugging Face text models (#​11330, #​12785, #​12727).
  • Performance enhancement to DeepSeek models.
    • Align KV caches entries to start 256 byte boundaries, yielding 43% throughput enhancement (#​12676)
    • Apply torch.compile to fused_moe/grouped_topk, yielding 5% throughput enhancement (#​12637)
    • Enable MLA for DeepSeek VL2 (#​12729)
    • Enable DeepSeek model on ROCm (#​12662)
Core Engine
  • Use VLLM_LOGITS_PROCESSOR_THREADS to speed up structured decoding in high batch size scenarios (#​12368)
Security Update
  • Improve hash collision avoidance in prefix caching (#​12621)
  • Add SPDX-License-Identifier headers to python source files (#​12628)
Other
  • Enable FusedSDPA support for Intel Gaudi (HPU) (#​12359)
What's Changed
New Contributors

Full Changelog: vllm-project/vllm@v0.7.1...v0.7.2

v0.7.1

Compare Source

Highlights

This release features MLA optimization for Deepseek family of models. Compared to v0.7.0 released this Monday, we offer ~3x the generation throughput, ~10x the memory capacity for tokens, and horizontal context scalability with pipeline parallelism

V1

For the V1 architecture, we

Models
  • New Model: MiniCPM-o (text outputs only) (#​12069)
Hardwares
  • Neuron: NKI-based flash-attention kernel with paged KV cache (#​11277)
  • AMD: llama 3.2 support upstreaming (#​12421)
Others
  • Support override generation config in engine arguments (#​12409)
  • Support reasoning content in API for deepseek R1 (#​12473)
What's Changed
New Contributors

Full Changelog: vllm-project/vllm@v0.7.0...v0.7.1

v0.7.0

Compare Source

Highlights

  • vLLM's V1 engine is ready for testing! This is a rewritten engine designed for performance and architectural simplicity. You can turn it on by setting environment variable VLLM_USE_V1=1. See our blog for more details. (44 commits).
  • New methods (LLM.sleep, LLM.wake_up, LLM.collective_rpc, LLM.reset_prefix_cache) in vLLM for the post training frameworks! (#​12361, #​12084, #​12284).
  • torch.compile is now fully integrated in vLLM, and enabled by default in V1. You can turn it on via -O3 engine parameter. (#​11614, #​12243, #​12043, #​12191, #​11677, #​12182, #​12246).

This release features

  • 400 commits from 132 contributors, including 57 new contributors.
    • 28 CI and build enhancements, including testing for nightly torch (#​12270) and inclusion of genai-perf for benchmark (#​10704).
    • 58 documentation enhancements, including reorganized documentation structure (#​11645, #​11755, #​11766, #​11843, #​11896).
    • more than 161 bug fixes and miscellaneous enhancements
Features

Models

Hardwares

Features

  • Distributed:
  • API Server: Jina- and Cohere-compatible Rerank API (#​12376)
  • Kernels:
    • Flash Attention 3 Support (#​12093)
    • Punica prefill kernels fusion (#​11234)
    • For Deepseek V3: optimize moe_align_block_size for cuda graph and large num_experts (#​12222)
Others
  • Benchmark: new script for CPU offloading (#​11533)
  • Security: Set weights_only=True when using torch.load() ([#​12366](https://re

Configuration

📅 Schedule: Branch creation - "" in timezone America/Toronto, Automerge - At any time (no schedule defined).

🚦 Automerge: Disabled by config. Please merge this manually once you are satisfied.

Rebasing: Whenever PR becomes conflicted, or you tick the rebase/retry checkbox.

🔕 Ignore: Close this PR and you won't be reminded about these updates again.


  • If you want to rebase/retry this PR, check this box

This PR was generated by Mend Renovate. View the repository job log.

Copy link
Contributor Author

renovate bot commented Jan 29, 2025

⚠️ Artifact update problem

Renovate failed to update artifacts related to this branch. You probably do not want to merge this PR as-is.

♻ Renovate will retry this branch, including artifacts, only when one of the following happens:

  • any of the package files in this branch needs updating, or
  • the branch becomes conflicted, or
  • you click the rebase/retry checkbox if found above, or
  • you rename this PR's title to start with "rebase!" to trigger it manually

The artifact failure details are included below:

File name: model-servers/vllm/0.6.4/Pipfile.lock
Command failed: pipenv lock
Creating a virtualenv for this project
Pipfile: 
/tmp/renovate/repos/github/redhat-ai-dev/developer-images/model-servers/vllm/0.6
.4/Pipfile
Using /opt/containerbase/tools/python/3.11.11/bin/python33.11.11 to create 
virtualenv...
created virtual environment CPython3.11.11.final.0-64 in 429ms
  creator 
CPython3Posix(dest=/tmp/renovate/cache/others/virtualenvs/0.6.4-d_r4MF2m, 
clear=False, no_vcs_ignore=False, global=False)
  seeder FromAppData(download=False, pip=bundle, setuptools=bundle, 
wheel=bundle, via=copy, 
app_data_dir=/tmp/containerbase/cache/.local/share/virtualenv)
    added seed packages: pip==24.3.1, setuptools==75.8.0, wheel==0.45.1
  activators 
BashActivator,CShellActivator,FishActivator,NushellActivator,PowerShellActivator
,PythonActivator

✔ Successfully created virtual environment!
Virtualenv location: /tmp/renovate/cache/others/virtualenvs/0.6.4-d_r4MF2m
Locking [packages] dependencies...
CRITICAL:pipenv.patched.pip._internal.resolution.resolvelib.factory:Cannot 
install -r /tmp/pipenv-55nn5owk-requirements/pipenv-ics_az7m-constraints.txt 
(line 14) and torch==2.3.0+cu121 because these package versions have conflicting
dependencies.
[ResolutionFailure]:   File 
"/opt/containerbase/tools/pipenv/2024.4.1/3.11.11/lib/python3.11/site-packages/p
ipenv/resolver.py", line 451, in main
[ResolutionFailure]:       _main(
[ResolutionFailure]:   File 
"/opt/containerbase/tools/pipenv/2024.4.1/3.11.11/lib/python3.11/site-packages/p
ipenv/resolver.py", line 436, in _main
[ResolutionFailure]:       resolve_packages(
[ResolutionFailure]:   File 
"/opt/containerbase/tools/pipenv/2024.4.1/3.11.11/lib/python3.11/site-packages/p
ipenv/resolver.py", line 400, in resolve_packages
[ResolutionFailure]:       results, resolver = resolve_deps(
[ResolutionFailure]:       ^^^^^^^^^^^^^
[ResolutionFailure]:   File 
"/opt/containerbase/tools/pipenv/2024.4.1/3.11.11/lib/python3.11/site-packages/p
ipenv/utils/resolver.py", line 967, in resolve_deps
[ResolutionFailure]:       results, hashes, internal_resolver = 
actually_resolve_deps(
[ResolutionFailure]:       ^^^^^^^^^^^^^^^^^^^^^^
[ResolutionFailure]:   File 
"/opt/containerbase/tools/pipenv/2024.4.1/3.11.11/lib/python3.11/site-packages/p
ipenv/utils/resolver.py", line 735, in actually_resolve_deps
[ResolutionFailure]:       resolver.resolve()
[ResolutionFailure]:   File 
"/opt/containerbase/tools/pipenv/2024.4.1/3.11.11/lib/python3.11/site-packages/p
ipenv/utils/resolver.py", line 460, in resolve
[ResolutionFailure]:       raise ResolutionFailure(message=e)
Your dependencies could not be resolved. You likely have a mismatch in your 
sub-dependencies.
You can use $ pipenv run pip install <requirement_name> to bypass this 
mechanism, then run $ pipenv graph to inspect the versions actually installed in
the virtualenv.
Hint: try $ pipenv lock --pre if it is a pre-release dependency.
ERROR: ResolutionImpossible: for help visit 
https://pip.pypa.io/en/latest/topics/dependency-resolution/#dealing-with-depende
ncy-conflicts

Your dependencies could not be resolved. You likely have a mismatch in your 
sub-dependencies.
You can use $ pipenv run pip install <requirement_name> to bypass this 
mechanism, then run $ pipenv graph to inspect the versions actually installed in
the virtualenv.
Hint: try $ pipenv lock --pre if it is a pre-release dependency.
ERROR: Failed to lock Pipfile.lock!

File name: model-servers/vllm/0.6.6/Pipfile.lock
Command failed: pipenv lock
Creating a virtualenv for this project
Pipfile: 
/tmp/renovate/repos/github/redhat-ai-dev/developer-images/model-servers/vllm/0.6
.6/Pipfile
Using /opt/containerbase/tools/python/3.11.11/bin/python33.11.11 to create 
virtualenv...
created virtual environment CPython3.11.11.final.0-64 in 148ms
  creator 
CPython3Posix(dest=/tmp/renovate/cache/others/virtualenvs/0.6.6-39QtryhM, 
clear=False, no_vcs_ignore=False, global=False)
  seeder FromAppData(download=False, pip=bundle, setuptools=bundle, 
wheel=bundle, via=copy, 
app_data_dir=/tmp/containerbase/cache/.local/share/virtualenv)
    added seed packages: pip==24.3.1, setuptools==75.8.0, wheel==0.45.1
  activators 
BashActivator,CShellActivator,FishActivator,NushellActivator,PowerShellActivator
,PythonActivator

✔ Successfully created virtual environment!
Virtualenv location: /tmp/renovate/cache/others/virtualenvs/0.6.6-39QtryhM
Locking [packages] dependencies...
CRITICAL:pipenv.patched.pip._internal.resolution.resolvelib.factory:Cannot 
install -r /tmp/pipenv-8dastkn7-requirements/pipenv-xwc2lplr-constraints.txt 
(line 21), -r /tmp/pipenv-8dastkn7-requirements/pipenv-xwc2lplr-constraints.txt 
(line 4) and torch==2.3.0+cu121 because these package versions have conflicting 
dependencies.
[ResolutionFailure]:   File 
"/opt/containerbase/tools/pipenv/2024.4.1/3.11.11/lib/python3.11/site-packages/p
ipenv/resolver.py", line 451, in main
[ResolutionFailure]:       _main(
[ResolutionFailure]:   File 
"/opt/containerbase/tools/pipenv/2024.4.1/3.11.11/lib/python3.11/site-packages/p
ipenv/resolver.py", line 436, in _main
[ResolutionFailure]:       resolve_packages(
[ResolutionFailure]:   File 
"/opt/containerbase/tools/pipenv/2024.4.1/3.11.11/lib/python3.11/site-packages/p
ipenv/resolver.py", line 400, in resolve_packages
[ResolutionFailure]:       results, resolver = resolve_deps(
[ResolutionFailure]:       ^^^^^^^^^^^^^
[ResolutionFailure]:   File 
"/opt/containerbase/tools/pipenv/2024.4.1/3.11.11/lib/python3.11/site-packages/p
ipenv/utils/resolver.py", line 967, in resolve_deps
[ResolutionFailure]:       results, hashes, internal_resolver = 
actually_resolve_deps(
[ResolutionFailure]:       ^^^^^^^^^^^^^^^^^^^^^^
[ResolutionFailure]:   File 
"/opt/containerbase/tools/pipenv/2024.4.1/3.11.11/lib/python3.11/site-packages/p
ipenv/utils/resolver.py", line 735, in actually_resolve_deps
[ResolutionFailure]:       resolver.resolve()
[ResolutionFailure]:   File 
"/opt/containerbase/tools/pipenv/2024.4.1/3.11.11/lib/python3.11/site-packages/p
ipenv/utils/resolver.py", line 460, in resolve
[ResolutionFailure]:       raise ResolutionFailure(message=e)
Your dependencies could not be resolved. You likely have a mismatch in your 
sub-dependencies.
You can use $ pipenv run pip install <requirement_name> to bypass this 
mechanism, then run $ pipenv graph to inspect the versions actually installed in
the virtualenv.
Hint: try $ pipenv lock --pre if it is a pre-release dependency.
ERROR: ResolutionImpossible: for help visit 
https://pip.pypa.io/en/latest/topics/dependency-resolution/#dealing-with-depende
ncy-conflicts

Your dependencies could not be resolved. You likely have a mismatch in your 
sub-dependencies.
You can use $ pipenv run pip install <requirement_name> to bypass this 
mechanism, then run $ pipenv graph to inspect the versions actually installed in
the virtualenv.
Hint: try $ pipenv lock --pre if it is a pre-release dependency.
ERROR: Failed to lock Pipfile.lock!

Signed-off-by: renovate[bot] <29139614+renovate[bot]@users.noreply.github.com>
@renovate renovate bot force-pushed the renovate/pypi-vllm-vulnerability branch from cc8ecc0 to 005f2a5 Compare February 8, 2025 07:14
@renovate renovate bot changed the title Update dependency vllm to v0.7.0 [SECURITY] Update dependency vllm to v0.7.2 [SECURITY] Feb 8, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

0 participants