Skip to content

Commit

Permalink
reset default block_size from 16 to 128 (#84)
Browse files Browse the repository at this point in the history
### What this PR does / why we need it?

Changed default block_size in platform.py from 16 to 128, as Ascend
Devices have a better affinity for block size 128.

### Does this PR introduce _any_ user-facing change?
No

### How was this patch tested?
CI passed

Signed-off-by: hzji210@gmail.com <hzji210@gmail.com>
  • Loading branch information
ji-huazhong authored Feb 18, 2025
1 parent 7606977 commit 8ea8523
Showing 1 changed file with 1 addition and 1 deletion.
2 changes: 1 addition & 1 deletion vllm_ascend/platform.py
Original file line number Diff line number Diff line change
Expand Up @@ -96,7 +96,7 @@ def check_and_update_config(cls, vllm_config: VllmConfig) -> None:
parallel_config.worker_cls = "vllm_ascend.worker.NPUWorker"
cache_config = vllm_config.cache_config
if cache_config and cache_config.block_size is None:
cache_config.block_size = 16
cache_config.block_size = 128

@classmethod
def get_attn_backend_cls(cls, selected_backend, head_size, dtype,
Expand Down

0 comments on commit 8ea8523

Please sign in to comment.