Releases: arthw/llama.cpp
Releases · arthw/llama.cpp
b3312
Merge pull request #1 from arthw/update_warp [SYCL] Fix WARP_SIZE=16 bug of Intel GPU (#8266) cherry-pick b549a1bbefb2f1fbb8b558bac1f2ae7967e60964
b3309
py : switch to snake_case (#8305) * py : switch to snake_case ggml-ci * cont ggml-ci * cont ggml-ci * cont : fix link * gguf-py : use snake_case in scripts entrypoint export * py : rename requirements for convert_legacy_llama.py Needed for scripts/check-requirements.sh --------- Co-authored-by: Francis Couture-Harpin <git@compilade.net>
b3290
replace get_work_group_size() by local buf
b3289
skip UT for BF16
b3288
update for title
b3286
fix: add missing short command line argument -mli for multiline-input…
b3281
fix multiple gpu, add device choose mode, update the guide for usages
b3279
[SYCL] Fix win build conflict of math library (#8230) * fix win build conflict of math library * fix the condition: !(win32 & SYCL) * revert warp_size=16
b3278
[SYCL] Fix the sub group size of Intel (#8106) * use warp_size macro for all sycl kernels * fix mask of permute_sub_group_by_xor * fix rms_norm with correct warp number * fix rms_norm_f32/group_norm_f32 * move norm to norm.cpp file * fix quantize bug * fix mmvq's batch size
b3265
fix code typo in llama-cli (#8198)