Releases · arthw/llama.cpp

13 Jul 09:34

aeaed61

b3312

Merge pull request #1 from arthw/update_warp

[SYCL] Fix WARP_SIZE=16 bug of Intel GPU (#8266) cherry-pick b549a1bbefb2f1fbb8b558bac1f2ae7967e60964

Assets 20

07 Jul 14:23

github-actions

b3309

c5009e6

b3309

py : switch to snake_case (#8305)

* py : switch to snake_case

ggml-ci

* cont

ggml-ci

* cont

ggml-ci

* cont : fix link

* gguf-py : use snake_case in scripts entrypoint export

* py : rename requirements for convert_legacy_llama.py

Needed for scripts/check-requirements.sh

---------

Co-authored-by: Francis Couture-Harpin <git@compilade.net>

Assets 20

04 Jul 04:50

github-actions

b3290

fdef7d6

b3290

replace get_work_group_size() by local buf

Assets 20

04 Jul 01:25

github-actions

b3289

2493479

b3289

skip UT for BF16

Assets 20

03 Jul 05:58

github-actions

b3288

96e3826

b3288

update for title

Assets 20

03 Jul 05:02

github-actions

b3286

85ec6c0

b3286

fix: add missing short command line argument -mli for multiline-input…

Assets 20

03 Jul 04:38

github-actions

b3281

9c59361

b3281

fix multiple gpu, add device choose mode, update the guide for usages

Assets 20

02 Jul 06:22

github-actions

b3279

a9f3b10

b3279

[SYCL] Fix win build conflict of math library (#8230)

* fix win build conflict of math library

* fix the condition: !(win32 & SYCL)

* revert warp_size=16

Assets 20

02 Jul 04:52

github-actions

b3278

d08c20e

b3278

[SYCL] Fix the sub group size of Intel (#8106)

* use warp_size macro for all sycl kernels

* fix mask of permute_sub_group_by_xor

* fix rms_norm with correct warp number

* fix rms_norm_f32/group_norm_f32

* move norm to norm.cpp file

* fix quantize bug

* fix mmvq's batch size

Assets 20

29 Jun 04:53

github-actions

b3265

72272b8

b3265

fix code typo in llama-cli (#8198)

Assets 20

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Releases: arthw/llama.cpp

b3312

b3309

b3290

b3289

b3288

b3286

b3281

b3279

b3278

b3265