Skip to content

Releases: ggml-org/llama.cpp

b9665

16 Jun 10:11
e3a74b2

Choose a tag to compare

bench : add --offline (#24511)

  • bench : add --offline

Signed-off-by: Adrien Gallouët angt@huggingface.co

  • Add default

Signed-off-by: Adrien Gallouët angt@huggingface.co


Signed-off-by: Adrien Gallouët angt@huggingface.co

macOS/iOS:

Linux:

Android:

Windows:

openEuler:

  • DISABLED
  • openEuler x86 (310p)
  • openEuler x86 (910b, ACL Graph)
  • openEuler aarch64 (310p)
  • openEuler aarch64 (910b, ACL Graph)

UI:

b9664

16 Jun 08:09
ac79caa

Choose a tag to compare

sycl: support reordered Q4_K/Q5_K/Q6_K MoE MUL_MAT_ID (#24452)

  • sycl: support reordered Q4_K and Q5_K MoE MUL_MAT_ID

Extend reordered-weight handling to fused MoE MUL_MAT_ID for Q4_K and Q5_K expert tensors and add Q5_K reordered DMMV coverage. Unsupported 3D reorder cases now fall back instead of aborting.

  • sycl: extend MoE reorder to Q6_K mul_mat_id

macOS/iOS:

Linux:

Android:

Windows:

openEuler:

  • DISABLED
  • openEuler x86 (310p)
  • openEuler x86 (910b, ACL Graph)
  • openEuler aarch64 (310p)
  • openEuler aarch64 (910b, ACL Graph)

UI:

b9663

16 Jun 06:12
fdd1098

Choose a tag to compare

[SYCL] Support OP EXPM1, support all UT cases of FLOOR, TRUNC, ROUND (#24363)

  • support OP EXPM1, support all UT cases of FLOOR, TRUNC, ROUND

  • fix conflict

  • rebase, support new UT case of repeat, concat

macOS/iOS:

Linux:

Android:

Windows:

openEuler:

  • DISABLED
  • openEuler x86 (310p)
  • openEuler x86 (910b, ACL Graph)
  • openEuler aarch64 (310p)
  • openEuler aarch64 (910b, ACL Graph)

UI:

b9661

16 Jun 05:07
ad39cca

Choose a tag to compare

vulkan: add col2im_1d op (#24425)

  • vulkan: add GGML_OP_COL2IM_1D, follow-up to the CPU op

  • vulkan: col2im_1d bounded gather loop instead of full-K scan with modulo

  • vulkan: col2im_1d address review from @jeffbolznv

  • vulkan: col2im_1d return nullptr for unsupported types, address review from @0cc4m

macOS/iOS:

Linux:

Android:

Windows:

openEuler:

  • DISABLED
  • openEuler x86 (310p)
  • openEuler x86 (910b, ACL Graph)
  • openEuler aarch64 (310p)
  • openEuler aarch64 (910b, ACL Graph)

UI:

b9660

15 Jun 22:05
7dad2f1

Choose a tag to compare

chat : fix LFM2 tool-call parsing double-escaping (#24667)

  • Add escape test cases

  • chat : fix LFM2 tool-call parsing double-escaping

macOS/iOS:

Linux:

Android:

Windows:

openEuler:

  • DISABLED
  • openEuler x86 (310p)
  • openEuler x86 (910b, ACL Graph)
  • openEuler aarch64 (310p)
  • openEuler aarch64 (910b, ACL Graph)

UI:

b9659

15 Jun 21:32
e36a602

Choose a tag to compare

b9658

15 Jun 20:59
38d5463

Choose a tag to compare

chat: include full unparsed prompt in debug (#24650)

message on parse error

macOS/iOS:

Linux:

Android:

Windows:

openEuler:

  • DISABLED
  • openEuler x86 (310p)
  • openEuler x86 (910b, ACL Graph)
  • openEuler aarch64 (310p)
  • openEuler aarch64 (910b, ACL Graph)

UI:

b9656

15 Jun 20:27
581e8ec

Choose a tag to compare

chat: harden peg-native tool call parsing (#24329)

  • chat: harden peg-native tool call parsing

accept an optional leading type: function field in
build_json_tools_flat_keys so openai style tool calls parse on
templates whose serialization opens on the name field.

return a clean error and log the unparsed fragment on a final peg
parse failure instead of throwing the raw parser position and input.

keep the raw arguments string in func_args_not_string when it is not
valid json instead of aborting the prompt render.

  • chat: surface peg-native parse failures

a final peg parse failure threw the raw parser position and input. log
the unparsed fragment and raise a clearer error instead, so a model
output that does not match the expected format no longer fails silently
with an empty assistant turn.

minimal change, no behavior change on successful parses.

  • chat: handle openai style tool calls in peg-native

  • nits

  • common: scope OpenAI wrapper grammar trigger via autoparser flag

  • chat: gate type:function parsing leniency on the analysis flag

Thread accept_openai_wrapper from the generator to build_json_tools_flat_keys
so the leading "type": "function" field is accepted only when openai_wrapper_trigger is set.

macOS/iOS:

Linux:

Android:

Windows:

openEuler:

  • DISABLED
  • openEuler x86 (310p)
  • openEuler x86 (910b, ACL Graph)
  • openEuler aarch64 (310p)
  • openEuler aarch64 (910b, ACL Graph)

UI:

b9655

15 Jun 19:55
0ae3f45

Choose a tag to compare

chat: fix an "oldie but goodie" grammar generator bug that surfaced during last changes (#24653)

  • chat: fix an "oldie but goodie" grammar generator bug that surfaced during last changes

  • update erroneous case in PEG parser test

macOS/iOS:

Linux:

Android:

Windows:

openEuler:

  • DISABLED
  • openEuler x86 (310p)
  • openEuler x86 (910b, ACL Graph)
  • openEuler aarch64 (310p)
  • openEuler aarch64 (910b, ACL Graph)

UI:

b9654

15 Jun 19:24
e3cab40

Choose a tag to compare

mtmd : add post-decode callback (#24645)

Assisted-by: pi:llama.cpp/Qwen3.6-27B

macOS/iOS:

Linux:

Android:

Windows:

openEuler:

  • DISABLED
  • openEuler x86 (310p)
  • openEuler x86 (910b, ACL Graph)
  • openEuler aarch64 (310p)
  • openEuler aarch64 (910b, ACL Graph)

UI: