Misc. bug:

### Name and Version

version: 147 (ec9e030)
built with Apple clang version 15.0.0 (clang-1500.0.40.1) for arm64-apple-darwin24.5.0



### Operating systems

_No response_

### Which llama.cpp modules do you know to be affected?

_No response_

### Command line

```shell
llama-server -m qwen2.5-vl-7b.gguf --mmproj mmproj-qwen2.5-vl-7b.gguf --host 0.0.0.0 -b 32

set the batch param to 32
```

### Problem description & steps to reproduce

main: server is listening on http://0.0.0.0:8080 - starting the main loop
srv  update_slots: all slots are idle
srv  params_from_: Chat format: Content-only
slot launch_slot_: id  0 | task 0 | processing task
slot update_slots: id  0 | task 0 | new prompt, n_ctx_slot = 4096, n_keep = 0, n_prompt_tokens = 99
slot update_slots: id  0 | task 0 | kv cache rm [0, end)
slot update_slots: id  0 | task 0 | prompt processing progress, n_past = 4, n_tokens = 4, progress = 0.040404
slot update_slots: id  0 | task 0 | kv cache rm [4, end)
encoding image slice...
srv  process_chun: processing image...
image slice encoded in 24194 ms
decoding image batch 1/10, n_tokens_batch = 64
image decoded (batch 1/10) in 456 ms
decoding image batch 2/10, n_tokens_batch = 64
image decoded (batch 2/10) in 335 ms
decoding image batch 3/10, n_tokens_batch = 64
image decoded (batch 3/10) in 336 ms
decoding image batch 4/10, n_tokens_batch = 64
image decoded (batch 4/10) in 338 ms
decoding image batch 5/10, n_tokens_batch = 64
image decoded (batch 5/10) in 339 ms
decoding image batch 6/10, n_tokens_batch = 64
image decoded (batch 6/10) in 338 ms
decoding image batch 7/10, n_tokens_batch = 64
image decoded (batch 7/10) in 340 ms
decoding image batch 8/10, n_tokens_batch = 64
image decoded (batch 8/10) in 340 ms
decoding image batch 9/10, n_tokens_batch = 64
image decoded (batch 9/10) in 342 ms
decoding image batch 10/10, n_tokens_batch = 53
image decoded (batch 10/10) in 339 ms
srv  process_chun: image processed in 27698 ms
slot update_slots: id  0 | task 0 | prompt processing progress, n_past = 69, n_tokens = 64, progress = 0.696970
slot update_slots: id  0 | task 0 | kv cache rm [69, end)
slot update_slots: id  0 | task 0 | prompt processing progress, n_past = 99, n_tokens = 30, progress = 1.000000
slot update_slots: id  0 | task 0 | prompt done, n_past = 99, n_tokens = 30
slot      release: id  0 | task 0 | stop processing: n_past = 205, truncated = 0
slot print_timing: id  0 | task 0 |
prompt eval time =   28943.24 ms /    99 tokens (  292.36 ms per token,     3.42 tokens per second)
       eval time =    9509.81 ms /   107 tokens (   88.88 ms per token,    11.25 tokens per second)
      total time =   38453.05 ms /   206 tokens
srv  update_slots: all slots are idle

**the n_tokens_batch is always 64**



### First Bad Commit

_No response_

### Relevant log output

```shell

```

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Misc. bug: #14046

Name and Version

Operating systems

Which llama.cpp modules do you know to be affected?

Command line

Problem description & steps to reproduce

First Bad Commit

Relevant log output

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Misc. bug: #14046

Description

Name and Version

Operating systems

Which llama.cpp modules do you know to be affected?

Command line

Problem description & steps to reproduce

First Bad Commit

Relevant log output

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions