Skip to content

Misc. bug: #14046

Open
Open
@BrightHai

Description

@BrightHai

Name and Version

version: 147 (ec9e030)
built with Apple clang version 15.0.0 (clang-1500.0.40.1) for arm64-apple-darwin24.5.0

Operating systems

No response

Which llama.cpp modules do you know to be affected?

No response

Command line

llama-server -m qwen2.5-vl-7b.gguf --mmproj mmproj-qwen2.5-vl-7b.gguf --host 0.0.0.0 -b 32

set the batch param to 32

Problem description & steps to reproduce

main: server is listening on http://0.0.0.0:8080 - starting the main loop
srv update_slots: all slots are idle
srv params_from_: Chat format: Content-only
slot launch_slot_: id 0 | task 0 | processing task
slot update_slots: id 0 | task 0 | new prompt, n_ctx_slot = 4096, n_keep = 0, n_prompt_tokens = 99
slot update_slots: id 0 | task 0 | kv cache rm [0, end)
slot update_slots: id 0 | task 0 | prompt processing progress, n_past = 4, n_tokens = 4, progress = 0.040404
slot update_slots: id 0 | task 0 | kv cache rm [4, end)
encoding image slice...
srv process_chun: processing image...
image slice encoded in 24194 ms
decoding image batch 1/10, n_tokens_batch = 64
image decoded (batch 1/10) in 456 ms
decoding image batch 2/10, n_tokens_batch = 64
image decoded (batch 2/10) in 335 ms
decoding image batch 3/10, n_tokens_batch = 64
image decoded (batch 3/10) in 336 ms
decoding image batch 4/10, n_tokens_batch = 64
image decoded (batch 4/10) in 338 ms
decoding image batch 5/10, n_tokens_batch = 64
image decoded (batch 5/10) in 339 ms
decoding image batch 6/10, n_tokens_batch = 64
image decoded (batch 6/10) in 338 ms
decoding image batch 7/10, n_tokens_batch = 64
image decoded (batch 7/10) in 340 ms
decoding image batch 8/10, n_tokens_batch = 64
image decoded (batch 8/10) in 340 ms
decoding image batch 9/10, n_tokens_batch = 64
image decoded (batch 9/10) in 342 ms
decoding image batch 10/10, n_tokens_batch = 53
image decoded (batch 10/10) in 339 ms
srv process_chun: image processed in 27698 ms
slot update_slots: id 0 | task 0 | prompt processing progress, n_past = 69, n_tokens = 64, progress = 0.696970
slot update_slots: id 0 | task 0 | kv cache rm [69, end)
slot update_slots: id 0 | task 0 | prompt processing progress, n_past = 99, n_tokens = 30, progress = 1.000000
slot update_slots: id 0 | task 0 | prompt done, n_past = 99, n_tokens = 30
slot release: id 0 | task 0 | stop processing: n_past = 205, truncated = 0
slot print_timing: id 0 | task 0 |
prompt eval time = 28943.24 ms / 99 tokens ( 292.36 ms per token, 3.42 tokens per second)
eval time = 9509.81 ms / 107 tokens ( 88.88 ms per token, 11.25 tokens per second)
total time = 38453.05 ms / 206 tokens
srv update_slots: all slots are idle

the n_tokens_batch is always 64

First Bad Commit

No response

Relevant log output

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions