Drop at the start of generation

After the generation starts, the server crashes. This only happens on the Qwen3-30B-A3B, and I checked different quant. Regular dense models work, including other dense qwen3. 
What could be the problem? I liked the acceleration in dense models, I thought moe would fly. 
But it doesn't work. It crashes without an error, it just goes to the command line when generation starts.

win10, Microsoft Visual Studio\2022, main branch

cmake -B ./build -DGGML_CUDA=OFF -DGGML_BLAS=OFF
cmake --build ./build --config Release -j 16

./llama-server.exe -t 7 -c 4096 -m F:\llm\Qwen3-30B-A3B-Q5_K_M.gguf 

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Drop at the start of generation #380

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Drop at the start of generation #380

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions