Skip to content

Bug: CUDA error: an illegal memory access was encountered #425

Closed
@nux

Description

@nux

What happened?

Not sure if this is a problem with me or ik_llama - but getting this while starting prompt processing (ubergarm's deepseek-v3)

May 15 08:57:29 red llama-swap[80783]: INFO [ launch_slot_with_task] slot is processing task | tid="139638925832192" timestamp=1747317449 id_slot=0 id_task=3
May 15 08:57:29 red llama-swap[80783]: INFO [ update_slots] kv cache rm [p0, end) | tid="139638925832192" timestamp=1747317449 id_slot=0 id_task=3 p0=0
May 15 08:57:36 red kernel: NVRM: Xid (PCI:0000:01:00): 31, pid=80798, name=llama-server, Ch 00000008, intr 00000000. MMU Fault: ENGINE GRAPHICS GPC1 GPCCLIENT_T1_3 faulted @ 0x7e9f_4f200000. Fault is of type FAULT_PDE ACCESS_TYPE_VIRT_READ
May 15 08:57:36 red llama-swap[80783]: CUDA error: an illegal memory access was encountered
May 15 08:57:36 red llama-swap[80783]: current device: 0, in function ggml_backend_cuda_synchronize at /home/nux/dev/ik_llama.cpp/ggml/src/ggml-cuda.cu:3067
May 15 08:57:36 red llama-swap[80783]: cudaStreamSynchronize(cuda_ctx->stream())
May 15 08:57:36 red llama-swap[80783]: /home/nux/dev/ik_llama.cpp/ggml/src/ggml-cuda.cu:110: CUDA error
May 15 08:57:36 red kernel: llama-server[80906]: segfault at 204803fe0 ip 00007f00399189d7 sp 00007ffc4a6104f0 error 4 in libcuda.so.575.51.03[7f00395c5000+e97000] likely on CPU 11 (core 11, socket 0)
May 15 08:57:36 red kernel: Code: ef e8 9d c9 ca ff 83 3d 7e 57 2f 05 01 49 8b 1c 24 76 0a 8b 05 86 57 2f 05 85 c0 74 56 49 8b 44 24 10 41 8b 4c 24 24 48 8b 13 <8b> 00 41 39 c6 74 52 8b b3 40 40 00 00 48 89 f0 89 8c b3 44 40 00

Name and Version

./build/bin/llama-server --version
version: 3697 (34ae71c)
built with cc (Debian 12.2.0-14) 12.2.0 for x86_64-linux-gnu

What operating system are you seeing the problem on?

No response

Relevant log output

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions