Open
Description
Environment:
- AMD CPU (t3a.2xlarge on AWS)
- TEI image: cpu-sha-bedb2e5
- Model: Qwen/Qwen3-Embedding-0.6B
Error:
Intel MKL ERROR: Parameter 8 was incorrect on entry to SGEMM
Intel MKL ERROR: Parameter 13 was incorrect on entry to SGEMM
What we've tried:
- Set
LD_PRELOAD=/usr/local/libfakeintel.so
(fake Intel library is present) - Set
MKL_DEBUG_CPU_TYPE=5
(force AVX2) - Reduced batch sizes and concurrency
- ONNX version has different compatibility issues
Root Cause:
The SGEMM errors indicate incorrect matrix dimensions/parameters being passed to Intel MKL BLAS routines. This appears to be a bug in the Candle backend's Qwen3 CPU implementation when compiled with Intel MKL, not just an AMD CPU detection issue.
Potential Solutions:
- Use a TEI build compiled with OpenBLAS instead of Intel MKL
- Fix the matrix dimension bug in the Candle backend code
- Use a working custom image without MKL dependencies
Metadata
Metadata
Assignees
Labels
No labels