Open
Description
LocalAI version:
Environment, CPU architecture, OS, and Version:
WSL Debian on Windows 10 using gh release binaries.
CPU: 12900K Intel
GPU: RTX 3090
RAM: 128gb
Describe the bug
I can seem to be able to make Nous-Hermes (the 4bit GPTQ version) converted manually to ggml using llama.cpp convert script (I tested the output in other projects where it works fine).
Also I have other models that do work so it's specific to that one somehow!
To Reproduce
My models:
> curl http://localhost:8080/v1/models -H "Content-Type: application/json"
# {"object":"list","data":[{"id":"ggml-model-q4_0.bin","object":"model"},{"id":"nous-4bit-32g.bin","object":"model"}]}
from the client
> curl http://localhost:8080/v1/completions -H "Content-Type: application/json" -d '{
"model": "nous-4bit-32g.bin",
"prompt": "A long time ago in a galaxy far, far away",
"temperature": 0.7
}'
# {"error":{"code":500,"message":"could not load model - all backends returned error: 11 errors occurred:
* failed loading model
* failed loading model
* failed loading model
* failed loading model
* failed loading model
* failed loading model
* failed loading model
* failed loading model
* failed loading model
* failed loading model
* failed loading model
","type":""}}
from the server:
> ./local-ai-avx2-Linux-x86_64
Starting LocalAI using 4 threads, with models path: /mnt/c/Users/User/dev/go-skynet/models
┌───────────────────────────────────────────────────┐
│ Fiber v2.47.0 │
│ http://127.0.0.1:8080 │
│ (bound on host 0.0.0.0 and port 8080) │
│ │
│ Handlers ............ 23 Processes ........... 1 │
│ Prefork ....... Disabled PID ................ 38 │
└───────────────────────────────────────────────────┘
llama.cpp: loading model from /mnt/c/Users/User/dev/go-skynet/models/nous-4bit-32g.bin
# hangs for a few minutes
error loading model: unexpectedly reached end of file
llama_init_from_file: failed to load model
load_gpt4all_model: error 'Success'
gpt_neox_model_load: loading model from '/mnt/c/Users/User/dev/go-skynet/models/nous-4bit-32g.bin' - please wait ...
gpt_neox_model_load: invalid model file '/mnt/c/Users/User/dev/go-skynet/models/nous-4bit-32g.bin' (bad magic)
gpt_neox_bootstrap: failed to load model from '/mnt/c/Users/User/dev/go-skynet/models/nous-4bit-32g.bin'
bert_load_from_file: loading model from '/mnt/c/Users/User/dev/go-skynet/models/nous-4bit-32g.bin' - please wait ...
bert_load_from_file: invalid model file '/mnt/c/Users/User/dev/go-skynet/models/nous-4bit-32g.bin' (bad magic)
bert_bootstrap: failed to load model from '/mnt/c/Users/User/dev/go-skynet/models/nous-4bit-32g.bin'
gptj_model_load: loading model from '/mnt/c/Users/User/dev/go-skynet/models/nous-4bit-32g.bin' - please wait ...
gptj_model_load: invalid model file '/mnt/c/Users/User/dev/go-skynet/models/nous-4bit-32g.bin' (bad magic)
gptj_bootstrap: failed to load model from '/mnt/c/Users/User/dev/go-skynet/models/nous-4bit-32g.bin'
gpt2_model_load: loading model from '/mnt/c/Users/User/dev/go-skynet/models/nous-4bit-32g.bin'
gpt2_model_load: invalid model file '/mnt/c/Users/User/dev/go-skynet/models/nous-4bit-32g.bin' (bad magic)
gpt2_bootstrap: failed to load model from '/mnt/c/Users/User/dev/go-skynet/models/nous-4bit-32g.bin'
dollyv2_model_load: loading model from '/mnt/c/Users/User/dev/go-skynet/models/nous-4bit-32g.bin' - please wait ...
dollyv2_model_load: invalid model file '/mnt/c/Users/User/dev/go-skynet/models/nous-4bit-32g.bin' (bad magic)
dolly_bootstrap: failed to load model from '/mnt/c/Users/User/dev/go-skynet/models/nous-4bit-32g.bin'
falcon_model_load: loading model from '/mnt/c/Users/User/dev/go-skynet/models/nous-4bit-32g.bin' - please wait ...
falcon_model_load: invalid model file '/mnt/c/Users/User/dev/go-skynet/models/nous-4bit-32g.bin' (bad magic)
falcon_bootstrap: failed to load model from '/mnt/c/Users/User/dev/go-skynet/models/nous-4bit-32g.bin'
mpt_model_load: loading model from '/mnt/c/Users/User/dev/go-skynet/models/nous-4bit-32g.bin' - please wait ...
mpt_model_load: invalid model file '/mnt/c/Users/User/dev/go-skynet/models/nous-4bit-32g.bin' (bad magic)
mpt_bootstrap: failed to load model from '/mnt/c/Users/User/dev/go-skynet/models/nous-4bit-32g.bin'
replit_model_load: loading model from '/mnt/c/Users/User/dev/go-skynet/models/nous-4bit-32g.bin' - please wait ...
replit_model_load: invalid model file '/mnt/c/Users/User/dev/go-skynet/models/nous-4bit-32g.bin' (bad magic)
replit_bootstrap: failed to load model from '/mnt/c/Users/User/dev/go-skynet/models/nous-4bit-32g.bin'
starcoder_model_load: loading model from '/mnt/c/Users/User/dev/go-skynet/models/nous-4bit-32g.bin'
starcoder_model_load: invalid model file '/mnt/c/Users/User/dev/go-skynet/models/nous-4bit-32g.bin' (bad magic)
starcoder_bootstrap: failed to load model from '/mnt/c/Users/User/dev/go-skynet/models/nous-4bit-32g.bin'
Expected behavior
Either better error reporting or for it to work :)
Logs
Ah... I should probably have read that before 😅 .
Here the same procedure as explained previously but with the `--debug` flag.
5:06PM DBG Request received: {"model":"nous-4bit-32g.bin","file":"","language":"","response_format":"","size":"","prompt":"A long time ago in a galaxy far, far away","instruction":"","input":null,"stop":null,"messages":null,"stream":false,"echo":false,"top_p":0,"top_k":0,"temperature":0.7,"max_tokens":0,"n":0,"batch":0,"f16":false,"ignore_eos":false,"repeat_penalty":0,"n_keep":0,"mirostat_eta":0,"mirostat_tau":0,"mirostat":0,"frequency_penalty":0,"tfz":0,"seed":0,"mode":0,"step":0,"typical_p":0}
5:06PM DBG `input`: &{Model:nous-4bit-32g.bin File: Language: ResponseFormat: Size: Prompt:A long time ago in a galaxy far, far away Instruction: Input:<nil> Stop:<nil> Messages:[] Stream:false Echo:false TopP:0 TopK:0 Temperature:0.7 Maxtokens:0 N:0 Batch:0 F16:false IgnoreEOS:false RepeatPenalty:0 Keep:0 MirostatETA:0 MirostatTAU:0 Mirostat:0 FrequencyPenalty:0 TFZ:0 Seed:0 Mode:0 Step:0 TypicalP:0}
5:06PM DBG Parameter Config: &{OpenAIRequest:{Model:nous-4bit-32g.bin File: Language: ResponseFormat: Size: Prompt:<nil> Instruction: Input:<nil> Stop:<nil> Messages:[] Stream:false Echo:false TopP:0.7 TopK:80 Temperature:0.7 Maxtokens:512 N:0 Batch:0 F16:false IgnoreEOS:false RepeatPenalty:0 Keep:0 MirostatETA:0 MirostatTAU:0 Mirostat:0 FrequencyPenalty:0 TFZ:0 Seed:0 Mode:0 Step:0 TypicalP:0} Name: StopWords:[] Cutstrings:[] TrimSpace:[] ContextSize:512 F16:false Threads:4 Debug:true Roles:map[] Embeddings:false Backend: TemplateConfig:{Completion: Chat: Edit:} MirostatETA:0 MirostatTAU:0 Mirostat:0 NGPULayers:0 MMap:false MMlock:false LowVRAM:false TensorSplit: MainGPU: ImageGenerationAssets: PromptCachePath: PromptCacheAll:false PromptCacheRO:false PromptStrings:[A long time ago in a galaxy far, far away] InputStrings:[] InputToken:[]}
5:06PM DBG Loading model 'nous-4bit-32g.bin' greedly
5:06PM DBG [llama] Attempting to load
5:06PM DBG Loading model llama from nous-4bit-32g.bin
5:06PM DBG Loading model in memory from file: /mnt/c/Users/User/dev/go-skynet/models/nous-4bit-32g.bin
llama.cpp: loading model from /mnt/c/Users/User/dev/go-skynet/models/nous-4bit-32g.bin
error loading model: unexpectedly reached end of file
llama_init_from_file: failed to load model
5:06PM DBG [llama] Fails: failed loading model
5:06PM DBG [gpt4all] Attempting to load
5:06PM DBG Loading model gpt4all from nous-4bit-32g.bin
5:06PM DBG Loading model in memory from file: /mnt/c/Users/User/dev/go-skynet/models/nous-4bit-32g.bin
load_gpt4all_model: error 'Success'
5:06PM DBG [gpt4all] Fails: failed loading model
5:06PM DBG [gptneox] Attempting to load
5:06PM DBG Loading model gptneox from nous-4bit-32g.bin
5:06PM DBG Loading model in memory from file: /mnt/c/Users/User/dev/go-skynet/models/nous-4bit-32g.bin
gpt_neox_model_load: loading model from '/mnt/c/Users/User/dev/go-skynet/models/nous-4bit-32g.bin' - please wait ...
gpt_neox_model_load: invalid model file '/mnt/c/Users/User/dev/go-skynet/models/nous-4bit-32g.bin' (bad magic)
gpt_neox_bootstrap: failed to load model from '/mnt/c/Users/User/dev/go-skynet/models/nous-4bit-32g.bin'
5:06PM DBG [gptneox] Fails: failed loading model
5:06PM DBG [bert-embeddings] Attempting to load
5:06PM DBG Loading model bert-embeddings from nous-4bit-32g.bin
5:06PM DBG Loading model in memory from file: /mnt/c/Users/User/dev/go-skynet/models/nous-4bit-32g.bin
bert_load_from_file: loading model from '/mnt/c/Users/User/dev/go-skynet/models/nous-4bit-32g.bin' - please wait ...
bert_load_from_file: invalid model file '/mnt/c/Users/User/dev/go-skynet/models/nous-4bit-32g.bin' (bad magic)
bert_bootstrap: failed to load model from '/mnt/c/Users/User/dev/go-skynet/models/nous-4bit-32g.bin'
5:06PM DBG [bert-embeddings] Fails: failed loading model
5:06PM DBG [gptj] Attempting to load
5:06PM DBG Loading model gptj from nous-4bit-32g.bin
5:06PM DBG Loading model in memory from file: /mnt/c/Users/User/dev/go-skynet/models/nous-4bit-32g.bin
gptj_model_load: loading model from '/mnt/c/Users/User/dev/go-skynet/models/nous-4bit-32g.bin' - please wait ...
gptj_model_load: invalid model file '/mnt/c/Users/User/dev/go-skynet/models/nous-4bit-32g.bin' (bad magic)
gptj_bootstrap: failed to load model from '/mnt/c/Users/User/dev/go-skynet/models/nous-4bit-32g.bin'
5:06PM DBG [gptj] Fails: failed loading model
5:06PM DBG [gpt2] Attempting to load
5:06PM DBG Loading model gpt2 from nous-4bit-32g.bin
5:06PM DBG Loading model in memory from file: /mnt/c/Users/User/dev/go-skynet/models/nous-4bit-32g.bin
gpt2_model_load: loading model from '/mnt/c/Users/User/dev/go-skynet/models/nous-4bit-32g.bin'
gpt2_model_load: invalid model file '/mnt/c/Users/User/dev/go-skynet/models/nous-4bit-32g.bin' (bad magic)
gpt2_bootstrap: failed to load model from '/mnt/c/Users/User/dev/go-skynet/models/nous-4bit-32g.bin'
5:06PM DBG [gpt2] Fails: failed loading model
5:06PM DBG [dolly] Attempting to load
5:06PM DBG Loading model dolly from nous-4bit-32g.bin
5:06PM DBG Loading model in memory from file: /mnt/c/Users/User/dev/go-skynet/models/nous-4bit-32g.bin
dollyv2_model_load: loading model from '/mnt/c/Users/User/dev/go-skynet/models/nous-4bit-32g.bin' - please wait ...
dollyv2_model_load: invalid model file '/mnt/c/Users/User/dev/go-skynet/models/nous-4bit-32g.bin' (bad magic)
dolly_bootstrap: failed to load model from '/mnt/c/Users/User/dev/go-skynet/models/nous-4bit-32g.bin'
5:06PM DBG [dolly] Fails: failed loading model
5:06PM DBG [falcon] Attempting to load
5:06PM DBG Loading model falcon from nous-4bit-32g.bin
5:06PM DBG Loading model in memory from file: /mnt/c/Users/User/dev/go-skynet/models/nous-4bit-32g.bin
falcon_model_load: loading model from '/mnt/c/Users/User/dev/go-skynet/models/nous-4bit-32g.bin' - please wait ...
falcon_model_load: invalid model file '/mnt/c/Users/User/dev/go-skynet/models/nous-4bit-32g.bin' (bad magic)
falcon_bootstrap: failed to load model from '/mnt/c/Users/User/dev/go-skynet/models/nous-4bit-32g.bin'
5:06PM DBG [falcon] Fails: failed loading model
5:06PM DBG [mpt] Attempting to load
5:06PM DBG Loading model mpt from nous-4bit-32g.bin
5:06PM DBG Loading model in memory from file: /mnt/c/Users/User/dev/go-skynet/models/nous-4bit-32g.bin
mpt_model_load: loading model from '/mnt/c/Users/User/dev/go-skynet/models/nous-4bit-32g.bin' - please wait ...
mpt_model_load: invalid model file '/mnt/c/Users/User/dev/go-skynet/models/nous-4bit-32g.bin' (bad magic)
mpt_bootstrap: failed to load model from '/mnt/c/Users/User/dev/go-skynet/models/nous-4bit-32g.bin'
5:06PM DBG [mpt] Fails: failed loading model
5:06PM DBG [replit] Attempting to load
5:06PM DBG Loading model replit from nous-4bit-32g.bin
5:06PM DBG Loading model in memory from file: /mnt/c/Users/User/dev/go-skynet/models/nous-4bit-32g.bin
replit_model_load: loading model from '/mnt/c/Users/User/dev/go-skynet/models/nous-4bit-32g.bin' - please wait ...
replit_model_load: invalid model file '/mnt/c/Users/User/dev/go-skynet/models/nous-4bit-32g.bin' (bad magic)
replit_bootstrap: failed to load model from '/mnt/c/Users/User/dev/go-skynet/models/nous-4bit-32g.bin'
5:06PM DBG [replit] Fails: failed loading model
5:06PM DBG [starcoder] Attempting to load
5:06PM DBG Loading model starcoder from nous-4bit-32g.bin
5:06PM DBG Loading model in memory from file: /mnt/c/Users/User/dev/go-skynet/models/nous-4bit-32g.bin
starcoder_model_load: loading model from '/mnt/c/Users/User/dev/go-skynet/models/nous-4bit-32g.bin'
starcoder_model_load: invalid model file '/mnt/c/Users/User/dev/go-skynet/models/nous-4bit-32g.bin' (bad magic)
starcoder_bootstrap: failed to load model from '/mnt/c/Users/User/dev/go-skynet/models/nous-4bit-32g.bin'
5:06PM DBG [starcoder] Fails: failed loading model
Additional context