You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hi, I'm trying to run Qwen2.5-VL-3B on my smartphone. I want to quantize the weights of the mmproj to 4 bits (like Q4_0, IQ4_NL) but only Q8_0 and F16 versions are provided on huggingface. Currently the python script convert_hf_to_gguf,py cannot apply 4bit quantizations, and llama-quantize cannot be directly used as well since it doesn't recognize the separate clip architecture.
I'm wondering what's the proper way to implement this? Any roadmap of quantization support for ViT encoders?
reacted with thumbs up emoji reacted with thumbs down emoji reacted with laugh emoji reacted with hooray emoji reacted with confused emoji reacted with heart emoji reacted with rocket emoji reacted with eyes emoji
Uh oh!
There was an error while loading. Please reload this page.
-
Hi, I'm trying to run Qwen2.5-VL-3B on my smartphone. I want to quantize the weights of the mmproj to 4 bits (like Q4_0, IQ4_NL) but only Q8_0 and F16 versions are provided on huggingface. Currently the python script convert_hf_to_gguf,py cannot apply 4bit quantizations, and llama-quantize cannot be directly used as well since it doesn't recognize the separate clip architecture.
I'm wondering what's the proper way to implement this? Any roadmap of quantization support for ViT encoders?
Beta Was this translation helpful? Give feedback.
All reactions