-
Notifications
You must be signed in to change notification settings - Fork 1.2k
Feature Request: Support Cerebras BTLM #427
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
i am trying to give it a go. i never ported any models before, so its new for me. but so far it looks fun. i have model conversion working i think HF refpo. (mostly bashed on
It would be nice if someone experienced told me in high level what is next.
MODEL
model loading cpp/wip impl file |
Sorry for ping 😃 @iboB @ggerganov |
I'm not familiar with "SCB" tensors - you have to check how they are used in Python and understand their purpose |
@bornjre, I think SCB tensors come from |
The python implementation of this model can be found at https://huggingface.co/cerebras/btlm-3b-8k-base/blob/main/modeling_btlm.py . The SCB tensors are a result of huggingface-side quantization and would be converted as per any bitsandbytes quantized model, and can be ignored. You can see the SCB tensors are not present in the model here:
|
BTLM is Cerebras's 3B model that matches the performance of many 7B models. Would be amazing to be able to quantize this because it would be so fast and good to run locally. Doesn't quite fit any of the existing architectures because it's based on CerebrasGPT but also uses ALiBi. Blog here: https://www.cerebras.net/machine-learning/btlm-3b-8k-7b-performance-in-a-3-billion-parameter-model/
HuggingFace model here: https://huggingface.co/cerebras/btlm-3b-8k-base
The text was updated successfully, but these errors were encountered: