-
Notifications
You must be signed in to change notification settings - Fork 62
Feature request: Support for quantization #113
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
One could use AQT for this, if penzai would expose the |
Agreed this would be a very useful feature! I think it should be pretty easy to prototype something like this without needing to directly change Penzai's implementation, because Penzai is designed to make it easy to hot-swap out model components. One implementation strategy:
If this works, it might make sense to add the (I probably won't have much bandwidth to experiment with this myself, but contributions are welcome!) |
I had the same thought! Your naming is better, though :) I'll try to implement the AQT layers when I find the time. But I think doing so will involve a lot of code copying, which is not ideal. If |
Are you accepting contributions from community, I would love to work on this issue. |
It looks like using AQT directly is a bit more tricky than I thought, as AQT objects carry around state for calibration and the AQT code generally seems to be in an unfinished and abandoned state. I'll see if I can implement some simple post-training quantization myself, but I can't guarantee that I'll find enough time to do so. As inspiration, I think this section of the AQT Readme and maybe this outdated user guide for flax might be helpful. @demoncoder-crypto I'm also just a community contributor, so I'm sure contributions would be welcome. Let me know if you make progress on this! |
@demoncoder-crypto This looks like the successor to AQT and might be interesting to look into: https://github.com/google/qwix |
It'd be great if penzai would support model quantization out of the box. I know this is a lot of work to implement, but right now the lack of quantization support is the main reason why I wouldn't want to fine tune models with penzai.
The text was updated successfully, but these errors were encountered: