-
Notifications
You must be signed in to change notification settings - Fork 152
Fix export of all quantized transformer models #1654
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
great catch @eldarkurtic this should also fix our broken tests on main. Thank you!
4de9f63
to
e066230
Compare
Accidentally pushed changes from this PR (#1630) as well to enable |
failing tests are unrelated - investigating separately but unable to reproduce outside of GHA. Merging this PR to unblock functionality |
Our current transformers export-to-onnx pipeline doesn't work for quantized models. To reproduce: try to export any quantized model with the latest versions of sparseml and neuralmagic/transformers.
There are two reasons:
resolved_archive_file=[]
atsparseml/src/sparseml/transformers/sparsification/trainer.py
Line 690 in b73a173
None
instead of[]
.self.model._load_pretrained_model
from transformers returns 6 items (see https://github.com/neuralmagic/transformers/blob/0798c9e3b743a7e5c552f943a1a7d52ff63bbffb/src/transformers/modeling_utils.py#L3323), and in our interface we only accept 5sparseml/src/sparseml/transformers/sparsification/trainer.py
Line 686 in b73a173