Fix export of all quantized transformer models #1654

eldarkurtic · 2023-07-06T20:11:10Z

Our current transformers export-to-onnx pipeline doesn't work for quantized models. To reproduce: try to export any quantized model with the latest versions of sparseml and neuralmagic/transformers.

There are two reasons:

we are passing in resolved_archive_file=[] at

sparseml/src/sparseml/transformers/sparsification/trainer.py

Line 690 in b73a173

resolved_archive_file=[],

which causes error in HF library here: https://github.com/neuralmagic/transformers/blob/0798c9e3b743a7e5c552f943a1a7d52ff63bbffb/src/transformers/modeling_utils.py#L3150 . We can fix this by passing None instead of [].
the function self.model._load_pretrained_model from transformers returns 6 items (see https://github.com/neuralmagic/transformers/blob/0798c9e3b743a7e5c552f943a1a7d52ff63bbffb/src/transformers/modeling_utils.py#L3323), and in our interface we only accept 5

sparseml/src/sparseml/transformers/sparsification/trainer.py

Line 686 in b73a173

_, missing, unexpected, _, _ = self.model._load_pretrained_model(

which also raises an issue while unpacking the returned tuple from their function.

bfineran

great catch @eldarkurtic this should also fix our broken tests on main. Thank you!

eldarkurtic · 2023-07-06T20:34:13Z

Accidentally pushed changes from this PR (#1630) as well to enable --trust_remote_code, so we can close that PR by merging this one into main.

bfineran · 2023-07-07T17:52:12Z

failing tests are unrelated - investigating separately but unable to reproduce outside of GHA. Merging this PR to unblock functionality

eldarkurtic and others added 9 commits June 16, 2023 10:05

Expose trust_remote_code flag for HF-transformers

aa08b19

Reload big model with multiple state dict files

80620f1

Add description for reload func

d19c550

Merge branch 'main' into load_truly_LLMs

44abc6b

Merge branch 'main' into load_truly_LLMs

a952e94

Merge branch 'main' into fix-HF-export

b908e3a

Merge branch 'fix-HF-export' of github.com:eldarkurtic/sparseml

2cd5a50

Merge branch 'main' of github.com:neuralmagic/sparseml

537afd2

handle new HF interface

e066230

bfineran previously approved these changes Jul 6, 2023

View reviewed changes

eldarkurtic dismissed bfineran’s stale review via 4de9f63 July 6, 2023 20:27

eldarkurtic force-pushed the fix-newHF-quant-export branch 2 times, most recently from 4de9f63 to e066230 Compare July 6, 2023 20:28

eldarkurtic requested review from bfineran, a team, KSGulin and robertgshaw2-redhat and removed request for a team July 6, 2023 20:36

bfineran approved these changes Jul 7, 2023

View reviewed changes

bfineran merged commit 4ec5133 into neuralmagic:main Jul 7, 2023

eldarkurtic mentioned this pull request Jul 7, 2023

[Required for MPT models] Expose trust_remote_code flag for HF-transformers #1630

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix export of all quantized transformer models #1654

Fix export of all quantized transformer models #1654

eldarkurtic commented Jul 6, 2023

bfineran left a comment

eldarkurtic commented Jul 6, 2023 •

edited

Loading

bfineran commented Jul 7, 2023

Fix export of all quantized transformer models #1654

Fix export of all quantized transformer models #1654

Conversation

eldarkurtic commented Jul 6, 2023

bfineran left a comment

Choose a reason for hiding this comment

eldarkurtic commented Jul 6, 2023 • edited Loading

bfineran commented Jul 7, 2023

eldarkurtic commented Jul 6, 2023 •

edited

Loading