Support Int4 ONNX Export #1670

Satrat · 2023-07-13T17:27:50Z

Int4 QAT was working out of the box, but the onnx export step was failing because onnx only supports 8-bit integer quantization. The fix for this was to trick onnx into thinking we were exporting 8-bit quantized weights by manually adjusting the range of the FakeQuantize modules. The weights themselves will still be between [-8, 7], and the original range boundaries are still accessible via the FakeQuantize.quant_min and FakeQuantize.quant_max attributes (they are deprecated by PyTorch, replaced by FakeQuantize.activation_post_process.quant_[min\max])

Testing

Run tests/sparseml/pytorch/test_torch_to_onnx_exporter/test_export_4bit_model for example usage, or just run TorchToONNX.export() on any Int4 QAT model:

model = ...
sample_batch = ...
exporter = TorchToONNX(sample_batch)
exporter.export(model, "model.onnx")

Old Behavior

Export fails with error:

"name": "SymbolicValueError",
"message": "For (quant_min, quant_max), ONNX allows only (0, 127), (0, 255) and (-128, 127). Got (-8, 7)

New Behavior

Export completes without errors

src/sparseml/pytorch/torch_to_onnx_exporter.py

src/sparseml/pytorch/utils/exporter.py

src/sparseml/pytorch/torch_to_onnx_exporter.py

tests/sparseml/pytorch/test_torch_to_onnx_exporter.py

…to int4-onnx-export

tests/sparseml/pytorch/test_torch_to_onnx_exporter.py

rahul-tuli

Nice diff! LGTM pending comments

src/sparseml/pytorch/utils/helpers.py

tests/sparseml/pytorch/test_torch_to_onnx_exporter.py

Sara Adkins added 4 commits July 11, 2023 17:13

adjust quantization ranges before onnx export

9f3433e

Merge branch 'main' into int4-onnx-export

ab7b958

unit test for exporting 4bit ONNX models

a146dc4

Merge branch 'main' into int4-onnx-export

2e8e8c9

Satrat requested review from dbogunowicz, dsikka, rahul-tuli and bfineran July 13, 2023 17:40

Satrat changed the title ~~Support Int4 ONNX Export~~ [WIP] Support Int4 ONNX Export Jul 13, 2023

fixing unit test

d916fd8

Satrat changed the title ~~[WIP] Support Int4 ONNX Export~~ Support Int4 ONNX Export Jul 13, 2023

Merge branch 'main' into int4-onnx-export

e32c72a

bfineran suggested changes Jul 13, 2023

View reviewed changes

Sara Adkins added 2 commits July 13, 2023 16:47

PR comments

77e343d

Merge branch 'int4-onnx-export' of github.com:neuralmagic/sparseml in…

1b6325b

…to int4-onnx-export

mgoin reviewed Jul 14, 2023

View reviewed changes

tests/sparseml/pytorch/test_torch_to_onnx_exporter.py Outdated Show resolved Hide resolved

Sara Adkins added 3 commits July 17, 2023 14:16

fixing range, adding unit test for range

f8cb61d

backwards compatibility for < 1.12

667ba17

Merge branch 'main' into int4-onnx-export

781943f

Satrat requested a review from bfineran July 17, 2023 18:57

rahul-tuli reviewed Jul 18, 2023

View reviewed changes

Sara Adkins added 3 commits July 19, 2023 10:30

PR comments

87bf4aa

PR comments

3706534

Merge branch 'main' into int4-onnx-export

b99434d

Satrat requested a review from rahul-tuli July 19, 2023 14:33

Merge branch 'main' into int4-onnx-export

fc9cc29

bfineran approved these changes Jul 19, 2023

View reviewed changes

rahul-tuli approved these changes Jul 19, 2023

View reviewed changes

Satrat merged commit ab9a168 into main Jul 19, 2023

Satrat deleted the int4-onnx-export branch July 19, 2023 19:13

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Support Int4 ONNX Export #1670

Support Int4 ONNX Export #1670

Satrat commented Jul 13, 2023 •

edited

Loading

rahul-tuli left a comment

Support Int4 ONNX Export #1670

Support Int4 ONNX Export #1670

Conversation

Satrat commented Jul 13, 2023 • edited Loading

Testing

Old Behavior

New Behavior

rahul-tuli left a comment

Choose a reason for hiding this comment

Satrat commented Jul 13, 2023 •

edited

Loading