[Export][Transformers] Implementation of correctness validation #1935

dbogunowicz · 2024-01-03T11:37:08Z

Feature description

Implements validate_correctness function to assert the given the same input, the outputs from the torch and onnx model are the same. The function uses top-k predictions match wrt the ground truth to assert correctness.
This will sometimes not be the case for the quantized model, given the different rounding behavior of torch and onnx quantization ops.

Testing

Tests are in-place for transformers and image-classification models.
Note: will also add tests for "generative transformers" once the parallel PR #1938 is approved.

Example

For LLMs:

from sparseml.export.export import export
from huggingface_hub import snapshot_download

hf_model = "roneneldan/TinyStories-1M"
source_path = snapshot_download(hf_model)
target_path = "."
export(
    source_path=source_path,
    target_path=target_path,
    task='text-generation',
    num_export_samples=2,
    validate_correctness=True,
    **dict(
        data_args=dict(
            dataset_name="wikitext", dataset_config_name="wikitext-2-raw-v1"
        )
    ),
)

Fetching 10 files: 100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 10/10 [00:00<00:00, 29.52it/s]
2024-01-05 20:00:13 sparseml.pytorch.image_classification.utils.helpers WARNING  Model: /home/damian/.cache/huggingface/hub/models--roneneldan--TinyStories-1M/snapshots/8cd14d5339178f1b285f55baee14a0deff7103ac/model.pth not an image classification model: [Errno 2] No such file or directory: '/home/damian/.cache/huggingface/hub/models--roneneldan--TinyStories-1M/snapshots/8cd14d5339178f1b285f55baee14a0deff7103ac/model.pth'
2024-01-05 20:00:13 sparseml.pytorch.image_classification.utils.helpers WARNING  Model: /home/damian/.cache/huggingface/hub/models--roneneldan--TinyStories-1M/snapshots/8cd14d5339178f1b285f55baee14a0deff7103ac/model.pth not an image classification model: [Errno 2] No such file or directory: '/home/damian/.cache/huggingface/hub/models--roneneldan--TinyStories-1M/snapshots/8cd14d5339178f1b285f55baee14a0deff7103ac/model.pth'
2024-01-05 20:00:13 sparseml.export.export INFO     Starting export for transformers model...
2024-01-05 20:00:13 sparseml.export.export INFO     Starting export for transformers model...
2024-01-05 20:00:13 sparseml.export.export INFO     Creating model for the export...
2024-01-05 20:00:13 sparseml.export.export INFO     Creating model for the export...
2024-01-05 20:00:13 sparseml.transformers.integration_helper_functions WARNING  trust_remote_code is set to False. It is possible, that the model will not be loaded correctly.
2024-01-05 20:00:13 sparseml.transformers.integration_helper_functions WARNING  trust_remote_code is set to False. It is possible, that the model will not be loaded correctly.
2024-01-05 20:00:14 sparseml.pytorch.model_load.helpers INFO     Loaded model from /home/damian/.cache/huggingface/hub/models--roneneldan--TinyStories-1M/snapshots/8cd14d5339178f1b285f55baee14a0deff7103ac with 3745984 total params. Of those there are 3609664 prunable params which have 0.0 avg sparsity.
2024-01-05 20:00:14 sparseml.pytorch.model_load.helpers INFO     Loaded model from /home/damian/.cache/huggingface/hub/models--roneneldan--TinyStories-1M/snapshots/8cd14d5339178f1b285f55baee14a0deff7103ac with 3745984 total params. Of those there are 3609664 prunable params which have 0.0 avg sparsity.
2024-01-05 20:00:15 sparseml.pytorch.model_load.helpers INFO     dense model detected, all sparsification info: {"params_summary": {"total": 3745984, "sparse": 0, "sparsity_percent": 0.0, "prunable": 3609664, "prunable_sparse": 0, "prunable_sparsity_percent": 0.0, "quantizable": 3612736, "quantized": 0, "quantized_percent": 0.0}, "params_info": {"transformer.h.0.attn.attention.k_proj.weight": {"numel": 4096, "sparsity": 0.0, "quantized": false}, "transformer.h.0.attn.attention.v_proj.weight": {"numel": 4096, "sparsity": 0.0, "quantized": false}, "transformer.h.0.attn.attention.q_proj.weight": {"numel": 4096, "sparsity": 0.0, "quantized": false}, "transformer.h.0.attn.attention.out_proj.weight": {"numel": 4096, "sparsity": 0.0, "quantized": false}, "transformer.h.0.mlp.c_fc.weight": {"numel": 16384, "sparsity": 0.0, "quantized": false}, "transformer.h.0.mlp.c_proj.weight": {"numel": 16384, "sparsity": 0.0, "quantized": false}, "transformer.h.1.attn.attention.k_proj.weight": {"numel": 4096, "sparsity": 0.0, "quantized": false}, "transformer.h.1.attn.attention.v_proj.weight": {"numel": 4096, "sparsity": 0.0, "quantized": false}, "transformer.h.1.attn.attention.q_proj.weight": {"numel": 4096, "sparsity": 0.0, "quantized": false}, "transformer.h.1.attn.attention.out_proj.weight": {"numel": 4096, "sparsity": 0.0, "quantized": false}, "transformer.h.1.mlp.c_fc.weight": {"numel": 16384, "sparsity": 0.0, "quantized": false}, "transformer.h.1.mlp.c_proj.weight": {"numel": 16384, "sparsity": 0.0, "quantized": false}, "transformer.h.2.attn.attention.k_proj.weight": {"numel": 4096, "sparsity": 0.0, "quantized": false}, "transformer.h.2.attn.attention.v_proj.weight": {"numel": 4096, "sparsity": 0.0, "quantized": false}, "transformer.h.2.attn.attention.q_proj.weight": {"numel": 4096, "sparsity": 0.0, "quantized": false}, "transformer.h.2.attn.attention.out_proj.weight": {"numel": 4096, "sparsity": 0.0, "quantized": false}, "transformer.h.2.mlp.c_fc.weight": {"numel": 16384, "sparsity": 0.0, "quantized": false}, "transformer.h.2.mlp.c_proj.weight": {"numel": 16384, "sparsity": 0.0, "quantized": false}, "transformer.h.3.attn.attention.k_proj.weight": {"numel": 4096, "sparsity": 0.0, "quantized": false}, "transformer.h.3.attn.attention.v_proj.weight": {"numel": 4096, "sparsity": 0.0, "quantized": false}, "transformer.h.3.attn.attention.q_proj.weight": {"numel": 4096, "sparsity": 0.0, "quantized": false}, "transformer.h.3.attn.attention.out_proj.weight": {"numel": 4096, "sparsity": 0.0, "quantized": false}, "transformer.h.3.mlp.c_fc.weight": {"numel": 16384, "sparsity": 0.0, "quantized": false}, "transformer.h.3.mlp.c_proj.weight": {"numel": 16384, "sparsity": 0.0, "quantized": false}, "transformer.h.4.attn.attention.k_proj.weight": {"numel": 4096, "sparsity": 0.0, "quantized": false}, "transformer.h.4.attn.attention.v_proj.weight": {"numel": 4096, "sparsity": 0.0, "quantized": false}, "transformer.h.4.attn.attention.q_proj.weight": {"numel": 4096, "sparsity": 0.0, "quantized": false}, "transformer.h.4.attn.attention.out_proj.weight": {"numel": 4096, "sparsity": 0.0, "quantized": false}, "transformer.h.4.mlp.c_fc.weight": {"numel": 16384, "sparsity": 0.0, "quantized": false}, "transformer.h.4.mlp.c_proj.weight": {"numel": 16384, "sparsity": 0.0, "quantized": false}, "transformer.h.5.attn.attention.k_proj.weight": {"numel": 4096, "sparsity": 0.0, "quantized": false}, "transformer.h.5.attn.attention.v_proj.weight": {"numel": 4096, "sparsity": 0.0, "quantized": false}, "transformer.h.5.attn.attention.q_proj.weight": {"numel": 4096, "sparsity": 0.0, "quantized": false}, "transformer.h.5.attn.attention.out_proj.weight": {"numel": 4096, "sparsity": 0.0, "quantized": false}, "transformer.h.5.mlp.c_fc.weight": {"numel": 16384, "sparsity": 0.0, "quantized": false}, "transformer.h.5.mlp.c_proj.weight": {"numel": 16384, "sparsity": 0.0, "quantized": false}, "transformer.h.6.attn.attention.k_proj.weight": {"numel": 4096, "sparsity": 0.0, "quantized": false}, "transformer.h.6.attn.attention.v_proj.weight": {"numel": 4096, "sparsity": 0.0, "quantized": false}, "transformer.h.6.attn.attention.q_proj.weight": {"numel": 4096, "sparsity": 0.0, "quantized": false}, "transformer.h.6.attn.attention.out_proj.weight": {"numel": 4096, "sparsity": 0.0, "quantized": false}, "transformer.h.6.mlp.c_fc.weight": {"numel": 16384, "sparsity": 0.0, "quantized": false}, "transformer.h.6.mlp.c_proj.weight": {"numel": 16384, "sparsity": 0.0, "quantized": false}, "transformer.h.7.attn.attention.k_proj.weight": {"numel": 4096, "sparsity": 0.0, "quantized": false}, "transformer.h.7.attn.attention.v_proj.weight": {"numel": 4096, "sparsity": 0.0, "quantized": false}, "transformer.h.7.attn.attention.q_proj.weight": {"numel": 4096, "sparsity": 0.0, "quantized": false}, "transformer.h.7.attn.attention.out_proj.weight": {"numel": 4096, "sparsity": 0.0, "quantized": false}, "transformer.h.7.mlp.c_fc.weight": {"numel": 16384, "sparsity": 0.0, "quantized": false}, "transformer.h.7.mlp.c_proj.weight": {"numel": 16384, "sparsity": 0.0, "quantized": false}, "lm_head.weight": {"numel": 3216448, "sparsity": 0.0, "quantized": false}}}
2024-01-05 20:00:15 sparseml.pytorch.model_load.helpers INFO     dense model detected, all sparsification info: {"params_summary": {"total": 3745984, "sparse": 0, "sparsity_percent": 0.0, "prunable": 3609664, "prunable_sparse": 0, "prunable_sparsity_percent": 0.0, "quantizable": 3612736, "quantized": 0, "quantized_percent": 0.0}, "params_info": {"transformer.h.0.attn.attention.k_proj.weight": {"numel": 4096, "sparsity": 0.0, "quantized": false}, "transformer.h.0.attn.attention.v_proj.weight": {"numel": 4096, "sparsity": 0.0, "quantized": false}, "transformer.h.0.attn.attention.q_proj.weight": {"numel": 4096, "sparsity": 0.0, "quantized": false}, "transformer.h.0.attn.attention.out_proj.weight": {"numel": 4096, "sparsity": 0.0, "quantized": false}, "transformer.h.0.mlp.c_fc.weight": {"numel": 16384, "sparsity": 0.0, "quantized": false}, "transformer.h.0.mlp.c_proj.weight": {"numel": 16384, "sparsity": 0.0, "quantized": false}, "transformer.h.1.attn.attention.k_proj.weight": {"numel": 4096, "sparsity": 0.0, "quantized": false}, "transformer.h.1.attn.attention.v_proj.weight": {"numel": 4096, "sparsity": 0.0, "quantized": false}, "transformer.h.1.attn.attention.q_proj.weight": {"numel": 4096, "sparsity": 0.0, "quantized": false}, "transformer.h.1.attn.attention.out_proj.weight": {"numel": 4096, "sparsity": 0.0, "quantized": false}, "transformer.h.1.mlp.c_fc.weight": {"numel": 16384, "sparsity": 0.0, "quantized": false}, "transformer.h.1.mlp.c_proj.weight": {"numel": 16384, "sparsity": 0.0, "quantized": false}, "transformer.h.2.attn.attention.k_proj.weight": {"numel": 4096, "sparsity": 0.0, "quantized": false}, "transformer.h.2.attn.attention.v_proj.weight": {"numel": 4096, "sparsity": 0.0, "quantized": false}, "transformer.h.2.attn.attention.q_proj.weight": {"numel": 4096, "sparsity": 0.0, "quantized": false}, "transformer.h.2.attn.attention.out_proj.weight": {"numel": 4096, "sparsity": 0.0, "quantized": false}, "transformer.h.2.mlp.c_fc.weight": {"numel": 16384, "sparsity": 0.0, "quantized": false}, "transformer.h.2.mlp.c_proj.weight": {"numel": 16384, "sparsity": 0.0, "quantized": false}, "transformer.h.3.attn.attention.k_proj.weight": {"numel": 4096, "sparsity": 0.0, "quantized": false}, "transformer.h.3.attn.attention.v_proj.weight": {"numel": 4096, "sparsity": 0.0, "quantized": false}, "transformer.h.3.attn.attention.q_proj.weight": {"numel": 4096, "sparsity": 0.0, "quantized": false}, "transformer.h.3.attn.attention.out_proj.weight": {"numel": 4096, "sparsity": 0.0, "quantized": false}, "transformer.h.3.mlp.c_fc.weight": {"numel": 16384, "sparsity": 0.0, "quantized": false}, "transformer.h.3.mlp.c_proj.weight": {"numel": 16384, "sparsity": 0.0, "quantized": false}, "transformer.h.4.attn.attention.k_proj.weight": {"numel": 4096, "sparsity": 0.0, "quantized": false}, "transformer.h.4.attn.attention.v_proj.weight": {"numel": 4096, "sparsity": 0.0, "quantized": false}, "transformer.h.4.attn.attention.q_proj.weight": {"numel": 4096, "sparsity": 0.0, "quantized": false}, "transformer.h.4.attn.attention.out_proj.weight": {"numel": 4096, "sparsity": 0.0, "quantized": false}, "transformer.h.4.mlp.c_fc.weight": {"numel": 16384, "sparsity": 0.0, "quantized": false}, "transformer.h.4.mlp.c_proj.weight": {"numel": 16384, "sparsity": 0.0, "quantized": false}, "transformer.h.5.attn.attention.k_proj.weight": {"numel": 4096, "sparsity": 0.0, "quantized": false}, "transformer.h.5.attn.attention.v_proj.weight": {"numel": 4096, "sparsity": 0.0, "quantized": false}, "transformer.h.5.attn.attention.q_proj.weight": {"numel": 4096, "sparsity": 0.0, "quantized": false}, "transformer.h.5.attn.attention.out_proj.weight": {"numel": 4096, "sparsity": 0.0, "quantized": false}, "transformer.h.5.mlp.c_fc.weight": {"numel": 16384, "sparsity": 0.0, "quantized": false}, "transformer.h.5.mlp.c_proj.weight": {"numel": 16384, "sparsity": 0.0, "quantized": false}, "transformer.h.6.attn.attention.k_proj.weight": {"numel": 4096, "sparsity": 0.0, "quantized": false}, "transformer.h.6.attn.attention.v_proj.weight": {"numel": 4096, "sparsity": 0.0, "quantized": false}, "transformer.h.6.attn.attention.q_proj.weight": {"numel": 4096, "sparsity": 0.0, "quantized": false}, "transformer.h.6.attn.attention.out_proj.weight": {"numel": 4096, "sparsity": 0.0, "quantized": false}, "transformer.h.6.mlp.c_fc.weight": {"numel": 16384, "sparsity": 0.0, "quantized": false}, "transformer.h.6.mlp.c_proj.weight": {"numel": 16384, "sparsity": 0.0, "quantized": false}, "transformer.h.7.attn.attention.k_proj.weight": {"numel": 4096, "sparsity": 0.0, "quantized": false}, "transformer.h.7.attn.attention.v_proj.weight": {"numel": 4096, "sparsity": 0.0, "quantized": false}, "transformer.h.7.attn.attention.q_proj.weight": {"numel": 4096, "sparsity": 0.0, "quantized": false}, "transformer.h.7.attn.attention.out_proj.weight": {"numel": 4096, "sparsity": 0.0, "quantized": false}, "transformer.h.7.mlp.c_fc.weight": {"numel": 16384, "sparsity": 0.0, "quantized": false}, "transformer.h.7.mlp.c_proj.weight": {"numel": 16384, "sparsity": 0.0, "quantized": false}, "lm_head.weight": {"numel": 3216448, "sparsity": 0.0, "quantized": false}}}
Running tokenizer on dataset: 100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 3760/3760 [00:00<00:00, 4770.75 examples/s]
Adding labels: 100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 3760/3760 [00:00<00:00, 5633.39 examples/s]
2024-01-05 20:00:17 sparseml.core.logger INFO     Logging all SparseML modifier-level logs to sparse_logs/05-01-2024_20.00.17.log
2024-01-05 20:00:17 sparseml.core.logger INFO     Logging all SparseML modifier-level logs to sparse_logs/05-01-2024_20.00.17.log
No recipes were applied for /home/damian/.cache/huggingface/hub/models--roneneldan--TinyStories-1M/snapshots/8cd14d5339178f1b285f55baee14a0deff7103ac, check to make sure recipe(s) are stored in the model_path
2024-01-05 20:00:17 sparseml.export.export INFO     Created additional items that will be used for the export: ['trainer', 'tokenizer', 'input_names']
2024-01-05 20:00:17 sparseml.export.export INFO     Created additional items that will be used for the export: ['trainer', 'tokenizer', 'input_names']
2024-01-05 20:00:17 sparseml.export.export INFO     Exporting model.onnx to ....
2024-01-05 20:00:17 sparseml.export.export INFO     Exporting model.onnx to ....
/nm/drive0/damian/sparseml/src/sparseml/pytorch/torch_to_onnx_exporter.py:132: UserWarning: Sample inputs passed into the ONNX exporter should be in the same order defined in the model forward function. Consider using OrderedDict for this purpose.
  warnings.warn(
/nm/drive0/damian/sparseml/venv/lib/python3.10/site-packages/transformers/models/gpt_neo/modeling_gpt_neo.py:557: TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
  if batch_size <= 0:
/nm/drive0/damian/sparseml/venv/lib/python3.10/site-packages/transformers/models/gpt_neo/modeling_gpt_neo.py:196: TracerWarning: torch.tensor results are registered as constants in the trace. You can safely ignore this warning if you use this function to create tensors out of constant variables that would be the same every time you call this function. In any other case, this might cause the trace to be incorrect.
  mask_value = torch.tensor(mask_value, dtype=attn_weights.dtype).to(attn_weights.device)
2024-01-05 20:00:36 sparseml.exporters.transforms.onnx_transform INFO     [FoldIdentityInitializers] Transformed 0 matches
2024-01-05 20:00:36 sparseml.exporters.transforms.onnx_transform INFO     [FoldIdentityInitializers] Transformed 0 matches
2024-01-05 20:00:36 sparseml.exporters.transforms.onnx_transform INFO     [FlattenQParams] Transformed 0 matches
2024-01-05 20:00:36 sparseml.exporters.transforms.onnx_transform INFO     [FlattenQParams] Transformed 0 matches
2024-01-05 20:00:36 sparseml.exporters.transforms.onnx_transform INFO     [UnwrapBatchNorms] Transformed 0 matches
2024-01-05 20:00:36 sparseml.exporters.transforms.onnx_transform INFO     [UnwrapBatchNorms] Transformed 0 matches
2024-01-05 20:00:36 sparseml.exporters.transforms.onnx_transform INFO     [DeleteTrivialOnnxAdds] Transformed 0 matches
2024-01-05 20:00:36 sparseml.exporters.transforms.onnx_transform INFO     [DeleteTrivialOnnxAdds] Transformed 0 matches
2024-01-05 20:00:36 sparseml.exporters.transforms.onnx_transform INFO     [ConstantsToInitializers] Transformed 411 matches
2024-01-05 20:00:36 sparseml.exporters.transforms.onnx_transform INFO     [ConstantsToInitializers] Transformed 411 matches
2024-01-05 20:00:36 sparseml.exporters.transforms.onnx_transform INFO     [FoldIdentityInitializers] Transformed 0 matches
2024-01-05 20:00:36 sparseml.exporters.transforms.onnx_transform INFO     [FoldIdentityInitializers] Transformed 0 matches
2024-01-05 20:00:36 sparseml.exporters.transforms.onnx_transform INFO     [InitializersToUint8] Transformed 0 matches
2024-01-05 20:00:36 sparseml.exporters.transforms.onnx_transform INFO     [InitializersToUint8] Transformed 0 matches
2024-01-05 20:00:36 sparseml.exporters.transforms.onnx_transform INFO     [FlattenQParams] Transformed 0 matches
2024-01-05 20:00:36 sparseml.exporters.transforms.onnx_transform INFO     [FlattenQParams] Transformed 0 matches
2024-01-05 20:00:36 sparseml.exporters.transforms.onnx_transform INFO     [FoldConvDivBn] Transformed 0 matches
2024-01-05 20:00:36 sparseml.exporters.transforms.onnx_transform INFO     [FoldConvDivBn] Transformed 0 matches
2024-01-05 20:00:36 sparseml.exporters.transforms.onnx_transform INFO     [DeleteRepeatedQdq] Transformed 0 matches
2024-01-05 20:00:36 sparseml.exporters.transforms.onnx_transform INFO     [DeleteRepeatedQdq] Transformed 0 matches
2024-01-05 20:00:36 sparseml.exporters.transforms.onnx_transform INFO     [QuantizeQATEmbedding] Transformed 0 matches
2024-01-05 20:00:36 sparseml.exporters.transforms.onnx_transform INFO     [QuantizeQATEmbedding] Transformed 0 matches
2024-01-05 20:00:37 sparseml.exporters.transforms.onnx_transform INFO     [PropagateEmbeddingQuantization] Transformed 0 matches
2024-01-05 20:00:37 sparseml.exporters.transforms.onnx_transform INFO     [PropagateEmbeddingQuantization] Transformed 0 matches
2024-01-05 20:00:37 sparseml.exporters.transforms.onnx_transform INFO     [PropagateDequantThroughSplit] Transformed 0 matches
2024-01-05 20:00:37 sparseml.exporters.transforms.onnx_transform INFO     [PropagateDequantThroughSplit] Transformed 0 matches
2024-01-05 20:00:37 sparseml.exporters.transforms.onnx_transform INFO     [MatMulAddToMatMulIntegerAddCastMul] Transformed 0 matches
2024-01-05 20:00:37 sparseml.exporters.transforms.onnx_transform INFO     [MatMulAddToMatMulIntegerAddCastMul] Transformed 0 matches
2024-01-05 20:00:37 sparseml.exporters.transforms.onnx_transform INFO     [MatMulToMatMulIntegerCastMul] Transformed 0 matches
2024-01-05 20:00:37 sparseml.exporters.transforms.onnx_transform INFO     [MatMulToMatMulIntegerCastMul] Transformed 0 matches
2024-01-05 20:00:37 sparseml.exporters.transforms.onnx_transform INFO     [FoldReLUQuants] Transformed 0 matches
2024-01-05 20:00:37 sparseml.exporters.transforms.onnx_transform INFO     [FoldReLUQuants] Transformed 0 matches
2024-01-05 20:00:37 sparseml.exporters.transforms.onnx_transform INFO     [ConvToConvIntegerAddCastMul] Transformed 0 matches
2024-01-05 20:00:37 sparseml.exporters.transforms.onnx_transform INFO     [ConvToConvIntegerAddCastMul] Transformed 0 matches
2024-01-05 20:00:37 sparseml.exporters.transforms.onnx_transform INFO     [GemmToQLinearMatMul] Transformed 0 matches
2024-01-05 20:00:37 sparseml.exporters.transforms.onnx_transform INFO     [GemmToQLinearMatMul] Transformed 0 matches
2024-01-05 20:00:37 sparseml.exporters.transforms.onnx_transform INFO     [GemmToMatMulIntegerAddCastMul] Transformed 0 matches
2024-01-05 20:00:37 sparseml.exporters.transforms.onnx_transform INFO     [GemmToMatMulIntegerAddCastMul] Transformed 0 matches
2024-01-05 20:00:37 sparseml.exporters.transforms.onnx_transform INFO     [QuantizeResiduals] Transformed 0 matches
2024-01-05 20:00:37 sparseml.exporters.transforms.onnx_transform INFO     [QuantizeResiduals] Transformed 0 matches
2024-01-05 20:00:37 sparseml.exporters.transforms.onnx_transform INFO     [RemoveDuplicateQConvWeights] Transformed 0 matches
2024-01-05 20:00:37 sparseml.exporters.transforms.onnx_transform INFO     [RemoveDuplicateQConvWeights] Transformed 0 matches
2024-01-05 20:00:37 sparseml.exporters.transforms.onnx_transform INFO     [RemoveDuplicateQuantizeOps] Transformed 0 matches
2024-01-05 20:00:37 sparseml.exporters.transforms.onnx_transform INFO     [RemoveDuplicateQuantizeOps] Transformed 0 matches
2024-01-05 20:00:37 sparseml.export.export INFO     Successfully exported model.onnx to ./model.onnx...
2024-01-05 20:00:37 sparseml.export.export INFO     Successfully exported model.onnx to ./model.onnx...
2024-01-05 20:00:37 sparseml.export.export INFO     Exporting 2 samples...
2024-01-05 20:00:37 sparseml.export.export INFO     Exporting 2 samples...
2it [00:12,  6.38s/it]
2024-01-05 20:00:50 sparseml.export.export_data INFO     Exporting sample-inputs to ....
2024-01-05 20:00:50 sparseml.export.export_data INFO     Exporting sample-inputs to ....
2024-01-05 20:00:50 sparseml.export.export_data INFO     Successfully exported sample-inputs to .!
2024-01-05 20:00:50 sparseml.export.export_data INFO     Successfully exported sample-inputs to .!
2024-01-05 20:00:50 sparseml.export.export_data INFO     Exporting sample-outputs to ....
2024-01-05 20:00:50 sparseml.export.export_data INFO     Exporting sample-outputs to ....
2024-01-05 20:00:57 sparseml.export.export_data INFO     Successfully exported sample-outputs to .!
2024-01-05 20:00:57 sparseml.export.export_data INFO     Successfully exported sample-outputs to .!
2024-01-05 20:00:57 sparseml.export.export INFO     Creating deployment folder deployment at directory: ....
2024-01-05 20:00:57 sparseml.export.export INFO     Creating deployment folder deployment at directory: ....
2024-01-05 20:00:57 sparseml.export.helpers WARNING  Optional file tokenizer.model not found in source path /home/damian/.cache/huggingface/hub/models--roneneldan--TinyStories-1M/snapshots/8cd14d5339178f1b285f55baee14a0deff7103ac
2024-01-05 20:00:57 sparseml.export.helpers WARNING  Optional file tokenizer.model not found in source path /home/damian/.cache/huggingface/hub/models--roneneldan--TinyStories-1M/snapshots/8cd14d5339178f1b285f55baee14a0deff7103ac
2024-01-05 20:00:57 sparseml.export.export INFO     Validating model structure...
2024-01-05 20:00:57 sparseml.export.export INFO     Validating model structure...
2024-01-05 20:00:57 sparseml.export.validators WARNING  File ./deployment/tokenizer.model is missing.
2024-01-05 20:00:57 sparseml.export.validators WARNING  File ./deployment/tokenizer.model is missing.
2024-01-05 20:00:57 sparseml.export.validators WARNING  File ./sample-labels is missing.
2024-01-05 20:00:57 sparseml.export.validators WARNING  File ./sample-labels is missing.
2024-01-05 20:00:57 sparseml.export.export INFO     Validating model correctness...
2024-01-05 20:00:57 sparseml.export.export INFO     Validating model correctness...
2024-01-05 20:01:11 sparseml.export.validators INFO     Successfully validated the exported model on all 2 samples.
2024-01-05 20:01:11 sparseml.export.validators INFO     Successfully validated the exported model on all 2 samples.
2024-01-05 20:01:11 sparseml.export.export INFO     Applying optimizations: all to the exported model...
2024-01-05 20:01:11 sparseml.export.export INFO     Applying optimizations: all to the exported model...
2024-01-05 20:01:11 sparseml.export.helpers INFO     Attempting to apply optimization: kv_cache_injection... 
2024-01-05 20:01:11 sparseml.export.helpers INFO     Attempting to apply optimization: kv_cache_injection... 
2024-01-05 20:01:11 sparseml.exporters.transforms.kv_cache.configs INFO     Loaded config file deployment/config.json for model: gpt_neo
2024-01-05 20:01:11 sparseml.exporters.transforms.kv_cache.configs INFO     Loaded config file deployment/config.json for model: gpt_neo
2024-01-05 20:01:11 sparseml.exporters.transforms.kv_cache.configs INFO     Properly configured arguments for KV Cache Transformation
2024-01-05 20:01:11 sparseml.exporters.transforms.kv_cache.configs INFO     Properly configured arguments for KV Cache Transformation
2024-01-05 20:01:11 sparseml.exporters.transforms.onnx_transform INFO     [CacheKeysAndValues] Transformed 16 matches
2024-01-05 20:01:11 sparseml.exporters.transforms.onnx_transform INFO     [CacheKeysAndValues] Transformed 16 matches
2024-01-05 20:01:11 sparseml.exporters.transforms.kv_cache.transforms_base INFO     Inserted positions input to the ONNX model
2024-01-05 20:01:11 sparseml.exporters.transforms.kv_cache.transforms_base INFO     Inserted positions input to the ONNX model
2024-01-05 20:01:11 sparseml.exporters.transforms.kv_cache.transforms_base INFO     Inserted causal_mask input to the ONNX model
2024-01-05 20:01:11 sparseml.exporters.transforms.kv_cache.transforms_base INFO     Inserted causal_mask input to the ONNX model
2024-01-05 20:01:11 sparseml.exporters.transforms.kv_cache.transforms_base INFO     Successfully swapped 1 nodes for input 'positions'
2024-01-05 20:01:11 sparseml.exporters.transforms.kv_cache.transforms_base INFO     Successfully swapped 1 nodes for input 'positions'
2024-01-05 20:01:11 sparseml.exporters.transforms.kv_cache.transforms_base INFO     Successfully swapped 8 nodes for input 'causal_mask'
2024-01-05 20:01:11 sparseml.exporters.transforms.kv_cache.transforms_base INFO     Successfully swapped 8 nodes for input 'causal_mask'
2024-01-05 20:01:11 sparseml.exporters.transforms.kv_cache.transforms_codegen INFO     Successfully adjusted the causal_mask input
2024-01-05 20:01:11 sparseml.exporters.transforms.kv_cache.transforms_codegen INFO     Successfully adjusted the causal_mask input
2024-01-05 20:01:11 sparseml.exporters.transforms.onnx_transform INFO     [AdditionalTransformsCodeGen] Transformed 10 matches
2024-01-05 20:01:11 sparseml.exporters.transforms.onnx_transform INFO     [AdditionalTransformsCodeGen] Transformed 10 matches
2024-01-05 20:01:11 sparseml.export.helpers INFO     Optimization: kv_cache_injection has been successfully applied to the ONNX model: ./deployment/model.onnx
2024-01-05 20:01:11 sparseml.export.helpers INFO     Optimization: kv_cache_injection has been successfully applied to the ONNX model: ./deployment/model.onnx
2024-01-05 20:01:11 sparseml.export.export INFO     Successfully exported model from:
.
to
./deployment
for integration: transformers
2024-01-05 20:01:11 sparseml.export.export INFO     Successfully exported model from:
.
to
./deployment
for integration: transformers

dbogunowicz · 2024-01-03T11:41:31Z

src/sparseml/export/export.py

@@ -45,7 +45,6 @@ def export(
    opset: int = TORCH_DEFAULT_ONNX_OPSET,
    single_graph_file: bool = True,
    num_export_samples: int = 0,
-    batch_size: int = 1,


Removing batch_size argument from the export.
It does not matter for the model export.
It also does not matter for the sample export (by convention, all our sample inputs/outputs/labeled are stored in the "batchless" arrays, e.g. inp-0000.npz has shape (3, 244, 244))

…port' into feature/damian/validate_correctness_finish

src/sparseml/transformers/utils/initializers.py

dbogunowicz · 2024-01-04T18:21:01Z

src/sparseml/export/validators.py

+    top_k_ground_truth = numpy.argsort(ground_truth.flatten())[-k:]
+    return numpy.all(top_k_prediction == top_k_ground_truth)
+
+
 def validate_correctness(


@bfineran this could be in the future moved to integration_helper_functions, but top_k_match feels like the right validation metric for all our use cases so far (to my best knowledge).

Satrat · 2024-01-04T18:36:49Z

tests/sparseml/export/transformers/test_transformers.py

+    def test_export_validate_correctness(self, caplog, setup):
+        if self.is_model_quantized:
+            pytest.skip(
+                "Skipping since quantized models may not pass this test"
+                "due to differences in rounding between quant ops in PyTorch and ONNX"
+            )


Is there an expected error range here that we could check for rather than skipping entirely?

…o feature/damian/validate_correctness_finish

bfineran · 2024-01-05T19:52:18Z

src/sparseml/export/export_data.py

-        outputs = outputs[0]
+        # outputs_ contains (logits, scores)
+        outputs = OrderedDict(logits=outputs[0], scores=outputs[1])
+    if len(inputs.size()) == 4:


let's add a comment that this is IC specific

* add suport for past_key_values in sample-outputs * [Export][Transformers] Implementation of correctness validation (#1935) * fix tests with help from sara * Update src/sparseml/transformers/utils/initializers.py * swap sparsezoo validator for custom one (top k match) * add more informative error message * add correctness validation for LLMs * remove past_key_values from outputs * remove past_key_values from outputs (2) * small note comment for the future

* initial commit * respond to PR comments * [Export Refactor][Image Classification] `create_model` function (#1878) * initial commit * looking good, time to cleanup * Delete src/sparseml/export/helpers.py * Delete tests/sparseml/export/test_helpers.py * ready for review * improve design * tests pass * reuse _validate_dataset_num_classes * [Export Refactor][Image Classification] `create_dummy_input` function (#1880) * initial commit * looking good, time to cleanup * Delete src/sparseml/export/helpers.py * Delete tests/sparseml/export/test_helpers.py * ready for review * improve design * tests pass * reuse _validate_dataset_num_classes * initial commit * Update src/sparseml/pytorch/image_classification/integration_helper_functions.py * Update src/sparseml/pytorch/image_classification/integration_helper_functions.py * ready for review * Update src/sparseml/export/export.py * Update src/sparseml/integration_helper_functions.py * [Export Refactor][Image Classification] `export_model` function (#1883) * initial commit * looking good, time to cleanup * Delete src/sparseml/export/helpers.py * Delete tests/sparseml/export/test_helpers.py * ready for review * improve design * tests pass * reuse _validate_dataset_num_classes * initial commit * Update src/sparseml/pytorch/image_classification/integration_helper_functions.py * Update src/sparseml/pytorch/image_classification/integration_helper_functions.py * ready for review * Update src/sparseml/export/export.py * Update src/sparseml/integration_helper_functions.py * initial commit * fixes * ready for review * nit * add return * make export function more general * [Export Refactor][Image Classification] `apply_optimizations` function (#1884) * initial commit * looking good, time to cleanup * Delete src/sparseml/export/helpers.py * Delete tests/sparseml/export/test_helpers.py * ready for review * improve design * tests pass * reuse _validate_dataset_num_classes * initial commit * Update src/sparseml/pytorch/image_classification/integration_helper_functions.py * Update src/sparseml/pytorch/image_classification/integration_helper_functions.py * ready for review * Update src/sparseml/export/export.py * Update src/sparseml/integration_helper_functions.py * initial commit * fixes * ready for review * nit * add return * initial commit * [Export Refactor][Image Classification] `export_sample_inputs_outputs` function (#1888) * initial commit * looking good, time to cleanup * Delete src/sparseml/export/helpers.py * Delete tests/sparseml/export/test_helpers.py * ready for review * improve design * tests pass * reuse _validate_dataset_num_classes * initial commit * Update src/sparseml/pytorch/image_classification/integration_helper_functions.py * Update src/sparseml/pytorch/image_classification/integration_helper_functions.py * ready for review * Update src/sparseml/export/export.py * Update src/sparseml/integration_helper_functions.py * initial commit * fixes * ready for review * nit * add return * initial commit * initial commit * PR comments * beautification * remove duplicated function * [Export Refactor][Image Classification] `create_deployment_folder` function (#1889) * initial commit * looking good, time to cleanup * Delete src/sparseml/export/helpers.py * Delete tests/sparseml/export/test_helpers.py * ready for review * improve design * tests pass * reuse _validate_dataset_num_classes * initial commit * Update src/sparseml/pytorch/image_classification/integration_helper_functions.py * Update src/sparseml/pytorch/image_classification/integration_helper_functions.py * ready for review * Update src/sparseml/export/export.py * Update src/sparseml/integration_helper_functions.py * initial commit * fixes * ready for review * nit * add return * initial commit * initial commit * initial commit * fix rebase, tests_work * ready to push * [Export Refactor][Image Classification] `validate_correctness` function (#1890) * initial commit * looking good, time to cleanup * Delete src/sparseml/export/helpers.py * Delete tests/sparseml/export/test_helpers.py * ready for review * improve design * tests pass * reuse _validate_dataset_num_classes * initial commit * Update src/sparseml/pytorch/image_classification/integration_helper_functions.py * Update src/sparseml/pytorch/image_classification/integration_helper_functions.py * ready for review * Update src/sparseml/export/export.py * Update src/sparseml/integration_helper_functions.py * initial commit * fixes * ready for review * nit * add return * initial commit * initial commit * initial commit * initial commit * Delete tests/sparseml/test_integration_helper_functions.py * ready to merge * [Export Refactor] End to end testing (#1898) * initial commit * looking good, time to cleanup * Delete src/sparseml/export/helpers.py * Delete tests/sparseml/export/test_helpers.py * ready for review * improve design * tests pass * reuse _validate_dataset_num_classes * initial commit * Update src/sparseml/pytorch/image_classification/integration_helper_functions.py * Update src/sparseml/pytorch/image_classification/integration_helper_functions.py * ready for review * Update src/sparseml/export/export.py * Update src/sparseml/integration_helper_functions.py * initial commit * fixes * ready for review * nit * add return * initial commit * initial commit * initial commit * initial commit * Delete tests/sparseml/test_integration_helper_functions.py * ready to merge * add structure validator * ready for review * Delete tests/sparseml/export/model.onnx * Delete tests/sparseml/export/image_classification/model.onnx * Delete tests/sparseml/export/image_classification/conftest.py * PR comments * remove onnx * [Export Refactor] Prepare the module to be more general (before including `transformers`) (#1908) * adapt the export script to handle transformers * Update src/sparseml/pytorch/image_classification/integration_helper_functions.py * Delete tests/sparseml/export/transformers/__init__.py * Delete tests/sparseml/export/transformers/test_generative_transformers.py * Delete tests/sparseml/export/transformers/test_transformers.py * Update src/sparseml/export/export.py Co-authored-by: Benjamin Fineran <[email protected]> * addressing review comments * [Export Refactor] Export `transformers` (#1909) * cleanup * Delete src/sparseml/transformers/integration_helper_functions_generative.py * Delete src/sparseml/transformers/utils/optimizations.py * Delete tests/sparseml/export/transformers/test_generative_transformers.py * Delete tests/sparseml/transformers/test_integration_helper_functions_generative.py * addressing PR reviews * [Export Refactor] Export generative transformers(#1910) * make tests green, remove using task to resolve the integration type * fix all the tests after the merge, make integration resolution independent of the task name * fold generative transformers into transformer helper functions * complete tests for export_data.py * Update src/sparseml/export/export.py * add tests that confirms that kv cache injection has been added * move applying optimizations into integration helper functions --------- Co-authored-by: Benjamin Fineran <[email protected]> * [Export Refactor][Transformers] Enable loading SparseModels (#1921) * initial commit * adressing review comments * Fix the tests * fix tests with help from sara * [Export][Transformers] Enable loading `text-generation` datasets (#1938) * add suport for past_key_values in sample-outputs * [Export][Transformers] Implementation of correctness validation (#1935) * fix tests with help from sara * Update src/sparseml/transformers/utils/initializers.py * swap sparsezoo validator for custom one (top k match) * add more informative error message * add correctness validation for LLMs * remove past_key_values from outputs * remove past_key_values from outputs (2) * small note comment for the future * tests fixed * fix test * [Export refactor] final manual testing fixes (#1948) * [Export refactor] final manual testing fixes * review --------- Co-authored-by: Benjamin Fineran <[email protected]>

dbogunowicz changed the base branch from main to feature/damian/feature_branch_export January 3, 2024 11:37

dbogunowicz commented Jan 3, 2024

View reviewed changes

fix tests with help from sara

5fe442c

dbogunowicz force-pushed the feature/damian/validate_correctness_finish branch from 3936104 to 5fe442c Compare January 3, 2024 12:52

Merge remote-tracking branch 'origin/feature/damian/feature_branch_ex…

dff8af1

…port' into feature/damian/validate_correctness_finish

dbogunowicz commented Jan 3, 2024

View reviewed changes

src/sparseml/transformers/utils/initializers.py Outdated Show resolved Hide resolved

dbogunowicz and others added 2 commits January 3, 2024 13:55

Update src/sparseml/transformers/utils/initializers.py

8623042

swap sparsezoo validator for custom one (top k match)

eb4c1c7

dbogunowicz marked this pull request as ready for review January 4, 2024 11:21

dbogunowicz changed the title ~~Feature/damian/validate correctness finish~~ [Export][Transformers] Implementation of correctness validation Jan 4, 2024

add more informative error message

1640d5f

dbogunowicz commented Jan 4, 2024

View reviewed changes

dbogunowicz requested review from bfineran and Satrat January 4, 2024 18:22

Satrat reviewed Jan 4, 2024

View reviewed changes

Satrat mentioned this pull request Jan 4, 2024

[Export][Transformers] Enable loading text-generation datasets #1938

Merged

Merge remote-tracking branch 'origin/feature/damian/samples_llms' int…

ed60acf

…o feature/damian/validate_correctness_finish

dbogunowicz changed the base branch from feature/damian/feature_branch_export to feature/damian/samples_llms January 5, 2024 16:39

dbogunowicz added 3 commits January 5, 2024 17:46

add correctness validation for LLMs

f94fa53

remove past_key_values from outputs

4fbeb27

remove past_key_values from outputs (2)

d79db1d

bfineran approved these changes Jan 5, 2024

View reviewed changes

small note comment for the future

d5ccfc8

dbogunowicz merged commit e0c1068 into feature/damian/samples_llms Jan 5, 2024

dbogunowicz deleted the feature/damian/validate_correctness_finish branch January 5, 2024 20:12

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Export][Transformers] Implementation of correctness validation #1935

[Export][Transformers] Implementation of correctness validation #1935

dbogunowicz commented Jan 3, 2024 •

edited

Loading

dbogunowicz Jan 3, 2024 •

edited

Loading

dbogunowicz Jan 4, 2024

Satrat Jan 4, 2024

bfineran Jan 5, 2024

[Export][Transformers] Implementation of correctness validation #1935

[Export][Transformers] Implementation of correctness validation #1935

Conversation

dbogunowicz commented Jan 3, 2024 • edited Loading

Feature description

Testing

Example

dbogunowicz Jan 3, 2024 • edited Loading

Choose a reason for hiding this comment

dbogunowicz Jan 4, 2024

Choose a reason for hiding this comment

Satrat Jan 4, 2024

Choose a reason for hiding this comment

bfineran Jan 5, 2024

Choose a reason for hiding this comment

dbogunowicz commented Jan 3, 2024 •

edited

Loading

dbogunowicz Jan 3, 2024 •

edited

Loading