Merge branch 'sparsegpt_quant_child' of github.com:neuralmagic/sparseml into sparsegpt_quant_child

Sara Adkins · Sara Adkins · commit dd37d1664ee5 · 2023-10-26T16:41:47.000-04:00
diff --git a/LICENSE-ULTRALYTICS b/LICENSE-ULTRALYTICS
@@ -0,0 +1,104 @@
+ULTRALYTICS ENTERPRISE SOFTWARE LICENSE AGREEMENT 
+v0.6.1 - Updated 21 February 2023
+
+This Enterprise Software License Agreement (the “Agreement”) is made between Neuralmagic, Inc., (the “Client”, or "Licensee") and Ultralytics Inc.
+(the “Company”), a Delaware corporation with offices at 3616 Barham BLVD X311, Los Angeles CA 90068 United States, (collectively the “Parties”) as 
+of March 1st, 2023 (the “Effective Date”).
+
+1. Definitions. As used in this Agreement, the following terms have the following specific meanings: 
+  1.	Documentation: the documentation for the Software supplied by Company to assist its  customers in the use of the Software. 
+  2.	Licensee: (a) the company or other legal entity on behalf of which this Agreement is signed, if  the Agreement is signed on behalf of such an 
+        entity (e.g., by an employee, independent contractor, or other authorized representative), or (b) if there is no such entity, the individual who 
+        signs this Agreement. For clarification, “Licensee” refers only to a single, specifically identified legal entity or individual, and does not 
+        include any subsidiary or affiliate of any such legal entity or individual or any other related person. 
+  3.	License Term: the period of time in which Licensee shall be entitled to use the Software and Documentation. 
+  4.	Services: the Software Updates and Support and any Consulting Services provided by the Company pursuant to this Agreement. 
+  5.	Software: all Ultralytics YOLO source code, trained models, project files and scripts maintained at https://github.com/ultralytics/yolov3, 
+        https://github.com/ultralytics/yolov5 and https://github.com/ultralytics/ultralytics provided by Company to Licensee hereunder.  
+
+2. Right to Use Software. Company grants Licensee a non-exclusive, non-transferable, worldwide license to use the Software, as well as the accompanying 
+   Documentation. 
+  1.	Licensee will hold the right to use the Software that Company has developed for any purpose, including commercial and for-profit purposes. 
+  2.	Licensee's subcontractors, consultants, and vendors will also hold the right to use the Software for purposes of developing or deploying Licensee's
+        products or services. 
+  3.	Licensee will have the ability to make and distribute to its customers and end users an unlimited number of commercial, for-profit products 
+        containing the Software mentioned above. 
+  4.	Licensee, and any third party that receives an authorized distribution under Section 2.3, will retain perpetual license rights to Software versions 
+        and updates released during License Term. 
+  5.	Licensee, and any third party that receives an authorized distribution under Section 2.3, shall own all rights, title and interest in and to any 
+        Ultralytics YOLO models that they train with the Software. Company shall have no rights in or to such Ultralytics YOLO models.  
+
+3. Restrictions on Use of Software. Except as expressly permitted in this Agreement, Licensee shall not, and shall not permit any third party to: 
+  1.	Sublicense, resell, or otherwise transfer the license or any portion thereof to any third party, including but not limited to any subsidiaries 
+        or the affiliates of Licensee. 
+  2.	Alter or remove any notices in the Software or within the Documentation included with said Software. All Software included in this source code 
+        license agreement as well as all Documentation included with said Software is provided in an “as is” condition. 
+
+4. Software Updates and Support. Company will provide Licensee access to the Software Updates and Support included during the License Term at no additional fee. 
+  1.	Updates. Any relevant Updates during the License Term.
+  2.	Support. Communication tools to enable Licensee to communicate efficiently with the  Company during the License Term.
+  3.	Consulting. Company may provide Consulting Services to Licensee if requested under a  separate agreement. Such services are made available at Company's 
+        standard time and material charges.  
+
+5. Payment Fees. In consideration of the license granted by the Company under this Agreement, Licensee agrees to pay Company a basic fee of [CONFIDENTIAL], 
+  plus any applicable taxes, for the Software provided under this Agreement. Payment shall be due within thirty (30) days, after which a late fee of one 
+  and a half percent (1.5%) is applied. 
+
+6. Term and Termination. This Agreement will begin on the Effective Date for a period of one (1) year (the “License Term”), and will be automatically renewed 
+  for one (1) year terms at the then- current fees and your credit card account (or other payment method account) will be charged without further authorization 
+  from you, absent sixty (60) day written notice of non-renewal prior to the end of the current License Term. 
+
+7. Ownership. Ownership of the Software and Documentation, including any copies or modifications of the Software or Documentation (in whole or in part), and 
+  all related copyright, patent, trade secret and other proprietary rights, are and will remain the exclusive property of Company and/or its licensors. Company 
+  reserves all rights not expressly granted by it to Licensee under this Agreement. There are no implied rights. 
+
+8. Confidentiality. Company agrees to protect Licensee's confidential information using no less than reasonable care and to avoid disclosure of any confidential 
+  information. To the extent Company is required by law to disclose Licensee's confidential information, Company make such disclosure, provided Company promptly 
+  notifies Licensee of such requirement prior to disclosure (to the extent permitted by law), and reasonably cooperates, at Company's expense, regarding Company's 
+  efforts to avoid and limit disclosure. Upon the reasonable request of Licensee, Company will either return, delete, or destroy all confidential information of 
+  Licensee and certify the same. 
+
+9. Limitation of Liability. Excluding a breach of this Agreement, Either Party shall not be liable to the other Party for any indirect, special or consequential 
+  damages or lost profits arising out of or related to this Agreement. Each Party’s total, aggregate liability to the other Party arising out of or in connection 
+  with this Agreement, whether in contract, tort (including negligence) equity or other legal ground, shall not exceed the fees paid or payable by Licensee. 
+
+10. Indemnification and Legal Compliance. Company represents and warrants that, to the best of its knowledge, the Software and Documentation provided to Licensee 
+  hereunder do not infringe any intellectual property rights or other rights of any third party. In the event of any claim, suit, or proceeding against Licensee 
+  arising out of an alleged infringement of any intellectual property rights or other rights of any third party, Company shall, at its sole expense, defend or settle 
+  such claim, suit, or proceeding and indemnify and hold harmless Licensee against any damages and costs awarded therein, provided that Licensee promptly notifies 
+  Company in writing of any such claim, suit, or proceeding and provides reasonable cooperation in the defense thereof. This indemnity shall not apply to any claim 
+  to the extent it arises from modifications made to the Software or Documentation by Licensee or a third party. 
+  
+  Both Parties agree to defend, indemnify, and hold harmless the other party from and against any and all damages, injunctive relief, claims, judgments, liabilities, 
+  fines, costs, expenses, penalties, or losses arising out of any third-party claim or action arising out of any breach by that party of any provision of this 
+  Agreement. This indemnification obligation shall survive the termination or expiration of this Agreement. 
+
+11. Assignment. Neither Party may assign this Agreement, or the rights and obligations herein, to any third-party without prior written approval from Company. In the 
+  event of assignment this Agreement is binding on the parties’ respective successors and assigns. Notwithstanding the foregoing, each Party is permitted to assign 
+  without consent or notice obligations to any affiliate, or in the event of the sale or merger of all or substantially all of its assets. 
+
+12. Severability. If any term, clause or provision herein is held invalid or unenforceable by a court of competent jurisdiction, such invalidity shall not affect the 
+  validity or operation of any other term, clause or provision. 
+
+13. Status. The parties’ status hereunder is that of independent contractors and not an employee or agent of the other. Each Party is an independent business and 
+  responsible for their own costs and expenses, including, those relating to federal, state, and local income taxes, unemployment taxes and workers’ compensation, 
+  liability insurance, and including the filing of all returns and reports and the payment of all assessments, taxes and other sums required of their business. 
+
+14. Waiver. The waiver of either Party of any breach or failure to enforce any of the terms and conditions of this Agreement at any time shall not in any way affect, 
+  limit, or waive such Party’s right thereafter to enforce and compel strict compliance with every term and condition of this Agreement. 
+
+15. Governing Law. This Agreement shall be governed by and construed in accordance with the substantive laws of the State of Delaware in the United States without 
+  regard to conflict of laws and all disputes arising under or relating to this Agreement shall be brought and resolved solely and exclusively in the State Court 
+  located in Delaware. 
+
+16. No Limitation. At any time both Parties may contract without limitation with other entities that provide services similar to those to be provided by Company to 
+  Licensee. 
+
+17. Final Agreement. This Agreement terminates and supersedes all prior understandings or agreements on the subject matter hereof. This Agreement may be modified 
+  only by a further writing that is duly executed by both parties.
+
+Any changes to this Agreement will be by mutual agreement. 
+
+IN WITNESS whereof, the Parties have caused this Agreement to be executed by their duly authorized representatives as of the Effective Date. 
+
+Executed by Ultralytics Inc. and Neuralmagic, Inc.
diff --git a/NOTICE b/NOTICE
@@ -30,13 +30,11 @@ PyTorch License https://github.com/pytorch/pytorch/blob/master/LICENSE
 
 PyTorch torchvision and models License https://github.com/pytorch/vision/blob/master/LICENSE
 
-Pytorch Image Models (timm) License https://github.com/rwightman/pytorch-image-models/blob/master/LICENSE
+PyTorch Image Models (timm) License https://github.com/rwightman/pytorch-image-models/blob/master/LICENSE
 
 TensorFlow License https://github.com/tensorflow/tensorflow/blob/master/LICENSE
 
-Ultralytics YOLOv3 License https://github.com/ultralytics/yolov3/blob/master/LICENSE
-
-Ultralytics YOLOv5 License https://github.com/ultralytics/yolov5/blob/master/LICENSE
+Ultralytics YOLOv* License https://github.com/neuralmagic/sparseml/blob/master/LICENSE-ULTRALYTICS
 
 YOLACT License https://github.com/dbolya/yolact/blob/master/LICENSE
 
diff --git a/README.md b/README.md
@@ -62,7 +62,7 @@ SparseML is an open-source model optimization toolkit that enables you to create
 ## Workflows
 
 SparseML enables you to create a sparse model trained on your dataset in two ways:
-- **Sparse Transfer Learning** enables you to fine-tune a pre-sparsified model from [SparseZoo](https://sparsezoo.neuralmagic.com/) (an open-source repository of sparse models such as BERT, YOLOv5, and ResNet-50) onto your dataset, while maintaining sparsity. This pathway works just like typical fine-tuning you are used to in training CV and NLP models, and is strongly preferred for if your model architecture is availble in SparseZoo.
+- **Sparse Transfer Learning** enables you to fine-tune a pre-sparsified model from [SparseZoo](https://sparsezoo.neuralmagic.com/) (an open-source repository of sparse models such as BERT, YOLOv5, and ResNet-50) onto your dataset, while maintaining sparsity. This pathway works just like typical fine-tuning you are used to in training CV and NLP models, and is strongly preferred for if your model architecture is available in SparseZoo.
 
 - **Sparsification from Scratch** enables you to apply state-of-the-art pruning (like gradual magnitude pruning or OBS pruning) and quantization (like quantization aware training) algorithms to arbitrary PyTorch and Hugging Face models. This pathway requires more experimentation, but allows you to create a sparse version of any model. 
 
@@ -134,7 +134,7 @@ To enable flexibility, ease of use, and repeatability, SparseML uses a declarati
 
 ### Python API
 
-Because of the declarative, recipe-based approach, you can add SparseML to your existing PyTorch traing pipelines. The `ScheduleModifierManager` class is responsible for parsing the YAML `recipes` and overriding standard PyTorch model and optimizer objects, encoding the logic of the sparsity algorithms from the recipe. Once you call `manager.modify`, you can then use the model and optimizer as usual, as SparseML abstracts away the complexity of the sparsification algorithms.
+Because of the declarative, recipe-based approach, you can add SparseML to your existing PyTorch training pipelines. The `ScheduleModifierManager` class is responsible for parsing the YAML `recipes` and overriding standard PyTorch model and optimizer objects, encoding the logic of the sparsity algorithms from the recipe. Once you call `manager.modify`, you can then use the model and optimizer as usual, as SparseML abstracts away the complexity of the sparsification algorithms.
 
 The workflow looks like this:
 
diff --git a/integrations/torchvision/modifiers_refactor_example/e2e_test.py b/integrations/torchvision/modifiers_refactor_example/e2e_test.py
@@ -24,7 +24,7 @@ def main():
     from torch.utils.data import DataLoader
     from torchvision import transforms
 
-    import sparseml.core.session as sml
+    import sparseml.core.session as session_manager
     from sparseml.core.event import EventType
     from sparseml.core.framework import Framework
     from sparseml.pytorch.utils import (
@@ -40,8 +40,8 @@ def main():
     device = "cuda:0"
 
     # set up SparseML session
-    sml.create_session()
-    session = sml.active_session()
+    session_manager.create_session()
+    session = session_manager.active_session()
 
     # download model
     model = torchvision.models.mobilenet_v2(
diff --git a/setup.py b/setup.py
@@ -187,8 +187,11 @@ def _setup_entry_points() -> Dict:
             ]
         )
 
-    entry_points["console_scripts"].append(
-        "sparseml.transformers.export_onnx=sparseml.transformers.export:main"
+    entry_points["console_scripts"].extend(
+        [
+            "sparseml.transformers.export_onnx=sparseml.transformers.export:main",
+            "sparseml.transformers.export_onnx_refactor=sparseml.transformers.sparsification.obcq.export:main",  # noqa 501
+        ]
     )
 
     # image classification integration
diff --git a/src/sparseml/modifiers/obcq/utils/helpers.py b/src/sparseml/modifiers/obcq/utils/helpers.py
@@ -158,10 +158,10 @@ def ppl_eval_general(
 
         vocabulary_size = logits[0].shape[-1]
         logits = [logit[:, :-1, :].view(-1, vocabulary_size) for logit in logits]
-        logits = torch.concatenate(logits, dim=0).contiguous().to(torch.float32)
+        logits = torch.cat(logits, dim=0).contiguous().to(torch.float32)
 
         labels = [sample[:, 1:].view(-1) for sample in samples]
-        labels = torch.concatenate(labels, dim=0).to(dev)
+        labels = torch.cat(labels, dim=0).to(dev)
         neg_log_likelihood += torch.nn.functional.cross_entropy(
             logits,
             labels,
diff --git a/src/sparseml/modifiers/quantization/base.py b/src/sparseml/modifiers/quantization/base.py
@@ -14,7 +14,7 @@
 
 from typing import Any, Dict, List, Optional
 
-from sparseml.core import Event, Modifier, State
+from sparseml.core import Event, Modifier
 
 
 __all__ = ["QuantizationModifier"]
@@ -136,6 +136,3 @@ def check_should_disable_observer(self, event: Event) -> bool:
         if event.current_index >= disable_epoch:
             return True
         return False
-
-    def on_initialize_structure(self, state: State, **kwargs):
-        pass  # nothing needed for this modifier
diff --git a/src/sparseml/modifiers/quantization/pytorch.py b/src/sparseml/modifiers/quantization/pytorch.py
@@ -71,6 +71,11 @@ def __init__(self, **kwargs):
             self.scheme_overrides, self.scheme
         )
 
+    def on_initialize_structure(self, state: State, **kwargs):
+        module = state.model.model
+        self._enable_module_qat(module)
+        state.model.model.apply(torch.quantization.disable_observer)
+
     def on_initialize(self, state: State, **kwargs) -> bool:
         raise_if_torch_quantization_not_available()
         if self.end and self.end != -1:
@@ -84,6 +89,7 @@ def on_initialize(self, state: State, **kwargs) -> bool:
 
         if self.calculate_start() == -1:  # one-shot
             self._enable_module_qat(module)
+            self._calibrate_if_possible(module)
             self._disable_quantization_observer(module)
 
         return True
@@ -122,30 +128,31 @@ def _disable_quantization_observer(self, model: Module):
         self.quantization_observer_disabled_ = True
 
     def _enable_module_qat(self, module: Module):
-        # fuse conv-bn-relu blocks prior to quantization emulation
-        self._fuse(module)
-
-        # add quantization_schemes to target submodules
-        set_quantization_schemes(
-            module,
-            scheme=self.scheme,
-            scheme_overrides=self.scheme_overrides,
-            ignore=self.ignore,
-            strict=self.strict,
-        )
+        module.apply(torch.quantization.enable_observer)
 
-        # fix for freezing batchnorm statistics when not fusing BN with convs.
-        # pytorch only supports freezing batchnorm statistics for fused modules.
-        # this fix wraps BN modules adding with a new module class that supports
-        # methods related to freezing/unfreezing BN statistics.
-        configure_module_bn_wrappers(module)
+        if not self.qat_enabled_:
+            # fuse conv-bn-relu blocks prior to quantization emulation
+            self._fuse(module)
+
+            # add quantization_schemes to target submodules
+            set_quantization_schemes(
+                module,
+                scheme=self.scheme,
+                scheme_overrides=self.scheme_overrides,
+                ignore=self.ignore,
+                strict=self.strict,
+            )
 
-        # convert target qconfig layers to QAT modules with FakeQuantize
-        convert_module_qat_from_schemes(module)
+            # fix for freezing batchnorm statistics when not fusing BN with convs.
+            # pytorch only supports freezing batchnorm statistics for fused modules.
+            # this fix wraps BN modules adding with a new module class that supports
+            # methods related to freezing/unfreezing BN statistics.
+            configure_module_bn_wrappers(module)
 
-        self.qat_enabled_ = True
+            # convert target qconfig layers to QAT modules with FakeQuantize
+            convert_module_qat_from_schemes(module)
 
-        self._calibrate_if_possible(module)
+        self.qat_enabled_ = True
 
     def _fuse(self, module: Module):
         if self.model_fuse_fn_name in [None, "conv_bn_relus"]:
diff --git a/src/sparseml/pytorch/torchvision/train.py b/src/sparseml/pytorch/torchvision/train.py
@@ -400,7 +400,7 @@ def collate_fn(batch):
     )
 
     _LOGGER.info("Creating model")
-    local_rank = args.local_rank if args.distributed else None
+    local_rank = int(os.environ["LOCAL_RANK"]) if args.distributed else None
     model, arch_key, maybe_dp_device = _create_model(
         arch_key=args.arch_key,
         local_rank=local_rank,
@@ -1256,14 +1256,6 @@ def new_func(*args, **kwargs):
         "Note: Will use ImageNet values if not specified."
     ),
 )
-@click.option(
-    "--local_rank",
-    "--local-rank",
-    type=int,
-    default=None,
-    help="Local rank for distributed training",
-    hidden=True,  # should not be modified by user
-)
 @click.pass_context
 def cli(ctx, **kwargs):
     """
diff --git a/src/sparseml/transformers/data/base_llm.py b/src/sparseml/transformers/data/base_llm.py
@@ -100,7 +100,7 @@ def _add_end_token(self, tokenized_sample):
             if len(tokenized_sample) == self._seqlen:
                 tokenized_sample[-1] = self.tokenizer.eos_token_id
             else:
-                tokenized_sample = torch.concatenate(
+                tokenized_sample = torch.cat(
                     (
                         tokenized_sample,
                         torch.tensor((self.tokenizer.eos_token_id,)),
diff --git a/src/sparseml/transformers/sparsification/obcq/export.py b/src/sparseml/transformers/sparsification/obcq/export.py
diff --git a/src/sparseml/transformers/sparsification/obcq/obcq.py b/src/sparseml/transformers/sparsification/obcq/obcq.py

Original file line number	Diff line number	Diff line change
`@@ -187,8 +187,11 @@ def _setup_entry_points() -> Dict:`
`187`	`187`	`]`
`188`	`188`	`)`
`189`	`189`
`190`		`- entry_points["console_scripts"].append(`
`191`		`- "sparseml.transformers.export_onnx=sparseml.transformers.export:main"`
	`190`	`+ entry_points["console_scripts"].extend(`
	`191`	`+ [`
	`192`	`+ "sparseml.transformers.export_onnx=sparseml.transformers.export:main",`
	`193`	`+ "sparseml.transformers.export_onnx_refactor=sparseml.transformers.sparsification.obcq.export:main", # noqa 501`
	`194`	`+ ]`
`192`	`195`	`)`
`193`	`196`
`194`	`197`	`# image classification integration`
Original file line number	Diff line number	Diff line change
`@@ -100,7 +100,7 @@ def _add_end_token(self, tokenized_sample):`
`100`	`100`	`if len(tokenized_sample) == self._seqlen:`
`101`	`101`	`tokenized_sample[-1] = self.tokenizer.eos_token_id`
`102`	`102`	`else:`
`103`		`- tokenized_sample = torch.concatenate(`
	`103`	`+ tokenized_sample = torch.cat(`
`104`	`104`	`(`
`105`	`105`	`tokenized_sample,`
`106`	`106`	`torch.tensor((self.tokenizer.eos_token_id,)),`