Replace module hooking with tree-defined targeting #1527

Qubitium · 2025-04-10T02:05:29Z

Even though existing code does not error, we should never ever hook/wrap a module that will never particpates in quantization. This doesn't make any sense. The fix is much more complicated and likely require for 2-3 PRs to refractor.

Currently add optional pin-point module targeting via tree of base.layers_modules_tree. This will play nicely into more generic multi-modal support as mm models embeds up to 3 separate models essentially into 1. The existing static-non-tree based config will be harder to support.

Tree syntax

# Full tree of quantizable modules
# `#` str will match any number: useful for layers and moe indexing.
# List[str] for serial linked nodes. List str are linear depth linked modules presented in a linear fashion with no divergence.
# Dict{str: List[str] | Dict | Tuple[str]} for diverging nodes where a node splits into multiple paths/nodes.
# Tuple(str) for final targeted modules/nodes: there are only strings representing the final targeted modules
layers_modules_tree = [
    "model",
    "layers",
    "#",
    {
        "self_attn": ("k_proj", "v_proj", "q_proj", "o_proj"),
        "mlp": ("up_proj", "gate_proj", "down_proj"),
    }
]

Signed-off-by: Qubitium <[email protected]>

Qubitium · 2025-04-10T02:34:32Z

@Cecilwang While testing for conv1d/2d support I found deep issues/oversight with the current hooking targeting mechnaism so this Pr kind of spiraled out of control. But the code relevant to you is minimal. You just need to add the layer_modules_tree definition to your model for pinpoint control of hooks and check if the hooked_linear.py changes is usable for your model. I could not find a model that has nn.conv1d or nn.conv2d that is actually quantized. Let me know if you know any, other than mamba which I don't think we support yet.

Signed-off-by: Qubitium <[email protected]>

refractor replace_linear with replace_module but using tree guide

8337498

Signed-off-by: Qubitium <[email protected]>

Qubitium changed the title ~~Refractor replace_linear with replace_module but using tree guide~~ Refractor replace_linear with tree-defined targeting Apr 10, 2025

Qubitium changed the title ~~Refractor replace_linear with tree-defined targeting~~ Replace module hooking with tree-defined targeting Apr 10, 2025

Qubitium added 4 commits April 10, 2025 02:42

fix doc

5102fcd

Signed-off-by: Qubitium <[email protected]>

update doc

3370496

Signed-off-by: Qubitium <[email protected]>

fix import and reduce logging

615c070

Signed-off-by: Qubitium <[email protected]>

fix import

7c12e32

Signed-off-by: Qubitium <[email protected]>

Qubitium merged commit 444b277 into main Apr 11, 2025
40 of 111 checks passed

Qubitium deleted the fix-hooked-conv12D branch April 11, 2025 01:34

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Replace module hooking with tree-defined targeting #1527

Replace module hooking with tree-defined targeting #1527

Qubitium commented Apr 10, 2025 •

edited

Loading

Qubitium commented Apr 10, 2025

Replace module hooking with tree-defined targeting #1527

Replace module hooking with tree-defined targeting #1527

Conversation

Qubitium commented Apr 10, 2025 • edited Loading

Qubitium commented Apr 10, 2025

Qubitium commented Apr 10, 2025 •

edited

Loading