feat/loraOp #3455

danielafrimi · 2025-04-10T15:21:23Z

A new C++ file (cpp/tensorrt_llm/thop/loraOp.cpp) was added.
This file defines a PyTorch custom operator named trtllm::lora_grouped_gemm.
The forward method of the LoraLayer class was modified. it now calls the newly created custom operator torch.ops.trtllm.lora_grouped_gemm (the file above).
Tests:

MLP with lora (loraOp)
Attention with lora (loraOp) - as the pytorch flow assume the input is not padded, we compare both of the attention layers with no padding (loraOp changed a little bit).
Comparison of loraPlugin vs loraOp

danielafrimi · 2025-04-14T07:10:35Z

/bot run

danielafrimi · 2025-04-14T07:10:51Z

/bot run

tensorrt-cicd · 2025-04-14T07:16:03Z

PR_Github #2136 [ run ] triggered by Bot

tensorrt-cicd · 2025-04-14T07:16:24Z

PR_Github #2137 [ run ] triggered by Bot

tensorrt-cicd · 2025-04-14T07:17:51Z

PR_Github #2137 [ run ] completed with state ABORTED

tensorrt-cicd · 2025-04-14T09:13:11Z

PR_Github #2136 [ run ] completed with state SUCCESS
/LLM/main/L0_MergeRequest_PR pipeline #1554 completed with status: 'FAILURE'

cpp/tensorrt_llm/thop/loraOp.cpp

danielafrimi · 2025-04-14T12:49:48Z

/bot run

danielafrimi · 2025-04-14T12:50:51Z

@Naveassaf @shaharmor98 @byshiue

PR is open for feedback

tensorrt-cicd · 2025-04-14T12:55:16Z

PR_Github #2192 [ run ] triggered by Bot

tensorrt-cicd · 2025-04-14T14:49:50Z

PR_Github #2192 [ run ] completed with state SUCCESS
/LLM/main/L0_MergeRequest_PR pipeline #1584 completed with status: 'FAILURE'

tests/unittest/_torch/modules/tests_lora_modules/test_lora_mlp_pytorch_flow_vs_trt.py

…ugin Signed-off-by: Ubuntu <[email protected]> att ragged works without lora Signed-off-by: Ubuntu <[email protected]> att lora works with ragged tensor - remove padding it True Signed-off-by: Ubuntu <[email protected]> att works with lora bs1 Signed-off-by: Ubuntu <[email protected]> lora op Signed-off-by: Ubuntu <[email protected]> lora op Signed-off-by: Ubuntu <[email protected]> wip on mlp vanilla test + minor change in the loraOp.app Signed-off-by: Ubuntu <[email protected]> mlp test is flaky Signed-off-by: Ubuntu <[email protected]> wip Signed-off-by: Ubuntu <[email protected]> remove mlp since its not stable Signed-off-by: Ubuntu <[email protected]>

danielafrimi · 2025-04-16T10:56:59Z

/bot run

tensorrt-cicd · 2025-04-16T11:02:37Z

PR_Github #2465 [ run ] triggered by Bot

tensorrt-cicd · 2025-04-16T15:55:26Z

PR_Github #2465 [ run ] completed with state SUCCESS
/LLM/main/L0_MergeRequest_PR pipeline #1768 completed with status: 'SUCCESS'

Naveassaf · 2025-04-17T04:24:09Z

Spoke about this PR over slack. It adds incomplete functionality along with passing, none-flaky tests.

The updates to this are already WIP and about to be placed in a separate PR.

Naveassaf · 2025-04-17T04:24:30Z

/bot reuse-pipeline

tensorrt-cicd · 2025-04-17T04:29:54Z

PR_Github #2575 [ reuse-pipeline ] triggered by Bot

Naveassaf · 2025-04-17T04:33:47Z

/bot reuse-pipeline

tensorrt-cicd · 2025-04-17T04:36:43Z

PR_Github #2575 [ reuse-pipeline ] completed with state SUCCESS
Reusing PR_Github #2465 for commit f236ee7

tensorrt-cicd · 2025-04-17T04:39:21Z

PR_Github #2577 [ reuse-pipeline ] triggered by Bot

tensorrt-cicd · 2025-04-17T04:48:25Z

PR_Github #2577 [ reuse-pipeline ] completed with state SUCCESS
Reusing PR_Github #2465 for commit 5cca096

danielafrimi force-pushed the lora_op_pytorch_flow branch from e8a5d2e to 6da8f63 Compare April 14, 2025 12:28

danielafrimi commented Apr 14, 2025

View reviewed changes

cpp/tensorrt_llm/thop/loraOp.cpp Show resolved Hide resolved

byshiue reviewed Apr 15, 2025

View reviewed changes

tests/unittest/_torch/modules/tests_lora_modules/test_lora_mlp_pytorch_flow_vs_trt.py Outdated Show resolved Hide resolved

danielafrimi force-pushed the lora_op_pytorch_flow branch from bd29792 to 2939140 Compare April 15, 2025 14:25

danielafrimi force-pushed the lora_op_pytorch_flow branch from 2939140 to 490ae2e Compare April 16, 2025 10:56

Naveassaf self-requested a review April 17, 2025 04:18

Merge branch 'main' into lora_op_pytorch_flow

f236ee7

Naveassaf approved these changes Apr 17, 2025

View reviewed changes

Naveassaf enabled auto-merge (squash) April 17, 2025 04:25

Merge branch 'main' into lora_op_pytorch_flow

5cca096

Naveassaf merged commit 0f084d9 into NVIDIA:main Apr 17, 2025
3 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat/loraOp #3455

feat/loraOp #3455

danielafrimi commented Apr 10, 2025 •

edited

Loading

danielafrimi commented Apr 14, 2025

danielafrimi commented Apr 14, 2025

tensorrt-cicd commented Apr 14, 2025

tensorrt-cicd commented Apr 14, 2025

tensorrt-cicd commented Apr 14, 2025

tensorrt-cicd commented Apr 14, 2025

danielafrimi commented Apr 14, 2025

danielafrimi commented Apr 14, 2025

tensorrt-cicd commented Apr 14, 2025

tensorrt-cicd commented Apr 14, 2025

danielafrimi commented Apr 16, 2025

tensorrt-cicd commented Apr 16, 2025

tensorrt-cicd commented Apr 16, 2025

Naveassaf commented Apr 17, 2025

Naveassaf commented Apr 17, 2025

tensorrt-cicd commented Apr 17, 2025

Naveassaf commented Apr 17, 2025

tensorrt-cicd commented Apr 17, 2025

tensorrt-cicd commented Apr 17, 2025

tensorrt-cicd commented Apr 17, 2025

feat/loraOp #3455

feat/loraOp #3455

Conversation

danielafrimi commented Apr 10, 2025 • edited Loading

danielafrimi commented Apr 14, 2025

danielafrimi commented Apr 14, 2025

tensorrt-cicd commented Apr 14, 2025

tensorrt-cicd commented Apr 14, 2025

tensorrt-cicd commented Apr 14, 2025

tensorrt-cicd commented Apr 14, 2025

danielafrimi commented Apr 14, 2025

danielafrimi commented Apr 14, 2025

tensorrt-cicd commented Apr 14, 2025

tensorrt-cicd commented Apr 14, 2025

danielafrimi commented Apr 16, 2025

tensorrt-cicd commented Apr 16, 2025

tensorrt-cicd commented Apr 16, 2025

Naveassaf commented Apr 17, 2025

Naveassaf commented Apr 17, 2025

tensorrt-cicd commented Apr 17, 2025

Naveassaf commented Apr 17, 2025

tensorrt-cicd commented Apr 17, 2025

tensorrt-cicd commented Apr 17, 2025

tensorrt-cicd commented Apr 17, 2025

danielafrimi commented Apr 10, 2025 •

edited

Loading