Skip to content

feat/loraOp #3455

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 3 commits into from
Apr 17, 2025
Merged

feat/loraOp #3455

merged 3 commits into from
Apr 17, 2025

Conversation

danielafrimi
Copy link
Collaborator

@danielafrimi danielafrimi commented Apr 10, 2025

  1. A new C++ file (cpp/tensorrt_llm/thop/loraOp.cpp) was added.
    This file defines a PyTorch custom operator named trtllm::lora_grouped_gemm.

  2. The forward method of the LoraLayer class was modified. it now calls the newly created custom operator torch.ops.trtllm.lora_grouped_gemm (the file above).

  3. Tests:

  • MLP with lora (loraOp)
  • Attention with lora (loraOp) - as the pytorch flow assume the input is not padded, we compare both of the attention layers with no padding (loraOp changed a little bit).
  • Comparison of loraPlugin vs loraOp

@danielafrimi
Copy link
Collaborator Author

/bot run

1 similar comment
@danielafrimi
Copy link
Collaborator Author

/bot run

@tensorrt-cicd
Copy link
Collaborator

PR_Github #2136 [ run ] triggered by Bot

@tensorrt-cicd
Copy link
Collaborator

PR_Github #2137 [ run ] triggered by Bot

@tensorrt-cicd
Copy link
Collaborator

PR_Github #2137 [ run ] completed with state ABORTED

@tensorrt-cicd
Copy link
Collaborator

PR_Github #2136 [ run ] completed with state SUCCESS
/LLM/main/L0_MergeRequest_PR pipeline #1554 completed with status: 'FAILURE'

@danielafrimi danielafrimi force-pushed the lora_op_pytorch_flow branch from e8a5d2e to 6da8f63 Compare April 14, 2025 12:28
@danielafrimi
Copy link
Collaborator Author

/bot run

@danielafrimi
Copy link
Collaborator Author

@Naveassaf @shaharmor98 @byshiue

PR is open for feedback

@tensorrt-cicd
Copy link
Collaborator

PR_Github #2192 [ run ] triggered by Bot

@tensorrt-cicd
Copy link
Collaborator

PR_Github #2192 [ run ] completed with state SUCCESS
/LLM/main/L0_MergeRequest_PR pipeline #1584 completed with status: 'FAILURE'

@danielafrimi danielafrimi force-pushed the lora_op_pytorch_flow branch from bd29792 to 2939140 Compare April 15, 2025 14:25
…ugin

Signed-off-by: Ubuntu <[email protected]>

att ragged works without lora

Signed-off-by: Ubuntu <[email protected]>

att lora works with ragged tensor - remove padding it True

Signed-off-by: Ubuntu <[email protected]>

att works with lora bs1

Signed-off-by: Ubuntu <[email protected]>

lora op

Signed-off-by: Ubuntu <[email protected]>

lora op

Signed-off-by: Ubuntu <[email protected]>

wip on mlp vanilla test + minor change in the loraOp.app

Signed-off-by: Ubuntu <[email protected]>

mlp test is flaky

Signed-off-by: Ubuntu <[email protected]>

wip

Signed-off-by: Ubuntu <[email protected]>

remove mlp since its not stable

Signed-off-by: Ubuntu <[email protected]>
@danielafrimi danielafrimi force-pushed the lora_op_pytorch_flow branch from 2939140 to 490ae2e Compare April 16, 2025 10:56
@danielafrimi
Copy link
Collaborator Author

/bot run

@tensorrt-cicd
Copy link
Collaborator

PR_Github #2465 [ run ] triggered by Bot

@tensorrt-cicd
Copy link
Collaborator

PR_Github #2465 [ run ] completed with state SUCCESS
/LLM/main/L0_MergeRequest_PR pipeline #1768 completed with status: 'SUCCESS'

@Naveassaf Naveassaf self-requested a review April 17, 2025 04:18
@Naveassaf
Copy link
Collaborator

Spoke about this PR over slack. It adds incomplete functionality along with passing, none-flaky tests.

The updates to this are already WIP and about to be placed in a separate PR.

@Naveassaf
Copy link
Collaborator

/bot reuse-pipeline

@Naveassaf Naveassaf enabled auto-merge (squash) April 17, 2025 04:25
@tensorrt-cicd
Copy link
Collaborator

PR_Github #2575 [ reuse-pipeline ] triggered by Bot

@Naveassaf
Copy link
Collaborator

/bot reuse-pipeline

@tensorrt-cicd
Copy link
Collaborator

PR_Github #2575 [ reuse-pipeline ] completed with state SUCCESS
Reusing PR_Github #2465 for commit f236ee7

@tensorrt-cicd
Copy link
Collaborator

PR_Github #2577 [ reuse-pipeline ] triggered by Bot

@tensorrt-cicd
Copy link
Collaborator

PR_Github #2577 [ reuse-pipeline ] completed with state SUCCESS
Reusing PR_Github #2465 for commit 5cca096

@Naveassaf Naveassaf merged commit 0f084d9 into NVIDIA:main Apr 17, 2025
3 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants