-
Notifications
You must be signed in to change notification settings - Fork 1.4k
feat/loraOp #3455
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat/loraOp #3455
Conversation
/bot run |
1 similar comment
/bot run |
PR_Github #2136 [ run ] triggered by Bot |
PR_Github #2137 [ run ] triggered by Bot |
PR_Github #2137 [ run ] completed with state |
PR_Github #2136 [ run ] completed with state |
e8a5d2e
to
6da8f63
Compare
/bot run |
@Naveassaf @shaharmor98 @byshiue PR is open for feedback |
PR_Github #2192 [ run ] triggered by Bot |
PR_Github #2192 [ run ] completed with state |
tests/unittest/_torch/modules/tests_lora_modules/test_lora_mlp_pytorch_flow_vs_trt.py
Outdated
Show resolved
Hide resolved
bd29792
to
2939140
Compare
…ugin Signed-off-by: Ubuntu <[email protected]> att ragged works without lora Signed-off-by: Ubuntu <[email protected]> att lora works with ragged tensor - remove padding it True Signed-off-by: Ubuntu <[email protected]> att works with lora bs1 Signed-off-by: Ubuntu <[email protected]> lora op Signed-off-by: Ubuntu <[email protected]> lora op Signed-off-by: Ubuntu <[email protected]> wip on mlp vanilla test + minor change in the loraOp.app Signed-off-by: Ubuntu <[email protected]> mlp test is flaky Signed-off-by: Ubuntu <[email protected]> wip Signed-off-by: Ubuntu <[email protected]> remove mlp since its not stable Signed-off-by: Ubuntu <[email protected]>
2939140
to
490ae2e
Compare
/bot run |
PR_Github #2465 [ run ] triggered by Bot |
PR_Github #2465 [ run ] completed with state |
Spoke about this PR over slack. It adds incomplete functionality along with passing, none-flaky tests. The updates to this are already WIP and about to be placed in a separate PR. |
/bot reuse-pipeline |
PR_Github #2575 [ reuse-pipeline ] triggered by Bot |
/bot reuse-pipeline |
PR_Github #2575 [ reuse-pipeline ] completed with state |
PR_Github #2577 [ reuse-pipeline ] triggered by Bot |
PR_Github #2577 [ reuse-pipeline ] completed with state |
A new C++ file (
cpp/tensorrt_llm/thop/loraOp.cpp
) was added.This file defines a PyTorch custom operator named
trtllm::lora_grouped_gemm.
The forward method of the LoraLayer class was modified. it now calls the newly created custom operator
torch.ops.trtllm.lora_grouped_gemm
(the file above).Tests:
loraPlugin
vsloraOp