Skip to content

feat: use NVRTC for DeepGEMM JIT compilation #3239

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 6 commits into from
Apr 7, 2025

Conversation

lucifer1004
Copy link
Collaborator

No description provided.

@lucifer1004 lucifer1004 force-pushed the user/zihuaw/deep_gemm_nvrtc branch from a411cdb to 7dafe95 Compare April 2, 2025 11:17
@lucifer1004
Copy link
Collaborator Author

/bot run

@tensorrt-cicd
Copy link
Collaborator

PR_Github #1012 [ run ] triggered by Bot

@tensorrt-cicd
Copy link
Collaborator

PR_Github #1012 [ run ] completed with state SUCCESS
/LLM/main/L0_MergeRequest_PR pipeline #779 completed with status: 'SUCCESS'

@lucifer1004 lucifer1004 force-pushed the user/zihuaw/deep_gemm_nvrtc branch from 7dafe95 to 7ec7850 Compare April 3, 2025 04:27
@lucifer1004
Copy link
Collaborator Author

/bot run

@tensorrt-cicd
Copy link
Collaborator

PR_Github #1068 [ run ] triggered by Bot

@tensorrt-cicd
Copy link
Collaborator

PR_Github #1068 [ run ] completed with state SUCCESS
/LLM/main/L0_MergeRequest_PR pipeline #819 completed with status: 'SUCCESS'

@lucifer1004 lucifer1004 force-pushed the user/zihuaw/deep_gemm_nvrtc branch from 26926bc to 3fa9f31 Compare April 3, 2025 09:48
@lucifer1004
Copy link
Collaborator Author

/bot run

@lucifer1004 lucifer1004 requested a review from litaotju April 3, 2025 09:49
@tensorrt-cicd
Copy link
Collaborator

PR_Github #1103 [ run ] triggered by Bot

@litaotju
Copy link
Collaborator

litaotju commented Apr 3, 2025

I am approving this. The overall code LGTM.

The on-by-default of DeepGemm for DS model should be converged before merging.
@NVGaryJi @juney-nvidia do you have oponion on this?
#3239 (comment)

@tensorrt-cicd
Copy link
Collaborator

PR_Github #1103 [ run ] completed with state SUCCESS
/LLM/main/L0_MergeRequest_PR pipeline #840 completed with status: 'FAILURE'

@lucifer1004
Copy link
Collaborator Author

/bot run --reuse-test

@tensorrt-cicd
Copy link
Collaborator

PR_Github #1118 [ run ] triggered by Bot

@tensorrt-cicd
Copy link
Collaborator

PR_Github #1118 [ run ] completed with state SUCCESS
/LLM/main/L0_MergeRequest_PR pipeline #847 completed with status: 'FAILURE'

@lucifer1004
Copy link
Collaborator Author

/bot run --reuse-test

1 similar comment
@lucifer1004
Copy link
Collaborator Author

/bot run --reuse-test

@tensorrt-cicd
Copy link
Collaborator

PR_Github #1134 [ run ] triggered by Bot

@tensorrt-cicd
Copy link
Collaborator

PR_Github #1134 [ run ] completed with state SUCCESS
/LLM/main/L0_MergeRequest_PR pipeline #854 completed with status: 'FAILURE'

@lucifer1004
Copy link
Collaborator Author

/bot run

@tensorrt-cicd
Copy link
Collaborator

PR_Github #1149 [ run ] triggered by Bot

@tensorrt-cicd
Copy link
Collaborator

PR_Github #1149 [ run ] completed with state SUCCESS
/LLM/main/L0_MergeRequest_PR pipeline #865 completed with status: 'FAILURE'

@lucifer1004
Copy link
Collaborator Author

/bot run

@tensorrt-cicd
Copy link
Collaborator

PR_Github #1169 [ run ] triggered by Bot

@lucifer1004 lucifer1004 force-pushed the user/zihuaw/deep_gemm_nvrtc branch from 8b9c133 to 1c39a52 Compare April 6, 2025 21:36
@lucifer1004
Copy link
Collaborator Author

/bot run

@tensorrt-cicd
Copy link
Collaborator

PR_Github #1238 [ run ] triggered by Bot

@tensorrt-cicd
Copy link
Collaborator

PR_Github #1238 [ run ] completed with state SUCCESS
/LLM/main/L0_MergeRequest_PR pipeline #931 completed with status: 'FAILURE'

@lucifer1004 lucifer1004 force-pushed the user/zihuaw/deep_gemm_nvrtc branch from 1c39a52 to 08ff654 Compare April 7, 2025 07:40
@lucifer1004
Copy link
Collaborator Author

/bot run --stage-list "DGX_H100-4_GPUs-PyTorch-1, DGX_H100-4_GPUs-PyTorch-2"

@tensorrt-cicd
Copy link
Collaborator

PR_Github #1291 [ run ] triggered by Bot

@tensorrt-cicd
Copy link
Collaborator

PR_Github #1291 [ run ] completed with state SUCCESS
/LLM/main/L0_MergeRequest_PR pipeline #973 (Partly Tested) completed with status: 'SUCCESS'

@lucifer1004 lucifer1004 force-pushed the user/zihuaw/deep_gemm_nvrtc branch from 08ff654 to a04862b Compare April 7, 2025 11:42
@lucifer1004
Copy link
Collaborator Author

/bot skip --comment "L0_MergeRequest_PR/931 + L0_MergeRequest_PR/973"

@tensorrt-cicd
Copy link
Collaborator

PR_Github #1318 [ skip ] triggered by Bot

@tensorrt-cicd
Copy link
Collaborator

PR_Github #1318 [ skip ] completed with state SUCCESS
Skipping testing for commit a04862b

@juney-nvidia juney-nvidia force-pushed the user/zihuaw/deep_gemm_nvrtc branch from a04862b to 8c51c85 Compare April 7, 2025 12:17
@juney-nvidia
Copy link
Collaborator

/bot reuse-pipeline

@juney-nvidia juney-nvidia enabled auto-merge (squash) April 7, 2025 12:21
@tensorrt-cicd
Copy link
Collaborator

PR_Github #1322 [ reuse-pipeline ] triggered by Bot

@juney-nvidia
Copy link
Collaborator

I am approving this. The overall code LGTM.

The on-by-default of DeepGemm for DS model should be converged before merging. @NVGaryJi @juney-nvidia do you have oponion on this? #3239 (comment)

Thanks, Tao. I am okay to firstly merge this MR, then have another MR to enable DeepGEMM by default. Thus to amortize the overhead for Zihua a little bit :)

June

@tensorrt-cicd
Copy link
Collaborator

PR_Github #1322 [ reuse-pipeline ] completed with state SUCCESS
Reusing PR_Github #1291 (Partly Tested) for commit 8c51c85

@juney-nvidia juney-nvidia merged commit 3767310 into NVIDIA:main Apr 7, 2025
2 checks passed
sarattha pushed a commit to sarattha/TensorRT-LLM that referenced this pull request Apr 9, 2025
* feat: use NVRTC for DeepGEMM JIT compilation

Signed-off-by: Zihua Wu

* fix: add license

Signed-off-by: Zihua Wu

* feat: store NVRTC JIT results in memory by default

Signed-off-by: Zihua Wu

* feat: refinement

Signed-off-by: Zihua Wu

* feat: refinement

Signed-off-by: Zihua Wu

* test: set timeout to 7200

Signed-off-by: Zihua Wu

---------

Signed-off-by: Zihua Wu
Signed-off-by: sarattha <[email protected]>
tomeras91 pushed a commit to tomeras91/TensorRT-LLM that referenced this pull request Apr 9, 2025
* feat: use NVRTC for DeepGEMM JIT compilation

Signed-off-by: Zihua Wu 

* fix: add license

Signed-off-by: Zihua Wu

* feat: store NVRTC JIT results in memory by default

Signed-off-by: Zihua Wu


* feat: refinement

Signed-off-by: Zihua Wu

* feat: refinement

Signed-off-by: Zihua Wu

* test: set timeout to 7200

Signed-off-by: Zihua Wu

---------

Signed-off-by: Zihua Wu
tomeras91 pushed a commit to tomeras91/TensorRT-LLM that referenced this pull request Apr 9, 2025
* feat: use NVRTC for DeepGEMM JIT compilation

Signed-off-by: Zihua Wu 

* fix: add license

Signed-off-by: Zihua Wu

* feat: store NVRTC JIT results in memory by default

Signed-off-by: Zihua Wu


* feat: refinement

Signed-off-by: Zihua Wu

* feat: refinement

Signed-off-by: Zihua Wu

* test: set timeout to 7200

Signed-off-by: Zihua Wu

---------

Signed-off-by: Zihua Wu
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants