-
Notifications
You must be signed in to change notification settings - Fork 1.4k
feat: use NVRTC for DeepGEMM JIT compilation #3239
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: use NVRTC for DeepGEMM JIT compilation #3239
Conversation
a411cdb
to
7dafe95
Compare
/bot run |
PR_Github #1012 [ run ] triggered by Bot |
PR_Github #1012 [ run ] completed with state |
7dafe95
to
7ec7850
Compare
/bot run |
PR_Github #1068 [ run ] triggered by Bot |
PR_Github #1068 [ run ] completed with state |
26926bc
to
3fa9f31
Compare
/bot run |
PR_Github #1103 [ run ] triggered by Bot |
I am approving this. The overall code LGTM. The on-by-default of DeepGemm for DS model should be converged before merging. |
PR_Github #1103 [ run ] completed with state |
/bot run --reuse-test |
PR_Github #1118 [ run ] triggered by Bot |
PR_Github #1118 [ run ] completed with state |
/bot run --reuse-test |
1 similar comment
/bot run --reuse-test |
PR_Github #1134 [ run ] triggered by Bot |
PR_Github #1134 [ run ] completed with state |
/bot run |
PR_Github #1149 [ run ] triggered by Bot |
PR_Github #1149 [ run ] completed with state |
/bot run |
PR_Github #1169 [ run ] triggered by Bot |
8b9c133
to
1c39a52
Compare
/bot run |
PR_Github #1238 [ run ] triggered by Bot |
PR_Github #1238 [ run ] completed with state |
1c39a52
to
08ff654
Compare
/bot run --stage-list "DGX_H100-4_GPUs-PyTorch-1, DGX_H100-4_GPUs-PyTorch-2" |
PR_Github #1291 [ run ] triggered by Bot |
PR_Github #1291 [ run ] completed with state |
08ff654
to
a04862b
Compare
/bot skip --comment "L0_MergeRequest_PR/931 + L0_MergeRequest_PR/973" |
PR_Github #1318 [ skip ] triggered by Bot |
PR_Github #1318 [ skip ] completed with state |
Signed-off-by: Zihua Wu <[email protected]>
Signed-off-by: Zihua Wu <[email protected]>
Signed-off-by: Zihua Wu <[email protected]>
Signed-off-by: Zihua Wu <[email protected]>
Signed-off-by: Zihua Wu <[email protected]>
Signed-off-by: Zihua Wu <[email protected]>
a04862b
to
8c51c85
Compare
/bot reuse-pipeline |
PR_Github #1322 [ reuse-pipeline ] triggered by Bot |
Thanks, Tao. I am okay to firstly merge this MR, then have another MR to enable DeepGEMM by default. Thus to amortize the overhead for Zihua a little bit :) June |
PR_Github #1322 [ reuse-pipeline ] completed with state |
* feat: use NVRTC for DeepGEMM JIT compilation Signed-off-by: Zihua Wu * fix: add license Signed-off-by: Zihua Wu * feat: store NVRTC JIT results in memory by default Signed-off-by: Zihua Wu * feat: refinement Signed-off-by: Zihua Wu * feat: refinement Signed-off-by: Zihua Wu * test: set timeout to 7200 Signed-off-by: Zihua Wu --------- Signed-off-by: Zihua Wu Signed-off-by: sarattha <[email protected]>
* feat: use NVRTC for DeepGEMM JIT compilation Signed-off-by: Zihua Wu * fix: add license Signed-off-by: Zihua Wu * feat: store NVRTC JIT results in memory by default Signed-off-by: Zihua Wu * feat: refinement Signed-off-by: Zihua Wu * feat: refinement Signed-off-by: Zihua Wu * test: set timeout to 7200 Signed-off-by: Zihua Wu --------- Signed-off-by: Zihua Wu
* feat: use NVRTC for DeepGEMM JIT compilation Signed-off-by: Zihua Wu * fix: add license Signed-off-by: Zihua Wu * feat: store NVRTC JIT results in memory by default Signed-off-by: Zihua Wu * feat: refinement Signed-off-by: Zihua Wu * feat: refinement Signed-off-by: Zihua Wu * test: set timeout to 7200 Signed-off-by: Zihua Wu --------- Signed-off-by: Zihua Wu
No description provided.