Skip to content

feat: Add FP8 support for SM 120 #3248

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 6 commits into from
Apr 14, 2025
Merged

Conversation

pamelap-nvidia
Copy link
Collaborator

  • Added fp8 support for sm120
  • Avoided sm120 for fp4 in a few places
  • Cubins are already updated from previous MRs.

@pamelap-nvidia
Copy link
Collaborator Author

/bot run

@juney-nvidia juney-nvidia changed the title Add FP8 support for SM 120 feat: Add FP8 support for SM 120 Apr 3, 2025
@pamelap-nvidia pamelap-nvidia force-pushed the gb20x_fp8 branch 2 times, most recently from edb80f7 to 97e6fb3 Compare April 3, 2025 14:35
@pamelap-nvidia
Copy link
Collaborator Author

/bot run

1 similar comment
@schetlur-nv
Copy link
Collaborator

/bot run

@tensorrt-cicd
Copy link
Collaborator

PR_Github #1143 [ run ] triggered by Bot

@tensorrt-cicd
Copy link
Collaborator

PR_Github #1143 [ run ] completed with state FAILURE
/LLM/main/L0_MergeRequest_PR pipeline #860 completed with status: 'FAILURE'

@yibinl-nvidia
Copy link
Collaborator

/bot run

@tensorrt-cicd
Copy link
Collaborator

PR_Github #1144 [ run ] triggered by Bot

@tensorrt-cicd
Copy link
Collaborator

PR_Github #1144 [ run ] completed with state SUCCESS
/LLM/main/L0_MergeRequest_PR pipeline #861 completed with status: 'FAILURE'

@yibinl-nvidia
Copy link
Collaborator

/bot run

@tensorrt-cicd
Copy link
Collaborator

PR_Github #1152 [ run ] triggered by Bot

@tensorrt-cicd
Copy link
Collaborator

PR_Github #1152 [ run ] completed with state SUCCESS
/LLM/main/L0_MergeRequest_PR pipeline #867 completed with status: 'SUCCESS'

@pamelap-nvidia
Copy link
Collaborator Author

/bot run

@tensorrt-cicd
Copy link
Collaborator

PR_Github #1681 [ run ] triggered by Bot

@pamelap-nvidia pamelap-nvidia requested a review from chzblych April 10, 2025 02:52
@tensorrt-cicd
Copy link
Collaborator

PR_Github #1681 [ run ] completed with state SUCCESS
/LLM/main/L0_MergeRequest_PR pipeline #1257 completed with status: 'FAILURE'

@pamelap-nvidia
Copy link
Collaborator Author

/bot run --stage-list "DGX_H100-4_GPUs-PyTorch-1, DGX_H100-4_GPUs-PyTorch-2, DGX_H100-4_GPUs-TensorRT-2"

@tensorrt-cicd
Copy link
Collaborator

PR_Github #1803 [ run ] triggered by Bot

@tensorrt-cicd
Copy link
Collaborator

PR_Github #1803 [ run ] completed with state FAILURE
/LLM/main/L0_MergeRequest_PR pipeline #1336 (Partly Tested) completed with status: 'FAILURE'

@chzblych
Copy link
Collaborator

@pamelap-nvidia FYI - @EmmaQiaoCh is also adding the automation testing for gb20x.

@chzblych chzblych requested a review from EmmaQiaoCh April 11, 2025 09:53
Signed-off-by: Pamela Peng <[email protected]>
Signed-off-by: Pamela Peng <[email protected]>
Signed-off-by: Pamela Peng <[email protected]>
Signed-off-by: Pamela Peng <[email protected]>
@pamelap-nvidia
Copy link
Collaborator Author

/bot run --stage-list "DGX_H100-4_GPUs-PyTorch-1, DGX_H100-4_GPUs-PyTorch-2, DGX_H100-4_GPUs-TensorRT-2"

@tensorrt-cicd
Copy link
Collaborator

PR_Github #1951 [ run ] triggered by Bot

@tensorrt-cicd
Copy link
Collaborator

PR_Github #1951 [ run ] completed with state SUCCESS
/LLM/main/L0_MergeRequest_PR pipeline #1436 (Partly Tested) completed with status: 'SUCCESS'

@EmmaQiaoCh
Copy link
Collaborator

Hi Pamela, will you also add the test list yml file for L0 under 'tests/integration/test_lists/test-db'? Or you can tell me the tests that you want to run on SM120, I also need to change some CI scripts:)
Thanks~

@pamelap-nvidia
Copy link
Collaborator Author

Hi Pamela, will you also add the test list yml file for L0 under 'tests/integration/test_lists/test-db'? Or you can tell me the tests that you want to run on SM120, I also need to change some CI scripts:) Thanks~

Resolved offline.

@schetlur-nv schetlur-nv requested a review from brb-nv April 14, 2025 17:36
@brb-nv
Copy link
Collaborator

brb-nv commented Apr 14, 2025

Minor comments. Changes look good to me.

@schetlur-nv schetlur-nv merged commit 6cdfc54 into NVIDIA:main Apr 14, 2025
2 checks passed
@schetlur-nv
Copy link
Collaborator

Bypassing some checks to merge since comments addressed after PR_Github #1951 were very minor.

vegaluisjose pushed a commit to vegaluisjose/TensorRT-LLM that referenced this pull request Apr 15, 2025
* Allow FP8 on SM120

Signed-off-by: Pamela Peng <[email protected]>

* fix sm121

Signed-off-by: Pamela Peng <[email protected]>

* fix

Signed-off-by: Pamela Peng <[email protected]>

* fix pre-commit

Signed-off-by: Pamela Peng <[email protected]>

* review update

Signed-off-by: Pamela Peng <[email protected]>

---------

Signed-off-by: Pamela Peng <[email protected]>
Co-authored-by: Sharan Chetlur <[email protected]>
Signed-off-by: Luis Vega <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants