chore: Unify Python NVTX call #3450

kaiyux · 2025-04-10T12:21:31Z

Move and unify NVTX implementation to tensorrt_llm/_utils.py
Replace the original LLM API NVTX with nvtx_range_debug call

Copilot

Copilot reviewed 10 out of 10 changed files in this pull request and generated 1 comment.

tensorrt_llm/_utils.py

kaiyux · 2025-04-10T12:40:40Z

/bot run

tensorrt-cicd · 2025-04-10T12:46:06Z

PR_Github #1775 [ run ] triggered by Bot

tensorrt-cicd · 2025-04-10T16:30:11Z

PR_Github #1775 [ run ] completed with state SUCCESS
/LLM/main/L0_MergeRequest_PR pipeline #1316 completed with status: 'FAILURE'

Superjomn · 2025-04-11T00:58:50Z

We mentioned having nvtx levels in an earlier discussion for tradeoff between details and performance with nsys, I noticed this MR introduces two levels, nvtx_range and nvtx_range_debug, are these two levels sufficient? Or will we introduce more or per-module switching in the future? @kaiyux @FrankD412

And I noticed this PR move nvtx_range to ._utils.py, will you unify the torch.cuda.nvtx_range usages in other places like pyexecutor? @kaiyux

kaiyux · 2025-04-11T02:45:03Z

@Superjomn Thanks for the comment

are these two levels sufficient? Or will we introduce more or per-module switching in the future?

I can't think of a scenario that we will need more levels - we already have layerwise nvtx markers as a feature.

And I noticed this PR move nvtx_range to ._utils.py, will you unify the torch.cuda.nvtx_range usages in other places like pyexecutor?

It should already be unified in this PR.

Superjomn · 2025-04-11T02:48:45Z

@Superjomn Thanks for the comment

are these two levels sufficient? Or will we introduce more or per-module switching in the future?

I can't think of a scenario that we will need more levels - we already have layerwise nvtx markers as a feature.

And I noticed this PR move nvtx_range to ._utils.py, will you unify the torch.cuda.nvtx_range usages in other places like pyexecutor?

It should already be unified in this PR.

Got it; that is clear. Thanks for your reply.

kaiyux · 2025-04-11T02:49:46Z

/bot run

requirements.txt

tensorrt_llm/_utils.py

kaiyux · 2025-04-11T02:57:41Z

/bot run

tensorrt-cicd · 2025-04-11T03:03:00Z

PR_Github #1851 [ run ] triggered by Bot

tensorrt_llm/executor/result.py

requirements.txt

FrankD412 · 2025-04-11T03:56:17Z

We mentioned having nvtx levels in an earlier discussion for tradeoff between details and performance with nsys, I noticed this MR introduces two levels, nvtx_range and nvtx_range_debug, are these two levels sufficient? Or will we introduce more or per-module switching in the future? @kaiyux @FrankD412

And I noticed this PR move nvtx_range to ._utils.py, will you unify the torch.cuda.nvtx_range usages in other places like pyexecutor? @kaiyux

I think two levels is fine -- when we profile we want as much information as possible so as long as it's documented how to do that I think we're happy on our end.

tensorrt-cicd · 2025-04-11T13:51:22Z

PR_Github #1851 [ run ] completed with state SUCCESS
/LLM/main/L0_MergeRequest_PR pipeline #1370 completed with status: 'SUCCESS'

Superjomn

LGTM

Signed-off-by: Kaiyu Xie <[email protected]>

kaiyux · 2025-04-15T15:12:50Z

/bot reuse-pipeline

tensorrt-cicd · 2025-04-15T15:18:37Z

PR_Github #2347 [ reuse-pipeline ] triggered by Bot

tensorrt-cicd · 2025-04-15T15:25:35Z

PR_Github #2347 [ reuse-pipeline ] completed with state SUCCESS
Reusing PR_Github #1851 for commit 6379c3e

Signed-off-by: Kaiyu Xie <[email protected]> Signed-off-by: Luis Vega <[email protected]>

kaiyux requested review from Superjomn, FrankD412 and Copilot April 10, 2025 12:21

Copilot AI reviewed Apr 10, 2025

View reviewed changes

tensorrt_llm/_utils.py Outdated Show resolved Hide resolved

Superjomn reviewed Apr 11, 2025

View reviewed changes

requirements.txt Outdated Show resolved Hide resolved

tensorrt_llm/_utils.py Show resolved Hide resolved

kaiyux force-pushed the user/kaiyu/unify_nvtx branch from eaa4ea3 to 955ddf7 Compare April 11, 2025 02:55

FrankD412 reviewed Apr 11, 2025

View reviewed changes

tensorrt_llm/executor/result.py Show resolved Hide resolved

requirements.txt Outdated Show resolved Hide resolved

Superjomn approved these changes Apr 14, 2025

View reviewed changes

kaiyux added 5 commits April 15, 2025 23:12

Unify Python NVTX call

d4c9a6b

Signed-off-by: Kaiyu Xie <[email protected]>

Fix

45abab7

Signed-off-by: Kaiyu Xie <[email protected]>

Fix tests

57f93df

Signed-off-by: Kaiyu Xie <[email protected]>

Update type hint and documents

62ae88d

Signed-off-by: Kaiyu Xie <[email protected]>

Add comments to matplotlib dep

6379c3e

Signed-off-by: Kaiyu Xie <[email protected]>

kaiyux force-pushed the user/kaiyu/unify_nvtx branch from 9eb5c50 to 6379c3e Compare April 15, 2025 15:12

kaiyux enabled auto-merge (squash) April 15, 2025 15:13

FrankD412 approved these changes Apr 15, 2025

View reviewed changes

kaiyux merged commit e037d3e into NVIDIA:main Apr 15, 2025
3 checks passed

vegaluisjose pushed a commit to vegaluisjose/TensorRT-LLM that referenced this pull request Apr 15, 2025

chore: Unify Python NVTX call (NVIDIA#3450)

00d7c73

Signed-off-by: Kaiyu Xie <[email protected]> Signed-off-by: Luis Vega <[email protected]>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

chore: Unify Python NVTX call #3450

chore: Unify Python NVTX call #3450

kaiyux commented Apr 10, 2025 •

edited

Loading

Copilot AI left a comment

kaiyux commented Apr 10, 2025

tensorrt-cicd commented Apr 10, 2025

tensorrt-cicd commented Apr 10, 2025

Superjomn commented Apr 11, 2025 •

edited

Loading

kaiyux commented Apr 11, 2025

Superjomn commented Apr 11, 2025

kaiyux commented Apr 11, 2025

kaiyux commented Apr 11, 2025

tensorrt-cicd commented Apr 11, 2025

FrankD412 commented Apr 11, 2025

tensorrt-cicd commented Apr 11, 2025

Superjomn left a comment

kaiyux commented Apr 15, 2025

tensorrt-cicd commented Apr 15, 2025

tensorrt-cicd commented Apr 15, 2025

chore: Unify Python NVTX call #3450

chore: Unify Python NVTX call #3450

Conversation

kaiyux commented Apr 10, 2025 • edited Loading

Copilot AI left a comment

Choose a reason for hiding this comment

kaiyux commented Apr 10, 2025

tensorrt-cicd commented Apr 10, 2025

tensorrt-cicd commented Apr 10, 2025

Superjomn commented Apr 11, 2025 • edited Loading

kaiyux commented Apr 11, 2025

Superjomn commented Apr 11, 2025

kaiyux commented Apr 11, 2025

kaiyux commented Apr 11, 2025

tensorrt-cicd commented Apr 11, 2025

FrankD412 commented Apr 11, 2025

tensorrt-cicd commented Apr 11, 2025

Superjomn left a comment

Choose a reason for hiding this comment

kaiyux commented Apr 15, 2025

tensorrt-cicd commented Apr 15, 2025

tensorrt-cicd commented Apr 15, 2025

kaiyux commented Apr 10, 2025 •

edited

Loading

Superjomn commented Apr 11, 2025 •

edited

Loading