Doc: update steps of using Draft-Target-Model (DTM) in the documents. #3366

wili-65535 · 2025-04-08T08:58:49Z

In this PR, we update the steps of using Draft-Target-Model (DTM) speculative decoding in TensorRT-LLM and TensorRT-LLm-backend.

Based on TRTLLM-backend: commit 86542b637bbccdd708ab892337c8ad3a95932131 (tag: v0.18.0) and docker image nvcr.io/nvidia/tritonserver:25.03-trtllm-python-py3.

Signed-off-by: wili-65535 <[email protected]>

tensorrt-cicd · 2025-04-09T04:58:24Z

PR_Github #1554 [ run ] completed with state FAILURE
/LLM/main/L0_MergeRequest_PR pipeline #1161 completed with status: 'FAILURE'

Signed-off-by: wili-65535 <[email protected]>

wili-65535 · 2025-04-09T05:05:54Z

/bot run

tensorrt-cicd · 2025-04-09T05:11:24Z

PR_Github #1559 [ run ] triggered by Bot

kaiyux · 2025-04-09T05:43:50Z

@lfr-0531 @yweng0828 Can you help review this PR?

tensorrt-cicd · 2025-04-09T08:10:47Z

PR_Github #1559 [ run ] completed with state SUCCESS
/LLM/main/L0_MergeRequest_PR pipeline #1165 completed with status: 'SUCCESS'

lfr-0531

LGTM~

yweng0828 · 2025-04-09T09:05:13Z

Hi @wili-65535 , thanks for the update. The documentation is much better organized now. : )

Copilot

Copilot reviewed 3 out of 3 changed files in this pull request and generated 2 comments.

Comments suppressed due to low confidence (1)

examples/draft_target_model/README.md:89

The configuration explanation above uses '[4,[0],[1],False]' as an example, but the code sample uses '[4,[0],[1],True]'. Please update one of them to ensure consistency.

    --draft_target_model_config="[4,[0],[1],True]" \

docs/source/advanced/speculative-decoding.md

examples/draft_target_model/README.md

Signed-off-by: wili-65535 <[email protected]>

kaiyux · 2025-04-09T09:22:41Z

/bot reuse-pipeline

wili-65535 · 2025-04-09T09:23:55Z

Thank you @lfr-0531, @yweng0828 and @kaiyux ! And I fix the comments Copilot comes up with, so could we continue to merge?

kaiyux · 2025-04-09T09:25:34Z

Thank you @lfr-0531, @yweng0828 and @kaiyux ! And I fix the comments Copilot comes up with, so could we continue to merge?

I've already set auto-merge. Thanks.

tensorrt-cicd · 2025-04-09T09:29:04Z

PR_Github #1590 [ reuse-pipeline ] triggered by Bot

tensorrt-cicd · 2025-04-09T09:35:00Z

PR_Github #1590 [ reuse-pipeline ] completed with state SUCCESS
Reusing PR_Github #1559 for commit ec270fc

wili-65535 force-pushed the doc/Draft-Target-Model branch 2 times, most recently from 29bb508 to 9d67cb7 Compare April 9, 2025 00:24

wili-65535 added 4 commits April 9, 2025 12:40

doc/Draft-Target-Model: v1.0

901ff79

Signed-off-by: wili-65535 <[email protected]>

doc/Draft-Target-Model: v1.1

63e8fd6

Signed-off-by: wili-65535 <[email protected]>

doc/Draft-Target-Model: v1.2

69b5a96

Signed-off-by: wili-65535 <[email protected]>

doc/Draft-Target-Model: v1.3

f3624c7

Signed-off-by: wili-65535 <[email protected]>

wili-65535 force-pushed the doc/Draft-Target-Model branch from 9d67cb7 to f3624c7 Compare April 9, 2025 04:40

doc/Draft-Target-Model: v1.4

4b04288

Signed-off-by: wili-65535 <[email protected]>

NVIDIA deleted a comment from tensorrt-cicd Apr 9, 2025

kaiyux requested review from lfr-0531 and yweng0828 April 9, 2025 05:43

lfr-0531 approved these changes Apr 9, 2025

View reviewed changes

yweng0828 approved these changes Apr 9, 2025

View reviewed changes

kaiyux requested a review from Copilot April 9, 2025 09:11

Copilot AI reviewed Apr 9, 2025

View reviewed changes

docs/source/advanced/speculative-decoding.md Outdated Show resolved Hide resolved

examples/draft_target_model/README.md Outdated Show resolved Hide resolved

wili-65535 added 2 commits April 9, 2025 17:18

doc/Draft-Target-Model: v1.5

f712c79

Signed-off-by: wili-65535 <[email protected]>

doc/Draft-Target-Model: v1.6

ec270fc

Signed-off-by: wili-65535 <[email protected]>

kaiyux enabled auto-merge (squash) April 9, 2025 09:23

kaiyux merged commit 6f1b2cd into NVIDIA:main Apr 9, 2025
2 checks passed

wili-65535 deleted the doc/Draft-Target-Model branch April 9, 2025 09:59

This was referenced Apr 17, 2025

doc/draft-target-model-V2 #3655

Closed

[doc] Better document for Draft-Target-Model (DTM) speculative decoding #3797

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Doc: update steps of using Draft-Target-Model (DTM) in the documents. #3366

Doc: update steps of using Draft-Target-Model (DTM) in the documents. #3366

wili-65535 commented Apr 8, 2025

tensorrt-cicd commented Apr 9, 2025

wili-65535 commented Apr 9, 2025

tensorrt-cicd commented Apr 9, 2025

kaiyux commented Apr 9, 2025

tensorrt-cicd commented Apr 9, 2025

lfr-0531 left a comment

yweng0828 commented Apr 9, 2025

Copilot AI left a comment

kaiyux commented Apr 9, 2025

wili-65535 commented Apr 9, 2025

kaiyux commented Apr 9, 2025

tensorrt-cicd commented Apr 9, 2025

tensorrt-cicd commented Apr 9, 2025

Doc: update steps of using Draft-Target-Model (DTM) in the documents. #3366

Doc: update steps of using Draft-Target-Model (DTM) in the documents. #3366

Conversation

wili-65535 commented Apr 8, 2025

tensorrt-cicd commented Apr 9, 2025

wili-65535 commented Apr 9, 2025

tensorrt-cicd commented Apr 9, 2025

kaiyux commented Apr 9, 2025

tensorrt-cicd commented Apr 9, 2025

lfr-0531 left a comment

Choose a reason for hiding this comment

yweng0828 commented Apr 9, 2025

Copilot AI left a comment

Choose a reason for hiding this comment

kaiyux commented Apr 9, 2025

wili-65535 commented Apr 9, 2025

kaiyux commented Apr 9, 2025

tensorrt-cicd commented Apr 9, 2025

tensorrt-cicd commented Apr 9, 2025