-
Notifications
You must be signed in to change notification settings - Fork 1.4k
Doc: update steps of using Draft-Target-Model (DTM) in the documents. #3366
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
29bb508
to
9d67cb7
Compare
Signed-off-by: wili-65535 <[email protected]>
Signed-off-by: wili-65535 <[email protected]>
Signed-off-by: wili-65535 <[email protected]>
Signed-off-by: wili-65535 <[email protected]>
9d67cb7
to
f3624c7
Compare
PR_Github #1554 [ run ] completed with state |
Signed-off-by: wili-65535 <[email protected]>
/bot run |
PR_Github #1559 [ run ] triggered by Bot |
@lfr-0531 @yweng0828 Can you help review this PR? |
PR_Github #1559 [ run ] completed with state |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM~
Hi @wili-65535 , thanks for the update. The documentation is much better organized now. : ) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Copilot reviewed 3 out of 3 changed files in this pull request and generated 2 comments.
Comments suppressed due to low confidence (1)
examples/draft_target_model/README.md:89
- The configuration explanation above uses '[4,[0],[1],False]' as an example, but the code sample uses '[4,[0],[1],True]'. Please update one of them to ensure consistency.
--draft_target_model_config="[4,[0],[1],True]" \
Signed-off-by: wili-65535 <[email protected]>
Signed-off-by: wili-65535 <[email protected]>
/bot reuse-pipeline |
Thank you @lfr-0531, @yweng0828 and @kaiyux ! And I fix the comments Copilot comes up with, so could we continue to merge? |
I've already set auto-merge. Thanks. |
PR_Github #1590 [ reuse-pipeline ] triggered by Bot |
PR_Github #1590 [ reuse-pipeline ] completed with state |
In this PR, we update the steps of using Draft-Target-Model (DTM) speculative decoding in TensorRT-LLM and TensorRT-LLm-backend.
Based on TRTLLM-backend: commit
86542b637bbccdd708ab892337c8ad3a95932131
(tag: v0.18.0) and docker imagenvcr.io/nvidia/tritonserver:25.03-trtllm-python-py3
.