Skip to content

fix: Fixing issue with first gen token being returned twice in streaming #3427

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 9 commits into from
Apr 14, 2025

Conversation

pcastonguay
Copy link
Collaborator

Better fix for first gen token being returned twice in streaming mode.

@pcastonguay pcastonguay requested a review from Shunkangz April 9, 2025 18:58
@pcastonguay pcastonguay force-pushed the first_double_token_fix branch from dfaebf0 to 384d556 Compare April 9, 2025 19:04
@pcastonguay
Copy link
Collaborator Author

/bot run --disable-fail-fast

@pcastonguay pcastonguay requested a review from Tabrizian April 9, 2025 19:19
@tensorrt-cicd
Copy link
Collaborator

PR_Github #1641 [ run ] triggered by Bot

Copy link
Member

@Tabrizian Tabrizian left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks Patrice!

@Shunkangz
Copy link
Collaborator

LGTM. Thank you!

@tensorrt-cicd
Copy link
Collaborator

PR_Github #1641 [ run ] completed with state SUCCESS
/LLM/main/L0_MergeRequest_PR pipeline #1228 completed with status: 'FAILURE'

@pcastonguay pcastonguay force-pushed the first_double_token_fix branch from df06fa2 to d90b2b3 Compare April 10, 2025 12:13
@pcastonguay
Copy link
Collaborator Author

/bot run --add-multi-gpu-test

@tensorrt-cicd
Copy link
Collaborator

PR_Github #1773 [ run ] triggered by Bot

@pcastonguay pcastonguay force-pushed the first_double_token_fix branch from d90b2b3 to 45ac435 Compare April 11, 2025 01:09
@pcastonguay
Copy link
Collaborator Author

/bot run --only-multi-gpu-test --disable-fail-fast

@tensorrt-cicd
Copy link
Collaborator

PR_Github #1827 [ run ] triggered by Bot

@tensorrt-cicd
Copy link
Collaborator

PR_Github #1773 [ run ] completed with state SUCCESS
/LLM/main/L0_MergeRequest_PR pipeline #1315 completed with status: 'FAILURE'

@tensorrt-cicd
Copy link
Collaborator

PR_Github #1827 [ run ] completed with state FAILURE
/LLM/main/L0_MergeRequest_PR pipeline #1351 (Partly Tested) completed with status: 'FAILURE'

@Shunkangz
Copy link
Collaborator

/bot run --only-multi-gpu-test --disable-fail-fast

@tensorrt-cicd
Copy link
Collaborator

PR_Github #1839 [ run ] triggered by Bot

@QiJune
Copy link
Collaborator

QiJune commented Apr 11, 2025

/bot run --only-multi-gpu-test --disable-fail-fast

@tensorrt-cicd
Copy link
Collaborator

PR_Github #1839 [ run ] completed with state FAILURE
/LLM/main/L0_MergeRequest_PR pipeline #1361 (Partly Tested) completed with status: 'FAILURE'

@tensorrt-cicd
Copy link
Collaborator

PR_Github #1865 [ run ] triggered by Bot

@tensorrt-cicd
Copy link
Collaborator

PR_Github #1865 [ run ] completed with state SUCCESS
/LLM/main/L0_MergeRequest_PR pipeline #1382 (Partly Tested) completed with status: 'FAILURE'

@QiJune
Copy link
Collaborator

QiJune commented Apr 12, 2025

/bot run --only-multi-gpu-test

@pcastonguay
Copy link
Collaborator Author

/bot run --add-multi-gpu-test --disable-fail-fast

@tensorrt-cicd
Copy link
Collaborator

PR_Github #2050 [ run ] triggered by Bot

@tensorrt-cicd
Copy link
Collaborator

PR_Github #2050 [ run ] completed with state SUCCESS
/LLM/main/L0_MergeRequest_PR pipeline #1509 completed with status: 'FAILURE'

@pcastonguay
Copy link
Collaborator Author

/bot run --stage-list "L40S-TensorRT-3"

@pcastonguay
Copy link
Collaborator Author

/bot run --only-multi-gpu-test

@tensorrt-cicd
Copy link
Collaborator

PR_Github #2059 [ run ] triggered by Bot

@tensorrt-cicd
Copy link
Collaborator

PR_Github #2060 [ run ] triggered by Bot

@tensorrt-cicd
Copy link
Collaborator

PR_Github #2059 [ run ] completed with state ABORTED

@pcastonguay
Copy link
Collaborator Author

/bot run --only-multi-gpu-test --disable-fail-fast

@tensorrt-cicd
Copy link
Collaborator

PR_Github #2061 [ run ] triggered by Bot

@tensorrt-cicd
Copy link
Collaborator

PR_Github #2060 [ run ] completed with state ABORTED

@tensorrt-cicd
Copy link
Collaborator

PR_Github #2061 [ run ] completed with state SUCCESS
/LLM/main/L0_MergeRequest_PR pipeline #1518 (Partly Tested) completed with status: 'SUCCESS'

@pcastonguay
Copy link
Collaborator Author

/bot run --stage-list "L40S-TensorRT-3"

@tensorrt-cicd
Copy link
Collaborator

PR_Github #2070 [ run ] triggered by Bot

@tensorrt-cicd
Copy link
Collaborator

PR_Github #2070 [ run ] completed with state SUCCESS
/LLM/main/L0_MergeRequest_PR pipeline #1525 (Partly Tested) completed with status: 'SUCCESS'

@pcastonguay
Copy link
Collaborator Author

/bot reuse-pipeline

@tensorrt-cicd
Copy link
Collaborator

PR_Github #2084 [ reuse-pipeline ] triggered by Bot

@tensorrt-cicd
Copy link
Collaborator

PR_Github #2084 [ reuse-pipeline ] completed with state SUCCESS
Reusing PR_Github #2070 (Partly Tested) for commit febb026

@pcastonguay
Copy link
Collaborator Author

/bot skip --comment "ran all tests previously"

@tensorrt-cicd
Copy link
Collaborator

PR_Github #2087 [ skip ] triggered by Bot

@tensorrt-cicd
Copy link
Collaborator

PR_Github #2087 [ skip ] completed with state SUCCESS
Skipping testing for commit 57726cd

@pcastonguay
Copy link
Collaborator Author

/bot skip --comment "Ran all tests previously"

@tensorrt-cicd
Copy link
Collaborator

PR_Github #2092 [ skip ] triggered by Bot

@tensorrt-cicd
Copy link
Collaborator

PR_Github #2092 [ skip ] completed with state SUCCESS
Skipping testing for commit b2c0059

@pcastonguay pcastonguay merged commit fe6f14b into NVIDIA:main Apr 14, 2025
3 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants