Skip to content

Clean up modeling_deepseek.py #3640

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Apr 19, 2025
Merged

Clean up modeling_deepseek.py #3640

merged 1 commit into from
Apr 19, 2025

Conversation

hlu1
Copy link
Collaborator

@hlu1 hlu1 commented Apr 17, 2025

  • Rename some variables to make the code more readable
  • Remove unnecessary if/else branches
  • Simplify some of the logics in allreduce fusion

The modeling_deepseek.py files looks mostly clean after this round of cleaning. I kept the allreduce part the same because we need the allreduce op unification PR to land first.

For accuracy test, I ran the examples/pytorch/quickstart_advanced.py script with the R1 model with different configs.

@hlu1
Copy link
Collaborator Author

hlu1 commented Apr 17, 2025

/bot run --disable-fail-fast

@tensorrt-cicd
Copy link
Collaborator

PR_Github #2533 [ run ] triggered by Bot

@tensorrt-cicd
Copy link
Collaborator

PR_Github #2533 [ run ] completed with state SUCCESS
/LLM/main/L0_MergeRequest_PR pipeline #1815 completed with status: 'FAILURE'

@hlu1 hlu1 force-pushed the deepseek_cleanup branch from 57f9a2e to 892d4ff Compare April 17, 2025 06:42
@hlu1
Copy link
Collaborator Author

hlu1 commented Apr 17, 2025

/bot run --disable-fail-fast

@hlu1 hlu1 force-pushed the deepseek_cleanup branch from 892d4ff to e192b6e Compare April 17, 2025 06:48
@tensorrt-cicd
Copy link
Collaborator

PR_Github #2596 [ run ] triggered by Bot

@hlu1
Copy link
Collaborator Author

hlu1 commented Apr 17, 2025

/bot kill

@tensorrt-cicd
Copy link
Collaborator

PR_Github #2601 [ kill ] triggered by Bot

@tensorrt-cicd
Copy link
Collaborator

PR_Github #2596 [ run ] completed with state ABORTED

@tensorrt-cicd
Copy link
Collaborator

PR_Github #2601 [ kill ] completed with state SUCCESS
Successfully killed previous jobs for commit e192b6e

@hlu1 hlu1 force-pushed the deepseek_cleanup branch from e192b6e to 693a21b Compare April 17, 2025 07:12
@hlu1
Copy link
Collaborator Author

hlu1 commented Apr 17, 2025

/bot run --disable-fail-fast

@tensorrt-cicd
Copy link
Collaborator

PR_Github #2605 [ run ] triggered by Bot

@tensorrt-cicd
Copy link
Collaborator

PR_Github #2605 [ run ] completed with state SUCCESS
/LLM/main/L0_MergeRequest_PR pipeline #1857 completed with status: 'FAILURE'

@hlu1 hlu1 force-pushed the deepseek_cleanup branch from 693a21b to 593cb9f Compare April 17, 2025 18:42
@hlu1
Copy link
Collaborator Author

hlu1 commented Apr 17, 2025

/bot run --disable-fail-fast

@tensorrt-cicd
Copy link
Collaborator

PR_Github #2672 [ run ] triggered by Bot

@tensorrt-cicd
Copy link
Collaborator

PR_Github #2672 [ run ] completed with state SUCCESS
/LLM/main/L0_MergeRequest_PR pipeline #1903 completed with status: 'SUCCESS'

Copy link
Collaborator

@QiJune QiJune left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@hlu1 hlu1 force-pushed the deepseek_cleanup branch from 593cb9f to e4b933b Compare April 18, 2025 23:44
@hlu1 hlu1 merged commit c861b6c into NVIDIA:main Apr 19, 2025
2 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants