NVIDIA / Megatron-LM Public

Notifications You must be signed in to change notification settings
Fork 2.8k
Star 12.6k

Code
Issues 310
Pull requests 200
Discussions
Actions
Security
Insights

Additional navigation options

Code
Issues
Pull requests
Discussions
Actions
Security
Insights

Pull requests: NVIDIA/Megatron-LM

Labels 11 Milestones 0

New pull request New

200 Open 297 Closed

Author

Filter by author

Uh oh!

There was an error while loading. Please reload this page.

Label

Filter by label

Uh oh!

There was an error while loading. Please reload this page.

Use alt + click/return to exclude labels

or ⇧ + click/return for logical OR

Projects

Filter by project

Uh oh!

There was an error while loading. Please reload this page.

Milestones

Filter by milestone

Uh oh!

There was an error while loading. Please reload this page.

Reviews

Filter by reviews

No reviews Review required Approved review Changes requested

Assignee

Filter by who’s assigned

Assigned to nobody

Uh oh!

There was an error while loading. Please reload this page.

Sort

Sort by

Newest Oldest Most commented Least commented Recently updated Least recently updated Best match

Most reactions

Pull requests list

add fused_topk_softmax_without_capacity for topk router fusion

#1632 opened Jun 13, 2025 by AshOfCat

Loading…

Fix log-timer-to-tensorboard on logging

#1631 opened Jun 13, 2025 by wplf

Loading…

Fix typos: vritual → virtual and decoeder → decoder

#1626 opened Jun 11, 2025 by EricLabile

Loading…

Fix: Apply q_layernorm consistently in MLA LoRA path

#1624 opened Jun 11, 2025 by Flink-ddd

Loading…

fix: when using moe parallel folding feature and set etp > 1 && ep == 1, the grad sync is incorrect and the loss curve is bad

#1622 opened Jun 10, 2025 by Louis-J

Loading…

Set weights_only=False in optimizer

#1618 opened Jun 9, 2025 by zhic-mt

Loading…

Fix mrope with context parallel

#1612 opened Jun 6, 2025 by liu-zichen

Loading…

use a cpu set to cache cuda tensor finished_request_ids

#1610 opened Jun 5, 2025 by ladyrick

Loading…

Add DistTrain, Allow Encoder to Have Different DP Size

#1605 opened May 30, 2025 by zidanehuang001

Loading…

add node_rank argument for example scripts

#1604 opened May 30, 2025 by xylllllllll

Loading…

CLIPViTModel support SP and CP

#1600 opened May 28, 2025 by Thaurun

Loading…

Support Multiple Input Formats for checkpoint

#1599 opened May 28, 2025 by Thaurun

Loading…

bugfix: cross_entropy inplace operations may cause backward error

#1594 opened May 24, 2025 by ChangWeiming

Loading…

support for qwen2.5vl window attention

#1591 opened May 22, 2025 by Agoniii

Loading…

fix bug: the loss of aux_loss and mtp will be tracked twice

#1585 opened May 18, 2025 by hyleepp

Loading…

[fix] Fix get_transformer_layer_offset incorrect when virtual pipeline is enabled and num_layers_in_last_pipeline_stage is set but num_layers_in_first_pipeline_stage is not

#1583 opened May 17, 2025 by yqy3214

Loading…

[Draft ]FP8 param support for MXFP8

#1581 opened May 14, 2025 by WanZzzzzz • Draft

use multiple yaml files to avoid passing annoying model configs from cmd lines

#1579 opened May 14, 2025 by nrailg

Loading…

The phrase "need to want to" is grammatically incorrect

#1574 opened May 13, 2025 by A-transformer

Loading…

Fix incorrect softmax_factor calculation in MLA

#1562 opened May 2, 2025 by HowardZorn

Loading…

[bugfix] fix the bug that loss: 0 will not be printed

#1555 opened Apr 28, 2025 by leisuzz

Loading…

Fix: training arguments print format

#1552 opened Apr 24, 2025 by vicoooo26

Loading…

param_copy_back_gpu_hook should sync to h2d stream

#1543 opened Apr 16, 2025 by ariverhorse

Loading…

Fix parameter error in text_generation_server.py file

#1542 opened Apr 16, 2025 by xichengpro

Loading…

lora offload

#1540 opened Apr 15, 2025 by sanandaraj5597

Loading…

Previous 1 2 3 4 5 6 7 8 Next

Previous Next

ProTip! Filter pull requests by the default branch with base:main.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Uh oh!