Skip to content

Pull requests: NVIDIA/Megatron-LM

Author
Filter by author
Loading
Label
Filter by label
Loading
Use alt + click/return to exclude labels
or + click/return for logical OR
Projects
Filter by project
Loading
Milestones
Filter by milestone
Loading
Reviews
Assignee
Filter by who’s assigned
Assigned to nobody Loading
Sort

Pull requests list

Fix log-timer-to-tensorboard on logging
#1631 opened Jun 13, 2025 by wplf Loading…
Set weights_only=False in optimizer
#1618 opened Jun 9, 2025 by zhic-mt Loading…
Fix mrope with context parallel
#1612 opened Jun 6, 2025 by liu-zichen Loading…
use a cpu set to cache cuda tensor finished_request_ids
#1610 opened Jun 5, 2025 by ladyrick Loading…
add node_rank argument for example scripts
#1604 opened May 30, 2025 by xylllllllll Loading…
CLIPViTModel support SP and CP
#1600 opened May 28, 2025 by Thaurun Loading…
Support Multiple Input Formats for checkpoint
#1599 opened May 28, 2025 by Thaurun Loading…
support for qwen2.5vl window attention
#1591 opened May 22, 2025 by Agoniii Loading…
[Draft ]FP8 param support for MXFP8
#1581 opened May 14, 2025 by WanZzzzzz Draft
Fix incorrect softmax_factor calculation in MLA
#1562 opened May 2, 2025 by HowardZorn Loading…
[bugfix] fix the bug that loss: 0 will not be printed
#1555 opened Apr 28, 2025 by leisuzz Loading…
Fix: training arguments print format
#1552 opened Apr 24, 2025 by vicoooo26 Loading…
lora offload
#1540 opened Apr 15, 2025 by sanandaraj5597 Loading…
ProTip! Filter pull requests by the default branch with base:main.