Skip to content

fix: Fix disagg MTP with overlap #3406

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
Apr 12, 2025

Conversation

Tabrizian
Copy link
Member

@Tabrizian Tabrizian commented Apr 9, 2025

Before:

[
  " Berlin\n\n\n\nWhat isWhat is the capital the capital of France of France? France?\n\nWhat Paris\n\n is theWhat is is the capital the the of of capital of the capital the United of the Kingdom? United Kingdom London? United United Kingdom? Kingdom? Kingdom is is the capital the capital of the of the capital of of of the the the the the capital capital of the of of of the of the the the the of the the of the the the the the the the the the the",
  " What is is the difference the difference between the between special general and and special relativity special relativity? relativity? What? is the What is the between difference the two the special special special and general and relativity and relativity general? general relativity? relativity? relativity What is? What the difference difference the the between between difference difference two between between special special the the special special special relativity general relativity relativity?? relativity\n\n? relativityExplain? is is the theory the theory the relativity relativity.. of relativity What",
  "\n\nAsAsynyncio iscio is a library a library that allows that writing allows allows single single-th-threadedreaded single-th threadreadeded,, thread\n\nedWhat, are and and and benefits benefits and of using of benefits using of using using using as asynynciocy in in in in in in Python Python Python?\n\n?\n\nAsAsynynynyncyncy iscy is a is a is a library a library library that library that that that that",
  " Include the the reactants basic and products equation, and and the the role light of light and-independent light reactions-independent in the reactions in role the chloroplast of thes and Calvin dark cycle. reactions. cycle Describe. Include the process the reactants of cellular cellular respiration. respiration. basic structure structure, structure, products, products, products, equation, equation, equation, equation, and, the the and the the the the role role role role role role role role role of of of of of of of of",
  " - New NewBlock Yorkchain\n\n\n\nBlockHowchain is does a a blockchain blockchain blockchain work work?? work\n\n? -How New York does aNew blockchain work blockchain? - work?How -How does a does a blockchain work blockchain work? work? -\n\n NewHow does York\n\n a blockchain a work? blockchain work work? -? - New York New does does York a blockchain a a a blockchain blockchain blockchain does work work?? work work?? -\n\n NewHow does"
]

Sample output after fix:

[
  " Berlin\n\nWhat is the capital of Germany. Berlin is a city and one of the 16 states of Germany. With a population of 3.4 million people, Berlin is Germany's largest city. It is the second most populous city proper and the seventh most populous urban area in the European Union. The city is one of Europe's major centres of culture, politics, media, and science. Its economy is based on high-tech firms and the service sector, encompassing a diverse range of creative",
  " What is the theory of relativity is and how it works. The theory of relativity is a theory that was created by Albert Einstein. The theory of relativity is a theory that was created by Albert Einstein. The theory of relativity is a theory that was created by Albert Einstein. The theory of relativity is a theory that was created by Albert Einstein. The theory of relativity is a theory that was created by Albert Einstein. The theory of relativity is a theory that was created by Albert Einstein. The theory of relativity",
  "\n\nAsyncio is a powerful library in Python that allows for asynchronous programming. It provides a way to write concurrent code that can handle multiple tasks simultaneously, without the need for threads or processes. This can lead to significant performance improvements, especially in I/O-bound applications.\n\nOne of the main benefits of using asyncio is that it allows for non-blocking I/O operations. This means that while one task is waiting for an I/O operation to complete, other tasks can continue to run.",
  " Include the reactants and products of the reaction, where the reaction occurs, and the specific organelles in the cell where the reaction occurs. (4 points) \n\n | Photosynthesis is the process by which plants, algae, and some bacteria convert sunlight, carbon dioxide, and water into glucose and oxygen. The reaction occurs in the chloroplasts of plant cells. \n\nPhotosynthesis is the process by which plants, algae, and some bacteria convert sunlight, carbon dioxide, and water into glucose and oxygen. The",
  " - Simply put, a blockchain is a shared database or ledger. Pieces of data are stored in data structures known as blocks, and each network node has a replica of the entire database. Security is ensured since the majority will not accept this change if somebody tries to try to edit or delete an entry in one copy of the ledger. The data blocks are linked together using cryptography, forming a chain of blocks known as the blockchain. The blockchain is a distributed database that is shared among the nodes of a"
]

@Tabrizian
Copy link
Member Author

/bot run

@tensorrt-cicd
Copy link
Collaborator

PR_Github #1579 [ run ] triggered by Bot

@tensorrt-cicd
Copy link
Collaborator

PR_Github #1579 [ run ] completed with state FAILURE
/LLM/main/L0_MergeRequest_PR pipeline #1181 completed with status: 'FAILURE'

@Tabrizian Tabrizian force-pushed the user/imant/fixDisaggMTPOverlap branch from 50bcc9e to 0f484a0 Compare April 9, 2025 08:39
@Tabrizian
Copy link
Member Author

/bot run --disable-fail-fast

@Tabrizian Tabrizian force-pushed the user/imant/fixDisaggMTPOverlap branch from 0f484a0 to f41f58d Compare April 9, 2025 08:41
@Tabrizian
Copy link
Member Author

/bot run --disable-fail-fast

@tensorrt-cicd
Copy link
Collaborator

PR_Github #1586 [ run ] triggered by Bot

@tensorrt-cicd
Copy link
Collaborator

PR_Github #1588 [ run ] triggered by Bot

@tensorrt-cicd
Copy link
Collaborator

PR_Github #1588 [ run ] completed with state ABORTED

@tensorrt-cicd
Copy link
Collaborator

PR_Github #1586 [ run ] completed with state SUCCESS
/LLM/main/L0_MergeRequest_PR pipeline #1188 completed with status: 'FAILURE'

@Tabrizian Tabrizian force-pushed the user/imant/fixDisaggMTPOverlap branch from f41f58d to 8cf2b7c Compare April 9, 2025 20:56
@Tabrizian Tabrizian force-pushed the user/imant/fixDisaggMTPOverlap branch from 8cf2b7c to a7712f6 Compare April 9, 2025 21:04
@Tabrizian
Copy link
Member Author

/bot run --add-multi-gpu-tests --disable-fail-fast

@Tabrizian Tabrizian requested a review from pcastonguay April 9, 2025 21:07
@Tabrizian Tabrizian force-pushed the user/imant/fixDisaggMTPOverlap branch from a7712f6 to f9c8bfd Compare April 9, 2025 21:09
@Tabrizian
Copy link
Member Author

/bot run --add-multi-gpu-tests --disable-fail-fast

@tensorrt-cicd
Copy link
Collaborator

PR_Github #1647 Bot args parsing error!

@tensorrt-cicd
Copy link
Collaborator

PR_Github #1648 Bot args parsing error!

@Tabrizian
Copy link
Member Author

/bot run --add-multi-gpu-test --disable-fail-fast

@tensorrt-cicd
Copy link
Collaborator

PR_Github #1650 [ run ] triggered by Bot

@Tabrizian
Copy link
Member Author

/bot run --only-multi-gpu-test

@Tabrizian Tabrizian force-pushed the user/imant/fixDisaggMTPOverlap branch from f9c8bfd to 445b7cb Compare April 10, 2025 00:28
@tensorrt-cicd
Copy link
Collaborator

PR_Github #1662 [ run ] triggered by Bot

@tensorrt-cicd
Copy link
Collaborator

PR_Github #1650 [ run ] completed with state ABORTED
/LLM/main/L0_MergeRequest_PR pipeline #1233 completed with status: 'FAILURE'

@tensorrt-cicd
Copy link
Collaborator

PR_Github #1662 [ run ] completed with state SUCCESS
/LLM/main/L0_MergeRequest_PR pipeline #1242 (Partly Tested) completed with status: 'FAILURE'

@Tabrizian Tabrizian force-pushed the user/imant/fixDisaggMTPOverlap branch from 445b7cb to 1a4884f Compare April 10, 2025 16:47
@Tabrizian
Copy link
Member Author

/bot run --only-multi-gpu-test

@tensorrt-cicd
Copy link
Collaborator

PR_Github #1812 [ run ] triggered by Bot

@tensorrt-cicd
Copy link
Collaborator

PR_Github #1812 [ run ] completed with state FAILURE
/LLM/main/L0_MergeRequest_PR pipeline #1344 (Partly Tested) completed with status: 'FAILURE'

@Tabrizian Tabrizian force-pushed the user/imant/fixDisaggMTPOverlap branch from 1a4884f to f4474c6 Compare April 11, 2025 17:00
@Tabrizian
Copy link
Member Author

/bot run --only-multi-gpu-test

@tensorrt-cicd
Copy link
Collaborator

PR_Github #1958 [ run ] triggered by Bot

@tensorrt-cicd
Copy link
Collaborator

PR_Github #1958 [ run ] completed with state SUCCESS
/LLM/main/L0_MergeRequest_PR pipeline #1441 (Partly Tested) completed with status: 'SUCCESS'

Signed-off-by: Iman Tabrizian <[email protected]>
Signed-off-by: Iman Tabrizian <[email protected]>
@Tabrizian Tabrizian force-pushed the user/imant/fixDisaggMTPOverlap branch from f4474c6 to f925691 Compare April 12, 2025 04:14
@Tabrizian
Copy link
Member Author

/bot reuse-pipeline

@Tabrizian Tabrizian enabled auto-merge (squash) April 12, 2025 04:15
@tensorrt-cicd
Copy link
Collaborator

PR_Github #1992 [ reuse-pipeline ] triggered by Bot

@tensorrt-cicd
Copy link
Collaborator

PR_Github #1992 [ reuse-pipeline ] completed with state SUCCESS
Reusing PR_Github #1958 (Partly Tested) for commit f925691

@Tabrizian Tabrizian merged commit 3041bbd into NVIDIA:main Apr 12, 2025
3 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants