fix: mllama e2e pytorch flow fix #3397

yechank-nvidia · 2025-04-09T04:27:36Z

This PR is the bugfix for the e2e pytorch flow of mllama.

python3 quickstart_advanced.py --model_dir meta-llama/Llama-3.2-11B-Vision --enable_chunked_prefill --enable_overlap_scheduler
=>
AttributeError: 'MllamaForConditionalGeneration' object has no attribute 'infer_max_seq_len'

MllamaForCausalLM is not based on DecoderModelForCausalLM which causes AttributeError of infer_max_seq_len. Copied infer_max_seq_len from DecoderModelForCausalLM.

yechank-nvidia · 2025-04-09T04:40:07Z

/bot run

tensorrt-cicd · 2025-04-09T04:45:33Z

PR_Github #1553 [ run ] triggered by Bot

tensorrt-cicd · 2025-04-09T05:24:40Z

PR_Github #1553 [ run ] completed with state SUCCESS
/LLM/main/L0_MergeRequest_PR pipeline #1160 completed with status: 'FAILURE'

yechank-nvidia · 2025-04-09T05:31:46Z

/bot run

tensorrt-cicd · 2025-04-09T05:37:04Z

PR_Github #1560 [ run ] triggered by Bot

tensorrt-cicd · 2025-04-09T07:37:33Z

PR_Github #1560 [ run ] completed with state SUCCESS
/LLM/main/L0_MergeRequest_PR pipeline #1166 completed with status: 'SUCCESS'

tensorrt_llm/_torch/models/modeling_mllama.py

Signed-off-by: yechank <[email protected]>

kaiyux · 2025-04-11T09:20:47Z

/bot reuse-pipeline

tensorrt-cicd · 2025-04-11T09:26:36Z

PR_Github #1911 [ reuse-pipeline ] triggered by Bot

tensorrt-cicd · 2025-04-11T09:32:24Z

PR_Github #1911 [ reuse-pipeline ] completed with state SUCCESS
Reusing PR_Github #1560 for commit f8e810c

yechank-nvidia requested review from tongyuantongyu, kaiyux and QiJune April 9, 2025 04:29

yechank-nvidia force-pushed the mllama_torch_seqlen branch from af50079 to 9a3c14a Compare April 9, 2025 05:31

kaiyux reviewed Apr 9, 2025

View reviewed changes

tensorrt_llm/_torch/models/modeling_mllama.py Show resolved Hide resolved

fix: mllama e2e pytorch flow fix

f8e810c

Signed-off-by: yechank <[email protected]>

kaiyux force-pushed the mllama_torch_seqlen branch from 9a3c14a to f8e810c Compare April 11, 2025 09:20

kaiyux enabled auto-merge (squash) April 11, 2025 09:20

kaiyux approved these changes Apr 11, 2025

View reviewed changes

kaiyux merged commit 5bc6f09 into NVIDIA:main Apr 11, 2025
3 checks passed

yechank-nvidia deleted the mllama_torch_seqlen branch April 14, 2025 07:43

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: mllama e2e pytorch flow fix #3397

fix: mllama e2e pytorch flow fix #3397

yechank-nvidia commented Apr 9, 2025 •

edited

Loading

yechank-nvidia commented Apr 9, 2025

tensorrt-cicd commented Apr 9, 2025

tensorrt-cicd commented Apr 9, 2025

yechank-nvidia commented Apr 9, 2025

tensorrt-cicd commented Apr 9, 2025

tensorrt-cicd commented Apr 9, 2025

kaiyux commented Apr 11, 2025

tensorrt-cicd commented Apr 11, 2025

tensorrt-cicd commented Apr 11, 2025

fix: mllama e2e pytorch flow fix #3397

fix: mllama e2e pytorch flow fix #3397

Conversation

yechank-nvidia commented Apr 9, 2025 • edited Loading

yechank-nvidia commented Apr 9, 2025

tensorrt-cicd commented Apr 9, 2025

tensorrt-cicd commented Apr 9, 2025

yechank-nvidia commented Apr 9, 2025

tensorrt-cicd commented Apr 9, 2025

tensorrt-cicd commented Apr 9, 2025

kaiyux commented Apr 11, 2025

tensorrt-cicd commented Apr 11, 2025

tensorrt-cicd commented Apr 11, 2025

yechank-nvidia commented Apr 9, 2025 •

edited

Loading