-
Notifications
You must be signed in to change notification settings - Fork 1.4k
fix: mllama e2e pytorch flow fix #3397
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
/bot run |
PR_Github #1553 [ run ] triggered by Bot |
PR_Github #1553 [ run ] completed with state |
af50079
to
9a3c14a
Compare
/bot run |
PR_Github #1560 [ run ] triggered by Bot |
PR_Github #1560 [ run ] completed with state |
Signed-off-by: yechank <[email protected]>
9a3c14a
to
f8e810c
Compare
/bot reuse-pipeline |
PR_Github #1911 [ reuse-pipeline ] triggered by Bot |
PR_Github #1911 [ reuse-pipeline ] completed with state |
This PR is the bugfix for the e2e pytorch flow of mllama.
python3 quickstart_advanced.py --model_dir meta-llama/Llama-3.2-11B-Vision --enable_chunked_prefill --enable_overlap_scheduler
=>
AttributeError: 'MllamaForConditionalGeneration' object has no attribute 'infer_max_seq_len'
MllamaForCausalLM
is not based onDecoderModelForCausalLM
which causes AttributeError ofinfer_max_seq_len
. Copiedinfer_max_seq_len
fromDecoderModelForCausalLM
.