V100 not support FlashAttention，how to disable FlashAttention function？ #25

mishaogui · 2025-02-18T01:36:18Z

mishaogui · 2025-02-18T01:39:24Z

run demo.py python demo/demo.py ./images --model_path ./Sa2VA-4B --work-dir ./OUTPUT_DIR --text "Please describe the video content."

Subcode · 2025-03-31T10:18:35Z

model = AutoModel.from_pretrained(
model_path,
torch_dtype=torch.bfloat16,
low_cpu_mem_usage=True,
use_flash_attn=False,
trust_remote_code=True,
).eval().cuda()

in projects/llava_sam2/gradio/app.py

for demo.py add use_flash_attn=False to AutoModelFromCausalLLM.from_pretrained().

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

V100 not support FlashAttention，how to disable FlashAttention function？ #25

V100 not support FlashAttention，how to disable FlashAttention function？ #25

mishaogui commented Feb 18, 2025

mishaogui commented Feb 18, 2025

Subcode commented Mar 31, 2025 •

edited

Loading

V100 not support FlashAttention，how to disable FlashAttention function？ #25

V100 not support FlashAttention，how to disable FlashAttention function？ #25

Comments

mishaogui commented Feb 18, 2025

mishaogui commented Feb 18, 2025

Subcode commented Mar 31, 2025 • edited Loading

Subcode commented Mar 31, 2025 •

edited

Loading