Skip to content

feat: flash attention support for hexagon-npu #16

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 22 commits into from
Jun 22, 2025
Merged

feat: flash attention support for hexagon-npu #16

merged 22 commits into from
Jun 22, 2025

Conversation

chraac
Copy link
Owner

@chraac chraac commented Jun 14, 2025

Related to chraac/llama.cpp#45

@chraac chraac requested a review from Copilot June 14, 2025 15:02
@chraac chraac self-assigned this Jun 14, 2025
@chraac chraac added the enhancement New feature or request label Jun 14, 2025
Copilot

This comment was marked as outdated.

@chraac chraac requested a review from Copilot June 18, 2025 02:43
Copilot

This comment was marked as outdated.

@chraac chraac requested a review from Copilot June 18, 2025 02:56
Copy link
Contributor

@Copilot Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

Adds Flash Attention support for the Hexagon NPU backend by enhancing log parsing, test coverage, CLI scripts, and documentation.

  • Extend log_parser and its tests to recognize FLASH_ATTN_EXT operations
  • Introduce --flash-attn flags in device‐run and benchmark scripts (bash and PowerShell)
  • Update build docs with Hexagon SDK requirements and add a Python test workflow

Reviewed Changes

Copilot reviewed 12 out of 12 changed files in this pull request and generated 2 comments.

Show a summary per file
File Description
scripts/tests/test_log_parser.py Expanded unit tests to cover FLASH_ATTN_EXT parsing
scripts/run_device_model.sh Added -f/--flash-attn argument and forwarding in bash script
scripts/run_device_model.ps1 Added -f/--FlashAttention switch and arg assembly in PS script
scripts/log_parser.py Implemented parsing and item subclassing for FLASH_ATTN_EXT
scripts/batch_run_benchmarks_and_save_log.sh Added --flash-attn support for bash benchmark runner
scripts/batch_run_benchmarks_and_save_log.ps1 Added -f/--FlashAttention support for PowerShell benchmark
llama.cpp Updated submodule commit reference
docs/how-to-build.md Added instructions to install Hexagon SDK
.github/workflows/python_tests.yml New workflow to run Python unit tests for log parser
Comments suppressed due to low confidence (3)

scripts/tests/test_log_parser.py:8

  • [nitpick] The test name suggests checking a method called get_name, but it actually verifies dtype, shape, and permute behavior. Consider renaming to test_op_tensor_initialization_sets_dtype_shape_and_permute for clarity.
    def test_given_op_tensor_when_get_name_then_return_correct_name(self):

.github/workflows/python_tests.yml:41

  • This step is labeled Test with pytest but invokes the unittest module incorrectly. Either switch to pytest invocation or use python -m unittest test_log_parser -v to run the tests as written.
        python3 -m test_log_parser -v

scripts/log_parser.py:131

  • [nitpick] The return type hint dict[str:(...)] is invalid Python syntax. Consider using dict[str, Union[OpTensor.DataType, int, float, bool, list]] or the -> Dict[str, Any] pattern to satisfy type checkers.
    def __parse_prop(prop: str) -> dict[str:(OpTensor.DataType | int | float | bool | list)]:

@@ -37,6 +40,10 @@ if ($Verbose) {
$extraArgs = "-v"
}

if ($FlashAttention) {
Copy link
Preview

Copilot AI Jun 18, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The variable $extraArgs is only set inside the if ($Verbose) block and may be undefined here. Initialize $extraArgs = "" before any conditionals to prevent null or concatenation errors.

Copilot uses AI. Check for mistakes.

@chraac chraac merged commit e675c6f into main Jun 22, 2025
5 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant