Skip to content

(Batched) Matrix Multiplication and Fused Operations #14394

Answered by slaren
taronaeo asked this question in Q&A
Discussion options

You must be logged in to vote
  1. Batched matrix multiplication is supported by using the dimensions 3 and 4 of the operand tensors, but mostly this is used with GQA
  2. Operator fusion is not currently done by any of the backends, although it is being worked on in #14366

What kind of setup and teardown are you talking about? If you have to copy the whole tensor data on each operation, that may explain the performance that you are seeing.

Replies: 1 comment 5 replies

Comment options

You must be logged in to vote
5 replies
@taronaeo
Comment options

@slaren
Comment options

@taronaeo
Comment options

@slaren
Comment options

@taronaeo
Comment options

Answer selected by taronaeo
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Category
Q&A
Labels
None yet
2 participants