Skip to content

[X86][test] Remove useless pattern for VDPBF16PSZmb and add a test for broadcast folding #80629

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
Feb 5, 2024

Conversation

KanRobert
Copy link
Contributor

llvm-issue: #68810

@llvmbot
Copy link
Member

llvmbot commented Feb 5, 2024

@llvm/pr-subscribers-backend-x86

Author: Shengchen Kan (KanRobert)

Changes

llvm-issue: #68810


Full diff: https://github.com/llvm/llvm-project/pull/80629.diff

2 Files Affected:

  • (modified) llvm/lib/Target/X86/X86InstrAVX512.td (+1-2)
  • (added) llvm/test/CodeGen/X86/fold-broadcast.ll (+18)
diff --git a/llvm/lib/Target/X86/X86InstrAVX512.td b/llvm/lib/Target/X86/X86InstrAVX512.td
index b588f660e2744..70d9b437b5fa9 100644
--- a/llvm/lib/Target/X86/X86InstrAVX512.td
+++ b/llvm/lib/Target/X86/X86InstrAVX512.td
@@ -12721,8 +12721,7 @@ multiclass avx512_dpbf16ps_rm<bits<8> opc, string OpcodeStr, SDNode OpNode,
                   OpcodeStr,
                   !strconcat("${src3}", _.BroadcastStr,", $src2"),
                   !strconcat("$src2, ${src3}", _.BroadcastStr),
-                  (_.VT (OpNode _.RC:$src1, src_v.RC:$src2,
-                  (src_v.VT (src_v.BroadcastLdFrag addr:$src3))))>,
+                  (null_frag)>,
                   EVEX_B, EVEX, VVVV, Sched<[sched.Folded, sched.ReadAfterFold]>;
 
 }
diff --git a/llvm/test/CodeGen/X86/fold-broadcast.ll b/llvm/test/CodeGen/X86/fold-broadcast.ll
new file mode 100644
index 0000000000000..02c26487136ed
--- /dev/null
+++ b/llvm/test/CodeGen/X86/fold-broadcast.ll
@@ -0,0 +1,18 @@
+; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py UTC_ARGS: --version 4
+; RUN: llc -mtriple=x86_64-unknown-unknown -mattr=+avx512bf16 < %s | FileCheck %s
+
+define <16 x float> @mm512_dpbf16_ps_broadcast_rhs(<16 x float> noundef %acc, <32 x bfloat> noundef %lhs, ptr nocapture noundef readonly %rhs) {
+; CHECK-LABEL: mm512_dpbf16_ps_broadcast_rhs:
+; CHECK:       # %bb.0: # %entry
+; CHECK-NEXT:    vdpbf16ps (%rdi){1to16}, %zmm1, %zmm0
+; CHECK-NEXT:    retq
+entry:
+  %0 = load float, ptr %rhs, align 4
+  %vecinit.i = insertelement <16 x float> poison, float %0, i64 0
+  %vecinit15.i = shufflevector <16 x float> %vecinit.i, <16 x float> poison, <16 x i32> zeroinitializer
+  %1 = bitcast <16 x float> %vecinit15.i to <32 x bfloat>
+  %2 = tail call <16 x float> @llvm.x86.avx512bf16.dpbf16ps.512(<16 x float> %acc, <32 x bfloat> %lhs, <32 x bfloat> %1)
+  ret <16 x float> %2
+}
+
+declare <16 x float> @llvm.x86.avx512bf16.dpbf16ps.512(<16 x float>, <32 x bfloat>, <32 x bfloat>)

@bjacob
Copy link
Contributor

bjacob commented Feb 5, 2024

I'll let the maintainers do the actual review, just want to say thanks and yes, the added test does correspond to what #68810 is about. Thanks!

Copy link
Contributor

@phoebewang phoebewang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks!

@KanRobert KanRobert merged commit 115c0c6 into llvm:main Feb 5, 2024
agozillon pushed a commit to agozillon/llvm-project that referenced this pull request Feb 5, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants