Skip to content

[AMDGPU][SDAG] Test ISD::PTRADD handling in VOP3 patterns #143880

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 1 commit into
base: users/ritter-x2a/06-11-_amdgpu_sdag_add_target-specific_isd_ptradd_combines
Choose a base branch
from

Conversation

ritter-x2a
Copy link
Member

Pre-committing tests to show improvements in a follow-up PR.

@llvmbot
Copy link
Member

llvmbot commented Jun 12, 2025

@llvm/pr-subscribers-backend-amdgpu

Author: Fabian Ritter (ritter-x2a)

Changes

Pre-committing tests to show improvements in a follow-up PR.


Full diff: https://github.com/llvm/llvm-project/pull/143880.diff

1 Files Affected:

  • (modified) llvm/test/CodeGen/AMDGPU/ptradd-sdag-optimizations.ll (+45)
diff --git a/llvm/test/CodeGen/AMDGPU/ptradd-sdag-optimizations.ll b/llvm/test/CodeGen/AMDGPU/ptradd-sdag-optimizations.ll
index c00bccdbce6b7..d48bfe0bb7f21 100644
--- a/llvm/test/CodeGen/AMDGPU/ptradd-sdag-optimizations.ll
+++ b/llvm/test/CodeGen/AMDGPU/ptradd-sdag-optimizations.ll
@@ -263,3 +263,48 @@ define amdgpu_kernel void @fold_mad64(ptr addrspace(1) %p) {
   store float 1.0, ptr addrspace(1) %p1
   ret void
 }
+
+; Use non-zero shift amounts in v_lshl_add_u64.
+define ptr @select_v_lshl_add_u64(ptr %base, i64 %voffset) {
+; GFX942_PTRADD-LABEL: select_v_lshl_add_u64:
+; GFX942_PTRADD:       ; %bb.0:
+; GFX942_PTRADD-NEXT:    s_waitcnt vmcnt(0) expcnt(0) lgkmcnt(0)
+; GFX942_PTRADD-NEXT:    v_lshlrev_b64 v[2:3], 3, v[2:3]
+; GFX942_PTRADD-NEXT:    v_lshl_add_u64 v[0:1], v[0:1], 0, v[2:3]
+; GFX942_PTRADD-NEXT:    s_setpc_b64 s[30:31]
+;
+; GFX942_LEGACY-LABEL: select_v_lshl_add_u64:
+; GFX942_LEGACY:       ; %bb.0:
+; GFX942_LEGACY-NEXT:    s_waitcnt vmcnt(0) expcnt(0) lgkmcnt(0)
+; GFX942_LEGACY-NEXT:    v_lshl_add_u64 v[0:1], v[2:3], 3, v[0:1]
+; GFX942_LEGACY-NEXT:    s_setpc_b64 s[30:31]
+  %gep = getelementptr inbounds i64, ptr %base, i64 %voffset
+  ret ptr %gep
+}
+
+; Fold mul and add into v_mad, even if amdgpu-codegenprepare-mul24 turned the
+; mul into a mul24.
+define ptr @fold_mul24_into_mad(ptr %base, i64 %a, i64 %b) {
+; GFX942_PTRADD-LABEL: fold_mul24_into_mad:
+; GFX942_PTRADD:       ; %bb.0:
+; GFX942_PTRADD-NEXT:    s_waitcnt vmcnt(0) expcnt(0) lgkmcnt(0)
+; GFX942_PTRADD-NEXT:    v_and_b32_e32 v2, 0xfffff, v2
+; GFX942_PTRADD-NEXT:    v_and_b32_e32 v4, 0xfffff, v4
+; GFX942_PTRADD-NEXT:    v_mul_hi_u32_u24_e32 v3, v2, v4
+; GFX942_PTRADD-NEXT:    v_mul_u32_u24_e32 v2, v2, v4
+; GFX942_PTRADD-NEXT:    v_lshl_add_u64 v[0:1], v[0:1], 0, v[2:3]
+; GFX942_PTRADD-NEXT:    s_setpc_b64 s[30:31]
+;
+; GFX942_LEGACY-LABEL: fold_mul24_into_mad:
+; GFX942_LEGACY:       ; %bb.0:
+; GFX942_LEGACY-NEXT:    s_waitcnt vmcnt(0) expcnt(0) lgkmcnt(0)
+; GFX942_LEGACY-NEXT:    v_and_b32_e32 v2, 0xfffff, v2
+; GFX942_LEGACY-NEXT:    v_and_b32_e32 v3, 0xfffff, v4
+; GFX942_LEGACY-NEXT:    v_mad_u64_u32 v[0:1], s[0:1], v2, v3, v[0:1]
+; GFX942_LEGACY-NEXT:    s_setpc_b64 s[30:31]
+  %a_masked = and i64 %a, u0xfffff
+  %b_masked = and i64 %b, u0xfffff
+  %mul = mul i64 %a_masked, %b_masked
+  %gep = getelementptr inbounds i8, ptr %base, i64 %mul
+  ret ptr %gep
+}

@ritter-x2a ritter-x2a requested review from arsenm and shiltian June 12, 2025 12:08
@ritter-x2a ritter-x2a marked this pull request as ready for review June 12, 2025 12:10
@ritter-x2a ritter-x2a force-pushed the users/ritter-x2a/06-11-_amdgpu_sdag_add_target-specific_isd_ptradd_combines branch from 11bd2c5 to a3d204e Compare June 13, 2025 09:02
@ritter-x2a ritter-x2a force-pushed the users/ritter-x2a/06-12-_amdgpu_sdag_test_isd_ptradd_handling_in_vop3_patterns branch 2 times, most recently from 99d65b3 to 78abf55 Compare June 13, 2025 12:06
@ritter-x2a ritter-x2a force-pushed the users/ritter-x2a/06-11-_amdgpu_sdag_add_target-specific_isd_ptradd_combines branch 2 times, most recently from 50de6e0 to 7eb2283 Compare June 13, 2025 12:12
@ritter-x2a ritter-x2a force-pushed the users/ritter-x2a/06-12-_amdgpu_sdag_test_isd_ptradd_handling_in_vop3_patterns branch from 78abf55 to 69fddba Compare June 13, 2025 12:12
@ritter-x2a ritter-x2a force-pushed the users/ritter-x2a/06-11-_amdgpu_sdag_add_target-specific_isd_ptradd_combines branch from 7eb2283 to 88860bc Compare June 13, 2025 13:28
@ritter-x2a ritter-x2a force-pushed the users/ritter-x2a/06-12-_amdgpu_sdag_test_isd_ptradd_handling_in_vop3_patterns branch from 69fddba to b1a78b2 Compare June 13, 2025 13:28
@ritter-x2a ritter-x2a force-pushed the users/ritter-x2a/06-11-_amdgpu_sdag_add_target-specific_isd_ptradd_combines branch from 88860bc to 10494be Compare June 13, 2025 14:05
Pre-committing tests to show improvements in a follow-up PR.
@ritter-x2a ritter-x2a force-pushed the users/ritter-x2a/06-12-_amdgpu_sdag_test_isd_ptradd_handling_in_vop3_patterns branch from b1a78b2 to 3f69917 Compare June 13, 2025 14:05
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants