Skip to content

[AMDGPU] Fix getAsmVOP3Base call agruments. #144572

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged

Conversation

rampitec
Copy link
Collaborator

#143465 has removed
getAsmVOP3OpSel and uses getAsmVOP3Base instead, but original
call to getAsmVOP3OpSel was using HasSrcFloatMods and the
call to getAsmVOP3Base uses HasSrc
Mods. This does not play
well with opsel. An opsel instruction has modifiers in dag but
shall not have them in the asm string.

Copy link
Collaborator Author

This stack of pull requests is managed by Graphite. Learn more about stacking.

@rampitec rampitec marked this pull request as ready for review June 17, 2025 17:48
@llvmbot
Copy link
Member

llvmbot commented Jun 17, 2025

@llvm/pr-subscribers-backend-amdgpu

Author: Stanislav Mekhanoshin (rampitec)

Changes

#143465 has removed
getAsmVOP3OpSel and uses getAsmVOP3Base instead, but original
call to getAsmVOP3OpSel was using HasSrcFloatMods and the
call to getAsmVOP3Base uses HasSrc
Mods. This does not play
well with opsel. An opsel instruction has modifiers in dag but
shall not have them in the asm string.


Full diff: https://github.com/llvm/llvm-project/pull/144572.diff

2 Files Affected:

  • (modified) llvm/lib/Target/AMDGPU/SIInstrInfo.td (+2-2)
  • (modified) llvm/lib/Target/AMDGPU/VOP3Instructions.td (+8-2)
diff --git a/llvm/lib/Target/AMDGPU/SIInstrInfo.td b/llvm/lib/Target/AMDGPU/SIInstrInfo.td
index e74ccbee975ab..1350afad52781 100644
--- a/llvm/lib/Target/AMDGPU/SIInstrInfo.td
+++ b/llvm/lib/Target/AMDGPU/SIInstrInfo.td
@@ -2648,8 +2648,8 @@ class VOPProfile <list<ValueType> _ArgVT, bit _EnableClamp = 0> {
   // the asm operand name via this HasModifiers flag
   field string AsmDPP8 = getAsmDPP8<HasDst, NumSrcArgs, 0 /*HasModifiers*/, DstVT>.ret;
   field string AsmVOP3Base = getAsmVOP3Base<NumSrcArgs, HasDst, HasClamp,
-   HasOpSel, HasOMod, IsVOP3P, HasNeg, HasSrc0Mods, HasSrc1Mods,
-   HasSrc2Mods, DstVT, HasFP8ByteSel, HasBitOp3>.ret;
+   HasOpSel, HasOMod, IsVOP3P, HasNeg, HasSrc0FloatMods, HasSrc1FloatMods,
+   HasSrc2FloatMods, DstVT, HasFP8ByteSel, HasBitOp3>.ret;
   field string Asm64 = AsmVOP3Base;
   field string AsmVOP3P = getAsmVOP3P<HasDst, NumSrcArgs, HasNeg, HasClamp, HasOpSel>.ret;
   field string AsmVOP3OpSel = AsmVOP3Base;
diff --git a/llvm/lib/Target/AMDGPU/VOP3Instructions.td b/llvm/lib/Target/AMDGPU/VOP3Instructions.td
index f372101cb7b77..2dbc119f65cda 100644
--- a/llvm/lib/Target/AMDGPU/VOP3Instructions.td
+++ b/llvm/lib/Target/AMDGPU/VOP3Instructions.td
@@ -1126,6 +1126,9 @@ class VOP3_CVT_SCALEF32_PK_F864_Profile<VOPProfile P> : VOP3_Profile<P> {
   let HasModifiers = 0;
   let HasSrc0IntMods = 0;
   let HasSrc1IntMods = 0;
+  let HasSrc0FloatMods = 0;
+  let HasSrc1FloatMods = 0;
+  let HasSrc2FloatMods = 0;
   let HasOMod = 0;
   let HasOpSel = 0;
   let HasClamp = 0;
@@ -1562,9 +1565,12 @@ let SubtargetPredicate = HasPseudoScalarTrans in {
   def : PseudoScalarPatF16<any_amdgcn_sqrt, V_S_SQRT_F16_e64>;
 }
 
+let HasModifiers = 1 in
+def ASHR_PK_I8_Profile : VOP3_Profile<VOP_I16_I32_I32_I32, VOP3_OPSEL_ONLY>;
+
 let SubtargetPredicate = HasAshrPkInsts, isReMaterializable = 1 in {
-  defm V_ASHR_PK_I8_I32 : VOP3Inst<"v_ashr_pk_i8_i32", VOP3_Profile<VOP_I16_I32_I32_I32, VOP3_OPSEL_ONLY>, int_amdgcn_ashr_pk_i8_i32>;
-  defm V_ASHR_PK_U8_I32 : VOP3Inst<"v_ashr_pk_u8_i32", VOP3_Profile<VOP_I16_I32_I32_I32, VOP3_OPSEL_ONLY>, int_amdgcn_ashr_pk_u8_i32>;
+  defm V_ASHR_PK_I8_I32 : VOP3Inst<"v_ashr_pk_i8_i32", ASHR_PK_I8_Profile, int_amdgcn_ashr_pk_i8_i32>;
+  defm V_ASHR_PK_U8_I32 : VOP3Inst<"v_ashr_pk_u8_i32", ASHR_PK_I8_Profile, int_amdgcn_ashr_pk_u8_i32>;
 } // End SubtargetPredicate = HasAshrPkInsts, isReMaterializable = 1
 
 class AshrPkI8Pat<VOP3_Pseudo inst, int lo, int hi>: GCNPat<

#143465 has removed
getAsmVOP3OpSel and uses getAsmVOP3Base instead, but original
call to getAsmVOP3OpSel was using HasSrc*FloatMods and the
call to getAsmVOP3Base uses HasSrc*Mods. This does not play
well with opsel. An opsel instruction has modifiers in dag but
shall not have them in the asm string.
@rampitec rampitec force-pushed the users/rampitec/06-17-_amdgpu_fix_getasmvop3base_call_agruments branch 2 times, most recently from 3187905 to 59cf075 Compare June 17, 2025 19:55
Copy link
Contributor

@Sisyph Sisyph left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@broxigarchen
Copy link
Contributor

I thought HasSrcMods is for mod in asm string, but it's actually for dag? If this is the case, it surpises me that none of the vop3 insts hit this issue in upstream

@rampitec
Copy link
Collaborator Author

I thought HasSrcMods is for mod in asm string, but it's actually for dag? If this is the case, it surpises me that none of the vop3 insts hit this issue in upstream

Yes, somehow we got lucky.

@rampitec rampitec merged commit 8dcf4ba into main Jun 17, 2025
7 checks passed
@rampitec rampitec deleted the users/rampitec/06-17-_amdgpu_fix_getasmvop3base_call_agruments branch June 17, 2025 20:30
@shiltian
Copy link
Contributor

I thought HasSrcMods is for mod in asm string, but it's actually for dag? If this is the case, it surpises me that none of the vop3 insts hit this issue in upstream

I thought the same thing as well…I think I did run into some issues with src mods in some downstream effort.

@broxigarchen
Copy link
Contributor

I thought HasSrcMods is for mod in asm string, but it's actually for dag? If this is the case, it surpises me that none of the vop3 insts hit this issue in upstream

Yes, somehow we got lucky.

Thanks. Now I see what happened

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants