Skip to content

[AMDGPU][MC][GFX11] Always output wait_vdst and wait_exp #66610

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Sep 22, 2023

Conversation

perlfu
Copy link
Contributor

@perlfu perlfu commented Sep 18, 2023

Always output values of wait_vdst and wait_exp in assembly even when they are zero.

While we normally avoid outputing default/zero parameters in assembly, the values of these parameters still imply wait behaviour when zero. Outputing zero values makes the intent more obvious to human readers, and avoid any future ambiguity if we choose to change the defaults to something other than zero.

Fixes #66383

Always output values of wait_vdst and wait_exp in assembly even
when they are zero.

While we normally avoid outputing default/zero parameters in assembly,
the values of these parameters still imply wait behaviour when zero.
Outputing zero values makes the intent more obvious to human readers,
and avoid any future ambiguity if we choose to change the defaults
to something other than zero.

Fixes llvm#66383
@llvmbot llvmbot added backend:AMDGPU mc Machine (object) code labels Sep 18, 2023
@llvmbot
Copy link
Member

llvmbot commented Sep 18, 2023

@llvm/pr-subscribers-mc

@llvm/pr-subscribers-backend-amdgpu

Changes

Always output values of wait_vdst and wait_exp in assembly even when they are zero.

While we normally avoid outputing default/zero parameters in assembly, the values of these parameters still imply wait behaviour when zero. Outputing zero values makes the intent more obvious to human readers, and avoid any future ambiguity if we choose to change the defaults to something other than zero.

Fixes #66383

Patch is 36.70 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/66610.diff

5 Files Affected:

  • (modified) llvm/lib/Target/AMDGPU/MCTargetDesc/AMDGPUInstPrinter.cpp (+4-10)
  • (modified) llvm/test/MC/AMDGPU/gfx11_asm_ldsdir.s (+2-2)
  • (modified) llvm/test/MC/AMDGPU/gfx11_asm_vinterp.s (+66-66)
  • (modified) llvm/test/MC/Disassembler/AMDGPU/gfx11_dasm_ldsdir.txt (+4-4)
  • (modified) llvm/test/MC/Disassembler/AMDGPU/gfx11_dasm_vinterp.txt (+57-57)
diff --git a/llvm/lib/Target/AMDGPU/MCTargetDesc/AMDGPUInstPrinter.cpp b/llvm/lib/Target/AMDGPU/MCTargetDesc/AMDGPUInstPrinter.cpp
index bdfffc475c90ae3..2a80296688744a2 100644
--- a/llvm/lib/Target/AMDGPU/MCTargetDesc/AMDGPUInstPrinter.cpp
+++ b/llvm/lib/Target/AMDGPU/MCTargetDesc/AMDGPUInstPrinter.cpp
@@ -532,21 +532,15 @@ void AMDGPUInstPrinter::printDefaultVccOperand(bool FirstOperand,
 void AMDGPUInstPrinter::printWaitVDST(const MCInst *MI, unsigned OpNo,
                                       const MCSubtargetInfo &STI,
                                       raw_ostream &O) {
-  uint8_t Imm = MI->getOperand(OpNo).getImm();
-  if (Imm != 0) {
-    O << " wait_vdst:";
-    printU4ImmDecOperand(MI, OpNo, O);
-  }
+  O << " wait_vdst:";
+  printU4ImmDecOperand(MI, OpNo, O);
 }
 
 void AMDGPUInstPrinter::printWaitEXP(const MCInst *MI, unsigned OpNo,
                                     const MCSubtargetInfo &STI,
                                     raw_ostream &O) {
-  uint8_t Imm = MI->getOperand(OpNo).getImm();
-  if (Imm != 0) {
-    O << " wait_exp:";
-    printU4ImmDecOperand(MI, OpNo, O);
-  }
+  O << " wait_exp:";
+  printU4ImmDecOperand(MI, OpNo, O);
 }
 
 bool AMDGPUInstPrinter::needsImpliedVcc(const MCInstrDesc &Desc,
diff --git a/llvm/test/MC/AMDGPU/gfx11_asm_ldsdir.s b/llvm/test/MC/AMDGPU/gfx11_asm_ldsdir.s
index 8a8daab9a3a7e03..9b1ba24053816cc 100644
--- a/llvm/test/MC/AMDGPU/gfx11_asm_ldsdir.s
+++ b/llvm/test/MC/AMDGPU/gfx11_asm_ldsdir.s
@@ -46,10 +46,10 @@ lds_direct_load v15 wait_vdst:1
 // GFX11: lds_direct_load v15 wait_vdst:1  ; encoding: [0x0f,0x00,0x11,0xce]
 
 lds_direct_load v16 wait_vdst:0
-// GFX11: lds_direct_load v16  ; encoding: [0x10,0x00,0x10,0xce]
+// GFX11: lds_direct_load v16 wait_vdst:0  ; encoding: [0x10,0x00,0x10,0xce]
 
 lds_direct_load v17
-// GFX11: lds_direct_load v17  ; encoding: [0x11,0x00,0x10,0xce]
+// GFX11: lds_direct_load v17 wait_vdst:0 ; encoding: [0x11,0x00,0x10,0xce]
 
 lds_param_load v1, attr0.x wait_vdst:15
 // GFX11: lds_param_load v1, attr0.x wait_vdst:15   ; encoding: [0x01,0x00,0x0f,0xce]
diff --git a/llvm/test/MC/AMDGPU/gfx11_asm_vinterp.s b/llvm/test/MC/AMDGPU/gfx11_asm_vinterp.s
index 0a3396b454b9c0a..e2e53776783f30a 100644
--- a/llvm/test/MC/AMDGPU/gfx11_asm_vinterp.s
+++ b/llvm/test/MC/AMDGPU/gfx11_asm_vinterp.s
@@ -1,31 +1,31 @@
 // RUN: llvm-mc -triple=amdgcn -mcpu=gfx1100 -show-encoding %s | FileCheck -check-prefix=GFX11 %s
 
 v_interp_p10_f32 v0, v1, v2, v3
-// GFX11: v_interp_p10_f32 v0, v1, v2, v3  ; encoding: [0x00,0x00,0x00,0xcd,0x01,0x05,0x0e,0x04]
+// GFX11: v_interp_p10_f32 v0, v1, v2, v3 wait_exp:0  ; encoding: [0x00,0x00,0x00,0xcd,0x01,0x05,0x0e,0x04]
 
 v_interp_p10_f32 v1, v10, v20, v30
-// GFX11: v_interp_p10_f32 v1, v10, v20, v30  ; encoding: [0x01,0x00,0x00,0xcd,0x0a,0x29,0x7a,0x04]
+// GFX11: v_interp_p10_f32 v1, v10, v20, v30 wait_exp:0  ; encoding: [0x01,0x00,0x00,0xcd,0x0a,0x29,0x7a,0x04]
 
 v_interp_p10_f32 v2, v11, v21, v31
-// GFX11: v_interp_p10_f32 v2, v11, v21, v31  ; encoding: [0x02,0x00,0x00,0xcd,0x0b,0x2b,0x7e,0x04]
+// GFX11: v_interp_p10_f32 v2, v11, v21, v31 wait_exp:0  ; encoding: [0x02,0x00,0x00,0xcd,0x0b,0x2b,0x7e,0x04]
 
 v_interp_p10_f32 v3, v12, v22, v32
-// GFX11: v_interp_p10_f32 v3, v12, v22, v32  ; encoding: [0x03,0x00,0x00,0xcd,0x0c,0x2d,0x82,0x04]
+// GFX11: v_interp_p10_f32 v3, v12, v22, v32 wait_exp:0 ; encoding: [0x03,0x00,0x00,0xcd,0x0c,0x2d,0x82,0x04]
 
 v_interp_p10_f32 v0, v1, v2, v3 clamp
-// GFX11: v_interp_p10_f32 v0, v1, v2, v3 clamp  ; encoding: [0x00,0x80,0x00,0xcd,0x01,0x05,0x0e,0x04]
+// GFX11: v_interp_p10_f32 v0, v1, v2, v3 clamp wait_exp:0 ; encoding: [0x00,0x80,0x00,0xcd,0x01,0x05,0x0e,0x04]
 
 v_interp_p10_f32 v0, -v1, v2, v3
-// GFX11: v_interp_p10_f32 v0, -v1, v2, v3  ; encoding: [0x00,0x00,0x00,0xcd,0x01,0x05,0x0e,0x24]
+// GFX11: v_interp_p10_f32 v0, -v1, v2, v3 wait_exp:0 ; encoding: [0x00,0x00,0x00,0xcd,0x01,0x05,0x0e,0x24]
 
 v_interp_p10_f32 v0, v1, -v2, v3
-// GFX11: v_interp_p10_f32 v0, v1, -v2, v3  ; encoding: [0x00,0x00,0x00,0xcd,0x01,0x05,0x0e,0x44]
+// GFX11: v_interp_p10_f32 v0, v1, -v2, v3 wait_exp:0 ; encoding: [0x00,0x00,0x00,0xcd,0x01,0x05,0x0e,0x44]
 
 v_interp_p10_f32 v0, v1, v2, -v3
-// GFX11: v_interp_p10_f32 v0, v1, v2, -v3  ; encoding: [0x00,0x00,0x00,0xcd,0x01,0x05,0x0e,0x84]
+// GFX11: v_interp_p10_f32 v0, v1, v2, -v3 wait_exp:0 ; encoding: [0x00,0x00,0x00,0xcd,0x01,0x05,0x0e,0x84]
 
 v_interp_p10_f32 v0, v1, v2, v3 wait_exp:0
-// GFX11: v_interp_p10_f32 v0, v1, v2, v3  ; encoding: [0x00,0x00,0x00,0xcd,0x01,0x05,0x0e,0x04]
+// GFX11: v_interp_p10_f32 v0, v1, v2, v3 wait_exp:0 ; encoding: [0x00,0x00,0x00,0xcd,0x01,0x05,0x0e,0x04]
 
 v_interp_p10_f32 v0, v1, v2, v3 wait_exp:1
 // GFX11: v_interp_p10_f32 v0, v1, v2, v3 wait_exp:1 ; encoding: [0x00,0x01,0x00,0xcd,0x01,0x05,0x0e,0x04]
@@ -37,31 +37,31 @@ v_interp_p10_f32 v0, v1, v2, v3 clamp wait_exp:7
 // GFX11: v_interp_p10_f32 v0, v1, v2, v3 clamp wait_exp:7 ; encoding: [0x00,0x87,0x00,0xcd,0x01,0x05,0x0e,0x04]
 
 v_interp_p2_f32 v0, v1, v2, v3
-// GFX11: v_interp_p2_f32 v0, v1, v2, v3  ; encoding: [0x00,0x00,0x01,0xcd,0x01,0x05,0x0e,0x04]
+// GFX11: v_interp_p2_f32 v0, v1, v2, v3 wait_exp:0 ; encoding: [0x00,0x00,0x01,0xcd,0x01,0x05,0x0e,0x04]
 
 v_interp_p2_f32 v1, v10, v20, v30
-// GFX11: v_interp_p2_f32 v1, v10, v20, v30  ; encoding: [0x01,0x00,0x01,0xcd,0x0a,0x29,0x7a,0x04]
+// GFX11: v_interp_p2_f32 v1, v10, v20, v30 wait_exp:0 ; encoding: [0x01,0x00,0x01,0xcd,0x0a,0x29,0x7a,0x04]
 
 v_interp_p2_f32 v2, v11, v21, v31
-// GFX11: v_interp_p2_f32 v2, v11, v21, v31  ; encoding: [0x02,0x00,0x01,0xcd,0x0b,0x2b,0x7e,0x04]
+// GFX11: v_interp_p2_f32 v2, v11, v21, v31 wait_exp:0 ; encoding: [0x02,0x00,0x01,0xcd,0x0b,0x2b,0x7e,0x04]
 
 v_interp_p2_f32 v3, v12, v22, v32
-// GFX11: v_interp_p2_f32 v3, v12, v22, v32  ; encoding: [0x03,0x00,0x01,0xcd,0x0c,0x2d,0x82,0x04]
+// GFX11: v_interp_p2_f32 v3, v12, v22, v32 wait_exp:0 ; encoding: [0x03,0x00,0x01,0xcd,0x0c,0x2d,0x82,0x04]
 
 v_interp_p2_f32 v0, v1, v2, v3 clamp
-// GFX11: v_interp_p2_f32 v0, v1, v2, v3 clamp  ; encoding: [0x00,0x80,0x01,0xcd,0x01,0x05,0x0e,0x04]
+// GFX11: v_interp_p2_f32 v0, v1, v2, v3 clamp wait_exp:0 ; encoding: [0x00,0x80,0x01,0xcd,0x01,0x05,0x0e,0x04]
 
 v_interp_p2_f32 v0, -v1, v2, v3
-// GFX11: v_interp_p2_f32 v0, -v1, v2, v3  ; encoding: [0x00,0x00,0x01,0xcd,0x01,0x05,0x0e,0x24]
+// GFX11: v_interp_p2_f32 v0, -v1, v2, v3 wait_exp:0 ; encoding: [0x00,0x00,0x01,0xcd,0x01,0x05,0x0e,0x24]
 
 v_interp_p2_f32 v0, v1, -v2, v3
-// GFX11: v_interp_p2_f32 v0, v1, -v2, v3  ; encoding: [0x00,0x00,0x01,0xcd,0x01,0x05,0x0e,0x44]
+// GFX11: v_interp_p2_f32 v0, v1, -v2, v3 wait_exp:0 ; encoding: [0x00,0x00,0x01,0xcd,0x01,0x05,0x0e,0x44]
 
 v_interp_p2_f32 v0, v1, v2, -v3
-// GFX11: v_interp_p2_f32 v0, v1, v2, -v3  ; encoding: [0x00,0x00,0x01,0xcd,0x01,0x05,0x0e,0x84]
+// GFX11: v_interp_p2_f32 v0, v1, v2, -v3 wait_exp:0 ; encoding: [0x00,0x00,0x01,0xcd,0x01,0x05,0x0e,0x84]
 
 v_interp_p2_f32 v0, v1, v2, v3 wait_exp:0
-// GFX11: v_interp_p2_f32 v0, v1, v2, v3  ; encoding: [0x00,0x00,0x01,0xcd,0x01,0x05,0x0e,0x04]
+// GFX11: v_interp_p2_f32 v0, v1, v2, v3 wait_exp:0 ; encoding: [0x00,0x00,0x01,0xcd,0x01,0x05,0x0e,0x04]
 
 v_interp_p2_f32 v0, v1, v2, v3 wait_exp:1
 // GFX11: v_interp_p2_f32 v0, v1, v2, v3 wait_exp:1 ; encoding: [0x00,0x01,0x01,0xcd,0x01,0x05,0x0e,0x04]
@@ -73,22 +73,22 @@ v_interp_p2_f32 v0, v1, v2, v3 clamp wait_exp:7
 // GFX11: v_interp_p2_f32 v0, v1, v2, v3 clamp wait_exp:7 ; encoding: [0x00,0x87,0x01,0xcd,0x01,0x05,0x0e,0x04]
 
 v_interp_p10_f16_f32 v0, v1, v2, v3
-// GFX11: v_interp_p10_f16_f32 v0, v1, v2, v3  ; encoding: [0x00,0x00,0x02,0xcd,0x01,0x05,0x0e,0x04]
+// GFX11: v_interp_p10_f16_f32 v0, v1, v2, v3 wait_exp:0 ; encoding: [0x00,0x00,0x02,0xcd,0x01,0x05,0x0e,0x04]
 
 v_interp_p10_f16_f32 v0, -v1, v2, v3
-// GFX11: v_interp_p10_f16_f32 v0, -v1, v2, v3  ; encoding: [0x00,0x00,0x02,0xcd,0x01,0x05,0x0e,0x24]
+// GFX11: v_interp_p10_f16_f32 v0, -v1, v2, v3 wait_exp:0 ; encoding: [0x00,0x00,0x02,0xcd,0x01,0x05,0x0e,0x24]
 
 v_interp_p10_f16_f32 v0, v1, -v2, v3
-// GFX11: v_interp_p10_f16_f32 v0, v1, -v2, v3  ; encoding: [0x00,0x00,0x02,0xcd,0x01,0x05,0x0e,0x44]
+// GFX11: v_interp_p10_f16_f32 v0, v1, -v2, v3 wait_exp:0 ; encoding: [0x00,0x00,0x02,0xcd,0x01,0x05,0x0e,0x44]
 
 v_interp_p10_f16_f32 v0, v1, v2, -v3
-// GFX11: v_interp_p10_f16_f32 v0, v1, v2, -v3  ; encoding: [0x00,0x00,0x02,0xcd,0x01,0x05,0x0e,0x84]
+// GFX11: v_interp_p10_f16_f32 v0, v1, v2, -v3 wait_exp:0 ; encoding: [0x00,0x00,0x02,0xcd,0x01,0x05,0x0e,0x84]
 
 v_interp_p10_f16_f32 v0, v1, v2, v3 clamp
-// GFX11: v_interp_p10_f16_f32 v0, v1, v2, v3 clamp ; encoding: [0x00,0x80,0x02,0xcd,0x01,0x05,0x0e,0x04]
+// GFX11: v_interp_p10_f16_f32 v0, v1, v2, v3 clamp wait_exp:0 ; encoding: [0x00,0x80,0x02,0xcd,0x01,0x05,0x0e,0x04]
 
 v_interp_p10_f16_f32 v0, v1, v2, v3 wait_exp:0
-// GFX11: v_interp_p10_f16_f32 v0, v1, v2, v3  ; encoding: [0x00,0x00,0x02,0xcd,0x01,0x05,0x0e,0x04]
+// GFX11: v_interp_p10_f16_f32 v0, v1, v2, v3 wait_exp:0 ; encoding: [0x00,0x00,0x02,0xcd,0x01,0x05,0x0e,0x04]
 
 v_interp_p10_f16_f32 v0, v1, v2, v3 wait_exp:1
 // GFX11: v_interp_p10_f16_f32 v0, v1, v2, v3 wait_exp:1 ; encoding: [0x00,0x01,0x02,0xcd,0x01,0x05,0x0e,0x04]
@@ -97,22 +97,22 @@ v_interp_p10_f16_f32 v0, v1, v2, v3 wait_exp:7
 // GFX11: v_interp_p10_f16_f32 v0, v1, v2, v3 wait_exp:7 ; encoding: [0x00,0x07,0x02,0xcd,0x01,0x05,0x0e,0x04]
 
 v_interp_p10_f16_f32 v0, v1, v2, v3 op_sel:[0,0,0,0]
-// GFX11: v_interp_p10_f16_f32 v0, v1, v2, v3  ; encoding: [0x00,0x00,0x02,0xcd,0x01,0x05,0x0e,0x04]
+// GFX11: v_interp_p10_f16_f32 v0, v1, v2, v3 wait_exp:0 ; encoding: [0x00,0x00,0x02,0xcd,0x01,0x05,0x0e,0x04]
 
 v_interp_p10_f16_f32 v0, v1, v2, v3 op_sel:[1,0,0,0]
-// GFX11: v_interp_p10_f16_f32 v0, v1, v2, v3 op_sel:[1,0,0,0] ; encoding: [0x00,0x08,0x02,0xcd,0x01,0x05,0x0e,0x04]
+// GFX11: v_interp_p10_f16_f32 v0, v1, v2, v3 op_sel:[1,0,0,0] wait_exp:0 ; encoding: [0x00,0x08,0x02,0xcd,0x01,0x05,0x0e,0x04]
 
 v_interp_p10_f16_f32 v0, v1, v2, v3 op_sel:[0,1,0,0]
-// GFX11: v_interp_p10_f16_f32 v0, v1, v2, v3 op_sel:[0,1,0,0] ; encoding: [0x00,0x10,0x02,0xcd,0x01,0x05,0x0e,0x04]
+// GFX11: v_interp_p10_f16_f32 v0, v1, v2, v3 op_sel:[0,1,0,0] wait_exp:0 ; encoding: [0x00,0x10,0x02,0xcd,0x01,0x05,0x0e,0x04]
 
 v_interp_p10_f16_f32 v0, v1, v2, v3 op_sel:[0,0,1,0]
-// GFX11: v_interp_p10_f16_f32 v0, v1, v2, v3 op_sel:[0,0,1,0] ; encoding: [0x00,0x20,0x02,0xcd,0x01,0x05,0x0e,0x04]
+// GFX11: v_interp_p10_f16_f32 v0, v1, v2, v3 op_sel:[0,0,1,0] wait_exp:0 ; encoding: [0x00,0x20,0x02,0xcd,0x01,0x05,0x0e,0x04]
 
 v_interp_p10_f16_f32 v0, v1, v2, v3 op_sel:[0,0,0,1]
-// GFX11: v_interp_p10_f16_f32 v0, v1, v2, v3 op_sel:[0,0,0,1] ; encoding: [0x00,0x40,0x02,0xcd,0x01,0x05,0x0e,0x04]
+// GFX11: v_interp_p10_f16_f32 v0, v1, v2, v3 op_sel:[0,0,0,1] wait_exp:0 ; encoding: [0x00,0x40,0x02,0xcd,0x01,0x05,0x0e,0x04]
 
 v_interp_p10_f16_f32 v0, v1, v2, v3 op_sel:[1,1,1,1]
-// GFX11: v_interp_p10_f16_f32 v0, v1, v2, v3 op_sel:[1,1,1,1] ; encoding: [0x00,0x78,0x02,0xcd,0x01,0x05,0x0e,0x04]
+// GFX11: v_interp_p10_f16_f32 v0, v1, v2, v3 op_sel:[1,1,1,1] wait_exp:0 ; encoding: [0x00,0x78,0x02,0xcd,0x01,0x05,0x0e,0x04]
 
 v_interp_p10_f16_f32 v0, v1, v2, v3 op_sel:[1,0,0,1] wait_exp:5
 // GFX11: v_interp_p10_f16_f32 v0, v1, v2, v3 op_sel:[1,0,0,1] wait_exp:5 ; encoding: [0x00,0x4d,0x02,0xcd,0x01,0x05,0x0e,0x04]
@@ -124,22 +124,22 @@ v_interp_p10_f16_f32 v0, -v1, -v2, -v3 clamp op_sel:[1,0,0,1] wait_exp:5
 // GFX11: v_interp_p10_f16_f32 v0, -v1, -v2, -v3 clamp op_sel:[1,0,0,1] wait_exp:5 ; encoding: [0x00,0xcd,0x02,0xcd,0x01,0x05,0x0e,0xe4]
 
 v_interp_p2_f16_f32 v0, v1, v2, v3
-// GFX11: v_interp_p2_f16_f32 v0, v1, v2, v3  ; encoding: [0x00,0x00,0x03,0xcd,0x01,0x05,0x0e,0x04]
+// GFX11: v_interp_p2_f16_f32 v0, v1, v2, v3 wait_exp:0 ; encoding: [0x00,0x00,0x03,0xcd,0x01,0x05,0x0e,0x04]
 
 v_interp_p2_f16_f32 v0, -v1, v2, v3
-// GFX11: v_interp_p2_f16_f32 v0, -v1, v2, v3  ; encoding: [0x00,0x00,0x03,0xcd,0x01,0x05,0x0e,0x24]
+// GFX11: v_interp_p2_f16_f32 v0, -v1, v2, v3 wait_exp:0 ; encoding: [0x00,0x00,0x03,0xcd,0x01,0x05,0x0e,0x24]
 
 v_interp_p2_f16_f32 v0, v1, -v2, v3
-// GFX11: v_interp_p2_f16_f32 v0, v1, -v2, v3  ; encoding: [0x00,0x00,0x03,0xcd,0x01,0x05,0x0e,0x44]
+// GFX11: v_interp_p2_f16_f32 v0, v1, -v2, v3 wait_exp:0 ; encoding: [0x00,0x00,0x03,0xcd,0x01,0x05,0x0e,0x44]
 
 v_interp_p2_f16_f32 v0, v1, v2, -v3
-// GFX11: v_interp_p2_f16_f32 v0, v1, v2, -v3  ; encoding: [0x00,0x00,0x03,0xcd,0x01,0x05,0x0e,0x84]
+// GFX11: v_interp_p2_f16_f32 v0, v1, v2, -v3 wait_exp:0 ; encoding: [0x00,0x00,0x03,0xcd,0x01,0x05,0x0e,0x84]
 
 v_interp_p2_f16_f32 v0, v1, v2, v3 clamp
-// GFX11: v_interp_p2_f16_f32 v0, v1, v2, v3 clamp ; encoding: [0x00,0x80,0x03,0xcd,0x01,0x05,0x0e,0x04]
+// GFX11: v_interp_p2_f16_f32 v0, v1, v2, v3 clamp wait_exp:0 ; encoding: [0x00,0x80,0x03,0xcd,0x01,0x05,0x0e,0x04]
 
 v_interp_p2_f16_f32 v0, v1, v2, v3 wait_exp:0
-// GFX11: v_interp_p2_f16_f32 v0, v1, v2, v3  ; encoding: [0x00,0x00,0x03,0xcd,0x01,0x05,0x0e,0x04]
+// GFX11: v_interp_p2_f16_f32 v0, v1, v2, v3 wait_exp:0 ; encoding: [0x00,0x00,0x03,0xcd,0x01,0x05,0x0e,0x04]
 
 v_interp_p2_f16_f32 v0, v1, v2, v3 wait_exp:1
 // GFX11: v_interp_p2_f16_f32 v0, v1, v2, v3 wait_exp:1 ; encoding: [0x00,0x01,0x03,0xcd,0x01,0x05,0x0e,0x04]
@@ -148,22 +148,22 @@ v_interp_p2_f16_f32 v0, v1, v2, v3 wait_exp:7
 // GFX11: v_interp_p2_f16_f32 v0, v1, v2, v3 wait_exp:7 ; encoding: [0x00,0x07,0x03,0xcd,0x01,0x05,0x0e,0x04]
 
 v_interp_p2_f16_f32 v0, v1, v2, v3 op_sel:[0,0,0,0]
-// GFX11: v_interp_p2_f16_f32 v0, v1, v2, v3  ; encoding: [0x00,0x00,0x03,0xcd,0x01,0x05,0x0e,0x04]
+// GFX11: v_interp_p2_f16_f32 v0, v1, v2, v3 wait_exp:0 ; encoding: [0x00,0x00,0x03,0xcd,0x01,0x05,0x0e,0x04]
 
 v_interp_p2_f16_f32 v0, v1, v2, v3 op_sel:[1,0,0,0]
-// GFX11: v_interp_p2_f16_f32 v0, v1, v2, v3 op_sel:[1,0,0,0] ; encoding: [0x00,0x08,0x03,0xcd,0x01,0x05,0x0e,0x04]
+// GFX11: v_interp_p2_f16_f32 v0, v1, v2, v3 op_sel:[1,0,0,0] wait_exp:0 ; encoding: [0x00,0x08,0x03,0xcd,0x01,0x05,0x0e,0x04]
 
 v_interp_p2_f16_f32 v0, v1, v2, v3 op_sel:[0,1,0,0]
-// GFX11: v_interp_p2_f16_f32 v0, v1, v2, v3 op_sel:[0,1,0,0] ; encoding: [0x00,0x10,0x03,0xcd,0x01,0x05,0x0e,0x04]
+// GFX11: v_interp_p2_f16_f32 v0, v1, v2, v3 op_sel:[0,1,0,0] wait_exp:0 ; encoding: [0x00,0x10,0x03,0xcd,0x01,0x05,0x0e,0x04]
 
 v_interp_p2_f16_f32 v0, v1, v2, v3 op_sel:[0,0,1,0]
-// GFX11: v_interp_p2_f16_f32 v0, v1, v2, v3 op_sel:[0,0,1,0] ; encoding: [0x00,0x20,0x03,0xcd,0x01,0x05,0x0e,0x04]
+// GFX11: v_interp_p2_f16_f32 v0, v1, v2, v3 op_sel:[0,0,1,0] wait_exp:0 ; encoding: [0x00,0x20,0x03,0xcd,0x01,0x05,0x0e,0x04]
 
 v_interp_p2_f16_f32 v0, v1, v2, v3 op_sel:[0,0,0,1]
-// GFX11: v_interp_p2_f16_f32 v0, v1, v2, v3 op_sel:[0,0,0,1] ; encoding: [0x00,0x40,0x03,0xcd,0x01,0x05,0x0e,0x04]
+// GFX11: v_interp_p2_f16_f32 v0, v1, v2, v3 op_sel:[0,0,0,1] wait_exp:0 ; encoding: [0x00,0x40,0x03,0xcd,0x01,0x05,0x0e,0x04]
 
 v_interp_p2_f16_f32 v0, v1, v2, v3 op_sel:[1,1,1,1]
-// GFX11: v_interp_p2_f16_f32 v0, v1, v2, v3 op_sel:[1,1,1,1] ; encoding: [0x00,0x78,0x03,0xcd,0x01,0x05,0x0e,0x04]
+// GFX11: v_interp_p2_f16_f32 v0, v1, v2, v3 op_sel:[1,1,1,1] wait_exp:0 ; encoding: [0x00,0x78,0x03,0xcd,0x01,0x05,0x0e,0x04]
 
 v_interp_p2_f16_f32 v0, v1, v2, v3 op_sel:[1,0,0,1] wait_exp:5
 // GFX11: v_interp_p2_f16_f32 v0, v1, v2, v3 op_sel:[1,0,0,1] wait_exp:5 ; encoding: [0x00,0x4d,0x03,0xcd,0x01,0x05,0x0e,0x04]
@@ -175,22 +175,22 @@ v_interp_p2_f16_f32 v0, -v1, -v2, -v3 clamp op_sel:[1,0,0,1] wait_exp:5
 // GFX11: v_interp_p2_f16_f32 v0, -v1, -v2, -v3 clamp op_sel:[1,0,0,1] wait_exp:5 ; encoding: [0x00,0xcd,0x03,0xcd,0x01,0x05,0x0e,0xe4]
 
 v_interp_p10_rtz_f16_f32 v0, v1, v2, v3
-// GFX11: v_interp_p10_rtz_f16_f32 v0, v1, v2, v3 ; encoding: [0x00,0x00,0x04,0xcd,0x01,0x05,0x0e,0x04]
+// GFX11: v_interp_p10_rtz_f16_f32 v0, v1, v2, v3 wait_exp:0 ; encoding: [0x00,0x00,0x04,0xcd,0x01,0x05,0x0e,0x04]
 
 v_interp_p10_rtz_f16_f32 v0, -v1, v2, v3
-// GFX11: v_interp_p10_rtz_f16_f32 v0, -v1, v2, v3 ; encoding: [0x00,0x00,0x04,0xcd,0x01,0x05,0x0e,0x24]
+// GFX11: v_interp_p10_rtz_f16_f32 v0, -v1, v2, v3 wait_exp:0 ; encoding: [0x00,0x00,0x04,0xcd,0x01,0x05,0x0e,0x24]
 
 v_interp_p10_rtz_f16_f32 v0, v1, -v2, v3
-// GFX11: v_interp_p10_rtz_f16_f32 v0, v1, -v2, v3 ; encoding: [0x00,0x00,0x04,0xcd,0x01,0x05,0x0e,0x44]
+// GFX11: v_interp_p10_rtz_f16_f32 v0, v1, -v2, v3 wait_exp:0 ; encoding: [0x00,0x00,0x04,0xcd,0x01,0x05,0x0e,0x44]
 
 v_interp_p10_rtz_f16_f32 v0, v1, v2, -v3
-// GFX11: v_interp_p10_rtz_f16_f32 v0, v1, v2, -v3 ; encoding: [0x00,0x00,0x04,0xcd,0x01,0x05,0x0e,0x84]
+// GFX11: v_interp_p10_rtz_f16_f32 v0, v1, v2, -v3 wait_exp:0 ; encoding: [0x00,0x00,0x04,0xcd,0x01,0x05,0x0e,0x84]
 
 v_interp_p10_rtz_f16_f32 v0, v1, v2, v3 clamp
-// GFX11: v_interp_p10_rtz_f16_f32 v0, v1, v2, v3 clamp ; encoding: [0x00,0x80,0x04,0xcd,0x01,0x05,0x0e,0x04]
+// GFX11: v_interp_p10_rtz_f16_f32 v0, v1, v2, v3 clamp wait_exp:0 ; encoding: [0x00,0x80,0x04,0xcd,0x01,0x05,0x0e,0x04]
 
 v_interp_p10_rtz_f16_f32 v0, v1, v2, v3 wait_exp:0
-// GFX11: v_interp_p10_rtz_f16_f32 v0, v1, v2, v3 ; encoding: [0x00,0x00,0x04,0xcd,0x01,0x05,0x0e,0x04]
+// GFX11: v_interp_p10_rtz_f16_f32 v0, v1, v2, v3 wait_exp:0 ; encoding: [0x00,0x00,0x04,0xcd,0x01,0x05,0x0e,0x04]
 
 v_interp_p10_rtz_f16_f32 v0, v1, v2, v3 wait_exp:1
 // GFX11: v_interp_p10_rtz_f16_f32 v0, v1, v2, v3 wait_exp:1 ; encoding: [0x00,0x01,0x04,0xcd,0x01,0x05,0x0e,0x04]
@@ -199,22 +199,22 @@ v_interp_p10_rtz_f16_f32 v0, v1, v2, v3 wait_exp:7
 // GFX11: v_interp_p10_rtz_f16_f32 v0, v1, v2, v3 wait_exp:7 ; encoding: [0x00,0x07,0x04,0xcd,0x01,0x05,0x0e,0x04]
 
 v_interp_p10_rtz_f16_f32 v0, v1, v2, v3 op_sel:[0,0,0,0]
-// GFX11: v_interp_p10_rtz_f16_f32 v0, v1, v2, v3 ; encoding: [0x00,0x00,0x04,0xcd,0x01,0x05,0x0e,0x04]
+// GFX11: v_interp_p10_rtz_f16_f32 v0, v1, v2, v3 wait_exp:0 ; encoding: [0x00,0x00,0x04,0xcd,0x01,0x05,0x0e,0x04]
 
 v_interp_p10_rtz_f16_f32 v0, v1, v2, v3 op_sel:[1,0,0,0]
-// GFX11: v_interp_p10_rtz_f16_f32 v0, v1, v2, v3 op_sel:[1,0,0,0] ; encoding: [0x00,0x08,0x04,0xcd,0x01,0x05,0x0e,0x04]
+// GFX11: v_interp_p10_rtz_f16_f32 v0, v1, v2, v3 op_sel:[1,0,0,0] wait_exp:0 ; encoding: [0x00,0x08,0x04,0xcd,0x01,0x05,0x0e,0x04]
 
 v_interp_p10_rtz_f16_f32 v0, v1, v2, v3 op_sel:[0,1,0,0]
-// GFX11: v_interp_p10_rtz_f16_f32 v0, v1, v2, v3 op_sel:[0,1,0,0] ; encoding: [0x00,0x10,0x04,0xcd,0x01,0x05,0x0e,0x04]
+// GFX11: v_interp_p10_rtz_f16_f32 v0, v1, v2, v3 op_sel:[0,1,0,0] wait_exp:0 ; encoding: [0x00,0x10,0x04,0xcd,0x01,0x05,0x0e,0x04]
 
 v_interp_p10_rtz_f16_f32 v0, v1, v2, v3 op_sel:[0,0,1,0]
-// GFX11: v_interp_p10_rtz_f16_f32 v0, v1, v2, v3 op_sel:[0,0,1,0] ; encoding: [0x00,0x20,0x04,0xcd,0x01,0x05,0x0e,0x04]
+// GFX11: v_interp_p10_rtz_f16_f32 v0, v1, v2, v3 op_sel:[0,0,1,0] wait_exp:0 ; encoding: [0x00,0x20,0x04,0xcd,0x01,0x05,0x0e,0x04]
 
 v_interp_p10_rtz_f16_f32 v0, v1, v2, v3 op_sel:[0,0,0,1]
-// GFX11: v_interp_p10_rtz_f16_f32 v0, v1, v2, v3 op_sel:[0,0,0,1] ; encoding: [0x00,0x40,0x04,0xcd,0x01,0x05,0x0e,0x04]
+// GFX11: v_interp_p10_rtz_f16_f32 v0, v1, v2, v3 op_sel:[0,0,0,1] wait_exp:0 ; encoding: [0x00,0x40,0x04,0xcd,0x01,0x05,0x0e,0x04]
 
 v_interp_p10_rtz_f16_f32 v0, v1, v2, v3 op_sel:[1,1,1,1]
-// GFX11: v_interp_p10_rtz_f16_f32 v0, v1, v2, v3 op_sel:[1,1,1,1] ; encoding: [0x00,0x78,0x04,0xcd,0x01,0x05,0x0e,0x04]
+// GFX11: v_interp_p10_rtz_f16_f32 v0, v1, v2, v3 op_sel:[1,1,1,1] wait_exp:0 ; encoding: [0x00,0x78,0x04,0xcd,0x01,0x05,0x0e,0x04]
 
 v_interp_p10_rtz_f16_f32 v0, v1, v2, v3 op_sel:[1,0,0,1] wait_exp:5
 // GFX11: v_interp_p10_rtz_f16_f32 v0, v1, v2, v3 op_sel:[1,0,0,1] wait_exp:5 ; encoding: [0x00,0x4d,0x04,0xcd,0x01,0x05,0x0e,0x04]
@@ -226,22 +226,22 @@ v_interp_p10_rtz_f16_f32 v0, -v1, -v2, -v3 clamp op_sel:[1,0,0,1] wait_exp:5
 // GFX11: v_interp_p10_rtz_f16_f32 v0, -v1, -v2, -v3 clamp op_sel:[1,0,0,1] wait_exp:5 ; encoding: [0x00,0xcd,0x04,0xcd,0x01,0x05,0x0e,0xe4]
 
 v_interp_p2_rtz_f16_f32 v0, v1, v2, v3
-// GFX11: v_interp_p2_rtz_f16_f32 v0, v1, v2, v3  ; encoding: [0x00,0x00,0x05,0xcd,0x01,0x05,0x0e,0x04]
+// GFX11: v_interp_p2_rtz_f16_f32 v0, v1, v2, v3 wait_exp:0 ; encoding: [0x00,0x00,0x05,0xcd,0x01,0x05,0x0e,0x04]
 
 v_interp_p2_rtz_f16_f32 v0, -v1, v2, v3
-// GFX11: v_interp_p2_rtz_f16_f32 v0, -v1, v2, v3 ; encoding: [0x00,0x00,0x05,0xcd,0x01,0x05,0x0e,0x24]
+// GFX11: v_interp_p2_rtz_f16_f32 v0, -v1, v2, v3 wait_exp:0 ...
[truncated]

@dstutt
Copy link
Collaborator

dstutt commented Sep 18, 2023

This makes sense to me.

Copy link
Contributor

@jayfoad jayfoad left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. I don't think the SP3 disassembler does this, but it'll still assemble the same so that doesn't matter.

Copy link
Contributor

@Sisyph Sisyph left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. It seems more clear.

@perlfu perlfu merged commit 6ebc179 into llvm:main Sep 22, 2023
@perlfu perlfu deleted the always-output-wait-vdstexp branch September 22, 2023 00:25
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backend:AMDGPU mc Machine (object) code
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[AMDGPU] Print wait_exp:0 for VINTERP and wait_vdst:0 for LDSDIR
5 participants