Skip to content

Add JIT/VM support for the approved but NYI xarch isas #116230

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 8 commits into from
Jun 7, 2025

Conversation

tannergooding
Copy link
Member

@tannergooding tannergooding commented Jun 2, 2025

This adds the JIT/VM support for the various xarch ISAs that have been approved but NYI. This does not actually add the intrinsic IDs or managed support required for their use.

This is being put up to avoid needing to churn the JIT/EE version guid multiple times and induce many conflicts for the pending work.

@github-actions github-actions bot added the area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI label Jun 2, 2025
Copy link
Contributor

Tagging subscribers to this area: @JulieLeeMSFT, @jakobbotsch
See info in area-owners.md if you want to be subscribed.

@tannergooding tannergooding marked this pull request as ready for review June 3, 2025 17:16
@Copilot Copilot AI review requested due to automatic review settings June 3, 2025 17:16
Copy link
Contributor

@Copilot Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR adds support for several new xarch instruction set extensions in both the JIT and VM. Key changes include updates to global CPU feature variables to 64‑bit types, modifications to intrinsic and configuration handling (including new configuration flags), and adjustments to emitter and instruction encoding functions to support the new ISAs.

Reviewed Changes

Copilot reviewed 38 out of 38 changed files in this pull request and generated no comments.

Show a summary per file
File Description
src/coreclr/nativeaot/Runtime/startup.cpp Updated global CPU feature variables to use int64_t
src/coreclr/jit/lowerxarch.cpp Updated isEmbeddedBroadcastCompatibleHWIntrinsic calls to pass a Compiler* parameter
src/coreclr/jit/gentree.{h,cpp} Updated isEmbeddedBroadcastCompatibleHWIntrinsic signature to accept a Compiler* parameter
src/coreclr/jit/emitxarch.{h,cpp} Refactored intrinsic encoding functions (merging several checks into Is3OpRmwInstruction)
src/coreclr/jit/compiler.cpp Remapped JIT configuration flags to new ISA support and updated default values
src/coreclr/jit/emit.h Adjusted emitter bitfield sizes to support an increased instruction count
src/coreclr/inc/{jitconfigvalues.h,clrconfigvalues.h} Added new configuration entries for the newly supported ISAs
Other files Various adjustments in ISA lookup, opcode updates, and version GUID changes to support additional ISAs
Comments suppressed due to low confidence (6)

src/coreclr/jit/compiler.cpp:6089

  • The remapping of JitConfig flags (e.g., using EnableAES, EnableAPX, etc.) to add corresponding instruction sets requires careful verification. Please confirm that the new mapping aligns with the intended ISA support and that default values are correct.
if (JitConfig.EnableAES() != 0) { instructionSetFlags.AddInstructionSet(InstructionSet_AES); }

src/coreclr/nativeaot/Runtime/startup.cpp:53

  • Changing g_cpuFeatures and g_requiredCpuFeatures to int64_t from int requires ensuring that all consumers of these globals can handle 64-bit values. Please verify that this change is consistent with their usage across the codebase.
EXTERN_C int64_t g_cpuFeatures;

src/coreclr/jit/lowerxarch.cpp:9189

  • The updated call to isEmbeddedBroadcastCompatibleHWIntrinsic now passes a Compiler pointer; please ensure that all call sites and the function implementation have been updated consistently.
if (parentNode->isEmbeddedBroadcastCompatibleHWIntrinsic(comp))

src/coreclr/jit/gentree.h:1504

  • Updating isEmbeddedBroadcastCompatibleHWIntrinsic to accept a Compiler pointer is a significant API change; please verify that all its usages (including in gentree.cpp) reflect this new signature.
bool isEmbeddedBroadcastCompatibleHWIntrinsic(Compiler* comp) const;

src/coreclr/jit/emitxarch.cpp:93

  • Consolidating checks for FMA, PermuteVar2x, and AVXVNNI into Is3OpRmwInstruction simplifies the code; please ensure that the new unified check covers all 3-operand RMW intrinsic scenarios with adequate tests.
bool emitter::Is3OpRmwInstruction(instruction ins)

src/coreclr/jit/emit.h:666

  • The update in emitter bitfield sizes to accommodate a larger INS_count necessitates verification that serialization and architecture‐specific packing remain correct.
static_assert_no_msg(INS_count <= 2048);

@tannergooding tannergooding requested a review from EgorBo June 3, 2025 17:27
@tannergooding
Copy link
Member Author

CC. @dotnet/jit-contrib, @EgorBo for JIT side changes

@MichalStrehovsky, @jkotas for the VM/NAOT changes.

Comment on lines +209 to +210
optimisticInstructionSetSupportBuilder.AddSupportedInstructionSet("lzcnt");
optimisticInstructionSetSupportBuilder.AddSupportedInstructionSet("movbe");
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These won't opportunistically lightup for baseline anymore, but that shouldn't be an issue.

These instructions are only available on AVX+ hardware. movbe was an internal instruction set and so it never got opportunistically emitted for NAOT anyways. While lzcnt has a baseline alternative via bsr and the cost of the lzcnt check and branch is more expensive than the check of value == 0 check to handle the bsr edge case.

{ ("skylake", TargetArchitecture.X86), "x86-x64-v3" },
{ ("x86-x64-v4", TargetArchitecture.X64), "x86-x64-v3 avx512" },
{ ("x86-x64-v4", TargetArchitecture.X86), "x86-x64-v3 avx512" },
{ ("x86-64", TargetArchitecture.X64), "base" },
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can these changes break performance for anyone?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Someone with a machine that doesn’t have SSE4.2 (shipped 2008) could see a performance regression

However, such users were already getting their R2R images thrown out and we’ve not really had SSE3/SSSE3/SSE4.1 specific code paths for a while, as the test complexity matrix hadn’t been worth supporting. Just some minor cases where we didn’t need to check for something higher.

The last pre-SSE4.2 processors went out of support around 2013. Windows 11 requires SSE4.1 and 24H2 requires SSE4.2

No major cloud providers have such old hardware and client statistics, such as Steam Hardware survey, report such devices being around 0.22%

So the belief is this reduction is also worthwhile.

For other ISA reductions, it’s simply combining sets that have always shipped together in the wild. That is, there is no hardware that has FMA without also having AVX2, BMI1, BMI2, F16C, etc. They have always been provided together as a set

else
if (nestedTypeName == "VL")
{ return InstructionSet.X64_AVX512v3; }
else
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this looks like a good candidate for switch pattern matching syntax? 🙂

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This file is generated from instructiondesc.txt and the corresponding gen.bat

I agree there is a lot of potential cleanup possible there, but it’s out of scope for this PR since it’s automated

@tannergooding tannergooding merged commit aad452c into dotnet:main Jun 7, 2025
164 checks passed
@tannergooding tannergooding deleted the fix-87097 branch June 7, 2025 16:35
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants