-
Notifications
You must be signed in to change notification settings - Fork 5k
Add JIT/VM support for the approved but NYI xarch isas #116230
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Tagging subscribers to this area: @JulieLeeMSFT, @jakobbotsch |
e969e2b
to
e8c66d7
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This PR adds support for several new xarch instruction set extensions in both the JIT and VM. Key changes include updates to global CPU feature variables to 64‑bit types, modifications to intrinsic and configuration handling (including new configuration flags), and adjustments to emitter and instruction encoding functions to support the new ISAs.
Reviewed Changes
Copilot reviewed 38 out of 38 changed files in this pull request and generated no comments.
Show a summary per file
File | Description |
---|---|
src/coreclr/nativeaot/Runtime/startup.cpp | Updated global CPU feature variables to use int64_t |
src/coreclr/jit/lowerxarch.cpp | Updated isEmbeddedBroadcastCompatibleHWIntrinsic calls to pass a Compiler* parameter |
src/coreclr/jit/gentree.{h,cpp} | Updated isEmbeddedBroadcastCompatibleHWIntrinsic signature to accept a Compiler* parameter |
src/coreclr/jit/emitxarch.{h,cpp} | Refactored intrinsic encoding functions (merging several checks into Is3OpRmwInstruction) |
src/coreclr/jit/compiler.cpp | Remapped JIT configuration flags to new ISA support and updated default values |
src/coreclr/jit/emit.h | Adjusted emitter bitfield sizes to support an increased instruction count |
src/coreclr/inc/{jitconfigvalues.h,clrconfigvalues.h} | Added new configuration entries for the newly supported ISAs |
Other files | Various adjustments in ISA lookup, opcode updates, and version GUID changes to support additional ISAs |
Comments suppressed due to low confidence (6)
src/coreclr/jit/compiler.cpp:6089
- The remapping of JitConfig flags (e.g., using EnableAES, EnableAPX, etc.) to add corresponding instruction sets requires careful verification. Please confirm that the new mapping aligns with the intended ISA support and that default values are correct.
if (JitConfig.EnableAES() != 0) { instructionSetFlags.AddInstructionSet(InstructionSet_AES); }
src/coreclr/nativeaot/Runtime/startup.cpp:53
- Changing g_cpuFeatures and g_requiredCpuFeatures to int64_t from int requires ensuring that all consumers of these globals can handle 64-bit values. Please verify that this change is consistent with their usage across the codebase.
EXTERN_C int64_t g_cpuFeatures;
src/coreclr/jit/lowerxarch.cpp:9189
- The updated call to isEmbeddedBroadcastCompatibleHWIntrinsic now passes a Compiler pointer; please ensure that all call sites and the function implementation have been updated consistently.
if (parentNode->isEmbeddedBroadcastCompatibleHWIntrinsic(comp))
src/coreclr/jit/gentree.h:1504
- Updating isEmbeddedBroadcastCompatibleHWIntrinsic to accept a Compiler pointer is a significant API change; please verify that all its usages (including in gentree.cpp) reflect this new signature.
bool isEmbeddedBroadcastCompatibleHWIntrinsic(Compiler* comp) const;
src/coreclr/jit/emitxarch.cpp:93
- Consolidating checks for FMA, PermuteVar2x, and AVXVNNI into Is3OpRmwInstruction simplifies the code; please ensure that the new unified check covers all 3-operand RMW intrinsic scenarios with adequate tests.
bool emitter::Is3OpRmwInstruction(instruction ins)
src/coreclr/jit/emit.h:666
- The update in emitter bitfield sizes to accommodate a larger INS_count necessitates verification that serialization and architecture‐specific packing remain correct.
static_assert_no_msg(INS_count <= 2048);
src/coreclr/tools/aot/ILCompiler.Compiler/Compiler/HardwareIntrinsicHelpers.Aot.cs
Outdated
Show resolved
Hide resolved
CC. @dotnet/jit-contrib, @EgorBo for JIT side changes @MichalStrehovsky, @jkotas for the VM/NAOT changes. |
optimisticInstructionSetSupportBuilder.AddSupportedInstructionSet("lzcnt"); | ||
optimisticInstructionSetSupportBuilder.AddSupportedInstructionSet("movbe"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These won't opportunistically lightup for baseline
anymore, but that shouldn't be an issue.
These instructions are only available on AVX+ hardware. movbe
was an internal instruction set and so it never got opportunistically emitted for NAOT anyways. While lzcnt
has a baseline alternative via bsr
and the cost of the lzcnt
check and branch is more expensive than the check of value == 0
check to handle the bsr
edge case.
{ ("skylake", TargetArchitecture.X86), "x86-x64-v3" }, | ||
{ ("x86-x64-v4", TargetArchitecture.X64), "x86-x64-v3 avx512" }, | ||
{ ("x86-x64-v4", TargetArchitecture.X86), "x86-x64-v3 avx512" }, | ||
{ ("x86-64", TargetArchitecture.X64), "base" }, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can these changes break performance for anyone?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Someone with a machine that doesn’t have SSE4.2 (shipped 2008) could see a performance regression
However, such users were already getting their R2R images thrown out and we’ve not really had SSE3/SSSE3/SSE4.1 specific code paths for a while, as the test complexity matrix hadn’t been worth supporting. Just some minor cases where we didn’t need to check for something higher.
The last pre-SSE4.2 processors went out of support around 2013. Windows 11 requires SSE4.1 and 24H2 requires SSE4.2
No major cloud providers have such old hardware and client statistics, such as Steam Hardware survey, report such devices being around 0.22%
So the belief is this reduction is also worthwhile.
For other ISA reductions, it’s simply combining sets that have always shipped together in the wild. That is, there is no hardware that has FMA without also having AVX2, BMI1, BMI2, F16C, etc. They have always been provided together as a set
else | ||
if (nestedTypeName == "VL") | ||
{ return InstructionSet.X64_AVX512v3; } | ||
else |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this looks like a good candidate for switch pattern matching syntax? 🙂
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This file is generated from instructiondesc.txt and the corresponding gen.bat
I agree there is a lot of potential cleanup possible there, but it’s out of scope for this PR since it’s automated
This adds the JIT/VM support for the various xarch ISAs that have been approved but NYI. This does not actually add the intrinsic IDs or managed support required for their use.
This is being put up to avoid needing to churn the JIT/EE version guid multiple times and induce many conflicts for the pending work.