-
Notifications
You must be signed in to change notification settings - Fork 5k
Emit cmp (extended register)
on ARM64 to simplify cast-then-compare expressions
#112411
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
… expressions Working towards dotnet#68028
Tagging subscribers to this area: @JulieLeeMSFT, @jakobbotsch |
src/coreclr/jit/lowerarmarch.cpp
Outdated
{ | ||
return true; | ||
} | ||
castOp->ClearContained(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some assertions hit related to register allocation if we don't clear it here. The problem is with chains such as:
ADD(
op1,
CAST(
LCL_VAR
)
)
Usually the LCL_VAR
node will be contained by the CAST
node, this allows a later optimization using LDRB/H
to perform the cast with load. The register will be allocated to the CAST
node which is then used by the ADD
. If we contain the CAST
and the LCL_VAR
spills, then both the CAST
and LCL_VAR
nodes won't have a register allocated when we try to generate the ADD
. If I understand correctly, it's not possible to tell whether a LCL_VAR
will be spilled until after register allocation, but you can force the casted node into a register and contain the CAST
node instead.
In most cases the casted node is a register value and the containment from CAST
-> LCL_VAR
had no effect, so this will result in removing the cast instruction. However, if the casted node is spilled then you would see LDR
used (no cast), and the ADD
node will use the cast extension, which is the behavior I'm preserving here. I've moved these containment settings from ContainCheckBinary
to here because they were causing side effects with the comparison nodes.
This problem also applies to the comparison nodes, which is why I clear the RegOptional
flag on some nodes.
looks like there is a test failure . Can you please double check if it is caused with this PR? |
It looks like this failed on "Send to Helix", is this a network error? I can't see any failing build or test output. |
You can see the test output by going to Tests -> System.Runtime.Intrinsics.Tests Work item -> Artifacts -> Console Logs. That takes you to this link: https://helixr1107v0xdcypoyl9e7f.blob.core.windows.net/dotnet-runtime-refs-pull-112411-merge-30e75b278b4d4c6eae/System.Runtime.Intrinsics.Tests/1/console.cb1ccbdb.log?helixlogtype=result In this case the failure looks like known failure #111922. |
src/coreclr/jit/lowerarmarch.cpp
Outdated
@@ -453,10 +455,34 @@ bool Lowering::IsContainableUnaryOrBinaryOp(GenTree* parentNode, GenTree* childN | |||
|
|||
if (parentNode->OperIs(GT_CMP)) | |||
{ | |||
if (IsInvariantInRange(childNode, parentNode)) | |||
castOp->ClearContained(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It looks odd to me that a function called IsContainableUnaryOrBinaryOp
mutates random trees. Can this be done by the caller instead?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, fixed this
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Working towards #68028
Matching on
GT_EQ/NE/...
nodes containingGT_CAST
.Example: