Skip to content

JIT: Re-introduce late fgOptimizeBranch pass #113491

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
Mar 19, 2025

Conversation

amanasifkhalid
Copy link
Member

@amanasifkhalid amanasifkhalid commented Mar 13, 2025

Part of #107749. fgReorderBlocks runs fgOptimizeBranch when it decides not to reorder a particular block. Turning off the old layout in favor of RPO layout in .NET 9 had the unintended consequence of also disabling the later fgOptimizeBranch pass the JIT used to do. After finding cases in #113108 (comment) that benefit from cloning and hoisting loop tests, I decided to reintroduce this late pass. These regressions also highlight that fgOptimizeBranch can create compaction opportunities inside loop bodies that we don't currently take advantage of; I've added a check to handle this.

@Copilot Copilot AI review requested due to automatic review settings March 13, 2025 20:04
Copy link
Contributor

@Copilot Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copilot reviewed 2 out of 2 changed files in this pull request and generated no comments.

@dotnet-issue-labeler dotnet-issue-labeler bot added the area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI label Mar 13, 2025
Copy link
Contributor

Tagging subscribers to this area: @JulieLeeMSFT, @jakobbotsch
See info in area-owners.md if you want to be subscribed.

@amanasifkhalid
Copy link
Member Author

cc @dotnet/jit-contrib, @AndyAyersMS PTAL. Diffs show size regressions as expected, though they are dominated by libraries_tests. I think fgOptimizeBranch could use some more work on a few fronts:

  • It has some fallthrough restrictions left over from BBJ_NONE removal and the BBJ_COND refactor. These are likely holding back some important transformations, but my attempts to lift them created lots of improper loop headers in the process by creating flow into loops that bypasses the headers' tests. I don't think this should be a problem late in the JIT frontend (maybe it'll mess with the loop-aware RPO computation if we no longer recognize a loop?), but we still run fgOptimizeBranch in optOptimizeFlow, before we find and optimize loops. At the moment, I don't want to pessimize those phases.
  • I'm not sure how confident we are about its current cost model for deciding whether to clone, and lifting the above restrictions would unlock much more cloning. We could control for this somewhat with simple heuristics we currently aren't doing, like skipping cold blocks. Any differences between this cost model and loop inversion's seems like a bit of an oversight, since they're doing very similar transformations.
  • I don't think we're gaining anything by running fgOptimizeBranch early. As far as I can tell, fgOptimizeBranch mainly helps layout avoid sticky situations with branch-test-branch shapes, but doesn't do much elsewhere. Plus, RBO seems to unlock more transformations for it, so we see plenty of instances where the later pass does cloning the previous one cannot. I'd appreciate a correction if I'm missing something here.
  • I suspect the early pass might already be harming our loop recognition potential. There isn't anything stopping fgOptimizeBranch from creating weird flow into loops, unless the fallthrough invariants kick in by chance. I'll experiment with removing the early pass and see how this changes loop recognition to find some motivating examples.

Most of the above is easy to implement, but it will incur plenty of churn. This PR should at least reverse some long-standing regressions that benefitted from late branch opts, but I think there's more we can claw back here -- not sure if we want to pursue this now or later, though.

@amanasifkhalid
Copy link
Member Author

/ba-g blocked by build timeouts

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants