Skip to content

Cleanup HMF definitions #116085

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 3 commits into from
Jun 4, 2025
Merged

Cleanup HMF definitions #116085

merged 3 commits into from
Jun 4, 2025

Conversation

am11
Copy link
Member

@am11 am11 commented May 29, 2025

Fixes #95695.

Depends on #115339.

@dotnet-policy-service dotnet-policy-service bot added the community-contribution Indicates that the PR has been added by a community member label May 29, 2025
@@ -340,7 +340,7 @@ Here is an annotated list of the stubs implemented for Unix on Arm64.
calls. Necessary for all applications as this is how the main method is
called.

2. `LazyMachStateCaptureState`/`HelperMethodFrameRestoreState` – Needed to
2. `LazyMachStateCaptureState` – Needed to
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This sounds weird. I would expect these to go together so LazyMachStateCaptureState should be dead code too.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Deleted.

src/coreclr/debug/shared/dbgtransportsession.cpp:                if (!PAL_VirtualUnwind(&frameContext, NULL))
src/coreclr/pal/src/exception/seh-unwind.cpp:    PAL_VirtualUnwind(contextRecord, NULL);
src/coreclr/vm/eetwain.cpp:            PAL_VirtualUnwind(pContext, NULL);
src/coreclr/vm/exceptionhandling.cpp:                PAL_VirtualUnwind(ex->GetContextRecord(), NULL);
src/coreclr/vm/frames.cpp:        PAL_VirtualUnwind(pRD->pCurrentContext, NULL);
src/coreclr/vm/stackwalk.cpp:        BOOL success = PAL_VirtualUnwind(pContext, NULL);

If / when we get rid of these (libunwind calls), that would be quite satisfying. 😁 This PR is removing six calls and I think one of the remaining call will be removed when we cleanup funclet definitions for .NET 11.

Copy link
Member

@filipnavara filipnavara May 29, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

...when we cleanup funclet definitions for .NET 11.

Which one you think? x86 generally didn't have an unwinding story on Windows, so I doubt it's used on linux-x86 either.

src/coreclr/vm/exceptionhandling.cpp: PAL_VirtualUnwind(ex->GetContextRecord(), NULL);
src/coreclr/vm/frames.cpp: PAL_VirtualUnwind(pRD->pCurrentContext, NULL);

I was contemplating to get the IL_*Throw all on the same plan with the new PUSH_COOP_PINVOKE_FRAME macros similar to what x86 does. This would probably be a bit faster, more unified, and it would remove this one call.

I'm not sure we can remove all the usages but it's certainly getting to a point where there's just enough usages to answer the question of what would it take.

Copy link
Member Author

@am11 am11 May 29, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Which one you think?

I will double check, but it was a rabbit hole which lead to #ifndef FEATURE_EH_FUNCLETS block when I chased down the nth caller. 😅

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@am11, removing call in dbgtransportsession.cpp should be part of this PR. So far as I am aware, the PAL_VirtualUnwind used there is only used for unwinding a frame for the purposes of filling in the HMF while in the debugger, and this PR is all about removing the HMF and its tendrils.

@filipnavara I would love to see the PAL_VirtualUnwind removed from the IL_*Throw path. The PUSH_COOP_PINVOKE_FRAME logic is definitely faster than invoking libunwind. Especially as in some situations I've seen PAL_VirtualUnwind take a global lock during unw_step on Linux platforms, and there are some cases where a lock can also be taken on Windows, which can have unfortunate performance concerns in heavily multithreaded exception storms. However, that certainly doesn't belong in THIS pr. This is big enough as it is.

With regards to getting rid of the rest of the PAL_VirtualUnwind logic. I think another big rock is the handling of exceptions thrown from within QCalls. I believe the long term path there that would likely work, is to have a scheme where a managed exception could be registered to be thrown as part of returning to cooperative mode in the managed code. This would mirror the JNI model of how exceptions can be thrown from native code via a deferred mechanism, and allow us to not require unwinding from the QCall itself. What we would need to do is change the END_QCALL macro to detect the C++ exception, and instead of dispatching a managed exception, create a managed exception to be dispatched once the QCall finishes. We would also need some set of tweaks so that we return from the END_QCALL macro instead of actually expecting the rest of the code in the QCall to execute, but that should be fairly simple. We may have some prior art around this in our handling of ObjectiveC/Swift exception handling, or perhaps the logic we have there could transition this to the deferred mechanism I describe here. @janvorli /@jkotas may have opinions here.
 
Windows may always require the virtual unwind behavior, as we support throwing SEH exceptions from native into managed code and catching them there, but there isn't any real reason to allow that sort of behavior on Unix platforms. I haven't dug enough into all of this to know. But it may be that we are able to move many of these calls to PAL_VirtualUnwind into something which only does the VirtualUnwind if on Windows, and on Linux we simple #ifdef the logic away. And for that, I would support simply calling the Windows api directly instead of using a PAL.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll look into the IL_*Throw path after the allocation helper PR is concluded. I have a prototype stashed somewhere but it needs to be updated.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe the long term path there that would likely work, is to have a scheme where a managed exception could be registered to be thrown as part of returning to cooperative mode in the managed code.

I think it may be possible to trigger the same code path as GC poll at the end of QCall, check for pending exception, and tailcall IL_Throw (or an equivalent helper with COOP frame and TransitionBlock). Don't really want to delve too deep into that right now but happy to explore it at some later point.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the insights @davidwrighton!

removing call in dbgtransportsession.cpp should be part of this PR

I will take a look.

Another use of libunwind is around remote unwind (inspecting linux coredump on macOS or vice-versa). When we will have the ability to find the first managed frame for a given IP/SP etc., then it might be possible to re-implement that part without libunwind as well (e.g. by searching for some sort of sentinel in epilog emitted in managed methods of interests).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@filipnavara, yeah that's the sort of thing I was thinking about. My guess is that there is also something inscrutable in how hardware exceptions are handled that need the PAL_VirtualUnwind today, but I suspect that is avoidable with some clever tweaks. We're approaching the point at which we need to be more cautious about such changes as we prepare to ship .NET 10, but I like this general direction.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We may have some prior art around this in our handling of ObjectiveC/Swift exception handling

We have a special support for unmanaged -> managed Objective calls. I do not think we have any special support for ObjectiveC managed -> unmanaged calls. On unmanaged size, ObjectiveC returns sets a thread local. On the managed side, the code reads the thread local and throws exception as necessary. Or something equivalent.

We can do the same for QCalls. Add a return LibraryImport marshaller and annotated all QCalls to use it by running regex over CoreLib, and modify QCall macros as needed.

a managed exception could be registered to be thrown as part of returning to cooperative mode in the managed code.

This is micro-optimization of the above scheme using runtime magic. I am not convinced that it is worth it. If we can convince ourselves that this micro-optimization is worth it, we should make it a public interop feature.

@filipnavara
Copy link
Member

Fun with the ASM offsets:

  D:\a\_work\1\s\src\coreclr\vm\amd64\asmconstants.h(265): error C2338: static_assert failed: 'OFFSETOF__MachState__m_Ptrs == offsetof(MachState, m_Ptrs)'
  D:\a\_work\1\s\src\coreclr\vm\amd64\asmconstants.h(267): error C2338: static_assert failed: 'OFFSETOF__MachState___pRetAddr == offsetof(MachState, _pRetAddr)'

@am11
Copy link
Member Author

am11 commented May 29, 2025

It was a dead code after removing LazyMachCode. Removed the rest in aa86cf0.

Comment on lines -1972 to -1975
if (!PAL_VirtualUnwind(&frameContext, NULL))
{
HANDLE_TRANSIENT_ERROR();
}
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@davidwrighton, should we delete the entire case MT_VirtualUnwind: or just this call?

Copy link
Member

@filipnavara filipnavara Jun 2, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It doesn't make sense to keep MT_VirtualUnwind around without the PAL_VirtualUnwind call. Presumably it would make sense to just return E_NOTIMPL from ShimRemoteDataTarget::VirtualUnwind and drop the whole code path down from there...

(I am not an expert, so perhaps wait for a second opinion.)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pushed 2c15f9e. I can revert in case it's undesired. :)

@am11 am11 force-pushed the feature/hmf-cleanups branch from 2c15f9e to 54a677f Compare June 3, 2025 20:03
@am11 am11 marked this pull request as ready for review June 3, 2025 20:03
Copy link
Member

@jkotas jkotas left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you!

@jkotas
Copy link
Member

jkotas commented Jun 4, 2025

/ba-g unrelated infrastructure timeout

@jkotas jkotas merged commit 0b42ec7 into dotnet:main Jun 4, 2025
150 of 153 checks passed
@jkotas jkotas mentioned this pull request Jun 4, 2025
@am11 am11 deleted the feature/hmf-cleanups branch June 10, 2025 16:53
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area-VM-coreclr community-contribution Indicates that the PR has been added by a community member
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Convert all FCalls with HELPER_METHOD_FRAME to QCalls
6 participants