Skip to content

Issue was that if a BGC thread handles a mark stack overflow, #74571

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged

Conversation

PeterSolMS
Copy link
Contributor

but runs into yet another mark stack overflow on another heap, we set a flag on the region, and the containing heap. However, the BGC thread handling the other heap may have already decided to move on, and may thus not see the flag.

Fix is to set the flag on the heap doing the scan rather than the heap containing the object causing the mark stack stack overflow. The thread handling that heap will indeed recheck the flag and rescan if necessary. This necessitates another change because in the concurrent case, we need each BGC thread to enter mark stack overflow scanning if there was a mark stack overflow on its heap. So we need to propagate the per-heap flag to all the heaps.

Fixed another issue for regions where the small_object_segments local variable in background_process_mark_overflow_internal would be set incorrectly in the non-concurrent case. It would be set to FALSE as soon as all the regions for gen 0 are processed.

…s into yet another mark stack overflow on another heap, we set a flag on the region, and the containing heap. However, the BGC handling the other heap may have already decided to move on, and may thus not see the flag.

Fix is to set the flag on the heap doing the scan rather than the heap containing the object causing the mark stack stack overflow. The thread handling that heap will indeed recheck the flag and rescan if necessary. This necessitates another change because in the concurrent case, we need each BGC thread to enter mark stack overflow scanning if there was a mark stack overflow on its heap. So we need to propagate the per-heap flag to all the heaps.

Fixed another issue for regions where the small_object_segments local variable in background_process_mark_overflow_internal would be set incorrectly in the non-concurrent case. It would be set to FALSE as soon as all the regions for gen 0 are processed.
@ghost ghost added the area-GC-coreclr label Aug 25, 2022
@ghost ghost assigned PeterSolMS Aug 25, 2022
@ghost
Copy link

ghost commented Aug 25, 2022

Tagging subscribers to this area: @dotnet/gc
See info in area-owners.md if you want to be subscribed.

Issue Details

but runs into yet another mark stack overflow on another heap, we set a flag on the region, and the containing heap. However, the BGC thread handling the other heap may have already decided to move on, and may thus not see the flag.

Fix is to set the flag on the heap doing the scan rather than the heap containing the object causing the mark stack stack overflow. The thread handling that heap will indeed recheck the flag and rescan if necessary. This necessitates another change because in the concurrent case, we need each BGC thread to enter mark stack overflow scanning if there was a mark stack overflow on its heap. So we need to propagate the per-heap flag to all the heaps.

Fixed another issue for regions where the small_object_segments local variable in background_process_mark_overflow_internal would be set incorrectly in the non-concurrent case. It would be set to FALSE as soon as all the regions for gen 0 are processed.

Author: PeterSolMS
Assignees: -
Labels:

area-GC-coreclr

Milestone: -

#ifdef USE_REGIONS
// in the regions case, compute the OR of all the per-heap flags
if (g_heaps[i]->background_overflow_p)
all_heaps_background_overflow_p = TRUE;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is there any perf impact of setting this across all heaps?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the perf impact is to do some work balancing between all Server BGC threads processing OF. for heaps that didn't actually have any OF it would quickly filter by its regions' OF flag.

@PeterSolMS PeterSolMS merged commit 62cb5f1 into dotnet:main Aug 26, 2022
@PeterSolMS
Copy link
Contributor Author

/backport to release/7.0

@github-actions
Copy link
Contributor

Started backporting to release/7.0: https://github.com/dotnet/runtime/actions/runs/2956839509

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants