Skip to content

ReconnectNodeRemover.setPathInformation() may cause OOM #18658

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
OlegMazurov opened this issue Apr 3, 2025 · 1 comment · Fixed by #18708
Closed

ReconnectNodeRemover.setPathInformation() may cause OOM #18658

OlegMazurov opened this issue Apr 3, 2025 · 1 comment · Fixed by #18708
Assignees
Labels
Performance Issues related to performance concerns. Platform Data Structures Platform Reconnect Platform Virtual Map Platform Tickets pertaining to the platform
Milestone

Comments

@OlegMazurov
Copy link
Contributor

The following log message was observed during a failed reconnect in longevity testing:

2025-XX-XX XX:XX:XX XXX     INFO  RECONNECT        <<work group learning-synchronizer: learner-task #2>> ReconnectNodeRemover: allNodesReceived(): newLastLeafPath = 761197512, oldLastLeafPath = 305093118

We can infer newFirstLeafPath as newLastLeafPath/2 = 380598756, which is greater than oldLastLeafPath. According to the logic of ReconnectNodeRemover.setPathInformation(), all old leaf records are added to leavesToDelete. Their accumulation and flushing may create high pressure on heap usage/GC and cause OOM.
Some provision should be made to ensure partial flushing of leavesToDelete if their number threatens heap exhaustion.

@OlegMazurov OlegMazurov added the Performance Issues related to performance concerns. label Apr 3, 2025
@OlegMazurov OlegMazurov added this to the v0.60 milestone Apr 3, 2025
@OlegMazurov
Copy link
Contributor Author

Relevant issue: #18626 Memory leak in HalfDiskHashMap.endWriting()

@artemananiev artemananiev self-assigned this Apr 3, 2025
@artemananiev artemananiev modified the milestones: v0.60, v0.62 Apr 3, 2025
@artemananiev artemananiev moved this to 🛠 In Progress in Foundation Team Apr 4, 2025
@artemananiev artemananiev moved this from 🛠 In Progress to ✅ Done in Foundation Team Apr 9, 2025
@artemananiev artemananiev moved this from ✅ Done to 🛠 In Progress in Foundation Team Apr 9, 2025
@artemananiev artemananiev moved this from 🛠 In Progress to 👀 In Review in Foundation Team Apr 9, 2025
@artemananiev artemananiev moved this from 👀 In Review to ✅ Done in Foundation Team Apr 9, 2025
joshmarinacci pushed a commit to joshmarinacci/hiero-consensus-node that referenced this issue Apr 10, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Performance Issues related to performance concerns. Platform Data Structures Platform Reconnect Platform Virtual Map Platform Tickets pertaining to the platform
Projects
Status: ✅ Done
2 participants