-
Notifications
You must be signed in to change notification settings - Fork 2.5k
Spark: Avoid closing deserialized copies of shared resources like FileIO #12868
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
e9c9529
to
035f3b8
Compare
035f3b8
to
4daa941
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍
LOG.info("Releasing resources"); | ||
io().close(); | ||
LOG.info("Executor-side cleanup: closing deserialized table resources"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is there a way to ensure that the io and hence the pool is eventually closed ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the review! Based on code inspection and some local debugging, Iceberg doesn’t explicitly call close() on most FileIO instances (i.e., the regular, non-deserialized ones). The exception is S3FileIO, which overrides finalize() to invoke close() during garbage collection—and this applies to deserialized copies as well.
And given that the underlying connection pool maybe shared, we might want to consider removing finalize()
from S3FileIO
to avoid unintended side effects during garbage collection. cc: @rdblue @aokolnychyi
Side note: finalize() was deprecated in Java 9 due to potential performance issues, deadlocks, and unpredictable behavior during GC.
Thanks, @xiaoxuandev this looks to work similarly to the PR that I opened to fix this issue here: #12129, but I was unable to come up with tests there. Would love to get this in. |
This change prevents calling
close()
onFileIO
during the cleanup of Spark's broadcast variable whenmemoryStore.remove(blockId)
is called. ClosingFileIO
can unintentionally shut down shared resources—such as a shared connection pool—whenS3FileIO
is backed by the Apache HTTPClient. Callingclose()
will trigger the shutdown of the HttpClientConnectionManager, leading to request failures if other instances are still in use.Fixes: #12858, #12046