-
Notifications
You must be signed in to change notification settings - Fork 3.6k
[fix] Don't use ForkJoinPool when using CompletableFuture.thenXXXAsync #24383
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
@@ -307,11 +307,13 @@ protected void internalRevokePermissionsOnTopic(AsyncResponse asyncResponse, Str | |||
for (int i = 0; i < numPartitions; i++) { | |||
TopicName topicNamePartition = topicName.getPartition(i); | |||
future = future.thenComposeAsync(unused -> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
future = future.thenComposeAsync(unused -> | |
future = future.thenCompose(unused -> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can only change this.
@@ -307,11 +307,13 @@ protected void internalRevokePermissionsOnTopic(AsyncResponse asyncResponse, Str | |||
for (int i = 0; i < numPartitions; i++) { | |||
TopicName topicNamePartition = topicName.getPartition(i); | |||
future = future.thenComposeAsync(unused -> | |||
getAuthorizationService().revokePermissionAsync(topicNamePartition, role)); | |||
getAuthorizationService().revokePermissionAsync(topicNamePartition, role), | |||
pulsar().getOrderedExecutor().chooseThread(role)); | |||
} | |||
} | |||
return future.thenComposeAsync(unused -> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
return future.thenComposeAsync(unused -> | |
return future.thenCompose(unused -> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Changes that should be avoided is to replace then*Async
calls with then*
calls. The reason for this is that in many cases there's a clear reason why then*Async
has been used. By default, CompletableFutures get completed on the thread that completes it. When metadata service is where it's completed, the completion chain will be executed on the metadata service thread.
I'd rather see the then*Async
calls use a thread pool that has more than 1 thread. There's a high risk for dead locks if changes are made to use a single threaded executor when most environments run with at least 2 cores.
ForkJoinPool has some logic to add more threads when it detects that another thread is blocked. That's also why it's used by default for CompletableFuture.then*Async
methods since it will avoid deadlocks this way.
As long as the mutations don't happen after an object has been passed through a completable future, the receiving thread will observe the same state of the object. A lot of Pulsar's "thread safety" relies on this at the moment. |
+1. Generally, return provider.isSuperUser(role, authenticationData, conf).thenCompose(isSuperUser -> {
if (isSuperUser) {
return CompletableFuture.completedFuture(true);
} else {
return provider.canLookupAsync(topicName, role, authenticationData);
}
}); The implementation of Regarding other places that replace the
Executors managed by Pulsar are not guaranteed to be better than
It's better to give a concrete example to show how could it result a thread safety issue. Make sure you know the happens-before relationship in Java. |
When the ForkJoinPool becomes blocked or runs into deadlock-like behavior, it can stall all dependent tasks. New tasks get stuck in the work queue, leading to widespread performance degradation. Additionally, ForkJoinPool is designed for workloads that can be recursively split into independent subtasks. If the task isn't inherently decomposable, using ForkJoinPool won't improve performance and may introduce unnecessary overhead. I think it's reasonable to replace the ForkJoinPool, but going with a single-threaded pool is risky. |
It's only one typical case. From the official document:
Most tasks from Pulsar are such small tasks in a chain, which are usually starting an asynchronous task. var future = runTask1().thenCompose(result1 -> runTask2(result1)).thenApply(result2 -> runTask3(result2);
future.whenComplete((result, e) -> {
if (e == null) {
future2.complete(result);
} else {
future2.completeExceptionally(e);
}
}); Then For a specific example, the
private void init() {
pendingAckStoreProvider.checkInitializedBefore(persistentSubscription)
.thenAcceptAsync(init -> {
if (init) {
initPendingAckStore();
} else {
completeHandleFuture();
}
}, internalPinnedExecutor) The caller side should be responsible to switch the thread to continue, not from the internal code. |
1、Although, as stated in the official documentation, ForkJoinPool provides benefits such as thread reuse and resource efficiency (e.g., automatic thread reclamation), in certain scenarios, it may actually lead to performance degradation. 2、We cannot guarantee that all tasks submitted to the ForkJoinPool are lightweight and non-blocking. 3、Using a dedicated thread pool results in clearer code structure, allowing developers to easily identify which thread pool is executing the current task.This not only improves code maintainability but also facilitates debugging and performance tuning. 4、When the ForkJoinPool encounters blocking issues, it becomes difficult to diagnose. |
By default, That said, in Pulsar we typically avoid switching threads unless it's explicitly necessary. We prefer to execute continuation logic on the callback thread, which helps reduce context switching and improves CPU efficiency. However, this also means we need to be mindful of how much work we're doing in these continuations, especially if the callback thread is part of a limited thread pool. Using |
The issue is, where should a Pulsar-managed executor be used is very ambiguous. I agree that there is no silver bullet, so we must be careful for which thread pool to use. This PR title is misleading that it thinks the built-in common It's reasonable to switch |
Therefore, I'd like to talk about the use of Take the case I've mentioned before: Line 171 in 3bd70fd
We should replace However, it's hard to know which thread will complete the future of Line 244 in 3bd70fd
to BTW, I see Pulsar's default executor is used here: Line 391 in 3bd70fd
Unfortunately, it's really ambiguous that where should |
Motivation
Currently, in some cases when don't pass an executor to
CompletableFuture
.thenComposeAsync/thenApplyAsync/thenAcceptAsync
, which will useForkJoinPool.commonPool
.Which maybe lead to 3 problems:
ForkJoinPool.commonPool
is managed by JVM, it is out of Pulsar manage.ForkJoinPool.commonPool
isMath.max(1, Runtime.getRuntime().availableProcessors() - 1)
, if the broker is in heavy workload, it's not enough.thenComposeAsync/thenApplyAsync/thenAcceptAsync
, it maybe lead to potential thread safety issue.Modifications
Verifying this change
(Please pick either of the following options)
This change is a trivial rework / code cleanup without any test coverage.
(or)
This change is already covered by existing tests, such as (please describe tests).
(or)
This change added tests and can be verified as follows:
(example:)
Does this pull request potentially affect one of the following parts:
If the box was checked, please highlight the changes
Documentation
doc
doc-required
doc-not-needed
doc-complete