-
Notifications
You must be signed in to change notification settings - Fork 3.4k
Don't execute the proxy queue while adding to it from same thread. #24565
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
Still working on a more reliable test, but I wanted to try this out... |
system/lib/pthread/proxying.c
Outdated
// lock if the current thread is not adding to the queue. We then hope the | ||
// queue is executed later when it is unlocked. | ||
// XXX: This could leave to starvation if we never process the queue. | ||
if (q->active_thread == pthread_self()) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How about if (q->active_thread && q->active_thread == pthread_self()) {
to avoid calling pthread_self at all in the non-contended case?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder if we could combine with the above executing_system_queue
check?
At least we could use the same technique of using a TLS variable instead of calling pthread_self? I guess that only works if this issue only applies to the system queue (which I think it does).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
At least maybe wrap this whole thing in the is_system_queue
check?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think there's a few different ways to do this and simplify it. I just wanted to try this out with minimal changes first to make sure it's going down the right path.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1 to reusing the existing thread-local variable once this is confirmed to be the right direction. It can be hoisted to the top level for use from multiple functions if necessary.
test_pthread_cancel was intermittently failing with a deadlock. From the backtrace, I found we were adding to the queue which can sometimes call malloc->emscripten_yield->emscripten_proxy_execute_queue which also tries to lock the same queue on the same thread.
9bf0082
to
fcffdb1
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice work!
Hopefully we can also figure out some way to improve the testing ?
@@ -35,6 +35,8 @@ static em_proxying_queue system_proxying_queue = { | |||
.capacity = 0, | |||
}; | |||
|
|||
static _Thread_local int system_queue_in_use = 0; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We could use bool
here along with true/false
but maybe thats a bigger/unnecessary change.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For whatever reason I decided to use int
instead of bool
when I wrote all this, so this is at least consistent with the rest of the file.
"a.html": 17199, | ||
"a.html.gz": 7525, | ||
"total": 17199, | ||
"total_gz": 7525 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How is this is a codesize win!?
Is it perhaps move the TLS init code into the global scope? If so that is kind of crazy.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Strange it looks like the code size test is still failing. Maybe this isn't actually a code size win. Will re-try again.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Makes sense to me. Function-scope TLS has to do all sorts of implicitly thread-safe lazy initialization that global-scope TLS does not have to do.
This is an automatic change generated by tools/maint/rebaseline_tests.py. The following (4) test expectation files were updated by running the tests with `--rebaseline`: ``` test/other/codesize/test_codesize_minimal_pthreads.funcs updated other/codesize/test_codesize_minimal_pthreads.size: 19396 => 19446 [+50 bytes / +0.26%] test/other/codesize/test_codesize_minimal_pthreads_memgrowth.funcs updated other/codesize/test_codesize_minimal_pthreads_memgrowth.size: 19397 => 19447 [+50 bytes / +0.26%] Average change: +0.26% (+0.26% - +0.26%) ```
fcffdb1
to
ff117cf
Compare
@@ -35,6 +35,8 @@ static em_proxying_queue system_proxying_queue = { | |||
.capacity = 0, | |||
}; | |||
|
|||
static _Thread_local int system_queue_in_use = 0; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For whatever reason I decided to use int
instead of bool
when I wrote all this, so this is at least consistent with the rest of the file.
// potentially try to execute the queue during the add (from | ||
// emscripten_yield). This will deadlock the thread, so only try to take the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
// potentially try to execute the queue during the add (from | |
// emscripten_yield). This will deadlock the thread, so only try to take the | |
// potentially try to execute the queue during the add (from | |
// emscripten_yield when malloc takes a lock). This will deadlock the thread, so only try to take the |
"a.html": 17199, | ||
"a.html.gz": 7525, | ||
"total": 17199, | ||
"total_gz": 7525 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Makes sense to me. Function-scope TLS has to do all sorts of implicitly thread-safe lazy initialization that global-scope TLS does not have to do.
@@ -72,6 +72,7 @@ $emscripten_futex_wake | |||
$emscripten_stack_get_current | |||
$emscripten_stack_set_limits | |||
$free_ctx | |||
$get_or_add_tasks_for_thread |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder what happened here.
Solution LGTM, but I forgot about the testing. |
Changing to using a thread local causes failures in |
Also use `bool` rather than `int` to better convey intent. Split out from emscripten-core#24565
Also use `bool` rather than `int` to better convey intent. Split out from emscripten-core#24565
test_pthread_cancel was intermittently failing with a deadlock. From the backtrace, I found we were adding to the queue which can sometimes call malloc->emscripten_yield->emscripten_proxy_execute_queue which also tries to lock the same queue on the same thread.