-
Notifications
You must be signed in to change notification settings - Fork 3.4k
Rewrite the old proxying API in terms of the new API #15880
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Implement a new proxying API that is meant to be suitable both for proxying in syscall implementations as well as for proxying arbitrary user work. Since the system proxying queue is processed from so many locations, it is dangerous to use it for arbitrary work that might take a lock or otherwise block and the work sent to it has to be structured more like native signal handlers. To avoid this limitation, the new API allows users to create their own proxying queues that are processed only when the target thread returns to the JS event loop or when the user explicitly requests processing. In contrast to the existing proxying API, this new API: - Never drops tasks (except in the case of allocation failure). It grows the task queues as necessary instead. - Does not attempt to dynamically type or dispatch queued functions, but rather uses statically typed function pointers that take `void*` arguments. This simplifies both the API and implementation. Packing of varargs into dynamically typed function wrappers could easily be layered on top of this API. - Is less redundant. There is only one way to proxy work synchronously or asynchronously to any thread. - Is more general. It allows waiting for a task to be explicitly signaled as done in addition to waiting for the proxied function to return. - Uses arrays instead of linked lists for better data locality. - Has a more uniform naming convention. A follow-up PR will reimplement the existing proxying API in terms of this new API.
Rewrite the old threading.h proxying API used for internal system call implementations in terms of the new proxying API introduced in #15737.
system/lib/pthread/library_pthread.c
Outdated
args->target_thread = target_thread; | ||
args->q = q; | ||
emscripten_set_timeout(dispatch_to_thread_helper, 0, args); | ||
emscripten_set_timeout(_do_call, 0, args); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Won't args
(the va_list) be invalid once the timeout fires? i.e. isn't this going to be a use-after-free?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah yes, and _do_call
expects an em_queued_call*
, not a va_list
. args
should be q
here.
postMessage({'cmd' : 'processQueuedMainThreadWork'}); | ||
_emscripten_notify_proxying_queue: function(targetThreadId, currThreadId, mainThreadId, queue) { | ||
if (targetThreadId == currThreadId) { | ||
setTimeout(function() { _emscripten_proxy_execute_queue(queue); }); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why turn this into a setTimeout?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually, this new _emscdripten_notify_proxying_queue
is a duplicate of the one that already exists just below. I'm not sure how I ended up with the duplicate function, but I'll remove it. The diff here should simply be a removal of the old _emscripten_notify_thread_queue
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The reason this is setTimeout
is so that we process the queue after returning to the event loop, similar to how queues on other threads will only be processed once they return to their event loops.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Interesting.. I didn't see that the condition here changed from (targetThreadId == mainThreadId)
to (targetThreadId == currThreadId)
.
How was the (targetThreadId == currThreadId)
handled in the old code.. it looks like it was not handled specially and
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The old code eagerly executed the proxied work if the target thread was the same as the current thread, so _emscripten_notify_thread_queue
was never called in that case. The new API is more flexible and does not force eager evaluation in that case, so the new JS code has to handle it.
I'm curious to see how the pthread code size tests are effected. Can you run |
The only difference shown by running that command before and after is on tests/other/metadce/minimal_main_Oz_USE_PTHREADS_PROXY_TO_PTHREAD. The js size goes from 32374 to 32392 (+18 bytes, +0.06%) The wasm size goes from 17053 to 18539 (+1486 bytes, +8.7%) |
@sbc100 any other comments before we land this? |
The 8.7% rise the size of basic pthread runtime is a little worrying... Admittedly we have not been very focused on reducing this, but I would like to change. Do you have sense of where that might be coming from? |
system/lib/pthread/library_pthread.c
Outdated
@@ -644,7 +505,18 @@ void emscripten_main_thread_process_queued_calls() { | |||
if (!_emscripten_allow_main_runtime_queued_calls) | |||
return; | |||
|
|||
// Recursion guard to avoid infinite recursion when we arrive here from the | |||
// pthread_lock calls inside `emscripten_proxy_execute_queue`. This isn't | |||
// caught by the queue's own recursion guard because the lock has to be |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"This isn't caught by the queue's own recursion guard because the lock has to be
acquired before that recursion guard can be checked"
This seems a little worrying, can we so something special for this internal lock to avoid the recursion.
Protecting against it here seems like treating the symptom rather than the cause.. for example what happens if a user calls emscripten_current_thread_process_queued_calls
directly on the main thread instead of going via this wrapper?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I moved the recursion guard into emscripten_proxy_execute_queue
, so at least it's more robust now and avoids the problem you identified.
if (Module['_pthread_self']()) { // If this thread is actually running? | ||
Module['_emscripten_current_thread_process_queued_calls'](); | ||
Module['_emscripten_proxy_execute_queue'](e.data.queue); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we need to pass the queue pointer around or can we just assume its the system queue here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, we need to pass the queue pointer around because this is the same code path used for both the system queue and user queues. Actually, I see that this ended up duplicated somehow as well. Will fix.
// TODO: Must post message to main Emscripten thread in PROXY_TO_WORKER mode. | ||
_emscripten_main_thread_process_queued_calls(); | ||
_emscripten_proxy_execute_queue(d['queue']); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could we save a little of JS codesize here by passing zero args and having _emscripten_proxy_execute_queue
assume that NULL == system queue? Would save the extra post message packing / unpacking too.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, potentially, but I would prefer to make it a clear-cut error for NULL to ever be passed to emscripten_proxy_execute_queue
. Better for users to get an assertion failure (in debug builds) when they do that by accident than for the system queue to be executed.
I assume this is basically just because the new queue implementation is larger than the old queue implementation, and I assume that is because the new queues are more general than the old queues, for example being able to dynamically grow. Using vectors rather than linked lists for the top-level set of task queues might also increase code size. Edit: Using pthread_mutex and pthread_cond APIs probably also has some overhead over just using emscripten_futex_wait. |
Rewrite the old threading.h proxying API used for internal system call
implementations in terms of the new proxying API introduced in #15737.