-
Notifications
You must be signed in to change notification settings - Fork 29
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix: Ensure executeRequest
waits for standard output to complete when using comlink
#144
Conversation
@@ -198,6 +198,9 @@ export class PyodideKernel extends BaseKernel implements IKernel { | |||
); | |||
break; | |||
} | |||
case 'execute_return': { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wondering if such message should be named something else, or prefixed with an underscore or similar? Otherwise it looks like it may be an official message part of the Jupyter protocol, while it isn't.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sure, jpk:
would be solid. It's not really private, and a downstream may well want to react to it. We'd pay a minimal tax to do the .split
but adding more fields at this time seems even worse and more breaking.
Most of these should be hanging off a const
or namespace
somewhere (but not enum
: those can break in round-tripped JSON settings).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So, should we rename the message to jpk:execute_return
after all??
Do you think this also solves the ipywidget problem, which is also related to the commit introducing coimlink: |
Not sure yet, but probably worth trying with this link (preview for the PR): https://jupyterlite-pyodide-kernel--144.org.readthedocs.build/en/144/_static/lab/index.html |
Nope, it still persists. So it must be another race condition introduced by comlink (or something else in the comlink commit). |
@@ -244,6 +247,8 @@ export class PyodideKernel extends BaseKernel implements IKernel { | |||
await this.ready; | |||
const result = await this._remoteKernel.execute(content, this.parent); | |||
result.execution_count = this.executionCount; | |||
await this._executed.promise; | |||
this._executed = new PromiseDelegate<void>(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have a stupid question. There is always one PromiseDelegate. The execute request starts before interacting with the promise_delegate. If two execution requests are started almost at the same time, what identifies which promise to resolve once the message arrives?
Another question: Does it make sense to add this message mechanism? I think the diagnosis is that Comlink sometimes does not accurately handle calls to async functions in the worker. But this means that potentially, all async functions are affected by the same issue. So, it may be better to find the root cause of why Comlink is failing. It is not that simple and uses UUIDS for every message, but I am also not sure if it can handle async. |
The real problem is that comlink changes the ordering of the messages from "this._sendWorkerMessage" and the return of the worker's async methods. So, the _sendWorkerMessages arrive after the return from the async method that sets the parent header with the msg_id. |
Ok, I have found the underlying problem.
ends up in comlink: https://github.com/GoogleChromeLabs/comlink/blob/fd4b52666b1ec62784f8b45cb1108c7e40bc481d/src/comlink.ts#L213 creating a separate MessageChannel for the callback used in _sendWorkerMessage. This explains all the troubles, as the call to the worker is done via the normal worker's message ports, but the _sendWorkerMessage uses a separate Message Channel. The UAs do not guarantee that two separate channels are executed in the same order, but ordering is crucial for the Jupyter protocol. |
Many thanks @martenrichter for the thorough investigation!
Right, if there is a way to do that with comlink and stick to the comlink API if possible, it would be great. Otherwise switching back to raw |
I have posted a PR for 3.). My follow-up investigation showed that using native comlink is only possible if I submit an upstream patch, which will be complex. I do not know how welcome this will be from the comlink team side. |
Description of Issue
When using
comlink
,executeRequest
function sometimes finishes before the standard output is fully completed.This behavior appears to be caused by the lack of synchronization between callback handling and the simple function logic in
comlink
.How to Reproduce the Issue
npm run quickstart
,jlpm build
,jlpm docs:lite
with"exposeAppInBrowser": true
andjlpm serve
Changes Made
I identified the issue as being caused by an inconsistency in the execution order between callback handling and the simple function logic when using
comlink
.To resolve this:
I implemented a mechanism where the worker sends a message through a callback at the end of the process .
The kernel now waits for this message before concluding the
executeRequest
function.This ensures proper synchronization and prevents premature termination of
executeRequest
.