Description
Repro:
- Launch a program (an example is this) in headless mode with Delve in
--continue
and--accept-multiclient
mode. - Attach to it from Theia with VSCode Go Extension.
- Set a breakpoint and notice that it didn't go through.
- Disconnecting also did not work.
Why this happens
Important Caveats
- Internally we are using a field called
continueRequestRunning
to keep track of whether Delve is in a running state or not. - If Delve is in a running state and we issue a synchronous (blocking) call to it, any subsequent calls (both synchronous and asynchronous) will not get through until Delve changes to a halted state.
Theia's and Delve's Sequence of Events
To understand the root cause, we need to go through the sequences of events from both Theia and Delve's perspective. Note the bolded part as that is crucial to understanding the problem. I also skipped some initial calls when initializing Delve (getVersion). Special thanks to @polinasok for helping debugging the issue and very helpful and detailed discussion of different scenarios.
Initializing Sequence (Not important)
Theia -> DAP: Sends InitializeRequest
to VSCode Go's Debug Adapter Protocol (DAP) to initialize.
Theia <- DAP: Receives a successful response from DAP.
Theia -> DAP: Sends SetBreakpointsRequest
to DAP to set breakpoints if there are any (also sends setFunctionBreakpoints
and setExceptionBreakpoints
if applicable). // <==== this does make a difference - please see explanation below
Theia <- DAP: Receives successful response(s) for those events from DAP.
Post-Initializing Sequence
Theia -> DAP Sends configurationDoneRequest
to DAP to indicate the end of the configuration.
DAP -> Delve: Sends asynchronous call to get Delve's state.
DAP <- Delve: Receives Delve's state and sees that it is running because we started Delve with --continue
switch. Since this is the case, the code doesn't call this.continue()
function and this means this.continueRequestRunning
is still set to false
.
Theia <- DAP: Receives response from DAP for the configurationDoneRequest.
Theia -> DAP: Sends ThreadsRequest
to DAP.
DAP -> Delve: Since DAP sees that this.continueRequestRunning
is still false, DAP thinks that Delve is not in a running state and so it sends a non-asynchronous (BLOCKING) call ListGoroutines
to Delve. Since Delve is in a running state, it won't return this call until it changes its state to Halted. However, since ListGoroutines
is a BLOCKING call, no other requests will get through, including the request to Halt Delve.
Theia -> DAP: Sends setBreakpointsRequest
to DAP, which will not go through.
As soon as we fix ThreadsRequest() to bypass the blocking call, the same issue occurs with the next call in the request waterfall to StackTraceRequest()
Remedy
- Instead of relying on the internal tracking
this.continueRequestRunning
, we should issue a non-blocking async call to get Delve's state instead and only fall back tothis.continueRequestRunning
if the call fails. - For synchronous (blocking calls) in
ThreadsRequest
andStacktraceRequest
, we need to make sure Delve's is in a halted state. Otherwise, we are not stopped at a breakpoint and should send back the dummy thread or empty response.