Skip to content

Commit f61aaa6

Browse files
Bryan C. Millscellularmitosis
Bryan C. Mills
authored andcommitted
context: fix synchronization in ExampleAfterFunc_cond
Condition variables are subtle and error-prone, and this example demonstrates exactly the sorts of problems that they introduce. Unfortunately, we're stuck with them for the foreseeable future. As previously implemented, this example was racy: since the callback passed to context.AfterFunc did not lock the mutex before calling Broadcast, it was possible for the Broadcast to occur before the goroutine was parked in the call to Wait, causing in a missed wakeup resulting in deadlock. The example also had a more insidious problem: it was not safe for multiple goroutines to call waitOnCond concurrently, but the whole point of using a sync.Cond is generally to synchronize concurrent goroutines. waitOnCond must use Broadcast to ensure that it wakes up the target goroutine, but the use of Broadcast in this way would produce spurious wakeups for all of the other goroutines waiting on the same condition variable. Since waitOnCond did not recheck the condition in a loop, those spurious wakeups would cause waitOnCond to spuriously return even if its own ctx was not yet done. Fixing the aforementioned bugs exposes a final problem, inherent to the use of condition variables in this way. This one is a performance problem: for N concurrent calls to waitOnCond, the resulting CPU cost is at least O(N²). This problem cannot be addressed without either reintroducing one of the above bugs or abandoning sync.Cond in the example entirely. Given that this example was already published in Go 1.21, I worry that Go users may think that it is appropriate to use a sync.Cond in conjunction with context.AfterFunc, so I have chosen to retain the Cond-based example and document its pitfalls instead of removing or replacing it entirely. I described this class of bugs and performance issues — and suggested some channel-based alternatives — in my GopherCon 2018 talk, “Rethinking Classical Concurrency Patterns”. The section on condition variables starts on slide 37. (https://youtu.be/5zXAHh5tJqQ?t=679) Fixes golang#62180. For golang#20491. Change-Id: If987cd9d112997c56171a7ef4fccadb360bb79bc Reviewed-on: https://go-review.googlesource.com/c/go/+/521596 Reviewed-by: Cuong Manh Le <[email protected]> Auto-Submit: Bryan Mills <[email protected]> Reviewed-by: Matthew Dempsky <[email protected]> TryBot-Result: Gopher Robot <[email protected]> Run-TryBot: Bryan Mills <[email protected]>
1 parent 7f6aecc commit f61aaa6

File tree

1 file changed

+50
-11
lines changed

1 file changed

+50
-11
lines changed

src/context/example_test.go

+50-11
Original file line numberDiff line numberDiff line change
@@ -125,25 +125,64 @@ func ExampleWithValue() {
125125
// This example uses AfterFunc to define a function which waits on a sync.Cond,
126126
// stopping the wait when a context is canceled.
127127
func ExampleAfterFunc_cond() {
128-
waitOnCond := func(ctx context.Context, cond *sync.Cond) error {
129-
stopf := context.AfterFunc(ctx, cond.Broadcast)
128+
waitOnCond := func(ctx context.Context, cond *sync.Cond, conditionMet func() bool) error {
129+
stopf := context.AfterFunc(ctx, func() {
130+
// We need to acquire cond.L here to be sure that the Broadcast
131+
// below won't occur before the call to Wait, which would result
132+
// in a missed signal (and deadlock).
133+
cond.L.Lock()
134+
defer cond.L.Unlock()
135+
136+
// If multiple goroutines are waiting on cond simultaneously,
137+
// we need to make sure we wake up exactly this one.
138+
// That means that we need to Broadcast to all of the goroutines,
139+
// which will wake them all up.
140+
//
141+
// If there are N concurrent calls to waitOnCond, each of the goroutines
142+
// will spuriously wake up O(N) other goroutines that aren't ready yet,
143+
// so this will cause the overall CPU cost to be O(N²).
144+
cond.Broadcast()
145+
})
130146
defer stopf()
131-
cond.Wait()
132-
return ctx.Err()
147+
148+
// Since the wakeups are using Broadcast instead of Signal, this call to
149+
// Wait may unblock due to some other goroutine's context becoming done,
150+
// so to be sure that ctx is actually done we need to check it in a loop.
151+
for !conditionMet() {
152+
cond.Wait()
153+
if ctx.Err() != nil {
154+
return ctx.Err()
155+
}
156+
}
157+
158+
return nil
133159
}
134160

135-
ctx, cancel := context.WithTimeout(context.Background(), 1*time.Millisecond)
136-
defer cancel()
161+
cond := sync.NewCond(new(sync.Mutex))
162+
163+
var wg sync.WaitGroup
164+
for i := 0; i < 4; i++ {
165+
wg.Add(1)
166+
go func() {
167+
defer wg.Done()
137168

138-
var mu sync.Mutex
139-
cond := sync.NewCond(&mu)
169+
ctx, cancel := context.WithTimeout(context.Background(), 1*time.Millisecond)
170+
defer cancel()
140171

141-
mu.Lock()
142-
err := waitOnCond(ctx, cond)
143-
fmt.Println(err)
172+
cond.L.Lock()
173+
defer cond.L.Unlock()
174+
175+
err := waitOnCond(ctx, cond, func() bool { return false })
176+
fmt.Println(err)
177+
}()
178+
}
179+
wg.Wait()
144180

145181
// Output:
146182
// context deadline exceeded
183+
// context deadline exceeded
184+
// context deadline exceeded
185+
// context deadline exceeded
147186
}
148187

149188
// This example uses AfterFunc to define a function which reads from a net.Conn,

0 commit comments

Comments
 (0)