Skip to content

Commit 76d39ae

Browse files
rscgopherbot
authored andcommitted
cmd/link, runtime: Apple libc atfork workaround take 3
CL 451735 worked around bugs in Apple's atfork handlers by calling notify_is_valid_token and xpc_atfork_child at startup, so that init code that wouldn't be safe in the child process would be warmed up in the parent process instead, but xpc_atfork_child broke use of the xpc library in Go programs, and xpc is internally used by various macOS frameworks (#57263). CL 459175 reverted that change, and then CL 459176 tried a new approach: use __fork, which doesn't call any of the atfork handlers at all. That worked, but an Apple engineer reviewing the change in private email suggests that since __fork is not public API, it should be avoided. The same engineer (with access to the source code for the xpc library) suggests that the breakage in #57263 is caused by xpc_atfork_child marking the library as unusable, expecting an imminent call to exec, and that calling xpc_date_create_from_current instead would do the necessary initialization without marking xpc as unusable. CL 460475 reverted that change, to prepare for this one. This CL goes back to the original “call functions to warm things up” approach, replacing xpc_atfork_child with xpc_date_create_from_current. The CL also updates cmd/link to use OS and SDK version 10.13.0 for x86 macOS binaries, up from 10.9.0, also suggested by the Apple engineer. Combined with the two warmup calls, this makes the fork hangs go away. The minimum macOS version has been 10.13 High Sierra since Go 1.17, so there should be no problem with writing that in the binaries too. Fixes #33565. Fixes #56784. Fixes #57263. Fixes #57577. Change-Id: I20769d9daa1fe9ea930f8009481335f8a14dc21b Reviewed-on: https://go-review.googlesource.com/c/go/+/460476 Auto-Submit: Russ Cox <[email protected]> Run-TryBot: Russ Cox <[email protected]> TryBot-Result: Gopher Robot <[email protected]> Reviewed-by: Bryan Mills <[email protected]> Reviewed-by: Cherry Mui <[email protected]>
1 parent 0a0de0f commit 76d39ae

File tree

5 files changed

+64
-2
lines changed

5 files changed

+64
-2
lines changed

src/cmd/link/internal/ld/macho.go

Lines changed: 5 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -479,8 +479,11 @@ func (ctxt *Link) domacho() {
479479
var version uint32
480480
switch ctxt.Arch.Family {
481481
case sys.AMD64:
482-
// The version must be at least 10.9; see golang.org/issues/30488.
483-
version = 10<<16 | 9<<8 | 0<<0 // 10.9.0
482+
// This must be fairly recent for Apple signing (go.dev/issue/30488).
483+
// Having too old a version here was also implicated in some problems
484+
// calling into macOS libraries (go.dev/issue/56784).
485+
// In general this can be the most recent supported macOS version.
486+
version = 10<<16 | 13<<8 | 0<<0 // 10.13.0
484487
case sys.ARM64:
485488
version = 11<<16 | 0<<8 | 0<<0 // 11.0.0
486489
}

src/runtime/os_darwin.go

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -136,6 +136,8 @@ func osinit() {
136136

137137
ncpu = getncpu()
138138
physPageSize = getPageSize()
139+
140+
osinit_hack()
139141
}
140142

141143
func sysctlbynameInt32(name []byte) (int32, int32) {

src/runtime/sys_darwin.go

Lines changed: 42 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -179,6 +179,45 @@ func pthread_kill(t pthread, sig uint32) {
179179
}
180180
func pthread_kill_trampoline()
181181

182+
// osinit_hack is a clumsy hack to work around Apple libc bugs
183+
// causing fork+exec to hang in the child process intermittently.
184+
// See go.dev/issue/33565 and go.dev/issue/56784 for a few reports.
185+
//
186+
// The stacks obtained from the hung child processes are in
187+
// libSystem_atfork_child, which is supposed to reinitialize various
188+
// parts of the C library in the new process.
189+
//
190+
// One common stack dies in _notify_fork_child calling _notify_globals
191+
// (inlined) calling _os_alloc_once, because _os_alloc_once detects that
192+
// the once lock is held by the parent process and then calls
193+
// _os_once_gate_corruption_abort. The allocation is setting up the
194+
// globals for the notification subsystem. See the source code at [1].
195+
// To work around this, we can allocate the globals earlier in the Go
196+
// program's lifetime, before any execs are involved, by calling any
197+
// notify routine that is exported, calls _notify_globals, and doesn't do
198+
// anything too expensive otherwise. notify_is_valid_token(0) fits the bill.
199+
//
200+
// The other common stack dies in xpc_atfork_child calling
201+
// _objc_msgSend_uncached which ends up in
202+
// WAITING_FOR_ANOTHER_THREAD_TO_FINISH_CALLING_+initialize. Of course,
203+
// whatever thread the child is waiting for is in the parent process and
204+
// is not going to finish anything in the child process. There is no
205+
// public source code for these routines, so it is unclear exactly what
206+
// the problem is. An Apple engineer suggests using xpc_date_create_from_current,
207+
// which empirically does fix the problem.
208+
//
209+
// So osinit_hack_trampoline (in sys_darwin_$GOARCH.s) calls
210+
// notify_is_valid_token(0) and xpc_date_create_from_current(), which makes the
211+
// fork+exec hangs stop happening. If Apple fixes the libc bug in
212+
// some future version of macOS, then we can remove this awful code.
213+
//
214+
//go:nosplit
215+
func osinit_hack() {
216+
libcCall(unsafe.Pointer(abi.FuncPCABI0(osinit_hack_trampoline)), nil)
217+
return
218+
}
219+
func osinit_hack_trampoline()
220+
182221
// mmap is used to do low-level memory allocation via mmap. Don't allow stack
183222
// splits, since this function (used by sysAlloc) is called in a lot of low-level
184223
// parts of the runtime and callers often assume it won't acquire any locks.
@@ -548,3 +587,6 @@ func setNonblock(fd int32) {
548587
//go:cgo_import_dynamic libc_pthread_cond_wait pthread_cond_wait "/usr/lib/libSystem.B.dylib"
549588
//go:cgo_import_dynamic libc_pthread_cond_timedwait_relative_np pthread_cond_timedwait_relative_np "/usr/lib/libSystem.B.dylib"
550589
//go:cgo_import_dynamic libc_pthread_cond_signal pthread_cond_signal "/usr/lib/libSystem.B.dylib"
590+
591+
//go:cgo_import_dynamic libc_notify_is_valid_token notify_is_valid_token "/usr/lib/libSystem.B.dylib"
592+
//go:cgo_import_dynamic libc_xpc_date_create_from_current xpc_date_create_from_current "/usr/lib/libSystem.B.dylib"

src/runtime/sys_darwin_amd64.s

Lines changed: 9 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -597,6 +597,15 @@ TEXT runtime·pthread_kill_trampoline(SB),NOSPLIT,$0
597597
POPQ BP
598598
RET
599599

600+
TEXT runtime·osinit_hack_trampoline(SB),NOSPLIT,$0
601+
PUSHQ BP
602+
MOVQ SP, BP
603+
MOVQ $0, DI // arg 1 val
604+
CALL libc_notify_is_valid_token(SB)
605+
CALL libc_xpc_date_create_from_current(SB)
606+
POPQ BP
607+
RET
608+
600609
// syscall calls a function in libc on behalf of the syscall package.
601610
// syscall takes a pointer to a struct like:
602611
// struct {

src/runtime/sys_darwin_arm64.s

Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -458,6 +458,12 @@ TEXT runtime·pthread_setspecific_trampoline(SB),NOSPLIT,$0
458458
BL libc_pthread_setspecific(SB)
459459
RET
460460

461+
TEXT runtime·osinit_hack_trampoline(SB),NOSPLIT,$0
462+
MOVD $0, R0 // arg 1 val
463+
BL libc_notify_is_valid_token(SB)
464+
BL libc_xpc_date_create_from_current(SB)
465+
RET
466+
461467
// syscall calls a function in libc on behalf of the syscall package.
462468
// syscall takes a pointer to a struct like:
463469
// struct {

0 commit comments

Comments
 (0)