Description
The FreeBSD buildbots have been encountering UNHANDLED TASK ERROR: EOFError: read end of file
while running the test suite on FreeBSD 12, which I can reproduce locally. What's actually happening is a segfault, but you wouldn't know unless you check /var/log/messages
and/or see the coredump file left behind.
Running the tests with a debug build and examining the backtrace of the coredump with GDB, I get:
#0 0x00000008006e44cf in _ULx86_64_dwarf_search_unwind_table () from /usr/home/julia/Desktop/julia/usr/lib/libunwind.so.8
#1 0x00000008006d80f5 in _ULx86_64_Iextract_dynamic_proc_info () from /usr/home/julia/Desktop/julia/usr/lib/libunwind.so.8
#2 0x00000008006d821c in local_find_proc_info () from /usr/home/julia/Desktop/julia/usr/lib/libunwind.so.8
#3 0x00000008006d8167 in _ULx86_64_Ifind_dynamic_proc_info () from /usr/home/julia/Desktop/julia/usr/lib/libunwind.so.8
#4 0x00000008006e11f4 in fetch_proc_info () from /usr/home/julia/Desktop/julia/usr/lib/libunwind.so.8
#5 0x00000008006e092c in find_reg_state () from /usr/home/julia/Desktop/julia/usr/lib/libunwind.so.8
#6 0x00000008006e07ef in _ULx86_64_dwarf_step () from /usr/home/julia/Desktop/julia/usr/lib/libunwind.so.8
#7 0x00000008006da949 in _ULx86_64_step () from /usr/home/julia/Desktop/julia/usr/lib/libunwind.so.8
#8 0x000000080154cd34 in jl_unw_step (cursor=0x8598a52e0, from_signal_handler=0, ip=0x8598a5258, sp=0x8598a5250) at stackwalk.c:545
#9 0x000000080154afde in jl_unw_stepn (cursor=0x8598a52e0, bt_data=0x80616f940, bt_size=0x8598a52d8, sp=0x0, maxsize=80000, skip=0, ppgcstack=0x8598a56d8,
from_signal_handler=0) at stackwalk.c:99
#10 0x000000080154b2dd in rec_backtrace (bt_data=0x80616f940, maxsize=80000, skip=2) at stackwalk.c:214
#11 0x000000080150cb9c in record_backtrace (ptls=0x801beb980, skip=1) at task.c:309
#12 0x000000080150cb0c in jl_throw (e=0x806cb2b40) at task.c:605
#13 0x0000000879a96d37 in chkfullrank () at /usr/home/julia/Desktop/julia/usr/share/julia/stdlib/v1.8/LinearAlgebra/src/cholesky.jl:698
#14 #cholesky!#147 () at /usr/home/julia/Desktop/julia/usr/share/julia/stdlib/v1.8/LinearAlgebra/src/cholesky.jl:308
#15 cholesky!##kw () at /usr/home/julia/Desktop/julia/usr/share/julia/stdlib/v1.8/LinearAlgebra/src/cholesky.jl:306
#16 julia_#cholesky!#149_12913 (tol=0, check=1 '\001', A=<error reading variable: Cannot access memory at address 0x0>)
at /usr/home/julia/Desktop/julia/usr/share/julia/stdlib/v1.8/LinearAlgebra/src/cholesky.jl:334
#17 0x0000000879a96f27 in cholesky!##kw () at /usr/home/julia/Desktop/julia/usr/share/julia/stdlib/v1.8/LinearAlgebra/src/cholesky.jl:327
#18 julia_#cholesky#152_12910 (tol=0, check=1 '\001', A=<error reading variable: Cannot access memory at address 0x0>)
at /usr/home/julia/Desktop/julia/usr/share/julia/stdlib/v1.8/LinearAlgebra/src/cholesky.jl:459
#19 0x0000000879a97134 in julia_cholesky_12907 (A=<error reading variable: Cannot access memory at address 0x877dd0070>)
at /usr/home/julia/Desktop/julia/usr/share/julia/stdlib/v1.8/LinearAlgebra/src/cholesky.jl:459
#20 0x0000000879a97215 in jfptr_cholesky_12908 ()
#21 0x00000008014e52dd in _jl_invoke (F=0x80e5a1dd0 <jl_system_image_data+43801616>, args=0x8598a5f98, nargs=2, mfunc=0x873092900, world=31263) at gf.c:2245
#22 0x00000008014e5383 in jl_apply_generic (F=0x80e5a1dd0 <jl_system_image_data+43801616>, args=0x8598a5f98, nargs=2) at gf.c:2427
#23 0x0000000801507040 in jl_apply (args=0x8598a5f90, nargs=3) at ./julia.h:1771
#24 0x0000000801506d01 in do_call (args=0x85c891a78, nargs=3, s=0x8598a6b30) at interpreter.c:125
#25 0x000000080150552b in eval_value (e=0x8090995b0, s=0x8598a6b30) at interpreter.c:214
#26 0x000000080150665d in eval_stmt_value (stmt=0x8090995b0, s=0x8598a6b30) at interpreter.c:165
#27 0x00000008015046d4 in eval_body (stmts=0x877f29c40, s=0x8598a6b30, ip=252, toplevel=1) at interpreter.c:579
#28 0x0000000801504190 in eval_body (stmts=0x877f29c40, s=0x8598a6b30, ip=249, toplevel=1) at interpreter.c:512
#29 0x0000000801504190 in eval_body (stmts=0x877f29c40, s=0x8598a6b30, ip=62, toplevel=1) at interpreter.c:512
#30 0x0000000801504190 in eval_body (stmts=0x877f29c40, s=0x8598a6b30, ip=10, toplevel=1) at interpreter.c:512
#31 0x0000000801504e34 in jl_interpret_toplevel_thunk (m=0x807eb9600, src=0x809bb3210) at interpreter.c:727
#32 0x000000080152d9ff in jl_toplevel_eval_flex (m=0x807eb9600, e=0x809085c70, fast=1, expanded=1) at toplevel.c:885
#33 0x000000080152e449 in jl_eval_module_expr (parent_module=0x807ebba90, ex=0x807b4e0d0) at toplevel.c:196
#34 0x000000080152c8d3 in jl_toplevel_eval_flex (m=0x807ebba90, e=0x807b4e0d0, fast=1, expanded=0) at toplevel.c:673
#35 0x000000080152d52e in jl_toplevel_eval_flex (m=0x807ebba90, e=0x807c7ded0, fast=1, expanded=0) at toplevel.c:830
#36 0x000000080152f054 in jl_toplevel_eval (m=0x807ebba90, v=0x807c7ded0) at toplevel.c:894
#37 0x000000080152f2ef in jl_toplevel_eval_in (m=0x807ebba90, ex=0x807c7ded0) at toplevel.c:944
#38 0x000000080b6ba028 in eval () at boot.jl:373
#39 japi1_include_string_39775 (mapexpr=..., mod=0x80639d390, code=0x59ac, filename=0x58) at loading.jl:1207
#40 0x00000008014dbe57 in jl_fptr_args (f=0x80ecbb1e0 <jl_system_image_data+51245088>, args=0x8598a8c20, nargs=4,
m=0x80ecbb4f0 <jl_system_image_data+51245872>) at gf.c:2014
#41 0x00000008014e51f5 in _jl_invoke (F=0x80ecbb1e0 <jl_system_image_data+51245088>, args=0x8598a8c20, nargs=4,
mfunc=0x80bf62d10 <jl_system_image_data+3697488>, world=31247) at gf.c:2226
#42 0x00000008014e5383 in jl_apply_generic (F=0x80ecbb1e0 <jl_system_image_data+51245088>, args=0x8598a8c20, nargs=4) at gf.c:2427
#43 0x000000080b88709f in japi1__include_32638 (mapexpr=0x80ca386b0 <jl_system_image_data+15058160>, mod=0x80639d390, _path=0x58) at loading.jl:1264
#44 0x000000081547ffa6 in include () at Base.jl:420
#45 macro expansion () at /usr/home/julia/Desktop/julia/test/testdefs.jl:24
#46 macro expansion () at /usr/home/julia/Desktop/julia/usr/share/julia/stdlib/v1.8/Test/src/Test.jl:1283
#47 macro expansion () at /usr/home/julia/Desktop/julia/test/testdefs.jl:23
#48 macro expansion () at timing.jl:368
#49 julia_#runtests#1_917 (seed=107125921925593505577364040855661097293, name=<error reading variable: Cannot access memory at address 0x0>,
path=<error reading variable: Cannot access memory at address 0x0>, isolate=1 '\001') at /usr/home/julia/Desktop/julia/test/testdefs.jl:21
#50 0x0000000815480ab7 in runtests##kw () at /usr/home/julia/Desktop/julia/test/testdefs.jl:6
#51 julia_runtests##kw_914 (name=<error reading variable: Cannot access memory at address 0x80>,
path=<error reading variable: Cannot access memory at address 0x877dd0070>) at /usr/home/julia/Desktop/julia/test/testdefs.jl:6
#52 0x0000000815480b1a in jfptr_runtests##kw_915 ()
#53 0x00000008014e52dd in _jl_invoke (F=0x80639a668, args=0x8598a9838, nargs=4, mfunc=0x806ffe2c0, world=31247) at gf.c:2245
#54 0x00000008014e5383 in jl_apply_generic (F=0x80639a668, args=0x8598a9838, nargs=4) at gf.c:2427
#55 0x00000008014f7030 in jl_apply (args=0x8598a9830, nargs=5) at ./julia.h:1771
#56 0x00000008014f6dce in do_apply (args=0x8598a9a68, nargs=3, iterate=0x80ec994c0 <jl_system_image_data+51106560>) at builtins.c:713
#57 0x00000008014f5fdf in jl_f__apply_iterate (F=0x0, args=0x8598a9a60, nargs=4) at builtins.c:721
#58 0x0000000815475586 in julia_#106_765 () at /usr/home/julia/Desktop/julia/usr/share/julia/stdlib/v1.8/Distributed/src/process_messages.jl:278
#59 0x0000000815475812 in julia_run_work_thunk_762 (thunk=..., print_error=0 '\000')
at /usr/home/julia/Desktop/julia/usr/share/julia/stdlib/v1.8/Distributed/src/process_messages.jl:63
#60 0x000000081547599f in macro expansion () at /usr/home/julia/Desktop/julia/usr/share/julia/stdlib/v1.8/Distributed/src/process_messages.jl:278
#61 julia_#105_759 () at task.jl:411
#62 0x0000000815475c60 in jfptr_#105_760 ()
#63 0x00000008014e51f5 in _jl_invoke (F=0x806fdcd80, args=0x807cfef08, nargs=0, mfunc=0x8079d6b80, world=31247) at gf.c:2226
#64 0x00000008014e5383 in jl_apply_generic (F=0x806fdcd80, args=0x807cfef08, nargs=0) at gf.c:2427
#65 0x000000080150bdd0 in jl_apply (args=0x807cfef00, nargs=1) at ./julia.h:1771
#66 0x000000080150dc2f in start_task () at task.c:881
@vchuravy looked at this a bit and posited that it was a crash in the unwinder while reading process data, caused by a bug in libunwind and/or buggy DWARF emission, and that something may be wrong with asynchronous unwind tables. He recommended the following:
- Build Julia with
-fno-asynchronous-unwind-tables
. I tried that but the crash persisted. - Swap nongnu libunwind for LLVM libunwind. WIP in WIP: Use LLVM libunwind on FreeBSD #41955.
- Upgrade LLVM libunwind from 11.0.1. WIP in Add LLVM libunwind v12.0.1 JuliaPackaging/Yggdrasil#3504.
He also noted the following upstream bug reports, which may be a useful breadcrumb (for folks who understand these things 😅)