Description
Consider this program:
% cat repro.mm
#include <string>
[[gnu::noinline]] void use_string(const std::string&) {}
typedef void(^B)();
[[gnu::noinline]] void use_block(B&& b) {}
int main() {
std::string request_id("123");
use_block(^() {
use_string(request_id);
});
}
I don't have a completely standalone repro yet, but based on my current understanding, it's possible to make a repro that uses just the llvm-project repo. For my current repro, I'm using Chromium's toolchain like so:
% third_party/llvm-build/Release+Asserts/bin/clang repro.mm -isystem third_party/libc++/src/include/ -isystem buildtools/third_party/libc++/ -D_LIBCPP_HARDENING_MODE=_LIBCPP_HARDENING_MODE_NONE -nostdinc++ -fobjc-arc -fno-exceptions -fno-rtti -fblocks -fsanitize=address out/gn/obj/buildtools/third_party/libc++/libc++/*.o out/gn/obj/buildtools/third_party/libc++abi/libc++abi/*.o -I. -isysroot $(xcrun -show-sdk-path)
This triggers an asan report from the blocks runtime:
% MallocNanoZone=0 ./a.out
=================================================================
==8812==ERROR: AddressSanitizer: container-overflow on address 0x00016bd6f3e4 at pc 0x0001049b74dc bp 0x00016bd6f2d0 sp 0x00016bd6ea80
READ of size 56 at 0x00016bd6f3e4 thread T0
#0 0x1049b74d8 (libclang_rt.asan_osx_dynamic.dylib:arm64+0x4f4d8)
#1 0x194932b48 in _Block_copy+0x58 (libsystem_blocks.dylib:arm64+0x1b48)
#2 0x630f80010409287c (<unknown module>)
#3 0x1948ae0dc (<unknown module>)
#4 0x86397ffffffffffc (<unknown module>)
Here's why:
When creating a block closure, clang will copy C++ objects into the closure using their copy constructor, see clang/lib/CGBlocks.cpp (so far so good).
clang will create calls to objc_retainBlock
under various conditions, see clang/lib/CodeGen/CGObjC.cpp. One such condition is passing a block to a function taking an rvalue reference when ARC is enabled (but there are other conditions).
objc_retainBlock
just calls _Block_copy
(https://github.com/opensource-apple/objc4/blob/cd5e62a5597ea7a31dccef089317abb3a661c154/runtime/NSObject.mm#L221>
If the block contains C++ objects, bit 25 in the block descriptor is set. This bit is called BLOCK_HAS_COPY_DISPOSE
in both clang and BlocksRuntime. CGBlocks.cpp generates a copy helper (GenerateCopyHelperFunction
) that the block points to. For C++ objects in the closure, it again calls copy ctors.
_Block_copy
is defined in compiler-rt/lib/BlocksRuntime/runtime.c. It's part of the system dyld closure which ships in macOS. It does:
...
// Its a stack block. Make a copy.
if (!isGC) {
struct Block_layout *result = malloc(aBlock->descriptor->size);
if (!result) return (void *)0;
memmove(result, aBlock, aBlock->descriptor->size); // bitcopy first
// reset refcount
result->flags &= ~(BLOCK_REFCOUNT_MASK); // XXX not needed
result->flags |= BLOCK_NEEDS_FREE | 1;
result->isa = _NSConcreteMallocBlock;
if (result->flags & BLOCK_HAS_COPY_DISPOSE) {
//printf("calling block copy helper %p(%p, %p)...\n", aBlock->descriptor->copy, result, aBlock);
(*aBlock->descriptor->copy)(result, aBlock); // do fixup
}
return result;
}
...
That is, it first makes a copy of the block's closure variables using memmove()
and then calls the copy helper to fix things up.
1a96179 enabled asan instrumentation for the short string inline storage. In the repro, the string is set to "123"
, meaning some of its short string storage isn't initialized. That makes that memmove()
in the block runtime fail.
Given that this call is in a macOS system library, fixing this will probably be involved.
(This of course only happens if _LIBCPP_INSTRUMENTED_WITH_ASAN
is enabled.)
clang/docs/Block-ABI-Apple.rst describes the whole blocks ABI fairly well.