Skip to content

Cannot find symbol for section 2: .text. #981

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
E5ten opened this issue Apr 8, 2020 · 25 comments
Open

Cannot find symbol for section 2: .text. #981

E5ten opened this issue Apr 8, 2020 · 25 comments
Labels
[ARCH] x86_64 This bug impacts ARCH=x86_64 [BUG] linux A bug that should be fixed in the mainline kernel. [FIXED][LINUX] 5.10 This bug was fixed in Linux 5.10 [TOOL] integrated-as The issue is relevant to LLVM integrated assembler [WORKAROUND] Applied This bug has an applied workaround

Comments

@E5ten
Copy link

E5ten commented Apr 8, 2020

Using AS=clang to build with integrated-as, on x86_64, when scripts/recordmcount is run on certain objects (for me it happens with init/initramfs.o and kernel/elfcore.o at least) I get the error in the title.

@E5ten E5ten added the [TOOL] integrated-as The issue is relevant to LLVM integrated assembler label Apr 8, 2020
@nathanchance
Copy link
Member

Steps to reproduce (from #986):

$ git clone git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git

$ cd linux

$ curl -LSs https://gist.github.com/nathanchance/171b7d672e311b56b4329821b8a43acd/raw/9a1dbb1f11552d0b6efec48ac29505dd0c768d1b/20200401_jpoimboe_objtool_fixes.mbx | git apply -3v

$ curl -LSs https://lore.kernel.org/lkml/[email protected]/raw | git apply -3v

$ ./scripts/config --file arch/x86/configs/x86_64_defconfig -e FUNCTION_TRACER

$ make -j$(nproc) -s LLVM=1 LLVM_IAS=1 O=out/x86_64 distclean defconfig bzImage

@E5ten
Copy link
Author

E5ten commented Jun 13, 2020

I did an integrated-as build and specifically added CFLAGS_.o += -no-integrated-as to the relevant Makefile's for init/initramfs.o and kernel/elfcore.o, and got through the rest of the build, so at least for my configuration, those are the only 2 objects this issue happens with.

@E5ten
Copy link
Author

E5ten commented Jun 13, 2020

I assume something like this also needs to be done for recordmcount to fix this?
https://lore.kernel.org/lkml/9a9cae7fcf628843aabe5a086b1a3c5bf50f42e8.1585761021.git.jpoimboe@redhat.com/

@dileks
Copy link
Collaborator

dileks commented Jun 14, 2020

Just to clarify:
You use here LLVM_IAS=1 together with LLVM=1.

@E5ten
Copy link
Author

E5ten commented Jun 15, 2020

yeah.

@dileks
Copy link
Collaborator

dileks commented Jun 15, 2020

@E5ten

I switched over to use LLVM_IAS=1 together with LLVM=1.

@samitolvanen
Copy link
Member

samitolvanen commented Oct 27, 2020

I also ran into this with LLVM_IAS=1 when building x86_64 defconfig with dynamic ftrace. Testing Peter's objtool mcount patch, I noticed that objtool segfaults for several object files because the files are missing STT_SECTION symbols for some of the sections.

A random example, compiled with LLVM_IAS=1:

$ readelf --sections arch/x86/mm/hugetlbpage.o | grep PROGBITS
  [ 2] .text             PROGBITS         0000000000000000  00000240
  [ 4] .altinstructions  PROGBITS         0000000000000000  000007c8
  [ 6] .altinstr_re[...] PROGBITS         0000000000000000  00000890
  [ 8] .altinstr_aux     PROGBITS         0000000000000000  000008d0
  [10] .init.text        PROGBITS         0000000000000000  00000988
...
$ readelf --symbols arch/x86/mm/hugetlbpage.o | grep SECTION
     3: 0000000000000000     0 SECTION LOCAL  DEFAULT    2 
     4: 0000000000000000     0 SECTION LOCAL  DEFAULT    6 
     5: 0000000000000000     0 SECTION LOCAL  DEFAULT    8 

Objtool fails here because .init.text doesn't have a corresponding STT_SECTION symbol. Without IAS, the symbol is generated:

$ readelf --sections arch/x86/mm/hugetlbpage.o | grep PROGBITS
  [ 1] .text             PROGBITS         0000000000000000  00000040
  [ 3] .data             PROGBITS         0000000000000000  000005c8
  [ 5] .altinstructions  PROGBITS         0000000000000000  000005c8
  [ 7] .altinstr_re[...] PROGBITS         0000000000000000  00000690
  [ 9] .altinstr_aux     PROGBITS         0000000000000000  000006d0
  [11] .init.text        PROGBITS         0000000000000000  00000788
...
$ readelf --symbols arch/x86/mm/hugetlbpage.o | grep SECTION
     2: 0000000000000000     0 SECTION LOCAL  DEFAULT    1 
     3: 0000000000000000     0 SECTION LOCAL  DEFAULT    3 
     4: 0000000000000000     0 SECTION LOCAL  DEFAULT    4 
     5: 0000000000000000     0 SECTION LOCAL  DEFAULT    5 
     6: 0000000000000000     0 SECTION LOCAL  DEFAULT    7 
     7: 0000000000000000     0 SECTION LOCAL  DEFAULT    9 
     9: 0000000000000000     0 SECTION LOCAL  DEFAULT   11 
... 

Edit: OK, my issue looks similar to issue #669, but just in a different part of objtool. Specifically, the new static call processing code and the proposed mcount patch both depend on section symbols, so if either of these occur in a section for which a symbol is missing, objtool is going to segfault. This doesn't appear to be a problem with static calls right now (or we would have noticed it), but the mcount patch triggers this quite often. I fixed this in commit 54d837e for now.

@nickdesaulniers nickdesaulniers added the [BUG] Untriaged Something isn't working label Nov 16, 2020
@nickdesaulniers
Copy link
Member

It sounds like CrOS is hitting this now trying to move to LLVM_IAS=1: https://bugs.chromium.org/p/chromium/issues/detail?id=1148073 cc @jcai19

@nickdesaulniers
Copy link
Member

With defconfig+FUNCTION_TRACER, I see this in:

init/initramfs.o
kernel/elfcore.o

Sami, I think 54d837e no longer applies on linux-next?

@samitolvanen
Copy link
Member

Sami, I think 54d837e no longer applies on linux-next?

That's because it only fixes the mcount pass (commit 0271fa5), which isn't upstream yet. You probably need an identical fix for the static call pass instead, assuming that's where it crashes.

@jcai19
Copy link
Member

jcai19 commented Nov 20, 2020

Sami, I think 54d837e no longer applies on linux-next?

That's because it only fixes the mcount pass (commit 0271fa5), which isn't upstream yet.

May I know what dependencies are needed to back port 0271fa5 and 54d837e into 5.4? While trying to test them on 5.4, I realized there were many dependencies I needed to cherry-pick/back-port in order to apply these two patches cleanly. For example, 0271fa5 seems to be based on upstream commit 0f1441b44e823a74f3f3780902a113e07c73fbf6, which is not in 5.4 yet, but I could not cherry-pick it into stable/linux-5.4.y branch cleanly as its dependencies were also missing.

You probably need an identical fix for the static call pass instead, assuming that's where it crashes.

Just to be clear, does that mean 0271fa5 and 54d837e are not enough to fix this issue? Thanks.

@samitolvanen
Copy link
Member

Just to be clear, does that mean 0271fa5 and 54d837e are not enough to fix this issue? Thanks.

After actually looking at the CrOS bug, I'm guessing it's the same as the original recordmcount issue and these objtool patches are not going to help here. Both issues have the same root cause though, Clang not always generating section symbols, but you'll need to fix this in recordmcount instead.

@nickdesaulniers
Copy link
Member

I think @arndb just sent patches for this that got picked up by akpm: https://lore.kernel.org/lkml/[email protected]/

@nickdesaulniers nickdesaulniers added [BUG] linux A bug that should be fixed in the mainline kernel. [PATCH] Accepted A submitted patch has been accepted upstream and removed [BUG] Untriaged Something isn't working labels Dec 5, 2020
@arndb
Copy link

arndb commented Dec 5, 2020

The patches I sent just work around the problem by avoiding the weak functions in those files, the bug is still there and could show up any time another file has only weak functions in the .text section.

@E5ten
Copy link
Author

E5ten commented Dec 6, 2020

With these patches I was able to build and boot an x86_64 kernel with LLVM=1 and LLVM_IAS=1

@nickdesaulniers nickdesaulniers added the [WORKAROUND] Applied This bug has an applied workaround label Jan 1, 2021
@dileks
Copy link
Collaborator

dileks commented Jan 1, 2021

Both patches in Linux v5.10 and linux-stable trees recently carrying them.

$ git log --oneline | grep 'initramfs: fix clang build failure'
55d5b7dd6451 initramfs: fix clang build failure
$ git describe --contains 55d5b7dd6451
v5.10~14^2~3

$ git log --oneline | grep 'elfcore: fix building with clang'
6e7b64b9dd6d elfcore: fix building with clang
$ git describe --contains 6e7b64b9dd6d
v5.10~14^2~2

@dileks dileks added [FIXED][LINUX] 5.10 This bug was fixed in Linux 5.10 and removed [PATCH] Accepted A submitted patch has been accepted upstream labels Jan 1, 2021
@nathanchance
Copy link
Member

Looks like the PowerPC folks are getting bit by this too:

linuxppc/issues#388

https://lore.kernel.org/r/cd0f6bdfdf1ee096fb2c07e7b38940921b8e9118.1637764848.git.christophe.leroy@csgroup.eu/

@emojifreak reported issues with ARCH=mips allmodconfig + CONFIG_LD_DEAD_CODE_DATA_ELIMINATION=y:

$ echo "CONFIG_LD_DEAD_CODE_DATA_ELIMINATION=y
CONFIG_MIPS32_O32=n" >>kernel/configs/repro.config

$ make -skj"$(nproc)" ARCH=mips LLVM=1 distclean allmodconfig repro.config init/calibrate.o
...
Cannot find symbol for section 8: .text.calibrate_delay_is_known.
init/calibrate.o: failed
...

KCOV helps reproduce it but I doubt it is strictly related to the issue. cvise spits out:

$ cat calibrate.i
long __attribute__((weak)) calibrate_delay_is_known() { return 0; }

$ clang --target=mipsel-linux-gnu -fsanitize-coverage=trace-pc -ffunction-sections -c calibrate.i

$ ./recordmcount calibrate.o
Cannot find symbol for section 4: .text.calibrate_delay_is_known.
calibrate.o: failed

$ llvm-objdump -x calibrate.o

calibrate.o:    file format elf32-mips
architecture: mipsel
start address: 0x00000000

Program Header:

Dynamic Section:

Sections:
Idx Name                               Size     VMA      Type
  0                                    00000000 00000000
  1 .strtab                            000000c0 00000000
  2 .text                              00000000 00000000 TEXT
  3 .mdebug.abi32                      00000000 00000000
  4 .text.calibrate_delay_is_known     00000034 00000000 TEXT
  5 .rel.text.calibrate_delay_is_known 00000008 00000000
  6 .pdr                               00000020 00000000
  7 .rel.pdr                           00000008 00000000
  8 .comment                           00000016 00000000
  9 .note.GNU-stack                    00000000 00000000
 10 .data                              00000000 00000000 DATA
 11 .bss                               00000000 00000000 BSS
 12 .reginfo                           00000018 00000000
 13 .MIPS.abiflags                     00000018 00000000
 14 .llvm_addrsig                      00000001 00000000
 15 .symtab                            00000040 00000000

SYMBOL TABLE:
00000000 l    df *ABS*  00000000 calibrate.i
00000000  w    F .text.calibrate_delay_is_known 00000034 calibrate_delay_is_known
00000000         *UND*  00000000 __sanitizer_cov_trace_pc

RELOCATION RECORDS FOR [.text.calibrate_delay_is_known]:
OFFSET   TYPE                     VALUE
00000010 R_MIPS_26                __sanitizer_cov_trace_pc

RELOCATION RECORDS FOR [.pdr]:
OFFSET   TYPE                     VALUE
00000000 R_MIPS_32                calibrate_delay_is_known

Without -fsanitize-coverage=trace-pc:

$ clang --target=mipsel-linux-gnu -ffunction-sections -c calibrate.i

$ ./recordmcount calibrate.o

$ llvm-objdump -x calibrate.o

calibrate.o:    file format elf32-mips
architecture: mipsel
start address: 0x00000000

Program Header:

Dynamic Section:

Sections:
Idx Name                           Size     VMA      Type
  0                                00000000 00000000
  1 .strtab                        000000a3 00000000
  2 .text                          00000000 00000000 TEXT
  3 .mdebug.abi32                  00000000 00000000
  4 .text.calibrate_delay_is_known 0000002c 00000000 TEXT
  5 .pdr                           00000020 00000000
  6 .rel.pdr                       00000008 00000000
  7 .comment                       00000016 00000000
  8 .note.GNU-stack                00000000 00000000
  9 .data                          00000000 00000000 DATA
 10 .bss                           00000000 00000000 BSS
 11 .reginfo                       00000018 00000000
 12 .MIPS.abiflags                 00000018 00000000
 13 .llvm_addrsig                  00000000 00000000
 14 .symtab                        00000030 00000000

SYMBOL TABLE:
00000000 l    df *ABS*  00000000 calibrate.i
00000000  w    F .text.calibrate_delay_is_known 0000002c calibrate_delay_is_known

RELOCATION RECORDS FOR [.pdr]:
OFFSET   TYPE                     VALUE
00000000 R_MIPS_32                calibrate_delay_is_known

rnav added a commit to rnav/linux that referenced this issue Apr 25, 2022
…elocations[_add]

kexec_load_purgatory() can fail for many reasons - there is no need to
print an error when encountering unsupported relocations.

This solves a build issue on powerpc with binutils v2.36 and newer [1],
and likely also with the clang integrated assembler [2].

Since commit d1bcae833b32f1 ("ELF: Don't generate unused section
symbols") [3], binutils started dropping section symbols that it thought
were unused.  This isn't an issue in general, but with kexec_file.c, gcc
is placing kexec_arch_apply_relocations[_add] into a separate
.text.unlikely section and the section symbol ".text.unlikely" is being
dropped. Due to this, recordmcount is unable to find a non-weak symbol
in .text.unlikely to generate a relocation record against. Dropping
pr_err() calls results in these functions being left in .text section,
enabling recordmcount to emit a proper relocation record.

[1] linuxppc/issues#388
[2] ClangBuiltLinux#981
[3] https://sourceware.org/git/?p=binutils-gdb.git;a=commit;h=d1bcae833b32f1

Signed-off-by: Naveen N. Rao <[email protected]>
@nathanchance
Copy link
Member

There is a new instance of this problem after commit dbe69b299884 ("bpf: Fix dispatcher patchable function entry to 5 bytes nop") for certain configurations:

$ make -skj"$(nproc)" ARCH=powerpc CROSS_COMPILE=powerpc-linux-gnu- LLVM=1 mrproper powernv_defconfig all
Cannot find symbol for section 4: .init.text.
kernel/bpf/dispatcher.o: failed

@nickdesaulniers
Copy link
Member

linuxppc/issues#388 alludes to this issue. Looks like binutils reverted dropping section symbols just for ppc: bminor/binutils-gdb@c09c8b4 cc @MaskRay

@nathanchance
Copy link
Member

That's annoying :/ for what it's worth, I have seen that error on i386 as well, so it is not just powerpc that is affected by this.

I think recordmcount is only run for ftrace so maybe a diff like this would help out?

diff --git a/kernel/trace/Kconfig b/kernel/trace/Kconfig
index e9e95c790b8e..233836893fd8 100644
--- a/kernel/trace/Kconfig
+++ b/kernel/trace/Kconfig
@@ -744,6 +744,7 @@ config FTRACE_MCOUNT_USE_RECORDMCOUNT
        depends on !FTRACE_MCOUNT_USE_CC
        depends on !FTRACE_MCOUNT_USE_OBJTOOL
        depends on FTRACE_MCOUNT_RECORD
+       depends on !AS_IS_LLVM

 config TRACING_MAP
        bool

@nathanchance
Copy link
Member

While that diff stops the build error because it disables the use of recordmcount, it does not prevent ftrace from being selected altogether, which may lead to further reports of ftrace not working, despite being selected. We might be able to fix that error in a similar manner as Arnd's previous patches but I am not sure how to go about that...

@nathanchance
Copy link
Member

I am not sure how to go about that...

More specifically, I only tried removing __init from bpf_arch_init_dispatcher_early() in kernel/bpf/dispatcher.c but that is not enough since the declaration in include/linux/bpf.h wins. We cannot remove __init altogether as the x86 version of bpf_arch_init_dispatcher_early() calls text_poke_early(), which is marked __init_or_module, which expands to nothing if CONFIG_MODULES is enabled or __init if not. With that in mind, the following diff resolves the failure that I note above for that specific configuration; so far, I have only seen that failure in three different configurations. It will still be reproducible with CONFIG_MODULES disabled but that is probably okay for now. I can send this as a formal patch on Monday if it seems reasonable.

diff --git a/arch/x86/net/bpf_jit_comp.c b/arch/x86/net/bpf_jit_comp.c
index 00127abd89ee..4145939bbb6a 100644
--- a/arch/x86/net/bpf_jit_comp.c
+++ b/arch/x86/net/bpf_jit_comp.c
@@ -389,7 +389,7 @@ static int __bpf_arch_text_poke(void *ip, enum bpf_text_poke_type t,
        return ret;
 }

-int __init bpf_arch_init_dispatcher_early(void *ip)
+int __init_or_module bpf_arch_init_dispatcher_early(void *ip)
 {
        const u8 *nop_insn = x86_nops[5];

diff --git a/include/linux/bpf.h b/include/linux/bpf.h
index 0566705c1d4e..4aa7bde406f5 100644
--- a/include/linux/bpf.h
+++ b/include/linux/bpf.h
@@ -971,7 +971,7 @@ struct bpf_trampoline *bpf_trampoline_get(u64 key,
                                          struct bpf_attach_target_info *tgt_info);
 void bpf_trampoline_put(struct bpf_trampoline *tr);
 int arch_prepare_bpf_dispatcher(void *image, void *buf, s64 *funcs, int num_funcs);
-int __init bpf_arch_init_dispatcher_early(void *ip);
+int __init_or_module bpf_arch_init_dispatcher_early(void *ip);

 #define BPF_DISPATCHER_INIT(_name) {                           \
        .mutex = __MUTEX_INITIALIZER(_name.mutex),              \
diff --git a/kernel/bpf/dispatcher.c b/kernel/bpf/dispatcher.c
index 04f0a045dcaa..e14a68e9a74f 100644
--- a/kernel/bpf/dispatcher.c
+++ b/kernel/bpf/dispatcher.c
@@ -91,7 +91,7 @@ int __weak arch_prepare_bpf_dispatcher(void *image, void *buf, s64 *funcs, int n
        return -ENOTSUPP;
 }

-int __weak __init bpf_arch_init_dispatcher_early(void *ip)
+int __weak __init_or_module bpf_arch_init_dispatcher_early(void *ip)
 {
        return -ENOTSUPP;
 }

@nathanchance
Copy link
Member

Patch submitted: https://lore.kernel.org/[email protected]/

@nathanchance nathanchance added the [PATCH] Submitted A patch has been submitted for review label Oct 31, 2022
kernel-patches-bot pushed a commit to kernel-patches/bpf-rc that referenced this issue Oct 31, 2022
After commit dbe69b2 ("bpf: Fix dispatcher patchable function entry
to 5 bytes nop"), building kernel/bpf/dispatcher.c in certain
configurations with LLVM's integrated assembler results in a known
recordmcount bug:

  Cannot find symbol for section 4: .init.text.
  kernel/bpf/dispatcher.o: failed

This occurs when there are only weak symbols in a particular section in
the translation unit; in this case, bpf_arch_init_dispatcher_early() is
marked '__weak __init' and it is the only symbol in the .init.text
section. recordmcount expects there to be a symbol for a particular
section but LLVM's integrated assembler (and GNU as after 2.37) do not
generated section symbols. This has been worked around in the kernel
before in commit 55d5b7d ("initramfs: fix clang build failure")
and commit 6e7b64b ("elfcore: fix building with clang").

Fixing recordmcount has been brought up before but there is no clear
solution that does not break ftrace outright.

Unfortunately, working around this issue by removing the '__init' from
bpf_arch_init_dispatcher_early() is not an option, as the x86 version of
bpf_arch_init_dispatcher_early() calls text_poke_early(), which is
marked '__init_or_module', meaning that when CONFIG_MODULES is disabled,
bpf_arch_init_dispatcher_early() has to be marked '__init' as well to
avoid a section mismatch warning from modpost.

However, bpf_arch_init_dispatcher_early() can be marked
'__init_or_module' as well, which would resolve the recordmcount warning
for configurations that support modules (i.e., the vast majority of
them) while not introducing any new warnings for all configurations. Do
so to clear up the build failure for CONFIG_MODULES=y configurations.

Link: ClangBuiltLinux/linux#981
Signed-off-by: Nathan Chancellor <[email protected]>
kernel-patches-bot pushed a commit to kernel-patches/bpf that referenced this issue Oct 31, 2022
After commit dbe69b2 ("bpf: Fix dispatcher patchable function entry
to 5 bytes nop"), building kernel/bpf/dispatcher.c in certain
configurations with LLVM's integrated assembler results in a known
recordmcount bug:

  Cannot find symbol for section 4: .init.text.
  kernel/bpf/dispatcher.o: failed

This occurs when there are only weak symbols in a particular section in
the translation unit; in this case, bpf_arch_init_dispatcher_early() is
marked '__weak __init' and it is the only symbol in the .init.text
section. recordmcount expects there to be a symbol for a particular
section but LLVM's integrated assembler (and GNU as after 2.37) do not
generated section symbols. This has been worked around in the kernel
before in commit 55d5b7d ("initramfs: fix clang build failure")
and commit 6e7b64b ("elfcore: fix building with clang").

Fixing recordmcount has been brought up before but there is no clear
solution that does not break ftrace outright.

Unfortunately, working around this issue by removing the '__init' from
bpf_arch_init_dispatcher_early() is not an option, as the x86 version of
bpf_arch_init_dispatcher_early() calls text_poke_early(), which is
marked '__init_or_module', meaning that when CONFIG_MODULES is disabled,
bpf_arch_init_dispatcher_early() has to be marked '__init' as well to
avoid a section mismatch warning from modpost.

However, bpf_arch_init_dispatcher_early() can be marked
'__init_or_module' as well, which would resolve the recordmcount warning
for configurations that support modules (i.e., the vast majority of
them) while not introducing any new warnings for all configurations. Do
so to clear up the build failure for CONFIG_MODULES=y configurations.

Link: ClangBuiltLinux/linux#981
Signed-off-by: Nathan Chancellor <[email protected]>
kernel-patches-bot pushed a commit to kernel-patches/bpf-rc that referenced this issue Nov 1, 2022
After commit dbe69b2 ("bpf: Fix dispatcher patchable function entry
to 5 bytes nop"), building kernel/bpf/dispatcher.c in certain
configurations with LLVM's integrated assembler results in a known
recordmcount bug:

  Cannot find symbol for section 4: .init.text.
  kernel/bpf/dispatcher.o: failed

This occurs when there are only weak symbols in a particular section in
the translation unit; in this case, bpf_arch_init_dispatcher_early() is
marked '__weak __init' and it is the only symbol in the .init.text
section. recordmcount expects there to be a symbol for a particular
section but LLVM's integrated assembler (and GNU as after 2.37) do not
generated section symbols. This has been worked around in the kernel
before in commit 55d5b7d ("initramfs: fix clang build failure")
and commit 6e7b64b ("elfcore: fix building with clang").

Fixing recordmcount has been brought up before but there is no clear
solution that does not break ftrace outright.

Unfortunately, working around this issue by removing the '__init' from
bpf_arch_init_dispatcher_early() is not an option, as the x86 version of
bpf_arch_init_dispatcher_early() calls text_poke_early(), which is
marked '__init_or_module', meaning that when CONFIG_MODULES is disabled,
bpf_arch_init_dispatcher_early() has to be marked '__init' as well to
avoid a section mismatch warning from modpost.

However, bpf_arch_init_dispatcher_early() can be marked
'__init_or_module' as well, which would resolve the recordmcount warning
for configurations that support modules (i.e., the vast majority of
them) while not introducing any new warnings for all configurations. Do
so to clear up the build failure for CONFIG_MODULES=y configurations.

Link: ClangBuiltLinux/linux#981
Signed-off-by: Nathan Chancellor <[email protected]>
kernel-patches-bot pushed a commit to kernel-patches/bpf that referenced this issue Nov 1, 2022
After commit dbe69b2 ("bpf: Fix dispatcher patchable function entry
to 5 bytes nop"), building kernel/bpf/dispatcher.c in certain
configurations with LLVM's integrated assembler results in a known
recordmcount bug:

  Cannot find symbol for section 4: .init.text.
  kernel/bpf/dispatcher.o: failed

This occurs when there are only weak symbols in a particular section in
the translation unit; in this case, bpf_arch_init_dispatcher_early() is
marked '__weak __init' and it is the only symbol in the .init.text
section. recordmcount expects there to be a symbol for a particular
section but LLVM's integrated assembler (and GNU as after 2.37) do not
generated section symbols. This has been worked around in the kernel
before in commit 55d5b7d ("initramfs: fix clang build failure")
and commit 6e7b64b ("elfcore: fix building with clang").

Fixing recordmcount has been brought up before but there is no clear
solution that does not break ftrace outright.

Unfortunately, working around this issue by removing the '__init' from
bpf_arch_init_dispatcher_early() is not an option, as the x86 version of
bpf_arch_init_dispatcher_early() calls text_poke_early(), which is
marked '__init_or_module', meaning that when CONFIG_MODULES is disabled,
bpf_arch_init_dispatcher_early() has to be marked '__init' as well to
avoid a section mismatch warning from modpost.

However, bpf_arch_init_dispatcher_early() can be marked
'__init_or_module' as well, which would resolve the recordmcount warning
for configurations that support modules (i.e., the vast majority of
them) while not introducing any new warnings for all configurations. Do
so to clear up the build failure for CONFIG_MODULES=y configurations.

Link: ClangBuiltLinux/linux#981
Signed-off-by: Nathan Chancellor <[email protected]>
@nathanchance
Copy link
Member

nathanchance commented Nov 1, 2022

It sounds like the original patch that caused the recent bpf issue might get reverted in favor of a difference fix:

https://lore.kernel.org/Y2DRVwI4bNUppmXJ@krava/

https://lore.kernel.org/[email protected]/

ajdlinux pushed a commit to ajdlinux/linux that referenced this issue Nov 4, 2022
After commit dbe69b2 ("bpf: Fix dispatcher patchable function entry
to 5 bytes nop"), building kernel/bpf/dispatcher.c in certain
configurations with LLVM's integrated assembler results in a known
recordmcount bug:

  Cannot find symbol for section 4: .init.text.
  kernel/bpf/dispatcher.o: failed

This occurs when there are only weak symbols in a particular section in
the translation unit; in this case, bpf_arch_init_dispatcher_early() is
marked '__weak __init' and it is the only symbol in the .init.text
section. recordmcount expects there to be a symbol for a particular
section but LLVM's integrated assembler (and GNU as after 2.37) do not
generated section symbols. This has been worked around in the kernel
before in commit 55d5b7d ("initramfs: fix clang build failure")
and commit 6e7b64b ("elfcore: fix building with clang").

Fixing recordmcount has been brought up before but there is no clear
solution that does not break ftrace outright.

Unfortunately, working around this issue by removing the '__init' from
bpf_arch_init_dispatcher_early() is not an option, as the x86 version of
bpf_arch_init_dispatcher_early() calls text_poke_early(), which is
marked '__init_or_module', meaning that when CONFIG_MODULES is disabled,
bpf_arch_init_dispatcher_early() has to be marked '__init' as well to
avoid a section mismatch warning from modpost.

However, bpf_arch_init_dispatcher_early() can be marked
'__init_or_module' as well, which would resolve the recordmcount warning
for configurations that support modules (i.e., the vast majority of
them) while not introducing any new warnings for all configurations. Do
so to clear up the build failure for CONFIG_MODULES=y configurations.

Link: ClangBuiltLinux#981
Signed-off-by: Nathan Chancellor <[email protected]>
@nathanchance nathanchance removed the [PATCH] Submitted A patch has been submitted for review label Dec 14, 2022
@arndb
Copy link

arndb commented Apr 14, 2023

Sent a fix for another instance of this problem: https://lore.kernel.org/lkml/[email protected]/T/#u

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
[ARCH] x86_64 This bug impacts ARCH=x86_64 [BUG] linux A bug that should be fixed in the mainline kernel. [FIXED][LINUX] 5.10 This bug was fixed in Linux 5.10 [TOOL] integrated-as The issue is relevant to LLVM integrated assembler [WORKAROUND] Applied This bug has an applied workaround
Projects
None yet
Development

No branches or pull requests

7 participants