Skip to content

RISC-V linker relaxation breaks booting with LTO #1942

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
nathanchance opened this issue Oct 5, 2023 · 7 comments
Open

RISC-V linker relaxation breaks booting with LTO #1942

nathanchance opened this issue Oct 5, 2023 · 7 comments
Labels
[ARCH] risc-v This bug impacts ARCH=riscv [BUG] llvm A bug that should be fixed in upstream LLVM [FEATURE] LTO Related to building the kernel with LLVM Link Time Optimization [TOOL] lld The issue is relevant to LLD linker [WORKAROUND] Applied This bug has an applied workaround

Comments

@nathanchance
Copy link
Member

I forgot to boot test v3 of RISC-V LTO enablement and to my surprise, kernels built with LTO and LLVM 15.0.0 and newer were broken (do not get past OpenSBI). I did a bisect of LLVM and landed on the commit that added support for RISC-V linker relaxation to ld.lld: llvm/llvm-project@6611d58

# Apply https://lore.kernel.org/[email protected]/ first then turn on CONFIG_LTO_CLANG_THIN in menuconfig
$ make -skj"$(nproc)" ARCH=riscv LLVM=1 mrproper defconfig menuconfig Image

$ boot-qemu.py -k . -t 20s
...
OpenSBI v1.3.1
   ____                    _____ ____ _____
  / __ \                  / ____|  _ \_   _|
 | |  | |_ __   ___ _ __ | (___ | |_) || |
 | |  | | '_ \ / _ \ '_ \ \___ \|  _ < | |
 | |__| | |_) |  __/ | | |____) | |_) || |_
  \____/| .__/ \___|_| |_|_____/|___/_____|
        | |
        |_|

Platform Name             : riscv-virtio,qemu
Platform Features         : medeleg
...
Boot HART ID              : 0
Boot HART Domain          : root
Boot HART Priv Version    : v1.12
Boot HART Base ISA        : rv64imafdch
Boot HART ISA Extensions  : time,sstc
Boot HART PMP Count       : 16
Boot HART PMP Granularity : 4
Boot HART PMP Address Bits: 54
Boot HART MHPM Count      : 16
Boot HART MIDELEG         : 0x0000000000001666
Boot HART MEDELEG         : 0x0000000000f0b509
qemu-system-riscv64: terminating on signal 15 from pid 2261906 (timeout)
...

With this diff applied on top:

diff --git a/arch/riscv/Makefile b/arch/riscv/Makefile
index 1329e060c548..c65e9cff59f4 100644
--- a/arch/riscv/Makefile
+++ b/arch/riscv/Makefile
@@ -45,7 +45,6 @@ else
 endif

 ifeq ($(CONFIG_LD_IS_LLD),y)
-ifeq ($(call test-lt, $(CONFIG_LLD_VERSION), 150000),y)
        KBUILD_CFLAGS += -mno-relax
        KBUILD_AFLAGS += -mno-relax
 ifndef CONFIG_AS_IS_LLVM
@@ -53,7 +52,6 @@ ifndef CONFIG_AS_IS_LLVM
        KBUILD_AFLAGS += -Wa,-mno-relax
 endif
 endif
-endif

 # ISA string setting
 riscv-march-$(CONFIG_ARCH_RV32I)       := rv32ima

The machine boots again:

$ make -skj"$(nproc)" ARCH=riscv LLVM=1 clean Image

$ boot-qemu.py -k . -t 20s
...
OpenSBI v1.3.1
   ____                    _____ ____ _____
  / __ \                  / ____|  _ \_   _|
 | |  | |_ __   ___ _ __ | (___ | |_) || |
 | |  | | '_ \ / _ \ '_ \ \___ \|  _ < | |
 | |__| | |_) |  __/ | | |____) | |_) || |_
  \____/| .__/ \___|_| |_|_____/|___/_____|
        | |
        |_|

Platform Name             : riscv-virtio,qemu
Platform Features         : medeleg
...
Boot HART ID              : 0
Boot HART Domain          : root
Boot HART Priv Version    : v1.12
Boot HART Base ISA        : rv64imafdch
Boot HART ISA Extensions  : time,sstc
Boot HART PMP Count       : 16
Boot HART PMP Granularity : 4
Boot HART PMP Address Bits: 54
Boot HART MHPM Count      : 16
Boot HART MIDELEG         : 0x0000000000001666
Boot HART MEDELEG         : 0x0000000000f0b509
[    0.000000] Linux version 6.6.0-rc4-00002-g81137efe9431-dirty ([email protected]) (ClangBuiltLinux clang version 18.0.0 (https://github.com/llvm/llvm-project 9a954c693573281407f6ee3f4eb1b16cc545033d), ClangBuiltLinux LLD 18.0.0) #4 SMP Thu Oct  5 13:39:05 MST 2023
[    0.000000] random: crng init done
[    0.000000] Machine model: riscv-virtio,qemu
[    0.000000] SBI specification v1.0 detected
[    0.000000] SBI implementation ID=0x1 Version=0x10003
[    0.000000] SBI TIME extension detected
[    0.000000] SBI IPI extension detected
[    0.000000] SBI RFENCE extension detected
[    0.000000] SBI SRST extension detected
...

I will see if I can narrow down if this happens in a particular translation unit. In the meantime, if we wanted to workaround this in the kernel, we could add something like this to v4:

diff --git a/arch/riscv/Makefile b/arch/riscv/Makefile
index 1329e060c548..ecaa784e8198 100644
--- a/arch/riscv/Makefile
+++ b/arch/riscv/Makefile
@@ -45,7 +45,7 @@ else
 endif

 ifeq ($(CONFIG_LD_IS_LLD),y)
-ifeq ($(call test-lt, $(CONFIG_LLD_VERSION), 150000),y)
+ifeq ($(filter y, $(call test-lt, $(CONFIG_LLD_VERSION), 150000) $(CONFIG_LTO_CLANG)),y)
        KBUILD_CFLAGS += -mno-relax
        KBUILD_AFLAGS += -mno-relax
 ifndef CONFIG_AS_IS_LLVM
@nathanchance nathanchance added [BUG] Untriaged Something isn't working [TOOL] lld The issue is relevant to LLD linker [ARCH] risc-v This bug impacts ARCH=riscv [FEATURE] LTO Related to building the kernel with LLVM Link Time Optimization labels Oct 5, 2023
@twd2
Copy link
Member

twd2 commented Oct 14, 2023

I can reproduce this issue with Linux 6.5 and LLVM commit 0ce6255a5058.

After some debugging using QEMU, I found the return value of ss->css_alloc(NULL) (where it points to cpuset_css_alloc()) in cgroup_init_subsys() was NULL and later triggered an unexpected page fault.

The source code of cpuset_css_alloc() is:

static struct cgroup_subsys_state *
cpuset_css_alloc(struct cgroup_subsys_state *parent_css)
{
	struct cpuset *cs;

	if (!parent_css)
		return &top_cpuset.css;
// ...

Disassembling cpuset_css_alloc() in vmlinux.o shows:

0000000000000000 <cpuset_css_alloc>:
   0:   1101                    addi    sp,sp,-32
   2:   ec06                    sd      ra,24(sp)
   4:   e822                    sd      s0,16(sp)
   6:   e426                    sd      s1,8(sp)
   8:   1000                    addi    s0,sp,32

000000000000000a <.Lpcrel_hi14724>:
   a:   00000597                auipc   a1,0x0
                        a: R_RISCV_PCREL_HI20   top_cpuset
   e:   00058593                mv      a1,a1
                        e: R_RISCV_PCREL_LO12_I .Lpcrel_hi14724
  12:   c935                    beqz    a0,86 <.LBB4160_4>

0000000000000014 <.Lpcrel_hi14725>:
...

0000000000000086 <.LBB4160_4>:
  86:   852e                    mv      a0,a1
  88:   60e2                    ld      ra,24(sp)
  8a:   6442                    ld      s0,16(sp)
  8c:   64a2                    ld      s1,8(sp)
  8e:   6105                    addi    sp,sp,32
  90:   8082                    ret

Note that there is no relocation entry (like llvm/llvm-project#59350) for the beqz, which makes the target of the beqz wrong after linker relaxation.

And disassembling cpuset_css_alloc() in vmlinux shows:

ffffffff800e84fc <cpuset_css_alloc>:
ffffffff800e84fc:       1101                    addi    sp,sp,-32
ffffffff800e84fe:       ec06                    sd      ra,24(sp)
ffffffff800e8500:       e822                    sd      s0,16(sp)
ffffffff800e8502:       e426                    sd      s1,8(sp)
ffffffff800e8504:       1000                    addi    s0,sp,32

ffffffff800e8506 <.Lpcrel_hi14724>:
ffffffff800e8506:       013a3597                auipc   a1,0x13a3
ffffffff800e850a:       ae258593                addi    a1,a1,-1310 # ffffffff8148afe8 <top_cpuset>
ffffffff800e850e:       c935                    beqz    a0,ffffffff800e8582 <.LBB4160_4+0x2>

ffffffff800e8510 <.Lpcrel_hi14725>:
...

ffffffff800e8580 <.LBB4160_4>:
ffffffff800e8580:       852e                    mv      a0,a1
ffffffff800e8582:       60e2                    ld      ra,24(sp)
ffffffff800e8584:       6442                    ld      s0,16(sp)
ffffffff800e8586:       64a2                    ld      s1,8(sp)
ffffffff800e8588:       6105                    addi    sp,sp,32
ffffffff800e858a:       8082                    ret

Note that the target of the beqz is wrong and mv a0,a1 is skipped, which is exactly the same issue as llvm/llvm-project#65090.

According to @MaskRay 's comment in llvm/llvm-project#59350, I tried the following diff:

diff --git a/arch/riscv/Makefile b/arch/riscv/Makefile
index 6ec6d52a4180..c8efe12b9238 100644
--- a/arch/riscv/Makefile
+++ b/arch/riscv/Makefile
@@ -55,6 +55,10 @@ endif
 endif
 endif
 
+ifdef CONFIG_LTO_CLANG
+       KBUILD_LDFLAGS += -mllvm -mattr=+c -mllvm -mattr=+relax
+endif
+

(something like llvm/llvm-project#50505 ?)

And the kernel can boot now.

@nickdesaulniers
Copy link
Member

Oh! Nice work!

(something like llvm/llvm-project#50505 ?)

Exactly!

  •   KBUILD_LDFLAGS += -mllvm -mattr=+c -mllvm -mattr=+relax
    

I thought llvm/llvm-project#50505 addressed -mattr=+c? Perhaps only relax is necessary?

I'm going to reopen llvm/llvm-project#50505. The above diff should be folded into the LTO series with a comment that it's working around that bug in LLVM.

@nathanchance
Copy link
Member Author

@twd2 Would you like to send v4 with that diff added or would you like me to? I do not mind either way.

@twd2
Copy link
Member

twd2 commented Oct 17, 2023

Oh! Nice work!

(something like llvm/llvm-project#50505 ?)

Exactly!

  •   KBUILD_LDFLAGS += -mllvm -mattr=+c -mllvm -mattr=+relax
    

I thought llvm/llvm-project#50505 addressed -mattr=+c? Perhaps only relax is necessary?

I checked the bitcode and IR and found that +c and +relax were indeed encoded in the IR's target-features:

attributes #2 = { ... "target-features"="+64bit,+a,+c,+m,+relax,...

But ld.lld somehow ignored these attributes when doing LTO. (?)

I thought the diff I referenced in llvm/llvm-project#50505 (llvm/llvm-project@e63455d) only solved the issue when writing nops. And, if we don't pass +c, e_flags of the ELF header will not contain EF_RISCV_RVC (MaskRay said in llvm/llvm-project#59350).

I'm going to reopen llvm/llvm-project#50505. The above diff should be folded into the LTO series with a comment that it's working around that bug in LLVM.

So, can you please change the title of llvm/llvm-project#50505 to something like "LLD ignores target-features when doing LTO"? I think this is the root cause of 50505 and this issue.

Thanks.

@twd2
Copy link
Member

twd2 commented Oct 17, 2023

@twd2 Would you like to send v4 with that diff added or would you like me to? I do not mind either way.

Please do that, thanks!

I thought there needed some if-conditions to check whether we should add +c or +relax, but I have not come up with a good solution.

@nathanchance
Copy link
Member Author

v4 with this issue worked around: https://lore.kernel.org/[email protected]/

@nathanchance nathanchance added [BUG] llvm A bug that should be fixed in upstream LLVM [WORKAROUND] Applied This bug has an applied workaround and removed [BUG] Untriaged Something isn't working labels Oct 17, 2023
intel-lab-lkp pushed a commit to intel-lab-lkp/linux that referenced this issue Oct 17, 2023
Allow LTO to be selected for RISC-V, only when LLD >= 14, since there is
an issue [1] in prior LLD versions that prevents LLD to generate proper
machine code for RISC-V when writing `nop`s.

To avoid boot failures in QEMU [2], '-mattr=+c' and '-mattr=+relax'
need to be passed via '-mllvm' to ld.lld, as there appears to be an
issue with LLVM's target-features and LTO [3], which can result in
incorrect relocations to branch targets [4]. Once this is fixed in LLVM,
it can be made conditional on affected ld.lld versions.

Disable LTO for arch/riscv/kernel/pi, as llvm-objcopy expects an ELF
object file when manipulating the files in that subfolder, rather than
LLVM bitcode.

[1] llvm/llvm-project#50505, resolved by LLVM
    commit e63455d5e0e5 ("[MC] Use local MCSubtargetInfo in writeNops")
[2] ClangBuiltLinux#1942
[3] llvm/llvm-project#59350
[4] llvm/llvm-project#65090

Tested-by: Wende Tan <[email protected]>
Signed-off-by: Wende Tan <[email protected]>
Co-developed-by: Nathan Chancellor <[email protected]>
Signed-off-by: Nathan Chancellor <[email protected]>
@nickdesaulniers
Copy link
Member

@hiraditya has been tracking this kind of issue with LTO and riscv for Android. He pointed me to the discussion in:

https://discourse.llvm.org/t/myterious-soft-float-output-in-lto-cache/70753

prabhakarlad pushed a commit to prabhakarlad/linux that referenced this issue Dec 11, 2023
Allow LTO to be selected for RISC-V, only when LLD >= 14, since there is
an issue [1] in prior LLD versions that prevents LLD to generate proper
machine code for RISC-V when writing `nop`s.

To avoid boot failures in QEMU [2], '-mattr=+c' and '-mattr=+relax'
need to be passed via '-mllvm' to ld.lld, as there appears to be an
issue with LLVM's target-features and LTO [3], which can result in
incorrect relocations to branch targets [4]. Once this is fixed in LLVM,
it can be made conditional on affected ld.lld versions.

Disable LTO for arch/riscv/kernel/pi, as llvm-objcopy expects an ELF
object file when manipulating the files in that subfolder, rather than
LLVM bitcode.

[1] llvm/llvm-project#50505, resolved by LLVM
    commit e63455d5e0e5 ("[MC] Use local MCSubtargetInfo in writeNops")
[2] ClangBuiltLinux#1942
[3] llvm/llvm-project#59350
[4] llvm/llvm-project#65090

Tested-by: Wende Tan <[email protected]>
Signed-off-by: Wende Tan <[email protected]>
Co-developed-by: Nathan Chancellor <[email protected]>
Signed-off-by: Nathan Chancellor <[email protected]>
Reviewed-by: Conor Dooley <[email protected]>
knagaitsev pushed a commit to knagaitsev/linux that referenced this issue Oct 4, 2024
Allow LTO to be selected for RISC-V, only when LLD >= 14, since there is
an issue [1] in prior LLD versions that prevents LLD to generate proper
machine code for RISC-V when writing `nop`s.

To avoid boot failures in QEMU [2], '-mattr=+c' and '-mattr=+relax'
need to be passed via '-mllvm' to ld.lld, as there appears to be an
issue with LLVM's target-features and LTO [3], which can result in
incorrect relocations to branch targets [4]. Once this is fixed in LLVM,
it can be made conditional on affected ld.lld versions.

Disable LTO for arch/riscv/kernel/pi, as llvm-objcopy expects an ELF
object file when manipulating the files in that subfolder, rather than
LLVM bitcode.

[1] llvm/llvm-project#50505, resolved by LLVM
    commit e63455d5e0e5 ("[MC] Use local MCSubtargetInfo in writeNops")
[2] ClangBuiltLinux#1942
[3] llvm/llvm-project#59350
[4] llvm/llvm-project#65090

Tested-by: Wende Tan <[email protected]>
Signed-off-by: Wende Tan <[email protected]>
Co-developed-by: Nathan Chancellor <[email protected]>
Signed-off-by: Nathan Chancellor <[email protected]>
Reviewed-by: Conor Dooley <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Palmer Dabbelt <[email protected]>
knagaitsev pushed a commit to knagaitsev/linux that referenced this issue Oct 11, 2024
Allow LTO to be selected for RISC-V, only when LLD >= 14, since there is
an issue [1] in prior LLD versions that prevents LLD to generate proper
machine code for RISC-V when writing `nop`s.

To avoid boot failures in QEMU [2], '-mattr=+c' and '-mattr=+relax'
need to be passed via '-mllvm' to ld.lld, as there appears to be an
issue with LLVM's target-features and LTO [3], which can result in
incorrect relocations to branch targets [4]. Once this is fixed in LLVM,
it can be made conditional on affected ld.lld versions.

Disable LTO for arch/riscv/kernel/pi, as llvm-objcopy expects an ELF
object file when manipulating the files in that subfolder, rather than
LLVM bitcode.

[1] llvm/llvm-project#50505, resolved by LLVM
    commit e63455d5e0e5 ("[MC] Use local MCSubtargetInfo in writeNops")
[2] ClangBuiltLinux#1942
[3] llvm/llvm-project#59350
[4] llvm/llvm-project#65090

Tested-by: Wende Tan <[email protected]>
Signed-off-by: Wende Tan <[email protected]>
Co-developed-by: Nathan Chancellor <[email protected]>
Signed-off-by: Nathan Chancellor <[email protected]>
Reviewed-by: Conor Dooley <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Palmer Dabbelt <[email protected]>
jmontleon pushed a commit to jmontleon/linux-bianbu that referenced this issue Feb 1, 2025
Allow LTO to be selected for RISC-V, only when LLD >= 14, since there is
an issue [1] in prior LLD versions that prevents LLD to generate proper
machine code for RISC-V when writing `nop`s.

To avoid boot failures in QEMU [2], '-mattr=+c' and '-mattr=+relax'
need to be passed via '-mllvm' to ld.lld, as there appears to be an
issue with LLVM's target-features and LTO [3], which can result in
incorrect relocations to branch targets [4]. Once this is fixed in LLVM,
it can be made conditional on affected ld.lld versions.

Disable LTO for arch/riscv/kernel/pi, as llvm-objcopy expects an ELF
object file when manipulating the files in that subfolder, rather than
LLVM bitcode.

[1] llvm/llvm-project#50505, resolved by LLVM
    commit e63455d5e0e5 ("[MC] Use local MCSubtargetInfo in writeNops")
[2] ClangBuiltLinux/linux#1942
[3] llvm/llvm-project#59350
[4] llvm/llvm-project#65090

Tested-by: Wende Tan <[email protected]>
Change-Id: Ib5a2cfe304cf10594e3f8cfec8d44794fd433882
Signed-off-by: Wende Tan <[email protected]>
Co-developed-by: Nathan Chancellor <[email protected]>
Signed-off-by: Nathan Chancellor <[email protected]>
Reviewed-by: Conor Dooley <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Palmer Dabbelt <[email protected]>
jmontleon pushed a commit to jmontleon/linux-bianbu that referenced this issue Feb 8, 2025
Allow LTO to be selected for RISC-V, only when LLD >= 14, since there is
an issue [1] in prior LLD versions that prevents LLD to generate proper
machine code for RISC-V when writing `nop`s.

To avoid boot failures in QEMU [2], '-mattr=+c' and '-mattr=+relax'
need to be passed via '-mllvm' to ld.lld, as there appears to be an
issue with LLVM's target-features and LTO [3], which can result in
incorrect relocations to branch targets [4]. Once this is fixed in LLVM,
it can be made conditional on affected ld.lld versions.

Disable LTO for arch/riscv/kernel/pi, as llvm-objcopy expects an ELF
object file when manipulating the files in that subfolder, rather than
LLVM bitcode.

[1] llvm/llvm-project#50505, resolved by LLVM
    commit e63455d5e0e5 ("[MC] Use local MCSubtargetInfo in writeNops")
[2] ClangBuiltLinux/linux#1942
[3] llvm/llvm-project#59350
[4] llvm/llvm-project#65090

Tested-by: Wende Tan <[email protected]>
Change-Id: Ib5a2cfe304cf10594e3f8cfec8d44794fd433882
Signed-off-by: Wende Tan <[email protected]>
Co-developed-by: Nathan Chancellor <[email protected]>
Signed-off-by: Nathan Chancellor <[email protected]>
Reviewed-by: Conor Dooley <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Palmer Dabbelt <[email protected]>
jmontleon pushed a commit to jmontleon/linux-bianbu that referenced this issue Feb 11, 2025
Allow LTO to be selected for RISC-V, only when LLD >= 14, since there is
an issue [1] in prior LLD versions that prevents LLD to generate proper
machine code for RISC-V when writing `nop`s.

To avoid boot failures in QEMU [2], '-mattr=+c' and '-mattr=+relax'
need to be passed via '-mllvm' to ld.lld, as there appears to be an
issue with LLVM's target-features and LTO [3], which can result in
incorrect relocations to branch targets [4]. Once this is fixed in LLVM,
it can be made conditional on affected ld.lld versions.

Disable LTO for arch/riscv/kernel/pi, as llvm-objcopy expects an ELF
object file when manipulating the files in that subfolder, rather than
LLVM bitcode.

[1] llvm/llvm-project#50505, resolved by LLVM
    commit e63455d5e0e5 ("[MC] Use local MCSubtargetInfo in writeNops")
[2] ClangBuiltLinux/linux#1942
[3] llvm/llvm-project#59350
[4] llvm/llvm-project#65090

Tested-by: Wende Tan <[email protected]>
Change-Id: Ib5a2cfe304cf10594e3f8cfec8d44794fd433882
Signed-off-by: Wende Tan <[email protected]>
Co-developed-by: Nathan Chancellor <[email protected]>
Signed-off-by: Nathan Chancellor <[email protected]>
Reviewed-by: Conor Dooley <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Palmer Dabbelt <[email protected]>
jmontleon pushed a commit to jmontleon/linux-bianbu that referenced this issue Feb 17, 2025
Allow LTO to be selected for RISC-V, only when LLD >= 14, since there is
an issue [1] in prior LLD versions that prevents LLD to generate proper
machine code for RISC-V when writing `nop`s.

To avoid boot failures in QEMU [2], '-mattr=+c' and '-mattr=+relax'
need to be passed via '-mllvm' to ld.lld, as there appears to be an
issue with LLVM's target-features and LTO [3], which can result in
incorrect relocations to branch targets [4]. Once this is fixed in LLVM,
it can be made conditional on affected ld.lld versions.

Disable LTO for arch/riscv/kernel/pi, as llvm-objcopy expects an ELF
object file when manipulating the files in that subfolder, rather than
LLVM bitcode.

[1] llvm/llvm-project#50505, resolved by LLVM
    commit e63455d5e0e5 ("[MC] Use local MCSubtargetInfo in writeNops")
[2] ClangBuiltLinux/linux#1942
[3] llvm/llvm-project#59350
[4] llvm/llvm-project#65090

Tested-by: Wende Tan <[email protected]>
Change-Id: Ib5a2cfe304cf10594e3f8cfec8d44794fd433882
Signed-off-by: Wende Tan <[email protected]>
Co-developed-by: Nathan Chancellor <[email protected]>
Signed-off-by: Nathan Chancellor <[email protected]>
Reviewed-by: Conor Dooley <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Palmer Dabbelt <[email protected]>
jmontleon pushed a commit to jmontleon/linux-bianbu that referenced this issue Feb 22, 2025
Allow LTO to be selected for RISC-V, only when LLD >= 14, since there is
an issue [1] in prior LLD versions that prevents LLD to generate proper
machine code for RISC-V when writing `nop`s.

To avoid boot failures in QEMU [2], '-mattr=+c' and '-mattr=+relax'
need to be passed via '-mllvm' to ld.lld, as there appears to be an
issue with LLVM's target-features and LTO [3], which can result in
incorrect relocations to branch targets [4]. Once this is fixed in LLVM,
it can be made conditional on affected ld.lld versions.

Disable LTO for arch/riscv/kernel/pi, as llvm-objcopy expects an ELF
object file when manipulating the files in that subfolder, rather than
LLVM bitcode.

[1] llvm/llvm-project#50505, resolved by LLVM
    commit e63455d5e0e5 ("[MC] Use local MCSubtargetInfo in writeNops")
[2] ClangBuiltLinux/linux#1942
[3] llvm/llvm-project#59350
[4] llvm/llvm-project#65090

Tested-by: Wende Tan <[email protected]>
Change-Id: Ib5a2cfe304cf10594e3f8cfec8d44794fd433882
Signed-off-by: Wende Tan <[email protected]>
Co-developed-by: Nathan Chancellor <[email protected]>
Signed-off-by: Nathan Chancellor <[email protected]>
Reviewed-by: Conor Dooley <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Palmer Dabbelt <[email protected]>
jmontleon pushed a commit to jmontleon/linux-bianbu that referenced this issue Mar 8, 2025
Allow LTO to be selected for RISC-V, only when LLD >= 14, since there is
an issue [1] in prior LLD versions that prevents LLD to generate proper
machine code for RISC-V when writing `nop`s.

To avoid boot failures in QEMU [2], '-mattr=+c' and '-mattr=+relax'
need to be passed via '-mllvm' to ld.lld, as there appears to be an
issue with LLVM's target-features and LTO [3], which can result in
incorrect relocations to branch targets [4]. Once this is fixed in LLVM,
it can be made conditional on affected ld.lld versions.

Disable LTO for arch/riscv/kernel/pi, as llvm-objcopy expects an ELF
object file when manipulating the files in that subfolder, rather than
LLVM bitcode.

[1] llvm/llvm-project#50505, resolved by LLVM
    commit e63455d5e0e5 ("[MC] Use local MCSubtargetInfo in writeNops")
[2] ClangBuiltLinux/linux#1942
[3] llvm/llvm-project#59350
[4] llvm/llvm-project#65090

Tested-by: Wende Tan <[email protected]>
Change-Id: Ib5a2cfe304cf10594e3f8cfec8d44794fd433882
Signed-off-by: Wende Tan <[email protected]>
Co-developed-by: Nathan Chancellor <[email protected]>
Signed-off-by: Nathan Chancellor <[email protected]>
Reviewed-by: Conor Dooley <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Palmer Dabbelt <[email protected]>
jmontleon pushed a commit to jmontleon/linux-bianbu that referenced this issue Mar 9, 2025
Allow LTO to be selected for RISC-V, only when LLD >= 14, since there is
an issue [1] in prior LLD versions that prevents LLD to generate proper
machine code for RISC-V when writing `nop`s.

To avoid boot failures in QEMU [2], '-mattr=+c' and '-mattr=+relax'
need to be passed via '-mllvm' to ld.lld, as there appears to be an
issue with LLVM's target-features and LTO [3], which can result in
incorrect relocations to branch targets [4]. Once this is fixed in LLVM,
it can be made conditional on affected ld.lld versions.

Disable LTO for arch/riscv/kernel/pi, as llvm-objcopy expects an ELF
object file when manipulating the files in that subfolder, rather than
LLVM bitcode.

[1] llvm/llvm-project#50505, resolved by LLVM
    commit e63455d5e0e5 ("[MC] Use local MCSubtargetInfo in writeNops")
[2] ClangBuiltLinux/linux#1942
[3] llvm/llvm-project#59350
[4] llvm/llvm-project#65090

Tested-by: Wende Tan <[email protected]>
Change-Id: Ib5a2cfe304cf10594e3f8cfec8d44794fd433882
Signed-off-by: Wende Tan <[email protected]>
Co-developed-by: Nathan Chancellor <[email protected]>
Signed-off-by: Nathan Chancellor <[email protected]>
Reviewed-by: Conor Dooley <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Palmer Dabbelt <[email protected]>
jmontleon pushed a commit to jmontleon/linux-bianbu that referenced this issue Mar 14, 2025
Allow LTO to be selected for RISC-V, only when LLD >= 14, since there is
an issue [1] in prior LLD versions that prevents LLD to generate proper
machine code for RISC-V when writing `nop`s.

To avoid boot failures in QEMU [2], '-mattr=+c' and '-mattr=+relax'
need to be passed via '-mllvm' to ld.lld, as there appears to be an
issue with LLVM's target-features and LTO [3], which can result in
incorrect relocations to branch targets [4]. Once this is fixed in LLVM,
it can be made conditional on affected ld.lld versions.

Disable LTO for arch/riscv/kernel/pi, as llvm-objcopy expects an ELF
object file when manipulating the files in that subfolder, rather than
LLVM bitcode.

[1] llvm/llvm-project#50505, resolved by LLVM
    commit e63455d5e0e5 ("[MC] Use local MCSubtargetInfo in writeNops")
[2] ClangBuiltLinux/linux#1942
[3] llvm/llvm-project#59350
[4] llvm/llvm-project#65090

Tested-by: Wende Tan <[email protected]>
Change-Id: Ib5a2cfe304cf10594e3f8cfec8d44794fd433882
Signed-off-by: Wende Tan <[email protected]>
Co-developed-by: Nathan Chancellor <[email protected]>
Signed-off-by: Nathan Chancellor <[email protected]>
Reviewed-by: Conor Dooley <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Palmer Dabbelt <[email protected]>
jmontleon pushed a commit to jmontleon/linux-bianbu that referenced this issue Mar 22, 2025
Allow LTO to be selected for RISC-V, only when LLD >= 14, since there is
an issue [1] in prior LLD versions that prevents LLD to generate proper
machine code for RISC-V when writing `nop`s.

To avoid boot failures in QEMU [2], '-mattr=+c' and '-mattr=+relax'
need to be passed via '-mllvm' to ld.lld, as there appears to be an
issue with LLVM's target-features and LTO [3], which can result in
incorrect relocations to branch targets [4]. Once this is fixed in LLVM,
it can be made conditional on affected ld.lld versions.

Disable LTO for arch/riscv/kernel/pi, as llvm-objcopy expects an ELF
object file when manipulating the files in that subfolder, rather than
LLVM bitcode.

[1] llvm/llvm-project#50505, resolved by LLVM
    commit e63455d5e0e5 ("[MC] Use local MCSubtargetInfo in writeNops")
[2] ClangBuiltLinux/linux#1942
[3] llvm/llvm-project#59350
[4] llvm/llvm-project#65090

Tested-by: Wende Tan <[email protected]>
Change-Id: Ib5a2cfe304cf10594e3f8cfec8d44794fd433882
Signed-off-by: Wende Tan <[email protected]>
Co-developed-by: Nathan Chancellor <[email protected]>
Signed-off-by: Nathan Chancellor <[email protected]>
Reviewed-by: Conor Dooley <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Palmer Dabbelt <[email protected]>
jmontleon pushed a commit to jmontleon/linux-bianbu that referenced this issue Mar 29, 2025
Allow LTO to be selected for RISC-V, only when LLD >= 14, since there is
an issue [1] in prior LLD versions that prevents LLD to generate proper
machine code for RISC-V when writing `nop`s.

To avoid boot failures in QEMU [2], '-mattr=+c' and '-mattr=+relax'
need to be passed via '-mllvm' to ld.lld, as there appears to be an
issue with LLVM's target-features and LTO [3], which can result in
incorrect relocations to branch targets [4]. Once this is fixed in LLVM,
it can be made conditional on affected ld.lld versions.

Disable LTO for arch/riscv/kernel/pi, as llvm-objcopy expects an ELF
object file when manipulating the files in that subfolder, rather than
LLVM bitcode.

[1] llvm/llvm-project#50505, resolved by LLVM
    commit e63455d5e0e5 ("[MC] Use local MCSubtargetInfo in writeNops")
[2] ClangBuiltLinux/linux#1942
[3] llvm/llvm-project#59350
[4] llvm/llvm-project#65090

Tested-by: Wende Tan <[email protected]>
Change-Id: Ib5a2cfe304cf10594e3f8cfec8d44794fd433882
Signed-off-by: Wende Tan <[email protected]>
Co-developed-by: Nathan Chancellor <[email protected]>
Signed-off-by: Nathan Chancellor <[email protected]>
Reviewed-by: Conor Dooley <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Palmer Dabbelt <[email protected]>
carbonfix pushed a commit to milkv-jupiter/linux-6.6 that referenced this issue Apr 3, 2025
Allow LTO to be selected for RISC-V, only when LLD >= 14, since there is
an issue [1] in prior LLD versions that prevents LLD to generate proper
machine code for RISC-V when writing `nop`s.

To avoid boot failures in QEMU [2], '-mattr=+c' and '-mattr=+relax'
need to be passed via '-mllvm' to ld.lld, as there appears to be an
issue with LLVM's target-features and LTO [3], which can result in
incorrect relocations to branch targets [4]. Once this is fixed in LLVM,
it can be made conditional on affected ld.lld versions.

Disable LTO for arch/riscv/kernel/pi, as llvm-objcopy expects an ELF
object file when manipulating the files in that subfolder, rather than
LLVM bitcode.

[1] llvm/llvm-project#50505, resolved by LLVM
    commit e63455d5e0e5 ("[MC] Use local MCSubtargetInfo in writeNops")
[2] ClangBuiltLinux/linux#1942
[3] llvm/llvm-project#59350
[4] llvm/llvm-project#65090

Tested-by: Wende Tan <[email protected]>
Change-Id: Ib5a2cfe304cf10594e3f8cfec8d44794fd433882
Signed-off-by: Wende Tan <[email protected]>
Co-developed-by: Nathan Chancellor <[email protected]>
Signed-off-by: Nathan Chancellor <[email protected]>
Reviewed-by: Conor Dooley <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Palmer Dabbelt <[email protected]>
jmontleon pushed a commit to jmontleon/linux-bianbu that referenced this issue Apr 7, 2025
Allow LTO to be selected for RISC-V, only when LLD >= 14, since there is
an issue [1] in prior LLD versions that prevents LLD to generate proper
machine code for RISC-V when writing `nop`s.

To avoid boot failures in QEMU [2], '-mattr=+c' and '-mattr=+relax'
need to be passed via '-mllvm' to ld.lld, as there appears to be an
issue with LLVM's target-features and LTO [3], which can result in
incorrect relocations to branch targets [4]. Once this is fixed in LLVM,
it can be made conditional on affected ld.lld versions.

Disable LTO for arch/riscv/kernel/pi, as llvm-objcopy expects an ELF
object file when manipulating the files in that subfolder, rather than
LLVM bitcode.

[1] llvm/llvm-project#50505, resolved by LLVM
    commit e63455d5e0e5 ("[MC] Use local MCSubtargetInfo in writeNops")
[2] ClangBuiltLinux/linux#1942
[3] llvm/llvm-project#59350
[4] llvm/llvm-project#65090

Tested-by: Wende Tan <[email protected]>
Change-Id: Ib5a2cfe304cf10594e3f8cfec8d44794fd433882
Signed-off-by: Wende Tan <[email protected]>
Co-developed-by: Nathan Chancellor <[email protected]>
Signed-off-by: Nathan Chancellor <[email protected]>
Reviewed-by: Conor Dooley <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Palmer Dabbelt <[email protected]>
jmontleon pushed a commit to jmontleon/linux-bianbu that referenced this issue Apr 10, 2025
Allow LTO to be selected for RISC-V, only when LLD >= 14, since there is
an issue [1] in prior LLD versions that prevents LLD to generate proper
machine code for RISC-V when writing `nop`s.

To avoid boot failures in QEMU [2], '-mattr=+c' and '-mattr=+relax'
need to be passed via '-mllvm' to ld.lld, as there appears to be an
issue with LLVM's target-features and LTO [3], which can result in
incorrect relocations to branch targets [4]. Once this is fixed in LLVM,
it can be made conditional on affected ld.lld versions.

Disable LTO for arch/riscv/kernel/pi, as llvm-objcopy expects an ELF
object file when manipulating the files in that subfolder, rather than
LLVM bitcode.

[1] llvm/llvm-project#50505, resolved by LLVM
    commit e63455d5e0e5 ("[MC] Use local MCSubtargetInfo in writeNops")
[2] ClangBuiltLinux/linux#1942
[3] llvm/llvm-project#59350
[4] llvm/llvm-project#65090

Tested-by: Wende Tan <[email protected]>
Change-Id: Ib5a2cfe304cf10594e3f8cfec8d44794fd433882
Signed-off-by: Wende Tan <[email protected]>
Co-developed-by: Nathan Chancellor <[email protected]>
Signed-off-by: Nathan Chancellor <[email protected]>
Reviewed-by: Conor Dooley <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Palmer Dabbelt <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
[ARCH] risc-v This bug impacts ARCH=riscv [BUG] llvm A bug that should be fixed in upstream LLVM [FEATURE] LTO Related to building the kernel with LLVM Link Time Optimization [TOOL] lld The issue is relevant to LLD linker [WORKAROUND] Applied This bug has an applied workaround
Projects
None yet
Development

No branches or pull requests

3 participants