-
Notifications
You must be signed in to change notification settings - Fork 15
ERROR: modpost: "__mulodi4" [drivers/block/nbd.ko] undefined! #1438
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
there's an upstream bug for this (@stephenhines and @kraj are on it): https://llvm.org/pr28629 |
The bad news is that this is probably still very low priority to get fixed. For Android, we just started out by including sub-components of compiler-rt directly with libgcc, and later switched completely to Clang's own builtin library with no libgcc. |
My attempts to reproduce a long long multiplication come out lowered from clang on all Arm 32/64 soft/hard I could come up with, with the only exception being But when I try to link, it just works. clang is calling I think the "low priority" of it is because of the fuzzy nature of the original reports, which had With a small enough reproducer, it should be much clearer to find where the problem is and perhaps it's even easy to implement. I'm guessing the pattern is not enabled in Thumb and ends up going to the runtime. It could be a simple case of adding a custom lowering for Thumb... |
I'll see if I can get a smaller reproducer tomorrow then. NOTE: This is not just a 32-bit ARM error, I do also see it on 32-bit MIPS. |
We can probably creduce the kernel easily. |
Thanks! Please attach the reproducer to the original bug, so other people can see it, and we update this if/once we fix it. |
This sounds like people just didn't get to it because no one cared enough about 32-bits by the time the change was introduced... |
Reproducer posted: https://llvm.org/pr28629 |
If we can fix clang for ARM and MIPS at least, then we might be able to ship in the kernel: diff --git a/drivers/block/nbd.c b/drivers/block/nbd.c
index 5170a630778d..2271a50ba4c0 100644
--- a/drivers/block/nbd.c
+++ b/drivers/block/nbd.c
@@ -1388,6 +1388,12 @@ static void nbd_set_cmd_timeout(struct nbd_device *nbd, u64 timeout)
blk_queue_rq_timeout(nbd->disk->queue, 30 * HZ);
}
+// FIXME: https://llvm.org/pr28629
+// https://github.com/ClangBuiltLinux/linux/issues/1438
+#if defined(__clang__) && (defined(__arm__) || (defined(__mips__))
+#define MULODI4_BROKEN
+#endif
+
/* Must be called with config_lock held */
static int __nbd_ioctl(struct block_device *bdev, struct nbd_device *nbd,
unsigned int cmd, unsigned long arg)
@@ -1408,8 +1414,10 @@ static int __nbd_ioctl(struct block_device *bdev, struct nbd_device *nbd,
case NBD_SET_SIZE:
return nbd_set_size(nbd, arg, config->blksize);
case NBD_SET_SIZE_BLOCKS:
+#ifndef MULODI4_BROKEN
if (check_mul_overflow((loff_t)arg, config->blksize, &bytesize))
return -EINVAL;
+#endif
return nbd_set_size(nbd, bytesize, config->blksize);
case NBD_SET_TIMEOUT:
nbd_set_cmd_timeout(nbd, arg); but with an added version check. Err, that would use |
Shouldn't we be working around this issue generically, rather than in that specific call site? Initial idea would be breaking up the |
Quick update: This problem was mostly fixed earlier this month (by @topperc) for x86 and @nickdesaulniers is working to get that propagated to Arm and Mips. It will only fix from now on (clang 14+), leaving previous clangs with the same problem. We may be able to get this in clang 13 if we can backport the current RC3. No promises, though, as it's very late in the release process. But if it works, we would get 13 with the fix in a a few weeks. That still leaves all previous clangs (12 and older) unfixed (we don't backport beyond six months). How the kernel is going to fix for those older versions is up to you guys. |
__has_builtin(__builtin_mul_overflow) returns true for 32b ARM targets, but Clang is deferring to compiler RT when encountering `long long` types. This breaks sanitizer builds of the Linux kernel that are using __builtin_mul_overflow with these types for these targets. If the semantics of __has_builtin mean "the compiler resolves these, always" then we shouldn't conditionally emit a libcall. This will still need to be worked around in the Linux kernel in order to continue to support allmodconfig builds of the Linux kernel for this target with older releases of clang. Link: https://bugs.llvm.org/show_bug.cgi?id=28629 Link: ClangBuiltLinux/linux#1438 Reviewed By: rengolin Differential Revision: https://reviews.llvm.org/D108842
__has_builtin(__builtin_mul_overflow) returns true for 32b MIPS targets, but Clang is deferring to compiler RT when encountering `long long` types. This breaks sanitizer builds of the Linux kernel that are using __builtin_mul_overflow with these types for these targets. If the semantics of __has_builtin mean "the compiler resolves these, always" then we shouldn't conditionally emit a libcall. This will still need to be worked around in the Linux kernel in order to continue to support malta_defconfig builds of the Linux kernel for this target with older releases of clang. Link: https://bugs.llvm.org/show_bug.cgi?id=28629 Link: ClangBuiltLinux/linux#1438 Reviewed By: rengolin Differential Revision: https://reviews.llvm.org/D108844
Some toolchain patches have gone in: MIPS is now hitting |
Is 32-bit X86 still broken for __mulodi4? |
It seems so. In your original patch you only add it for 128-bits, not 64-bits. However, further up the X86Lowering code, But when I compile the original code with |
Isn't it in this for loop that has a continue for i64 on 32-bit targets?
|
aha! Yes, I missed that. Been looking at this from different computers and mobiles, lacking context on all of them. Not sure the kernel cares about i386 anymore but with your legalisation fix, it seems the same thing would work just like Arm and Mips. |
Are we heading in the direction of removing MULO lib calls completely from llvm? My initial patch was just to fix the cases that wouldn’t link with either libgcc or compiler-rt. |
Maybe? I'm not advocating for it, I'm just trying to fix an issue that exists for over 5 years and still doesn't have a nice solution. Linking compiler-rt isn't mandatory and not used on many systems. Linking with compiler-rt and libgcc won't work, so if there is functionality that is only available in compiler-rt, we can't know if the linker will find it or not, so lowering calls to The current legalisation is un-optimised, but Compiler-RT's implementation is plain-C for all platforms, so not the most optimised either. If we can lower the builtin with the compiler and avoid the linking errors on anything that doesn't use compiler-rt, then I think it's a good compromise. Even if libgcc implements it, the kernel and embedded software won't have it, so are bound to have the same problem. That's why they use I don't think eliding the calls as a target option is the best option, but right now it's the only option. I don't really know how Clang can emit code that it knows compiler-rt implements when it doesn't know if it will be linked against it. There was a discussion many years ago to split builtins from compiler-rt and make it so clang always linked against that. It would obviously clash with symbols from libgcc and other RT libs, but we could mark them weak or some other linkage magic that escapes me now. Perhaps we need some more thorough discussion on the list... |
for compatibility reasons I think its best to remove it from clang. |
Yes, this is broken for 32b x86 kernels as well (we just don't have CI coverage for ARCH=i386 defconfig + CONFIG_BLK_DEV_NBD=y). If I enable
|
mips patch 2/2: https://reviews.llvm.org/D108926 For future travelers, llvm/include/llvm/IR/RuntimeLibcalls.def has the relevant mapping for these functions. |
* [X86] Remove isel predicates for xgetbv/xsetbv instructions so they can work on Windows. https://reviews.llvm.org/D56686 was supposed to allow these to work on Windows without needing to enable the xsave feature to match MSVC. It seems this didn't work because the backend isel patterns would still block it. This patch removes the predicates from the isel patterns. Fixes PR51706. Reviewed By: pengfei Differential Revision: https://reviews.llvm.org/D109097 * [libc++] Remove an unused internal concept. Removed as suggested by @Quuxplusone during the review of D109075. * [AIX][PowerPC] Define __powerpc and __PPC macros %%% This patch defines the macros __powerpc and __PPC on AIX to be consistent with XL for AIX. See: https://www.ibm.com/docs/en/xl-c-and-cpp-aix/13.1.0?topic=macros-related-platform Note: GCC does not currently define __powerpc and __PPC so users should prefer the __powerpc__ and __PPC__ forms. %%% Reviewed By: cebowleratibm Differential Revision: https://reviews.llvm.org/D108917 * [Bazel] Add explicit dependency on llvm:Support to reflect layering Differential Revision: https://reviews.llvm.org/D109173 * [InlineCost] Introduce attributes to override InlineCost for inliner testing This patch introduces four new string attributes: function-inline-cost, function-inline-threshold, call-inline-cost and call-threshold-bonus. These attributes allow you to selectively override some aspects of InlineCost analysis. That would allow us to test inliner separately from the InlineCost analysis. That could be useful when you're trying to write tests for inliner and you need to test some very specific situation, like "the inline cost has to be this high", or "the threshold has to be this low". Right now every time someone does that, they have get creative to come up with a way to make the InlineCost give them the number they need (like adding ~30 load/add pairs for a trivial test). This process can be somewhat tedious which can discourage some people from writing enough tests for their changes. Also, that results in tests that are fragile and can be easily broken without anyone noticing it because the test writer can't explicitly control what input the inliner will get from the inline cost analysis. These new attributes will alleviate those problems to an extent. Reviewed By: mtrofin Differential Revision: https://reviews.llvm.org/D109033 * [MipsISelLowering] avoid emitting libcalls to __multi3 Similar to D108842 and D108844. __has_builtin(builtin_mul_overflow) returns true for 32b MIPS targets, but Clang is deferring to compiler RT when encountering long long types. This breaks MIPS malta_defconfig builds of the Linux kernel that are using __builtin_mul_overflow with these types for these targets. If the semantics of __has_builtin mean "the compiler resolves these, always" then we shouldn't conditionally emit a libcall. This will still need to be worked around in the Linux kernel in order to continue to support malta_defconfig builds of the Linux kernel for this target with older releases of clang. Link: https://bugs.llvm.org/show_bug.cgi?id=28629 Link: https://github.com/ClangBuiltLinux/linux/issues/1438 Reviewed By: rengolin Differential Revision: https://reviews.llvm.org/D108926 * [WebAssembly] Add Wasm SjLj support This add support for SjLj using Wasm exception handling instructions: https://github.com/WebAssembly/exception-handling/blob/master/proposals/exception-handling/Exceptions.md This does not yet support the mixed use of EH and SjLj within a function. It will be added in a follow-up CL. This currently passes all SjLj Emscripten tests for wasm0/1/2/3/s, except for the below: - `test_longjmp_standalone`: Uses Node - `test_dlfcn_longjmp`: Uses NodeRAWFS - `test_longjmp_throw`: Mixes EH and SjLj - `test_exceptions_longjmp1`: Mixes EH and SjLj - `test_exceptions_longjmp2`: Mixes EH and SjLj - `test_exceptions_longjmp3`: Mixes EH and SjLj Reviewed By: dschuff, tlively Differential Revision: https://reviews.llvm.org/D108960 * [WebAssembly] Fix names of WebAssemblyWrapper SDNodes. NFC Other platforms all use CamelCase as normal for these wrapper nodes. Differential Revision: https://reviews.llvm.org/D109172 * [SCEVExpander] Simplify pointer overflow check This is a followup to D104662 to generate slightly nicer code for pointer overflow checks. Bypass expandAddToGEP and instead explicitly generate i8 GEPs. This saves some bitcasts and negates the value in a more obvious way. In particular, this prevents SCEV from looking through the umul.with.overflow, same as in the integer case. The wrapping-pointer-ni.ll test deserves a comment: Previously, this generated a typed GEP which used the umulo argument rather than the multiplication result. This results in more compact IR in that case, but effectively does the multiplication twice, the second one is just hidden in the GEP. Reusing the umulo result seems pretty reasonable to me. Differential Revision: https://reviews.llvm.org/D109093 * [CSSPGO] Allow inlining recursive call for preinliner When preinliner is used for CSSPGO, we try to honor global preinliner decision as much as we can except for uninlinable callees. We rely on InlineCost::Never to prevent us from illegal inlining. However, it turns out that we use InlineCost::Never for both illeagle inlining and some of the "not-so-beneficial" inlining. The most common one is recursive inlining, while it can bloat size a lot during CGSCC bottom-up inlining, it's less of a problem when recursive inlining is guided by profile and done in top-down manner. Ideally it'd be better to have a clear separation between inline legality check vs cost-benefit check, but that requires a bigger change. This change enables InlineCost computation to allow inlining recursive calls, controlled by InlineParams. In SampleLoader, we now enable recursive inlining for CSSPGO when global preinliner decision is used. With this change, we saw a few perf improvements on SPEC2017 with CSSPGO and preinliner on: 2% for povray_r, 6% for xalancbmk_s, 3% omnetpp_s, while size is about the same (no noticeable perf change for all other benchmarks) Differential Revision: https://reviews.llvm.org/D109104 * [test][NewPM] Remove RUN lines using -analyze Only tests in llvm/test/Analysis. -analyze is legacy PM-specific. This only touches files with `-passes`. I looked through everything and made sure that everything had a new PM equivalent. Reviewed By: MaskRay Differential Revision: https://reviews.llvm.org/D109040 * [test] Remove missed RUN line after D109040 * Try to unbreak Win build differently after 973519826edb76 Looks like the MS STL wants StringMapKeyIterator::operator*() to be const. Return the result by copy instead of reference to do that. Assigning to a hash map key iterator doesn't make sense anyways. Also reverts 123f811fe5b0b which is now hopefully no longer needed. Differential Revision: https://reviews.llvm.org/D109167 * Revert "Try to unbreak Win build differently after 973519826edb76" Breaks the build and failed pre-merge checks: https://buildkite.com/llvm-project/premerge-checks/builds/54930#07373971-3d37-49cf-9def-22c0d724ee23 > llvm-project/lld/wasm/Writer.cpp:521:16: error: non-const lvalue reference to > type 'llvm::StringRef' cannot bind to a temporary of type 'llvm::StringRef' > for (auto &feature : used.keys()) { This reverts commit 5881dcff7e76a68323edc8bb3c6e14420ad9cf7c. * Fix lld build after 5881dcff7e76a68 * [WebAssemlby] Remove redundant SDTypeProfile. NFC I added this back in https://reviews.llvm.org/D54647 but it wasn't actually needed. Differential Revision: https://reviews.llvm.org/D109176 * [test] Remove legacy PM tests in llvm/test/Other Differential Revision: https://reviews.llvm.org/D109180 * [llvm-profgen] Turn off cold context trimming by default We merge cold context by default to save profile size. However trimming cold context after merging doesn't save size much, so default to off to reflect how it's commonly used. Differential Revision: https://reviews.llvm.org/D109166 * [NFC] Remove some unclear attribute methods To any downstream users broken by this change, please examine your uses of these methods and see if you can use a better method. For example, getAttribute(AttributeList::FunctionIndex) => getFnAttr(), or addAttribute(AttributeList::FirstArgIndex + ArgNo) => addParamAttribute(ArgNo). 0 corresponds to ReturnIndex, ~0 corresponds to FunctionIndex. This may make future cleanups less painful. I've made the mistake of assuming that these indexes are for parameters multiple times, but actually they're based off of a weird indexing scheme AttributeList::AttrIndex where 0 is the return value and ~0 is the function. Hopefully renaming these methods will make this clearer. Ideally users should use more specific methods like AttributeList::getFnAttr(). This touches all relevant methods in AttributeList, CallBase, and Function. This hopefully will make easier a future change to cleanup AttrIndex. A previous worry about cleaning up AttrIndex was that too many downstream users would have to look through all uses of AttrIndex and relevant attribute method calls to see if anything was unintentionally hardcoded (e.g. using 0 instead of ReturnIndex). With this change hopefully downstream users will look at existing usages of these methods and clean them up. Reviewed By: rnk, MaskRay Differential Revision: https://reviews.llvm.org/D108614 * [Verifier] Only allow invariant.group metadata on stores and loads As specified by https://llvm.org/docs/LangRef.html#invariant-group-metadata. Reviewed By: nikic Differential Revision: https://reviews.llvm.org/D109182 * [MemorySSA] Properly handle liveOnEntry in the walker printer Reviewed By: asbirlea Differential Revision: https://reviews.llvm.org/D109177 * Fix lldb after D108614 * [libc++] Define insert_iterator::iter with ranges::iterator_t. The `insert_iterator::iter` member is defined as `Container::iterator` but the standard requires `iter` to be defined in terms of `ranges::iterator_t` as of C++20. So, if in C++20 or later, define the `iter` member as `ranges::iterator_t`. Original patch by Joe Loser! Differential Revision: https://reviews.llvm.org/D108575 * [NFC] Added testcase for PR40750 * [mlir] speed up construction of LLVM IR constants when possible The translation to LLVM IR used to construct sequential constants by recurring down to individual elements, creating constant values for them, and wrapping them into aggregate constants in post-order. This is highly inefficient for large constants with known data such as DenseElementsAttr. Use LLVM's ConstantData for the innermost dimension instead. LLVM does seem to support data constants for nested sequential constants so the outer dimensions are still handled recursively. Nevertheless, this speeds up the translation of large constants with equal dimensions by up to 30x. Users are advised to rewrite large constants to use flat types before translating to LLVM IR if more efficiency in translation is necessary. This is not done automatically as the translation is not aware of the expectations of the overall compilation flow about type changes and indexing, in particular for global constants with external linkage. Reviewed By: silvas Differential Revision: https://reviews.llvm.org/D109152 * [OpenCL] Remove decls for scalar vloada_half and vstorea_half* fns These functions are not part of the OpenCL C specification. See https://github.com/KhronosGroup/OpenCL-Docs/issues/648 for a clarification regarding the vloada_half declarations. Reviewed By: Anastasia Differential Revision: https://reviews.llvm.org/D108761 * [flang] NFC: change non-nullable pointer arguments to references Ticking off a Parser TODO: Preprocessor::Directive()'s Prescanner argument should be a reference, not a pointer. Differential Revision: https://reviews.llvm.org/D109094 * [flang] Fix scope in which undeclared symbols are created Don't create new symbols in FORALL, implied DO, or other construct scopes when an undeclared name appears; use the innermost enclosing program unit's scope. This clears up a pending TODO in name resolution, and also exposes (& fixes) an unnoticed name resolution problem in a module file test. Differential Revision: https://reviews.llvm.org/D109095 * [NFC] Regenerate SVE ACLE intrinsics tests Change-Id: Ic4ec50f9a53fcf58e86104bf19ba229c1dd132d0 * [Sanitizers] intercept clock_getcpuclockid on FreeBSD, and pthread_getcpuclockid. Reviewed By: vitalybuka Differential Revision: https://reviews.llvm.org/D108884 * Revert "[CSSPGO] Honor preinliner decision for ThinLTO importing" This reverts commit a2768b4732a0216dfd346d34e428685f03f10549. Breaks sanitizer-x86_64-linux-fast buildbot: https://lab.llvm.org/buildbot/#/builders/5/builds/11334 Log snippet: Testing: 0.. 10.. 20.. 30.. 40.. 50.. 60.. 70.. 80 FAIL: LLVM :: Transforms/SampleProfile/early-inline.ll (65549 of 78729) ******************** TEST 'LLVM :: Transforms/SampleProfile/early-inline.ll' FAILED ******************** Script: -- : 'RUN: at line 1'; /b/sanitizer-x86_64-linux-fast/build/llvm_build_ubsan/bin/opt < /b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/test/Transforms/SampleProfile/early-inline.ll -instcombine -sample-profile -sample-profile-file=/b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/test/Transforms/SampleProfile/Inputs/einline.prof -S | /b/sanitizer-x86_64-linux-fast/build/llvm_build_ubsan/bin/FileCheck /b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/test/Transforms/SampleProfile/early-inline.ll -- Exit Code: 2 Command Output (stderr): -- /b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/lib/Transforms/IPO/SampleProfile.cpp:1309:53: runtime error: member call on null pointer of type 'llvm::sampleprof::FunctionSamples' #0 0x5a730f8 in shouldInlineCandidate /b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/lib/Transforms/IPO/SampleProfile.cpp:1309:53 #1 0x5a730f8 in (anonymous namespace)::SampleProfileLoader::tryInlineCandidate((anonymous namespace)::InlineCandidate&, llvm::SmallVector<llvm::CallBase*, 8u>*) /b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/lib/Transforms/IPO/SampleProfile.cpp:1178:21 #2 0x5a6cda6 in inlineHotFunctions /b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/lib/Transforms/IPO/SampleProfile.cpp:1105:13 #3 0x5a6cda6 in (anonymous namespace)::SampleProfileLoader::emitAnnotations(llvm::Function&) /b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/lib/Transforms/IPO/SampleProfile.cpp:1633:16 #4 0x5a5fcbe in runOnFunction /b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/lib/Transforms/IPO/SampleProfile.cpp:2008:12 #5 0x5a5fcbe in (anonymous namespace)::SampleProfileLoader::runOnModule(llvm::Module&, llvm::AnalysisManager<llvm::Module>*, llvm::ProfileSummaryInfo*, llvm::CallGraph*) /b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/lib/Transforms/IPO/SampleProfile.cpp:1922:15 #6 0x5a5de55 in llvm::SampleProfileLoaderPass::run(llvm::Module&, llvm::AnalysisManager<llvm::Module>&) /b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/lib/Transforms/IPO/SampleProfile.cpp:2038:21 #7 0x6552a01 in llvm::detail::PassModel<llvm::Module, llvm::SampleProfileLoaderPass, llvm::PreservedAnalyses, llvm::AnalysisManager<llvm::Module> >::run(llvm::Module&, llvm::AnalysisManager<llvm::Module>&) /b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/include/llvm/IR/PassManagerInternal.h:88:17 #8 0x57f807c in llvm::PassManager<llvm::Module, llvm::AnalysisManager<llvm::Module> >::run(llvm::Module&, llvm::AnalysisManager<llvm::Module>&) /b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/include/llvm/IR/PassManager.h:526:21 #9 0x37c8522 in llvm::runPassPipeline(llvm::StringRef, llvm::Module&, llvm::TargetMachine*, llvm::TargetLibraryInfoImpl*, llvm::ToolOutputFile*, llvm::ToolOutputFile*, llvm::ToolOutputFile*, llvm::StringRef, llvm::ArrayRef<llvm::StringRef>, llvm::opt_tool::OutputKind, llvm::opt_tool::VerifierKind, bool, bool, bool, bool, bool) /b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/tools/opt/NewPMDriver.cpp:489:7 #10 0x37e7c11 in main /b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/tools/opt/opt.cpp:830:12 #11 0x7fbf4de4009a in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x2409a) #12 0x379e519 in _start (/b/sanitizer-x86_64-linux-fast/build/llvm_build_ubsan/bin/opt+0x379e519) SUMMARY: UndefinedBehaviorSanitizer: undefined-behavior /b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/lib/Transforms/IPO/SampleProfile.cpp:1309:53 in FileCheck error: '<stdin>' is empty. FileCheck command line: /b/sanitizer-x86_64-linux-fast/build/llvm_build_ubsan/bin/FileCheck /b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/test/Transforms/SampleProfile/early-inline.ll -- ******************** Testing: 0.. 10.. 20.. 30.. 40.. 50.. 60.. 70.. 80 FAIL: LLVM :: Transforms/SampleProfile/inline-cold.ll (65643 of 78729) ******************** TEST 'LLVM :: Transforms/SampleProfile/inline-cold.ll' FAILED ******************** Script: -- : 'RUN: at line 4'; /b/sanitizer-x86_64-linux-fast/build/llvm_build_ubsan/bin/opt < /b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/test/Transforms/SampleProfile/inline-cold.ll -sample-profile -sample-profile-file=/b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/test/Transforms/SampleProfile/Inputs/inline-cold.prof -S | /b/sanitizer-x86_64-linux-fast/build/llvm_build_ubsan/bin/FileCheck -check-prefix=NOTINLINE /b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/test/Transforms/SampleProfile/inline-cold.ll : 'RUN: at line 5'; /b/sanitizer-x86_64-linux-fast/build/llvm_build_ubsan/bin/opt < /b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/test/Transforms/SampleProfile/inline-cold.ll -passes=sample-profile -sample-profile-file=/b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/test/Transforms/SampleProfile/Inputs/inline-cold.prof -S | /b/sanitizer-x86_64-linux-fast/build/llvm_build_ubsan/bin/FileCheck -check-prefix=NOTINLINE /b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/test/Transforms/SampleProfile/inline-cold.ll : 'RUN: at line 8'; /b/sanitizer-x86_64-linux-fast/build/llvm_build_ubsan/bin/opt < /b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/test/Transforms/SampleProfile/inline-cold.ll -sample-profile -sample-profile-file=/b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/test/Transforms/SampleProfile/Inputs/inline-cold.prof -sample-profile-inline-size -S | /b/sanitizer-x86_64-linux-fast/build/llvm_build_ubsan/bin/FileCheck -check-prefix=INLINE /b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/test/Transforms/SampleProfile/inline-cold.ll : 'RUN: at line 11'; /b/sanitizer-x86_64-linux-fast/build/llvm_build_ubsan/bin/opt < /b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/test/Transforms/SampleProfile/inline-cold.ll -passes=sample-profile -sample-profile-file=/b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/test/Transforms/SampleProfile/Inputs/inline-cold.prof -sample-profile-inline-size -sample-profile-cold-inline-threshold=9999999 -S | /b/sanitizer-x86_64-linux-fast/build/llvm_build_ubsan/bin/FileCheck -check-prefix=INLINE /b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/test/Transforms/SampleProfile/inline-cold.ll : 'RUN: at line 14'; /b/sanitizer-x86_64-linux-fast/build/llvm_build_ubsan/bin/opt < /b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/test/Transforms/SampleProfile/inline-cold.ll -passes=sample-profile -sample-profile-file=/b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/test/Transforms/SampleProfile/Inputs/inline-cold.prof -sample-profile-inline-size -sample-profile-cold-inline-threshold=-500 -S | /b/sanitizer-x86_64-linux-fast/build/llvm_build_ubsan/bin/FileCheck -check-prefix=NOTINLINE /b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/test/Transforms/SampleProfile/inline-cold.ll -- Exit Code: 2 Command Output (stderr): -- /b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/lib/Transforms/IPO/SampleProfile.cpp:1309:53: runtime error: member call on null pointer of type 'llvm::sampleprof::FunctionSamples' #0 0x5a730f8 in shouldInlineCandidate /b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/lib/Transforms/IPO/SampleProfile.cpp:1309:53 #1 0x5a730f8 in (anonymous namespace)::SampleProfileLoader::tryInlineCandidate((anonymous namespace)::InlineCandidate&, llvm::SmallVector<llvm::CallBase*, 8u>*) /b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/lib/Transforms/IPO/SampleProfile.cpp:1178:21 #2 0x5a6cda6 in inlineHotFunctions /b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/lib/Transforms/IPO/SampleProfile.cpp:1105:13 #3 0x5a6cda6 in (anonymous namespace)::SampleProfileLoader::emitAnnotations(llvm::Function&) /b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/lib/Transforms/IPO/SampleProfile.cpp:1633:16 #4 0x5a5fcbe in runOnFunction /b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/lib/Transforms/IPO/SampleProfile.cpp:2008:12 #5 0x5a5fcbe in (anonymous namespace)::SampleProfileLoader::runOnModule(llvm::Module&, llvm::AnalysisManager<llvm::Module>*, llvm::ProfileSummaryInfo*, llvm::CallGraph*) /b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/lib/Transforms/IPO/SampleProfile.cpp:1922:15 #6 0x5a5de55 in llvm::SampleProfileLoaderPass::run(llvm::Module&, llvm::AnalysisManager<llvm::Module>&) /b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/lib/Transforms/IPO/SampleProfile.cpp:2038:21 #7 0x6552a01 in llvm::detail::PassModel<llvm::Module, llvm::SampleProfileLoaderPass, llvm::PreservedAnalyses, llvm::AnalysisManager<llvm::Module> >::run(llvm::Module&, llvm::AnalysisManager<llvm::Module>&) /b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/include/llvm/IR/PassManagerInternal.h:88:17 #8 0x57f807c in llvm::PassManager<llvm::Module, llvm::AnalysisManager<llvm::Module> >::run(llvm::Module&, llvm::AnalysisManager<llvm::Module>&) /b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/include/llvm/IR/PassManager.h:526:21 #9 0x37c8522 in llvm::runPassPipeline(llvm::StringRef, llvm::Module&, llvm::TargetMachine*, llvm::TargetLibraryInfoImpl*, llvm::ToolOutputFile*, llvm::ToolOutputFile*, llvm::ToolOutputFile*, llvm::StringRef, llvm::ArrayRef<llvm::StringRef>, llvm::opt_tool::OutputKind, llvm::opt_tool::VerifierKind, bool, bool, bool, bool, bool) /b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/tools/opt/NewPMDriver.cpp:489:7 #10 0x37e7c11 in main /b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/tools/opt/opt.cpp:830:12 #11 0x7fcd534a209a in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x2409a) #12 0x379e519 in _start (/b/sanitizer-x86_64-linux-fast/build/llvm_build_ubsan/bin/opt+0x379e519) SUMMARY: UndefinedBehaviorSanitizer: undefined-behavior /b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/lib/Transforms/IPO/SampleProfile.cpp:1309:53 in FileCheck error: '<stdin>' is empty. FileCheck command line: /b/sanitizer-x86_64-linux-fast/build/llvm_build_ubsan/bin/FileCheck -check-prefix=INLINE /b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/test/Transforms/SampleProfile/inline-cold.ll -- ******************** Testing: 0.. 10.. 20.. 30.. 40.. 50.. 60.. 70.. 80.. 90.. ******************** Failed Tests (2): LLVM :: Transforms/SampleProfile/early-inline.ll LLVM :: Transforms/SampleProfile/inline-cold.ll * [asan] Fixed link error by setting jump symbol to R_X86_64_PLT32. Fixing this link error: ld: error: relocation R_X86_64_PC32 cannot be used against symbol __asan_report_load...; recompile with -fPIC Reviewed By: vitalybuka Differential Revision: https://reviews.llvm.org/D109183 * Fully qualify template template parameters when printing I discovered this quirk when working on some DWARF - AST printing prints type template parameters fully qualified, but printed template template parameters the way they were written syntactically, or wholely unqualified - instead, we should print them consistently with the way we print type template parameters: fully qualified. The one place this got weird was for partial specializations like in ast-print-temp-class.cpp - hence the need for checking for TemplateNameDependenceScope::DependentInstantiation template template parameters. (not 100% sure that's the right solution to that, though - open to ideas) Differential Revision: https://reviews.llvm.org/D108794 * [GlobalISel] Combine icmp eq/ne x, 0/1 -> x when x == 0 or 1 This adds the following combines: ``` x = ... 0 or 1 c = icmp eq x, 1 -> c = x ``` and ``` x = ... 0 or 1 c = icmp ne x, 0 -> c = x ``` When the target's true value for the relevant types is 1. This showed up in the following situation: https://godbolt.org/z/M5jKexWTW SDAG currently supports the `ne` case, but not the `eq` case. This can probably be further generalized, but I don't feel like thinking that hard right now. This gives some minor code size improvements across the board on CTMark at -Os for AArch64. (0.1% for 7zip and pairlocalalign in particular.) Differential Revision: https://reviews.llvm.org/D109130 * [ORC] Move callWrapper and callSPSWrapper functions to ExecutorProcessControl. The ExecutionSession versions now just forward to the implementations in ExecutorProcessControl. This allows callWrapper / callSPSWrapper to be used while bootstrapping an ExecutorProcessControl instance. * [ORC] Add specialized SPSSerializationTraits for ArrayRef<char>. Deserializing from an SPSSequence<char> to an an ArrayRef<char> will point the ArrayRef<char> at the input buffer. * [ORC] Add EPCGenericJITLinkMemoryManager: memory management via EPC calls. All ExecutorProcessControl subclasses must provide a JITLinkMemoryManager object that can be used to allocate memory in the executor process. The EPCGenericJITLinkMemoryManager class provides an off-the-shelf JITLinkMemoryManager implementation for JITs that do not need (or cannot provide) a specialized JITLinkMemoryManager implementation. This simplifies the process of creating new ExecutorProcessControl implementations. * [gn build] Port dad60f8071d5 * [ORC] Range check and narrow size value. This should fix the build issues in https://lab.llvm.org/buildbot#builders/171/builds/3149. * [Sanitizers] remove empty test case. * Reland "Try to unbreak Win build differently after 973519826edb76"" Build should be fixed by https://github.com/llvm/llvm-project/commit/9d22754389 This reverts commit df052e1732ab57f5d9c684ceeaed3ab39073cd9f. Differential Revision: https://reviews.llvm.org/D109181 * [openmp] NFC add bitcode comment * [runtimeunroll] Under EXPENSIVE_CHECKS, validate loop info Requested in review comment on D108476 * [runtimeunroll] Support epilogue unrolling with a parent loop This patch adds support for unrolling inner loops using epilogue unrolling. The basic issue is that the original latch exit block of the inner loop could be outside the outer loop. When we clone the inner loop and split the latch exit, the cloned blocks need to be in the outer loop. Differential Revision: https://reviews.llvm.org/D108476 * [WebAssembly] Rename WrapperPIC -> WrapperREL. NFC This ISD node/wrapper represents am address which is relative to a base address and therefore lowers to `i32.const` rather than `global.get`. Use this wrapper type for TLS-relative addresses, paving the way for the non-REL wrapper to be used to external TLS address once those are supported. Differential Revision: https://reviews.llvm.org/D109179 * [AMDGPU] Fold immediates in the optimizeCompareInstr Peephole works before the first SIFoldOperands so most of the immediates are in registers. Differential Revision: https://reviews.llvm.org/D109186 * [CSSPGO] Honor preinliner decision for ThinLTO importing When pre-inliner decision is used for CSSPGO, we should take that into account for ThinLTO importing as well, so post-link sample loader inliner can favor that decision. This is handled by a small tweak in this patch. It also includes a change to transfer preinliner decision when merging context. Differential Revision: https://reviews.llvm.org/D109088 * [Coroutines] Only run verifyFunction in debug mode verifyFunction can be really slow on large functions. This can significantly slow down compilation in production. Given that coroutine passes are fairly stable now, we should only run it in debug mode. Differential Revision: https://reviews.llvm.org/D109198 * [AMDGPU] Process any power of 2 in optimizeCompareInstr Differential Revision: https://reviews.llvm.org/D109201 * [mlir][python] Simplify python extension loading. * Now that packaging has stabilized, removes old mechanisms for loading extensions, preferring direct importing. * Removes _cext_loader.py, _dlloader.py as unnecessary. * Fixes the path where the CAPI dll is written on Windows. This enables that path of least resistance loading behavior to work with no further drama (see: https://bugs.python.org/issue36085). * With this patch, `ninja check-mlir` on Windows with Python bindings works for me, modulo some failures that are actually due to a couple of pre-existing Windows bugs. I think this is the first time the Windows Python bindings have worked upstream. * Downstream changes needed: * If downstreams are using the now removed `load_extension`, `reexport_cext`, etc, then those should be replaced with normal import statements as done in this patch. Reviewed By: jdd, aartbik Differential Revision: https://reviews.llvm.org/D108489 * [mlir][scf] Allow runtime type of iter_args to change The limitation on iter_args introduced with D108806 is too restricting. Changes of the runtime type should be allowed. Extends the dim op canonicalization with a simple analysis to determine when it is safe to canonicalize. Differential Revision: https://reviews.llvm.org/D109125 * Fix typo in RISCVMatInt.cpp comments * [LoopPredication] Fix MemorySSA crash in predicateLoopExits The attached testcase crashes without the patch (Not the same accesses in the same order). When we move instructions before another instruction, we also need to update the memory accesses corresponding to it. Reviewed-By: asbirlea Differential Revision: https://reviews.llvm.org/D109197 * Revert "[NFC] Regenerate SVE ACLE intrinsics tests" This reverts commit 8749a556da96fb17df1a2e36b860527e557c8c7b. * [NFC] Recommit "Regenerate SVE ACLE intrinsics tests" Change-Id: Ida45fc41231cd71709048f2d37f228f14053514e * [OMPIRBuilder] Add ordered directive to OMPBuilder Add support for ordered directive in the OpenMPIRBuilder. This patch also modidies clang to use the ordered directive when the option -fopenmp-enable-irbuilder is enabled. Also fix one ICE when parsing one canonical for loop with the relational operator LE or GE in openmp region by replacing unary increment operation of the expression of the variable "Expr A" minus the variable "Expr B" (++(Expr A - Expr B)) with binary addition operation of the experssion of the variable "Expr A" minus the variable "Expr B" and the expression with constant value "1" (Expr A - Expr B + "1"). Reviewed By: Meinersbur, kiranchandramohan Differential Revision: https://reviews.llvm.org/D107430 * [RISCV] Add SiFive core S51 Add SiFive core s51 as rv64imac RocketModel Reviewed-By: MaskRay, evandro Differential Revision: https://reviews.llvm.org/D108886 * [Coroutines] [Clang] Look up coroutine component in std namespace first Summary: Now in libcxx and clang, all the coroutine components are defined in std::experimental namespace. And now the coroutine TS is merged into C++20. So in the working draft like N4892, we could find the coroutine components is defined in std namespace instead of std::experimental namespace. And the coroutine support in clang seems to be relatively stable. So I think it may be suitable to move the coroutine component into the experiment namespace now. But move the coroutine component into the std namespace may be an break change. So I planned to split this change into two patch. One in clang and other in libcxx. This patch would make clang lookup coroutine_traits in std namespace first. For the compatibility consideration, clang would lookup in std::experimental namespace if it can't find definitions in std namespace and emit a warning in this case. So the existing codes wouldn't be break after update compiler. Test Plan: check-clang, check-libcxx Reviewed By: lxfind Differential Revision: https://reviews.llvm.org/D108696 * AMDGPU: Remove FeatureLocalMemorySize0 There's no reason to make this an explicit feature, since it's implied by the lack of a feature with a size. * Revert "[HardwareLoops] Change order of SCEV expression construction for InitLoopCount." This causes https://bugs.llvm.org/show_bug.cgi?id=51714 and is not a right patch according to comments in D91724 This reverts commit 42eaf4fe0adef3344adfd9fbccd49f325cb549ef. * [PowerPC] Enable fast-isel on AIX 64 subtarget This patch basically enables fast-isel for AIX 64-bit subtarget (previously enabled only for ELF 64). The initial motivation is to introduce branch folding to AIX generated code for correct debug behavior. I also saw some compiling time improvement in a few LLVM test-suite benchmarks. (toast, dbms, cjpeg, burg, etc.) Reviewed By: jsji Differential Revision: https://reviews.llvm.org/D98844 * [AArch64][GlobalISel] Support for folding G_ROTR as shifted operands. This allows selection like: eor w0, w1, w2, ror #8 Saves 500 bytes on ClamAV -Os, which is 0.1%. Differential Revision: https://reviews.llvm.org/D109206 * Reformulate OrcJIT tutorial doc to make it more clear. Fixed a minor writing error. The text was hard to understand. Reviewed By: lhames, mehdi_amini Differential Revision: https://reviews.llvm.org/D106235 * [Test] Missed opt test for D108910 We can fold loop phis after we've proved that some exit has EC=0 in IndVars. Patch by Dmitry Makogon! * [flang] Extend common block size to cover equivalence storage The size of common block should be extended to cover any storage sequence that are storage associated with the common block via equivalences (8.10.2.2 point 1 (2)). In symbol size and offset computation, the size of the common block was not always extended to cover storage association. It was only done if the "base symbol of an equivalence group"(*) appeared in a common block statement. Correct this to cover all cases where a symbol appearing in a common block statement is storage associated. (*) the base symbol of an equivalence group is the symbol whose storage starts first in a storage association (if several symbols starts first, the base symbol is the last one visited by the algorithm going through the equivalence sets). Differential Revision: https://reviews.llvm.org/D109156 * [mlir][flang] Do not prevent integer types from being parsed as MLIR keywords DialectAsmParser::parseKeyword is rejecting `'i' digit+` while it is a valid identifier according to mlir/docs/LangRef.md. Integer types actually used to be TOK_KEYWORD a while back before the change: https://github.com/llvm/llvm-project/commit/6af866c58d21813fb243906611d02bb2a8ffa43a. This patch Modifies `isCurrentTokenAKeyword` to return true for tokens that match integer types too. The motivation for this change is the parsing of `!fir.type<{` `component-name: component-type,`+ `}>` type in FIR that represent Fortran derived types. The component-names are parsed as keywords, and can very well be i32 or any ixxx (which are valid Fortran derived type component names). The Quant dialect type parser had to be modified since it relied on `iw` not being parsed as keywords. Differential Revision: https://reviews.llvm.org/D108913 * [lldb] [test] Mark *fork-follow-child* tests non-Darwin * [flang] Remove *- C++ -* incantation from runtime .cpp files. NFC We should only need to spell the language out in .h files. Differential Revision: https://reviews.llvm.org/D109138 * [lldb/lua] Force Lua version to be 5.3 Due to CMake cache, find_package in FindLuaAndSwig.cmake will be ignored. This commit adds EXACT and REQUIRED flags to it and removes find_package in Lua ScriptInterpreter. Signed-off-by: Siger Yang <[email protected]> Reviewed By: tammela, JDevlieghere Differential Revision: https://reviews.llvm.org/D108515 * [flang] COMMAND_ARGUMENT_COUNT runtime implementation Grab whatever ProgramStart has stored in executionEnvironment.argc and subtract 1 (based on the assumption that ProgramStart is called with a C-style argc that counts the command name as an argument). Spoiler alert: The tests will evolve into fixtures when we implement GET_COMMAND_ARGUMENT etc. Differential Revision: https://reviews.llvm.org/D109048 * [AArch64][ISel] NFC: DAG.getMachineFunction() -> MF Reviewed By: sdesmalen Differential Revision: https://reviews.llvm.org/D109135 * [AArch64][SME] Support NEON vector to GPR integer moves in streaming mode A small subset of the NEON instruction set is legal in streaming mode. This patch adds support for the following vector to integer move instructions: 0x00 1110 0000 0001 0010 11xx xxxx xxxx # SMOV W|Xd,Vn.B[0] 0x00 1110 0000 0010 0010 11xx xxxx xxxx # SMOV W|Xd,Vn.H[0] 0100 1110 0000 0100 0010 11xx xxxx xxxx # SMOV Xd,Vn.S[0] 0000 1110 0000 0001 0011 11xx xxxx xxxx # UMOV Wd,Vn.B[0] 0000 1110 0000 0010 0011 11xx xxxx xxxx # UMOV Wd,Vn.H[0] 0000 1110 0000 0100 0011 11xx xxxx xxxx # UMOV Wd,Vn.S[0] 0100 1110 0000 1000 0011 11xx xxxx xxxx # UMOV Xd,Vn.D[0] Only the zero index variants are legal, all others indexes are illegal. To support this, new instructions are defined specifically for zero index which is hardcoded, along an implicit 'VectorIndex0' operand. Since the index operand is implicit and takes no bits in the encoding, custom decoding is required to add the operand. I'm not sure if this is the best approach but the predicate constraint on a subset of an operand is unusual. Would be interested to hear some alternatives. The instructions are predicated on 'HasNEONorStreamingSVE', i.e. they're enabled by either +neon or +streaming-sve. This follows on from the work in D106272 to support the subset of SVE(2) instructions that are legal in streaming mode. Depends on D107902. Reviewed By: sdesmalen Differential Revision: https://reviews.llvm.org/D107903 * [sanitizer_common] Define wordexp_wrde_dooffs for Solaris The Solaris buildbots have been broken for some time: In file included from /opt/llvm-buildbot/home/solaris11-amd64/clang-solaris11-amd64/llvm/compiler-rt/lib/asan/asan_interceptors.cpp:174: /opt/llvm-buildbot/home/solaris11-amd64/clang-solaris11-amd64/llvm/compiler-rt/lib/asan/../sanitizer_common/sanitizer_common_interceptors.inc:4000:19: error: use of undeclared identifier 'wordexp_wrde_dooffs' ((flags & wordexp_wrde_dooffs) ? p->we_offs : 0) + p->we_wordc; ^ This was caused by D108646 <https://reviews.llvm.org/D108646>; the fix is equivalent to D108838 <https://reviews.llvm.org/D108838>. Tested on `amd64-pc-solaris2.11` and `sparcv9-sun-solaris2.11`. Differential Revision: https://reviews.llvm.org/D109193 * [LoopBoundSplit] Update phi node in exit block It fixes https://bugs.llvm.org/show_bug.cgi?id=51700 Differential Revision: * [JITLink] Add initial Aarch64 support Set up basic infrastructure for 64-bit ARM architecture support in JITLink. It allows for loading a minimal object file and resolving a single relocation. Advanced features like GOT and PLT handling or relaxations were intentionally left out for the moment. This patch follows the idea to keep implementations for ARM (32-bit) and Aaarch64 (64-bit) separate, because: * it might be easier to share code with the MachO "arm64" JITLink backend * LLVM has individual targets for ARM and Aaarch64 as well Reviewed By: lhames Differential Revision: https://reviews.llvm.org/D108986 * [gn build] Port 2ed91da0f1f3 * [hwasan] Support more complicated lifetimes. This is important as with exceptions enabled, non-POD allocas often have two lifetime ends: the exception handler, and the normal one. Reviewed By: eugenis Differential Revision: https://reviews.llvm.org/D108365 * Revert "[lldb/lua] Force Lua version to be 5.3" This commit causes buildbot failures if SWIG is available but Lua is not present. This reverts commit 7bb42dc6b114f57200abfebaaa01160914be6bba. * [OpenCL] Supports optional 64-bit floating point types in C++ for OpenCL 2021 Adds support for a feature macro `__opencl_c_fp64` in C++ for OpenCL 2021 enabling a respective optional core feature from OpenCL 3.0. This change aims to achieve compatibility between C++ for OpenCL 2021 and OpenCL 3.0. Differential Revision: https://reviews.llvm.org/D108989 * [AMDGPU][MC][NFC][DOC] Updated description of registers Corrected list of available register tuples to reflect changes introduced by commits https://reviews.llvm.org/D103672 and https://reviews.llvm.org/D103800 See bug https://bugs.llvm.org/show_bug.cgi?id=51388 * [OptTable] Reapply Improve error message output for grouped short options This reapplies 71d7fed3bc2ad6c22729d446526a59fcfd99bd03 which was reverted by 3e2bd82f02c6cbbfb0544897c7645867f04b3a7e. This change includes the fix for breaking the sanitizer bots. As seen in https://bugs.llvm.org/show_bug.cgi?id=48880 the current implementation for parsing grouped short options can return unclear error messages. This change fixes the example given in the ticket in which a flag is incorrectly given an argument. Also when parsing a group we now keep reading past the first incorrect option and output errors for all incorrect options in the group. Differential Revision: https://reviews.llvm.org/D108770 * [X86][SLM] Fix PBLENDVB uops and throughput SLM PBLENDVB is just as bad as BLENDVPD/PS - so model it as such, fixing the rr vs rm uops diff as well. The Intel AoM appears to have a copy+paste typo with PBLENDW, it doesn't match Agner or InstLatX64. Noticed while investigating some of the weird discrepancies reported by the D103695 helper script (SLM had much better vector shift throughputs than it should). * [GlobalISel] Add convenience constructors to MemDesc This allows constructing a MemDesc from a MachineMemoryOperand, a pattern that starts to show up more frequently. Reviewed By: foad Differential Revision: https://reviews.llvm.org/D109161 * [LoopDeletion] Move ICmpInst handling to getValueOnFirstIteration() As noticed in https://reviews.llvm.org/D105688, it would be great to move handling of ICmpInst which was in canProveExitOnFirstIteration() to getValueOnFirstIteration(). Patch by Dmitry Makogon! Differential Revision: https://reviews.llvm.org/D108978 Reviewed By: reames * [analyzer][NFCI] Allow clients of NoStateChangeFuncVisitor to check entire function calls, rather than each ExplodedNode in it D105553 added NoStateChangeFuncVisitor, an abstract class to aid in creating notes such as "Returning without writing to 'x'", or "Returning without changing the ownership status of allocated memory". Its clients need to define, among other things, what a change of state is. For code like this: f() { g(); } foo() { f(); h(); } We'd have a path in the ExplodedGraph that looks like this: -- <g> --> / \ --- <f> --------> --- <h> ---> / \ / \ -------- <foo> ------ <foo> --> When we're interested in whether f neglected to change some property, NoStateChangeFuncVisitor asks these questions: ÷×~ -- <g> --> ß / \$ @&#* --- <f> --------> --- <h> ---> / \ / \ -------- <foo> ------ <foo> --> Has anything changed in between # and *? Has anything changed in between & and *? Has anything changed in between @ and *? ... Has anything changed in between $ and *? Has anything changed in between × and ~? Has anything changed in between ÷ and ~? ... Has anything changed in between ß and *? ... This is a rather thorough line of questioning, which is why in D105819, I was only interested in whether state *right before* and *right after* a function call changed, and early returned to the CallEnter location: if (!CurrN->getLocationAs<CallEnter>()) return; Except that I made a typo, and forgot to negate the condition. So, in this patch, I'm fixing that, and under the same hood allow all clients to decide to do this whole-function check instead of the thorough one. Differential Revision: https://reviews.llvm.org/D108695 * [gn build] Port a375bfb5b729 * Reland "[clang-repl] Re-implement clang-interpreter as a test case." Original commit message: " Original commit message:" The current infrastructure in lib/Interpreter has a tool, clang-repl, very similar to clang-interpreter which also allows incremental compilation. This patch moves clang-interpreter as a test case and drops it as conditionally built example as we already have clang-repl in place. Differential revision: https://reviews.llvm.org/D107049 " This patch also ignores ppc due to missing weak symbol for __gxx_personality_v0 which may be a feature request for the jit infrastructure. Also, adds a missing build system dependency to the orc jit. " Additionally, this patch defines a custom exception type and thus avoids the requirement to include header <exception>, making it easier to deploy across systems without standard location of the c++ headers. Differential revision: https://reviews.llvm.org/D107049 * [ORC] Static cast more uint64_t to size_t These instances don't have an obvious way to fail nicely so I've just asserted they are within range. Fixes the Arm 32 bit builds. * [compiler-rt][Profile] Disable test on Arm/AArch64 Linux While a fix for flaky results is being reviewed. * [gn build] (manually) port 6fe2beba7d2a (ExceptionTests) * Revert "Reland "[clang-repl] Re-implement clang-interpreter as a test case."" This reverts commit 6fe2beba7d2a41964af658c8c59dd172683ef739 which fails on clang-hexagon-elf * Revert "[gn build] (manually) port 6fe2beba7d2a (ExceptionTests)" This reverts commit da47c2719b1094a29427917ddb157c9c716e876d. 6fe2beba7d2a was reverted in 885964046114. * [lldb] Support .debug_rnglists.dwo sections in dwp file This patch considers the CU index entry when reading the .debug_rnglists.dwo section. Reviewed By: jankratochvil Differential Revision: https://reviews.llvm.org/D107456 * Revert "[NFC] Recommit "Regenerate SVE ACLE intrinsics tests"" This reverts commit 91eda9c30f33da6ec6da70b59a5f5da6c6397039. Breaks tests on macOS, both intel and arm. See e.g. https://logs.chromium.org/logs/chromium/buildbucket/cr-buildbucket/8837137028177680097/+/u/package_clang/stdout?format=raw https://logs.chromium.org/logs/chromium/buildbucket/cr-buildbucket/8837137028177680081/+/u/package_clang/stdout?format=raw http://45.33.8.238/macm1/17258/step_7.txt http://45.33.8.238/mac/35004/step_7.txt * [lldb] [test] Mark vfork-follow-child-* tests unsupported (flaky) on aarch64 * [lldb] [test] Mark the remaining vfork-follow-child test unsupported (flaky) on aarch64 * [CUDA][NFC] Fix wrong assert information Reviewed By: fodinabor Differential Revision: https://reviews.llvm.org/D109232 * Remove blank from NaN string representation Flang front end function DumpHexadecimal generates a string representation of a REAL value. When the value is a NaN, the string contains a blank, as in "NaN 0x7fc00000". This function is used by lowering to generate a string that is then passed to llvm Support function convertFromStringSpecials, which does not expect a blank in the string. Remove the blank to allow correct recognition of a NaN by this llvm function. Note that function DumpHexadecimal is not exercised by the front end itself. This functionality is only exercised by code that is not yet present in llvm. * [mlir] Update EmitC documentation * [mlir][sparse] refine heuristic for iteration graph topsort The sparse index order must always be satisfied, but this may give a choice in topsorts for several cases. We broke ties in favor of any dense index order, since this gives good locality. However, breaking ties in favor of pushing unrelated indices into sparse iteration spaces gives better asymptotic complexity. This revision improves the heuristic. Note that in the long run, we are really interested in using ML for ML to find the best loop ordering as a replacement for such heuristics. Reviewed By: bixia Differential Revision: https://reviews.llvm.org/D109100 * [clangd] Use the active file's language for hover code blocks This helps improve the syntax highlighting for Objective-C code, although it currently doesn't work well in VS Code with methods/properties/ivars since we don't currently include the proper decl context (e.g. class). Differential Revision: https://reviews.llvm.org/D108584 * [CMake] Add targets for generating coverage reports This is a pretty small bit of CMake goop to generate code coverage reports. I always forget the right script invocation and end up fumbling around too much. Wouldn't it be great to have targets that "Just Work"? Well, I thought so. At present this only really works correctly for LLVM, but I'll extend it in subsequent patches to work for subprojects. Reviewed By: phosek Differential Revision: https://reviews.llvm.org/D109019 * [mlir][linalg] Extend tiled_loop to SCF conversion to generate scf.parallel. Differential Revision: https://reviews.llvm.org/D109230 * [RISCV] Change how we encode AVL operands in vector pseudoinstructions to use GPRNoX0. This patch changes the register class to avoid accidentally setting the AVL operand to X0 through MachineIR optimizations. There are cases where we really want to use X0, but we can't get that past the MachineVerifier with the register class as GPRNoX0. So I've use a 64-bit -1 as a sentinel for X0. All other immediate values should be uimm5. I convert it to X0 at the earliest possible point in the VSETVLI insertion pass to avoid touching the rest of the algorithm. In SelectionDAG lowering I'm using a -1 TargetConstant to hide it from instruction selection and treat it differently than if the user used -1. A user -1 should be selected to a register since it doesn't fit in uimm5. This is the rest of the changes started in D109110. As mentioned there, I don't have a failing test from MachineIR optimizations anymore. Reviewed By: frasercrmck Differential Revision: https://reviews.llvm.org/D109116 * [lld/mac] Don't assert during thunk insertion if there are undefined symbols We end up calling resolveBranchVA(), which asserts for Undefineds. As fix, just return early in Writer::run() if there are any diagnostics after processing relocations (which is where undefined symbol errors are emitted). This matches what the ELF port does. Differential Revision: https://reviews.llvm.org/D109079 * Add missing `REQUIRES: asserts` to combine-icmp-to-lhs-known-bits.mir * [ARM] Add VFP lowering for fptosi.sat This extends D107865 to the VFP insructions, lowering llvm.fptosi.sat and llvm.fptoui.sat to VCVT instructions that inherently perform the saturate. Differential Revision: https://reviews.llvm.org/D107866 * [libc++][NFC] Remove uses of 'using namespace std;' in the test suite Differential Revision: https://reviews.llvm.org/D109120 * Revert "[analyzer][NFCI] Allow clients of NoStateChangeFuncVisitor to check entire function calls, rather than each ExplodedNode in it" This reverts commit a375bfb5b729e0f3ca8d5e001f423fa89e74de87. This was causing a bot to crash: https://green.lab.llvm.org/green/job/clang-stage1-cmake-RA-incremental/23380/ * [lldb/Plugins] Introduce Scripted Interface Factory This patch splits the previous `ScriptedProcessPythonInterface` into multiple specific classes: 1. The `ScriptedInterface` abstract class that carries the interface instance object and its virtual pure abstract creation method. 2. The `ScriptedPythonInterface` that holds a generic `Dispatch` method that can be used by various interfaces to call python methods and also keeps a reference to the Python Script Interpreter instance. 3. The `ScriptedProcessInterface` that describes the base Scripted Process model with all the methods used in the underlying script. All these components are used to refactor the `ScriptedProcessPythonInterface` class, making it more modular. This patch is also a requirement for the upcoming work on `ScriptedThread`. Differential Revision: https://reviews.llvm.org/D107521 Signed-off-by: Med Ismail Bennani <[email protected]> * [gn build] Port b9e57e030560 * [NFC][CSSPGO] Add end of file newline to test input On some platform (eg: AIX), diff will complain about newline. diff: Missing newline at the end of file .../llvm/test/tools/llvm-profdata/Inputs/cs-sample.proftext. * [flang] Move runtime API headers to flang/include/flang/Runtime Move the closure of the subset of flang/runtime/*.h header files that are referenced by source files outside flang/runtime (apart from unit tests) into a new directory (flang/include/flang/Runtime) so that relative include paths into ../runtime need not be used. flang/runtime/pgmath.h.inc is moved to flang/include/flang/Evaluate; it's not used by the runtime. Differential Revision: https://reviews.llvm.org/D109107 * [modules] Use `HashBuilder` and `MD5` for the module hash. Per the comments, `hash_code` values "are not stable to save or persist", so are unsuitable for the module hash, which must persist across compilations for the implicit module hashes to match. Note that in practice, today, `hash_code` are stable. But this is an implementation detail, with a clear `FIXME` indicating we should switch to a per-execution seed. The stability of `MD5` also allows modules cross-compilation use-cases. The `size_t` underlying storage for `hash_code` varying across platforms could cause mismatching hashes when cross-compiling from a 64bit target to a 32bit target. Note that native endianness is still used for the hash computation. So hashes will differ between platforms of different endianness. Reviewed By: jansvoboda11 Differential Revision: https://reviews.llvm.org/D102943 * [NFC][DWARF] Add triple to new TAG test file The file is requiring x86, but using llc without triple. This will cause problem on non-x86 platforms, as the default triple will not be x86. eg: On PowerPC le, it will emit warnings as: 'x86-64' is not a recognized processor for this target (ignoring processor) '+cx8' is not a recognized feature for this target (ignoring feature) '+fxsr' is not a recognized feature for this target (ignoring feature) '+mmx' is not a recognized feature for this target (ignoring feature) '+sse' is not a recognized feature for this target (ignoring feature) .. On some other platform, it may even crash -- if some of the feature are with same name (eg: soft-float). Add the triple as this was the intention test target. * [gn build] Reformat all files Ran `git ls-files '*.gn' '*.gni' | xargs llvm/utils/gn/gn.py format`. * [ARM] Add patterns for store(fptosisat(..)) As an extension to D107866, this adds store(fptosisat(..)) patterns, similar to the existing fptosi patterns, to prevent unnecessarily moving into gpr regs where we can use fp stores directly. Differential Revision: https://reviews.llvm.org/D108378 * [libc++abi] Remove workarounds for missing -Wno-exceptions on older GCCs https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97675 has now been resolved in GCC 11, so we can remove those workarounds. Differential Revision: https://reviews.llvm.org/D109188 * [libc++] Remove _LIBCPP_HAS_NO_LONG_LONG in favour of using_if_exists _LIBCPP_HAS_NO_LONG_LONG was only defined on FreeBSD. Instead, use the using_if_exists attribute to skip over declarations that are not available on the base system. Note that there's an annoying limitation that we can't conditionally define a function based on whether the base system provides a function, so for example we still need preprocessor logic to define the abs() and div() overloads. Differential Revision: https://reviews.llvm.org/D108630 * [AMDGPU] Small cleanup in optimizeCompareInstr. NFC. * [clang] fix error recovery ICE on copy elision when returing invalid variable See PR51708. Attempting copy elision in dependent contexts with invalid variable, such as a variable with incomplete type, would cause a crash when attempting to calculate it's alignment. The fix is to just skip this optimization on invalid VarDecl, as otherwise this provides no benefit to error recovery: This functionality does not try to diagnose anything, it only calculates a flag which will affect where the variable will be allocated during codegen. Signed-off-by: Matheus Izvekov <[email protected]> Reviewed By: rtrieu Differential Revision: https://reviews.llvm.org/D109191 * [compiler-rt][Profile] Wait for child threads in set-file-object test We've been seeing this test return 31 instead of 32 for the "functions" line in this test on our AArch64 bots. One possible cause is some of the children not finishing in time before the llvm-profdata commands are run, if the machine is heavily loaded. Wait for all the children to finish before exiting the parent. Reviewed By: zequanwu Differential Revision: https://reviews.llvm.org/D109222 * [InstCombine] add tests for icmp of rotate (PR51566); NFC * [InstCombine] reduce code duplication; NFC * [InstCombine] fold (rotate X) eq/ne (0/-1) This generalizes the examples shown in: https://llvm.org/PR51566 https://alive2.llvm.org/ce/z/V-sEy9 * [libc++][NFC] Mark values in gdb pretty print comparison functions as live to prevent values being optimized out. It appears when testing LLVM 13 on Power, we run into failures with the `libcxx/test/libcxx/gdb/gdb_pretty_printer_test.sh.cpp` test case optimizing values out. Despite some the functions in the test already being marked with optnone, adding the `MarkAsLive()` calls inside of the pretty printer comparison functions resolves the issues of the values being optimized out. This patch aims to address https://llvm.org/PR51675. Differential Revision: https://reviews.llvm.org/D109204 * [SampleFDO] Fix -Wnon-virtual-dtor Make the dtor virtual to fix the warning. * DebugInfo: Correct/improve type formatting (pointers to function types especially) This does add some extra superfluous whitespace (eg: "int *") intended to make the Simplified Template Names work easier - this makes the DIE-based names match more exactly the clang-generated names, so it's easier to identify cases that don't generate matching names. (arguably we could change clang to skip that whitespace or add some fuzzy matching to accommodate differences in certain whitespace - but this seemed easier and fairly low-impact) * Revert "[Coroutines] [Clang] Look up coroutine component in std namespace first" This reverts commit 2fbd254aa46b, which broke the libc++ CI. I'm reverting to get things stable again until we've figured out a way forward. Differential Revision: https://reviews.llvm.org/D108696 * [libc++] Add an assertion in the subrange constructors with a size hint Those constructors are very easy to misuse -- one could easily think that the size passed to the constructor is the size of the range to exhibit from the subrange. Instead, it's a size hint and it's UB to get it wrong. Hence, when it's cheap to compute the real size of the range, it's cheap to make sure that the user didn't get it wrong. Differential Revision: https://reviews.llvm.org/D108827 * [lldb] Adjust parse_frames for unnamed images Follow up to 2cbd3b04feaaaff7fab4c6500476839a23180886 which added support for unnamed images but missed the use case in parse_frames. * [NFC][OpenMP] Use clang_cc1 to driver tests The test driver-fopenmp-extensions.c is failing on platforms that does not use integrated-as. It can be reproduced using -fno-integrated-as on Linux too. bin/clang -c -Xclang -verify=omp -fopenmp -fopenmp-extensions -fno-openmp-extensions ../llvm-project/clang/test/OpenMP/driver-fopenmp-extensions.c -fno-integrated-as Assembler messages: Error: can't open /tmp/driver-fopenmp-extensions-8fafe8.s for reading: No such file or directory clang-14: error: assembler command failed with exit code 1 (use -v to see invocation) The goal of this test is to verify syntax diags only, so we should use clang_cc1 to test. Reviewed By: jdenny, ABataev Differential Revision: https://reviews.llvm.org/D109255 * [mlir][sparse] add convenience method for sparse tensor setup This simplifies setting up sparse tensors through C-style data structures. Useful for runtimes that want to interact with MLIR-generated code without knowning about all bufferization details (viz. memrefs). Reviewed By: bixia Differential Revision: https://reviews.llvm.org/D109251 * [libc] fix strtointeger hex prefix parsing Fix edge case where "0x" would be considered a complete hexadecimal number for purposes of str_end. Now the hexadecimal prefix needs a valid digit after it, else just the 0 will be counted as the number. Reviewed By: sivachandra Differential Revision: https://reviews.llvm.org/D109084 * [flang] Use CMake to determine endianness. The preprocessor definitions __BYTE_ORDER__, __ORDER_BIG_ENDIAN__, and __ORDER_LITTLE_ENDIAN__ are gcc extensions (also supported by clang), but msvc (and others) do not define them. As a result __BYTE_ORDER__ and __ORDER_BIG_ENDIAN__ both evaluate to 0 by the prepreprocessor, and __BYTE_ORDER__ == __ORDER_BIG_ENDIAN__, the first `#if` condition to 1, hence assuming the wrong byte order for x86(_64). This patch instead uses CMake's TestBigEndian module to determine target architecture's endianness at configure-time. Note this also uses the same mechanism for the runtime. If compiling flang as a cross-compiler, the runtime for the compile-target must be built separately (Flang does not support the LLVM_ENABLE_RUNTIMES mechanism yet). Fixes llvm.org/PR51597 Reviewed By: ijan1, Leporacanthicus Differential Revision: https://reviews.llvm.org/D109108 * DebugInfo: Fix a few bot failures for type dumping fixes * [clang] Allow the OpenBSD driver to link the libclang_rt.profile library. Differential Revision: https://reviews.llvm.org/D109244 * Make LLVM Linkage a first class attribute instead of using an integer attribute This makes the IR more readable, in particular when this will be used on the builtin func outside of the LLVM dialect. Reviewed By: wsmoses Differential Revision: https://reviews.llvm.org/D109209 * OpenBSD also needs execinfo * [lldb/Plugins] Move member template specialization out of class This patch should fix the build failure that surfaced when build llvm with GCC: https://lab.llvm.org/staging/#/builders/16/builds/10450 GCC complained that I explicitely specialized `ScriptedPythonInterface::ExtractValueFromPythonObject` in a in non-namespace scope, which is tolerated by Clang. To solve this issue, the specialization were declared out of the class and implemented in the source file. Signed-off-by: Med Ismail Bennani <[email protected]> * DebugInfo: additional fix missed in bc066e2. * [ORC] Silence a buggy GCC unused argument warning. * [AArch64] Implement target hook function to decide folding (mul (add x, c1), c2) Prevent the folding if it leads to worse code. Reviewed By: dmgreen Differential Revision: https://reviews.llvm.org/D108871 * Support linking against OpenMP runtime on OpenBSD. * [MLIR] Primitive linkage lowering of FuncOp FuncOp always lowers to an LLVM external linkage presently. This makes it …
commit fad7cd3 ("nbd: add the check to prevent overflow in __nbd_ioctl()") exposed something that's long been broken for semi-hosted environments like the kernel in Clang: check_mul_overflow() is implemented in terms of __builtin_mul_overflow(). For 64b operands on 32b hosts, LLVM was emitting libcalls to __mulodi4() which assumes that compiler-rt is being linked against. The kernel does not do so, so LLVM was emitting calls to functions that have no definition, resulting in the linkage failure: ERROR: modpost: "__mulodi4" [drivers/block/nbd.ko] undefined! I have been fixing LLVM upstream, see the six fixes linked from: https://bugs.llvm.org/show_bug.cgi?id=28629#c23. I still need to detect older toolchains that we'd still like to support, then find an appropriate workaround for the kernel. Disable network block devices for now, so that we don't lose coverage of 32b ARM allmodconfig builds which are currently red in our CI. Bug: 199191028 Link: ClangBuiltLinux/linux#1438 Signed-off-by: Nick Desaulniers <[email protected]> Change-Id: I79a597177f75370f60621b984cb8e21ca2a268d6
commit fad7cd3 ("nbd: add the check to prevent overflow in __nbd_ioctl()") raised an issue from the fallback helpers added in commit f090782 ("compiler.h: enable builtin overflow checkers and add fallback code") Specifically, the helpers for checking whether the results of a multiplication overflowed (__unsigned_mul_overflow, __signed_add_overflow) use the division operator when !COMPILER_HAS_GENERIC_BUILTIN_OVERFLOW. This is problematic for 64b operands on 32b hosts. Also, because the macro is type agnostic, it is very difficult to write a similarly type generic macro that dispatches to one of: * div64_s64 * div64_u64 * div_s64 * div_u64 Raising the minimum supported versions allows us to remove all of the fallback helpers for !COMPILER_HAS_GENERIC_BUILTIN_OVERFLOW, instead dispatching the compiler builtins. arm64 has already raised the minimum supported GCC version to 5.1, do this for all targets now. See the link below for the previous discussion. Link: https://lore.kernel.org/all/[email protected]/ Link: https://lore.kernel.org/lkml/CAK7LNASs6dvU6D3jL2GG3jW58fXfaj6VNOe55NJnTB8UPuk2pA@mail.gmail.com/ Link: ClangBuiltLinux/linux#1438 Reported-by: Stephen Rothwell <[email protected]> Reported-by: Nathan Chancellor <[email protected]> Suggested-by: Rasmus Villemoes <[email protected]> Signed-off-by: Nick Desaulniers <[email protected]> Reviewed-by: Kees Cook <[email protected]> Reviewed-by: Nathan Chancellor <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
Once upgrading the minimum supported version of GCC to 5.1, we can drop the fallback code for !COMPILER_HAS_GENERIC_BUILTIN_OVERFLOW. This is effectively a revert of commit f090782 ("compiler.h: enable builtin overflow checkers and add fallback code") Link: ClangBuiltLinux/linux#1438 (comment) Suggested-by: Rasmus Villemoes <[email protected]> Signed-off-by: Nick Desaulniers <[email protected]> Acked-by: Kees Cook <[email protected]> Reviewed-by: Nathan Chancellor <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
https://lore.kernel.org/lkml/[email protected]/is the latest kernel patch in this saga. |
commit fad7cd3 ("nbd: add the check to prevent overflow in __nbd_ioctl()") raised an issue from the fallback helpers added in commit f090782 ("compiler.h: enable builtin overflow checkers and add fallback code") ERROR: modpost: "__divdi3" [drivers/block/nbd.ko] undefined! As Stephen Rothwell notes: The added check_mul_overflow() call is being passed 64 bit values. COMPILER_HAS_GENERIC_BUILTIN_OVERFLOW is not set for this build (see include/linux/overflow.h). Specifically, the helpers for checking whether the results of a multiplication overflowed (__unsigned_mul_overflow, __signed_add_overflow) use the division operator when !COMPILER_HAS_GENERIC_BUILTIN_OVERFLOW. This is problematic for 64b operands on 32b hosts. This was fixed upstream by commit 76ae847 ("Documentation: raise minimum supported version of GCC to 5.1") which is not suitable to be backported to stable. Further, __builtin_mul_overflow() would emit a libcall to a compiler-rt-only symbol when compiling with clang < 14 for 32b targets. ld.lld: error: undefined symbol: __mulodi4 In order to keep stable buildable with GCC 4.9 and clang < 14, modify struct nbd_config to instead track the number of bits of the block size; reconstructing the block size using runtime checked shifts that are not problematic for those compilers and in a ways that can be backported to stable. In nbd_set_size, we do validate that the value of blksize must be a power of two (POT) and is in the range of [512, PAGE_SIZE] (both inclusive). This does modify the debugfs interface. Cc: [email protected] Cc: Arnd Bergmann <[email protected]> Cc: Rasmus Villemoes <[email protected]> Link: ClangBuiltLinux/linux#1438 Link: https://lore.kernel.org/all/[email protected]/ Link: https://lore.kernel.org/stable/CAHk-=whiQBofgis_rkniz8GBP9wZtSZdcDEffgSLO62BUGV3gg@mail.gmail.com/ Reported-by: Naresh Kamboju <[email protected]> Reported-by: Nathan Chancellor <[email protected]> Reported-by: Stephen Rothwell <[email protected]> Suggested-by: Kees Cook <[email protected]> Suggested-by: Linus Torvalds <[email protected]> Suggested-by: Pavel Machek <[email protected]> Signed-off-by: Nick Desaulniers <[email protected]> Reviewed-by: Josef Bacik <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jens Axboe <[email protected]>
(once this lands in mainline, I'll need to chase this into stable, then re-enable CONFIG_BLK_DEV_NBD=y in ACK) |
41e76c6 is in v5.15-rc4. GKH pulled this into stable 5.14. |
commit 41e76c6 upstream. commit fad7cd3 ("nbd: add the check to prevent overflow in __nbd_ioctl()") raised an issue from the fallback helpers added in commit f090782 ("compiler.h: enable builtin overflow checkers and add fallback code") ERROR: modpost: "__divdi3" [drivers/block/nbd.ko] undefined! As Stephen Rothwell notes: The added check_mul_overflow() call is being passed 64 bit values. COMPILER_HAS_GENERIC_BUILTIN_OVERFLOW is not set for this build (see include/linux/overflow.h). Specifically, the helpers for checking whether the results of a multiplication overflowed (__unsigned_mul_overflow, __signed_add_overflow) use the division operator when !COMPILER_HAS_GENERIC_BUILTIN_OVERFLOW. This is problematic for 64b operands on 32b hosts. This was fixed upstream by commit 76ae847 ("Documentation: raise minimum supported version of GCC to 5.1") which is not suitable to be backported to stable. Further, __builtin_mul_overflow() would emit a libcall to a compiler-rt-only symbol when compiling with clang < 14 for 32b targets. ld.lld: error: undefined symbol: __mulodi4 In order to keep stable buildable with GCC 4.9 and clang < 14, modify struct nbd_config to instead track the number of bits of the block size; reconstructing the block size using runtime checked shifts that are not problematic for those compilers and in a ways that can be backported to stable. In nbd_set_size, we do validate that the value of blksize must be a power of two (POT) and is in the range of [512, PAGE_SIZE] (both inclusive). This does modify the debugfs interface. Cc: [email protected] Cc: Arnd Bergmann <[email protected]> Cc: Rasmus Villemoes <[email protected]> Link: ClangBuiltLinux#1438 Link: https://lore.kernel.org/all/[email protected]/ Link: https://lore.kernel.org/stable/CAHk-=whiQBofgis_rkniz8GBP9wZtSZdcDEffgSLO62BUGV3gg@mail.gmail.com/ Reported-by: Naresh Kamboju <[email protected]> Reported-by: Nathan Chancellor <[email protected]> Reported-by: Stephen Rothwell <[email protected]> Suggested-by: Kees Cook <[email protected]> Suggested-by: Linus Torvalds <[email protected]> Suggested-by: Pavel Machek <[email protected]> Signed-off-by: Nick Desaulniers <[email protected]> Reviewed-by: Josef Bacik <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jens Axboe <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>
commit 41e76c6 upstream. commit fad7cd3 ("nbd: add the check to prevent overflow in __nbd_ioctl()") raised an issue from the fallback helpers added in commit f090782 ("compiler.h: enable builtin overflow checkers and add fallback code") ERROR: modpost: "__divdi3" [drivers/block/nbd.ko] undefined! As Stephen Rothwell notes: The added check_mul_overflow() call is being passed 64 bit values. COMPILER_HAS_GENERIC_BUILTIN_OVERFLOW is not set for this build (see include/linux/overflow.h). Specifically, the helpers for checking whether the results of a multiplication overflowed (__unsigned_mul_overflow, __signed_add_overflow) use the division operator when !COMPILER_HAS_GENERIC_BUILTIN_OVERFLOW. This is problematic for 64b operands on 32b hosts. This was fixed upstream by commit 76ae847 ("Documentation: raise minimum supported version of GCC to 5.1") which is not suitable to be backported to stable. Further, __builtin_mul_overflow() would emit a libcall to a compiler-rt-only symbol when compiling with clang < 14 for 32b targets. ld.lld: error: undefined symbol: __mulodi4 In order to keep stable buildable with GCC 4.9 and clang < 14, modify struct nbd_config to instead track the number of bits of the block size; reconstructing the block size using runtime checked shifts that are not problematic for those compilers and in a ways that can be backported to stable. In nbd_set_size, we do validate that the value of blksize must be a power of two (POT) and is in the range of [512, PAGE_SIZE] (both inclusive). This does modify the debugfs interface. Cc: [email protected] Cc: Arnd Bergmann <[email protected]> Cc: Rasmus Villemoes <[email protected]> Link: ClangBuiltLinux#1438 Link: https://lore.kernel.org/all/[email protected]/ Link: https://lore.kernel.org/stable/CAHk-=whiQBofgis_rkniz8GBP9wZtSZdcDEffgSLO62BUGV3gg@mail.gmail.com/ Reported-by: Naresh Kamboju <[email protected]> Reported-by: Nathan Chancellor <[email protected]> Reported-by: Stephen Rothwell <[email protected]> Suggested-by: Kees Cook <[email protected]> Suggested-by: Linus Torvalds <[email protected]> Suggested-by: Pavel Machek <[email protected]> Signed-off-by: Nick Desaulniers <[email protected]> Reviewed-by: Josef Bacik <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jens Axboe <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>
commit 41e76c6 upstream. commit fad7cd3 ("nbd: add the check to prevent overflow in __nbd_ioctl()") raised an issue from the fallback helpers added in commit f090782 ("compiler.h: enable builtin overflow checkers and add fallback code") ERROR: modpost: "__divdi3" [drivers/block/nbd.ko] undefined! As Stephen Rothwell notes: The added check_mul_overflow() call is being passed 64 bit values. COMPILER_HAS_GENERIC_BUILTIN_OVERFLOW is not set for this build (see include/linux/overflow.h). Specifically, the helpers for checking whether the results of a multiplication overflowed (__unsigned_mul_overflow, __signed_add_overflow) use the division operator when !COMPILER_HAS_GENERIC_BUILTIN_OVERFLOW. This is problematic for 64b operands on 32b hosts. This was fixed upstream by commit 76ae847 ("Documentation: raise minimum supported version of GCC to 5.1") which is not suitable to be backported to stable. Further, __builtin_mul_overflow() would emit a libcall to a compiler-rt-only symbol when compiling with clang < 14 for 32b targets. ld.lld: error: undefined symbol: __mulodi4 In order to keep stable buildable with GCC 4.9 and clang < 14, modify struct nbd_config to instead track the number of bits of the block size; reconstructing the block size using runtime checked shifts that are not problematic for those compilers and in a ways that can be backported to stable. In nbd_set_size, we do validate that the value of blksize must be a power of two (POT) and is in the range of [512, PAGE_SIZE] (both inclusive). This does modify the debugfs interface. Cc: [email protected] Cc: Arnd Bergmann <[email protected]> Cc: Rasmus Villemoes <[email protected]> Link: ClangBuiltLinux#1438 Link: https://lore.kernel.org/all/[email protected]/ Link: https://lore.kernel.org/stable/CAHk-=whiQBofgis_rkniz8GBP9wZtSZdcDEffgSLO62BUGV3gg@mail.gmail.com/ Reported-by: Naresh Kamboju <[email protected]> Reported-by: Nathan Chancellor <[email protected]> Reported-by: Stephen Rothwell <[email protected]> Suggested-by: Kees Cook <[email protected]> Suggested-by: Linus Torvalds <[email protected]> Suggested-by: Pavel Machek <[email protected]> Signed-off-by: Nick Desaulniers <[email protected]> Reviewed-by: Josef Bacik <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jens Axboe <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>
commit 41e76c6 upstream. commit fad7cd3 ("nbd: add the check to prevent overflow in __nbd_ioctl()") raised an issue from the fallback helpers added in commit f090782 ("compiler.h: enable builtin overflow checkers and add fallback code") ERROR: modpost: "__divdi3" [drivers/block/nbd.ko] undefined! As Stephen Rothwell notes: The added check_mul_overflow() call is being passed 64 bit values. COMPILER_HAS_GENERIC_BUILTIN_OVERFLOW is not set for this build (see include/linux/overflow.h). Specifically, the helpers for checking whether the results of a multiplication overflowed (__unsigned_mul_overflow, __signed_add_overflow) use the division operator when !COMPILER_HAS_GENERIC_BUILTIN_OVERFLOW. This is problematic for 64b operands on 32b hosts. This was fixed upstream by commit 76ae847 ("Documentation: raise minimum supported version of GCC to 5.1") which is not suitable to be backported to stable. Further, __builtin_mul_overflow() would emit a libcall to a compiler-rt-only symbol when compiling with clang < 14 for 32b targets. ld.lld: error: undefined symbol: __mulodi4 In order to keep stable buildable with GCC 4.9 and clang < 14, modify struct nbd_config to instead track the number of bits of the block size; reconstructing the block size using runtime checked shifts that are not problematic for those compilers and in a ways that can be backported to stable. In nbd_set_size, we do validate that the value of blksize must be a power of two (POT) and is in the range of [512, PAGE_SIZE] (both inclusive). This does modify the debugfs interface. Cc: [email protected] Cc: Arnd Bergmann <[email protected]> Cc: Rasmus Villemoes <[email protected]> Link: ClangBuiltLinux/linux#1438 Link: https://lore.kernel.org/all/[email protected]/ Link: https://lore.kernel.org/stable/CAHk-=whiQBofgis_rkniz8GBP9wZtSZdcDEffgSLO62BUGV3gg@mail.gmail.com/ Reported-by: Naresh Kamboju <[email protected]> Reported-by: Nathan Chancellor <[email protected]> Reported-by: Stephen Rothwell <[email protected]> Suggested-by: Kees Cook <[email protected]> Suggested-by: Linus Torvalds <[email protected]> Suggested-by: Pavel Machek <[email protected]> Signed-off-by: Nick Desaulniers <[email protected]> Reviewed-by: Josef Bacik <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jens Axboe <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>
This reverts commit ca7dad5. This re-enabled coverage of BLK_DEV_NBD. The clang-13 bug was worked around in commit 41e76c6 ("nbd: use shifts rather than multiplies") Bug: 199191028 Link: ClangBuiltLinux/linux#1438 Signed-off-by: Nick Desaulniers <[email protected]> Change-Id: Ifcb6131e5c8b12e4c784320adac639b573198b1f
BugLink: https://bugs.launchpad.net/bugs/1950516 commit 41e76c6 upstream. commit fad7cd3 ("nbd: add the check to prevent overflow in __nbd_ioctl()") raised an issue from the fallback helpers added in commit f090782 ("compiler.h: enable builtin overflow checkers and add fallback code") ERROR: modpost: "__divdi3" [drivers/block/nbd.ko] undefined! As Stephen Rothwell notes: The added check_mul_overflow() call is being passed 64 bit values. COMPILER_HAS_GENERIC_BUILTIN_OVERFLOW is not set for this build (see include/linux/overflow.h). Specifically, the helpers for checking whether the results of a multiplication overflowed (__unsigned_mul_overflow, __signed_add_overflow) use the division operator when !COMPILER_HAS_GENERIC_BUILTIN_OVERFLOW. This is problematic for 64b operands on 32b hosts. This was fixed upstream by commit 76ae847 ("Documentation: raise minimum supported version of GCC to 5.1") which is not suitable to be backported to stable. Further, __builtin_mul_overflow() would emit a libcall to a compiler-rt-only symbol when compiling with clang < 14 for 32b targets. ld.lld: error: undefined symbol: __mulodi4 In order to keep stable buildable with GCC 4.9 and clang < 14, modify struct nbd_config to instead track the number of bits of the block size; reconstructing the block size using runtime checked shifts that are not problematic for those compilers and in a ways that can be backported to stable. In nbd_set_size, we do validate that the value of blksize must be a power of two (POT) and is in the range of [512, PAGE_SIZE] (both inclusive). This does modify the debugfs interface. Cc: [email protected] Cc: Arnd Bergmann <[email protected]> Cc: Rasmus Villemoes <[email protected]> Link: ClangBuiltLinux/linux#1438 Link: https://lore.kernel.org/all/[email protected]/ Link: https://lore.kernel.org/stable/CAHk-=whiQBofgis_rkniz8GBP9wZtSZdcDEffgSLO62BUGV3gg@mail.gmail.com/ Reported-by: Naresh Kamboju <[email protected]> Reported-by: Nathan Chancellor <[email protected]> Reported-by: Stephen Rothwell <[email protected]> Suggested-by: Kees Cook <[email protected]> Suggested-by: Linus Torvalds <[email protected]> Suggested-by: Pavel Machek <[email protected]> Signed-off-by: Nick Desaulniers <[email protected]> Reviewed-by: Josef Bacik <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jens Axboe <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> Signed-off-by: Kamal Mostafa <[email protected]> Signed-off-by: Stefan Bader <[email protected]>
__has_builtin(__builtin_mul_overflow) returns true for 32b ARM targets, but Clang is deferring to compiler RT when encountering `long long` types. This breaks sanitizer builds of the Linux kernel that are using __builtin_mul_overflow with these types for these targets. If the semantics of __has_builtin mean "the compiler resolves these, always" then we shouldn't conditionally emit a libcall. This will still need to be worked around in the Linux kernel in order to continue to support allmodconfig builds of the Linux kernel for this target with older releases of clang. Link: https://bugs.llvm.org/show_bug.cgi?id=28629 Link: ClangBuiltLinux/linux#1438 Reviewed By: rengolin Differential Revision: https://reviews.llvm.org/D108842
__has_builtin(__builtin_mul_overflow) returns true for 32b MIPS targets, but Clang is deferring to compiler RT when encountering `long long` types. This breaks sanitizer builds of the Linux kernel that are using __builtin_mul_overflow with these types for these targets. If the semantics of __has_builtin mean "the compiler resolves these, always" then we shouldn't conditionally emit a libcall. This will still need to be worked around in the Linux kernel in order to continue to support malta_defconfig builds of the Linux kernel for this target with older releases of clang. Link: https://bugs.llvm.org/show_bug.cgi?id=28629 Link: ClangBuiltLinux/linux#1438 Reviewed By: rengolin Differential Revision: https://reviews.llvm.org/D108844
Similar to D108842, D108844, and D108926. __has_builtin(builtin_mul_overflow) returns true for 32b PPC targets, but Clang is deferring to compiler RT when encountering long long types. This breaks ppc44x_defconfig + CONFIG_BLK_DEV_NBD=y builds of the Linux kernel that are using builtin_mul_overflow with these types for these targets. If the semantics of __has_builtin mean "the compiler resolves these, always" then we shouldn't conditionally emit a libcall. This will still need to be worked around in the Linux kernel in order to continue to support these builds of the Linux kernel for this target with older releases of clang. Link: https://bugs.llvm.org/show_bug.cgi?id=28629 Link: ClangBuiltLinux/linux#1438 Reviewed By: nemanjai Differential Revision: https://reviews.llvm.org/D108936
Similar to D108842 and D108844. __has_builtin(builtin_mul_overflow) returns true for 32b MIPS targets, but Clang is deferring to compiler RT when encountering long long types. This breaks MIPS malta_defconfig builds of the Linux kernel that are using __builtin_mul_overflow with these types for these targets. If the semantics of __has_builtin mean "the compiler resolves these, always" then we shouldn't conditionally emit a libcall. This will still need to be worked around in the Linux kernel in order to continue to support malta_defconfig builds of the Linux kernel for this target with older releases of clang. Link: https://bugs.llvm.org/show_bug.cgi?id=28629 Link: ClangBuiltLinux/linux#1438 Reviewed By: rengolin Differential Revision: https://reviews.llvm.org/D108926
Similar to D108842, D108844, and D108926. __has_builtin(builtin_mul_overflow) returns true for 32b x86 targets, but Clang is deferring to compiler RT when encountering long long types. This breaks ARCH=i386 + CONFIG_BLK_DEV_NBD=y builds of the Linux kernel that are using builtin_mul_overflow with these types for these targets. If the semantics of __has_builtin mean "the compiler resolves these, always" then we shouldn't conditionally emit a libcall. This will still need to be worked around in the Linux kernel in order to continue to support these builds of the Linux kernel for this target with older releases of clang. Link: https://bugs.llvm.org/show_bug.cgi?id=28629 Link: https://bugs.llvm.org/show_bug.cgi?id=35922 Link: ClangBuiltLinux/linux#1438 Reviewed By: lebedev.ri, RKSimon Differential Revision: https://reviews.llvm.org/D108928
Once upgrading the minimum supported version of GCC to 5.1, we can drop the fallback code for !COMPILER_HAS_GENERIC_BUILTIN_OVERFLOW. This is effectively a revert of commit f0907827a8a9 ("compiler.h: enable builtin overflow checkers and add fallback code") Link: ClangBuiltLinux/linux#1438 (comment) Suggested-by: Rasmus Villemoes <[email protected]> Signed-off-by: Nick Desaulniers <[email protected]> Acked-by: Kees Cook <[email protected]> Reviewed-by: Nathan Chancellor <[email protected]> Signed-off-by: Linus Torvalds <[email protected]> Conflicts: include/linux/compiler-clang.h include/linux/compiler-gcc.h tools/include/linux/compiler-gcc.h tools/include/linux/overflow.h
After commit 9aa0ebde0014 ("bpf, verifier: Improve precision of BPF_MUL") [1], there is an error in certain ARM configurations that enable CONFIG_BPF_SYSCALL: ld.lld: error: undefined symbol: __mulodi4 >>> referenced by verifier.c:14221 (/builds/linux/kernel/bpf/verifier.c:14221) >>> kernel/bpf/verifier.o:(adjust_reg_min_max_vals) in archive vmlinux.a >>> referenced by verifier.c:14222 (/builds/linux/kernel/bpf/verifier.c:14222) >>> kernel/bpf/verifier.o:(adjust_reg_min_max_vals) in archive vmlinux.a >>> referenced by verifier.c:14223 (/builds/linux/kernel/bpf/verifier.c:14223) >>> kernel/bpf/verifier.o:(adjust_reg_min_max_vals) in archive vmlinux.a >>> referenced 1 more times This was encountered previously [2], where it was fixed in clang-14 and avoided in the kernel with a source code workaround (that ended up being cleaner anyways). This time around, inserting a source code workaround would not be as clean, as it may involve disabling a core part of the kernel on a limited condition (as it only impacts one supported LLVM version and architecture combination) or having a separate code path for this situation. For now, just disable the builds that are impacted by this. If more people notice this problem, we can explore bumping the minimum supported version of LLVM for building `ARCH=arm` to 14. Link: https://git.kernel.org/bpf/bpf-next/c/9aa0ebde0014f01a8ca82adcbf43b92345da0d50 [1] Link: ClangBuiltLinux/linux#1438 [2] Signed-off-by: Nathan Chancellor <[email protected]>
… with clang-13 After commit 9aa0ebde0014 ("bpf, verifier: Improve precision of BPF_MUL") [1], there is an error in certain ARM configurations that enable CONFIG_BPF_SYSCALL: ld.lld: error: undefined symbol: __mulodi4 >>> referenced by verifier.c:14221 (/builds/linux/kernel/bpf/verifier.c:14221) >>> kernel/bpf/verifier.o:(adjust_reg_min_max_vals) in archive vmlinux.a >>> referenced by verifier.c:14222 (/builds/linux/kernel/bpf/verifier.c:14222) >>> kernel/bpf/verifier.o:(adjust_reg_min_max_vals) in archive vmlinux.a >>> referenced by verifier.c:14223 (/builds/linux/kernel/bpf/verifier.c:14223) >>> kernel/bpf/verifier.o:(adjust_reg_min_max_vals) in archive vmlinux.a >>> referenced 1 more times This was encountered previously [2], where it was fixed in clang-14 and avoided in the kernel with a source code workaround (that ended up being cleaner anyways). This time around, inserting a source code workaround would not be as clean, as it may involve disabling a core part of the kernel on a limited condition (as it only impacts one supported LLVM version and architecture combination) or having a separate code path for this situation. For now, just disable the builds that are impacted by this. If more people notice this problem, we can explore bumping the minimum supported version of LLVM for building `ARCH=arm` to 14. Link: https://git.kernel.org/bpf/bpf-next/c/9aa0ebde0014f01a8ca82adcbf43b92345da0d50 [1] Link: ClangBuiltLinux/linux#1438 [2] Signed-off-by: Nathan Chancellor <[email protected]>
After commit fad7cd3310db ("nbd: add the check to prevent overflow in __nbd_ioctl()") in the block tree, the following error is seen with several different 32-bit configurations:
check_mul_overflow()
will ultimately call__builtin_mul_overflow()
, where I assume that it is problematic that the operands arelong long
so LLVM calls to the compiler-rt functions that we do not link against.cc @kees
The text was updated successfully, but these errors were encountered: