-
Notifications
You must be signed in to change notification settings - Fork 1
Initial statically linked clang image #5
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
The goal with this bootstrap is that it goes in the garbage ASAP; it's just enough to start building dependencies from source. |
6c30d26
to
8566635
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just some initial comments, hopefully that helps clarify some things!
pushed a few changes, not quite done yet; happy to squash when merging, or force push squashed |
docker produces the following error if the input file is unspecified (via `-f Dockerfile`): unable to prepare context: unable to evaluate symlinks in Dockerfile path: lstat /android2/containers/linux/Dockerfile: no such file or directory Suggested by @nathanchance in #5 (comment)
This image still has object files built with GCC (from library dependencies). But it's a start to start rebuilding dependencies from scratch. Link: ClangBuiltLinux/tc-build#150
These are needed by the libc...though the libc is needed for HOSTCC...
Also, I just stumbled across the concept of https://reddit.com/r/archlinux/comments/ut383i/_/i97fgfk/?context=1 |
Right, we can export files between dependant workflows. We do so for CI with the builds.json file. Is that what you were thinking?
So that's something I'm still not certain about. Initially, I was thinking we'd publish the results of this build, then immediately rewrite the Dockerfile to replace the bootstrapping with the newly published container. But thinking about supporting other hosts (aarch64 in particular), I think we might want to keep the bootstrapping script alive a little longer, even if we use it only once to produce the initial image. I don't have enough experience with Docker to know whether there are any conventions for what we're trying to do. My initial preference is to use the top level directories of this repo as a namespace mirroring the open source projects we're building, so I guess that would be multiple files (Dockerfiles) or multiple stages to this folder. Perhaps we put things in sub folders under llvm-project/, which I would be fine with, too. Perhaps llvm-project/stage1/Dockerfile, llvm-project/stage2/Dockerfile, llvm-project/stage3/Dockerfile. IDK The thing I still don't like about docker is multistage builds across multiple distinct Dockerfiles seems to require some orchestration via a shell script to name/tag the resulting container. I want to be able to test the full chain locally (which is easier with one Dockerfile IMO) but not be able to push to the container registry (no one other than CI would be my preference), but also have the CI test these individual pieces as distinct workflows which will give each piece more time to run, and we can better express the dependency chain, it seems. |
How is that different from |
Yes, exactly. I was thinking about having each container build be its own step but I think we need to make them individual jobs so that we get the time benefits. The Docker folks have a nice example of doing this: https://github.com/docker/build-push-action/blob/master/docs/advanced/share-image-jobs.md
I think this organization makes sense, especially if we are talking about reusing certain parts later.
Right, I think we can just get away with having a script that handles this; we might want to move the
|
Just makes grouping and removing dependencies a little easier, as you could do something like:
vs.
|
609c40d
to
25fc46e
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is solid enough for now, I can nitpick style until the cows come home :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It would be nice to add the comments on the CMake cache up front, but the rest of it seems reasonable.
Interesting, I see how this works. So it looks like you can serialize/deserialize a docker container to/from a tarball. It looks like it's just the upload/download artifact pieces we're using else where. I see...
SGTM. Perhaps "sysroot" since we will need to distribute clang, lld, other llvm binaries, but also a few compiler supplied headers that are required. At that point, throwing the libc++ headers+libraries wouldn't hurt. Then we have a sysroot that can be the basis of a musl and libc++ based distro. ;) I'm going to save changing directory structure to a follow up PR where we focus more on the CI wiring.
Ah, yeah this seems a bit more descriptive. I wonder if you can repeat that command to "append" to the running list that a given identifier represents? Because I think I might want to install dependencies as late a possible ("just in time") until when they are absolutely needed, but then clean everything up all at once. Though I do wonder if we can remove some dependencies as soon as they are no longer needed. That might be overkill.
Sure. I think code review of those to check my decisions are sound would be helpful (even to clarify my own understanding). If we can't explain why a flag should be used, I feel it does not belong. Let me take the time to add some comments in this PR. Thanks both for all of the good tips; I'm impressed with how much better this looks than my initial commit! :) |
llvm-project/stage2.cmake
Outdated
|
||
# Use libunwind from stage1. | ||
# Statically link resulting executable. | ||
# TODO: why is -lc++abi necessary for static link? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Because static libraries do not carry DT_NEEDED
and thus we cannot get the transitive closure for the symbols and will fail to link because lld
behaves incorrectly (undefined symbols are encouraged by ELF).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
because lld behaves incorrectly (undefined symbols are encouraged by ELF).
"Behaves incorrectly?" Does BFD do something differently/correctly?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, BFD does permit the undefined symbols IIRC. However, re-reading the lines below, the executable should be fully linked and not have undefined symbols - that does make sense (even though it may be different from the default expected semantics).
That said, I think this is more about the expectations and Unix defaults ... its just different and correctness here refers to the fact that it changes the expectations by default - which means that the behaviour is now different which for a drop-in replacement is arguably incorrect.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
cc @MaskRay
# TODO: passing in the value of $(clang -print-multiarch) causes failures. | ||
# It seems that alpine clang's default target triple is x86_64-linux-gnu. | ||
# Perhaps missing alpine in the triple causes some incompatibility? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think clang -print-multiarch
prints the default target triple, it seems like we really want clang -print-target-triple
?
On AArch64:
$ podman run --rm -ti docker.io/alpine:edge sh -c "apk add --no-cache clang 2>&1 >/dev/null && clang -print-target-triple"
aarch64-alpine-linux-musl
On x86_64:
$ podman run --rm -ti docker.io/alpine:edge sh -c "apk add --no-cache clang 2>&1 >/dev/null && clang -print-target-triple"
x86_64-alpine-linux-musl
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ah, makes sense the flag for the target triple is named as such! My mistake for trying the wrong one. Let me give that a shot. Thanks for the tip.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
idk, I'm still running into issues trying to replace this. Will look further some other time.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are you trying to replace this within the cmake
cache or in the Dockerfile? Happen to have a diff that you tried so I can see if I can reproduce locally?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm trying to have cmake .... -D TRIPLE=$(clang -print-target-triple)
in the Dockerfile, then set(LLVM_DEFAULT_TARGET_TRIPLE "${TRIPLE}")
in the stageX.cmake.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah yeah, I don’t think that will work. I think we just ditch trying to preserve that in the cache and just define the value during the cmake
invocation. It is not a static value so we shouldn’t try to treat it as such.
cmake -B … -DLLVM_DEFAULT_TARGET_TRIPLE=$(clang -print-target-triple) …
-fuse-ld=lld is already enabled by LLVM_ENABLE_LLD.
mucking with apk add re-untars sources, which takes a while.
This image still has object files built with GCC (from library
dependencies). But it's a start to start rebuilding dependencies from
scratch.
This is less asking for code review and more so checking that my "tagging" in build.sh is correct (not sure where to put version info, if any), directory structure (llvm-project/ with multiple Dockerfiles for different stages?), and more so a heads up so I can publish this on our Docker registry, then start rebuilding dependencies from scratch.
Link: ClangBuiltLinux/tc-build#150