-
Notifications
You must be signed in to change notification settings - Fork 17
Add Support for CRaC enabled JDK Distributions #500
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
Thanks for raising this up. Presently, you could make this work by packaging up your own copy of the Bellsoft Liberica buildpack. You would first need to adjust sha256 and uri to point to the CRAC-enabled JVM you want, then package the buildpack. Lastly, you can consume your buildpack with the instructions in that link. Instead of specifying an alternative Paketo JVM buildpack, point to your image. i.e. Long term, I think we're waiting to see how folks want to use CRAC with buildpacks. We can certainly bundle and install a CRAC-enabled JVM, but there's more work that needs to happen for CRAC to be useful. In particular, you need to start and run the app to generate the checkpoint. How long the app runs and what it does while the checkpoint is being generated are unclear, and buildpacks would have limits if they were to attempt to automatically generate a checkpoint. Going further, that checkpoint needs to live somewhere. Buildpacks could put it into the image, but it's unclear if that's what people would want/expect. As someone asking for this feature, we'd like to hear your thoughts on how you plan to use CRAC & what you'd expect buildpacks to do. That will help us create better support in buildpacks for this functionality. Thanks |
Thanks for the hint of packaging a custom buildpack. It would just be more sustainable having the mentioned support built-in. I would like to use buildpacks as part of a more comprehensive pipeline (i.e. Tekton) for deployments on Kubernetes clusters. This pipeline consists of a sequence of steps. For example:
This could resemble the workflow in the build environment where the first start-up would still be slow. But all future deployments (e.g. staging or production environments) would benefit from the CRaC enabled image. |
The trouble is that you can't do step 5.) there. Your app image is generated in step 2.) when you build with buildpacks. Once the image is written, you can't change it. OCI images are immutable (guaranteed via hashes). You could build a new one with that information included, but it's a rebuild of the image that produces a new image. So something like:
The second build could be pretty quick because of all the caching that we do. It's extra steps though. Another possibility is:
This is one step less, but requires a volume mount and those can be a problem/non-starter for some users. The other option might be a config map, but I suspect the checkpoint info might be too larger for that. and another is:
This is the least steps/most automated but it is very difficult for buildpacks to start up a random app successfully. It might require resources not available like a service (DB, message queue, etc..). We might be able to get a little farther if we constrain the types of apps supported like if we only support Spring apps. In that case, we might be able to more reliably start the app but even then, you could still have issues with required services. and yet another possibility is:
I can see some advantages to this approach, but it has the drawback of needing work done before the app starts which takes time. Ultimately, CRaC is about starting the app super fast, so that it negates the benefits of it. Anyway, I appreciate your thoughts and feedback. If anyone else comes across this thread, please add your feedback too. |
The two possibilities mentioned in between (second and third) seem a little hard to implement. Possibility 2 would require the creation (and deletion) of an additional Persistent Volume/ConfigMap for each version of the app in every environment. As far as possibility 3 is concerned: I'm afraid it won't be possible for the buildpack to start all containers as part of the build process in enterprise environments. Because they often depend on a number of other resources (ConfigMaps, DBs, Leader/Follower instances...) that might only be available in the namespace where they get deployed afterwards (e.g. by a Helm chart that provides these artifacts). But I would like to pick up on your first and your last proposed possibilities. Let's name them "Heavy" (=first) and "Light" (=last) for now. The heavyweight option includes the checkpoint data in the app's image itself and the lightweight option fetches the checkpoint info separately at startup. HeavyPros:
Cons:
These cons are not present in the lightweight version. But "Light" comes with other challenges already mentioned. LightPros:
Cons:
The cons of the "Light" version could be dealt with like below. Checkpoint Data StorageSince the checkpoint data must be fetched on-demand during app deployment, it needs a place to be published at alongside it's corresponding image version. Option A: Option B: Fast App Start of "Light" Option on KubernetesThe concern about the extra work needed for fetching the checkpoint data on-the-fly at startup might be addressed like this: The sidecar container, that fetches the checkpoint data, should write an additional file (e.g. "checkpoint-fetch-completed") when there was either no checkpoint available or it's download was completed. ConclusionThe disadvantages of storing app image and checkpoint data separately can be handled by existing Kubernetes functionalities. I suppose both, the heavyweight as well as the lightweight, solutions come with some trade-offs but could work. |
The spring-boot buildpack has added CDS support where already a training run is performed which generates the CDS archive. This seems to me to be almost the same as supporting CRaC with a training run.
Same as with the CDS training run, you as a developer have to make sure that your spring-boot application can be started in a training run without external resssources. As suggested I gave it a try and built my own bellsoft-liberica and own spring-boot buildpack. In the training run the checkpoint files couldn't be generated due to missing privileges. When I start a training run by hand I would pass --privileged to docker which is missing here. I don't know if something similar is possible when a buildpack is executed. |
What specifically do you set when you do this with a Dockerfile? |
Nothing special in the Dockerfile only when executing the Docker Image. Quoting from https://bell-sw.com/blog/how-to-use-crac-with-spring-boot-apps-in-a-docker-container/ : docker run -d --privileged -v $(pwd)/storage/:/storage/ -w /storage --name petclinic-app-container bellsoft/liberica-runtime-container:jdk-21-crac-slim-glibc java -Xmx512m -XX:CRaCCheckpointTo=/storage/checkpoint-spring-petclinic -jar spring-petclinic-3.2.0-SNAPSHOT.jar Please note the --privileged option, which is necessary for the correct CRaC and the underlying criu executable behavior. |
Ok, that's what I was suspecting. I don't think that'll work with buildpacks because there is a lot of effort to run things as non-privileged users (we don't even run as root, let alone with a privileged flag). I'll ask around though and see what I can find out. |
From what I can remember, adding the PTRACE capability 'only' might also be sufficient: docker run --cap-add=SYS_PTRACE The thing with CraC is, that you need to put in manual effort to exclude individual parts of the memory that shouldn't be projected into the checkpoint file (e.g. stage-specific properties/variables/URLs/password s). But last January, Azul Systems told me, they were working on providing their CraC enabled version as part of the Paketo Buildpack. Not sure if this was pursued any further, though. |
Description of Enhancement
The BP_JVM_TYPE variable only lets us choose between "JDK" and "JRE" at the moment.
It would be nice to have a new variable or an additional value available for choosing a CRaC enabled JDK/JRE distribution.
The buildpack should then download and use the CRaC enabled distribution instead of the standard version.
Possible Solution
Add a new boolean variable "BP_CRAC_ENABLED".
Or add allowed values for existing BP_JVM_TYPE variable. Something like "JDK-CRAC" or "JRE-CRAC".
Motivation
We could keep using the existing buildpack whilst migrating to CRaC for snappy container start-ups.
The text was updated successfully, but these errors were encountered: