Skip to content

Implement pinning for container images consumed in dotnet/runtime #113455

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
richlander opened this issue Mar 12, 2025 · 6 comments
Open

Implement pinning for container images consumed in dotnet/runtime #113455

richlander opened this issue Mar 12, 2025 · 6 comments
Labels
area-Infrastructure untriaged New issue has not been triaged by the area owner

Comments

@richlander
Copy link
Member

The dotnet/runtime repo CI is tough to maintain on a good day. We try to use a config-as-code approach wherever possible so that repo change is controlled via merged PRs. There are some systems that do not play nicely with that, including prereqs container images. Container images are a system we control and can adopt a different plan.

I want to use this issue as a formal request to invest in an automated code-flow style pinning system. It's another team that will implement this, but I wanted to create the issue here since the need is here. The work has already started (yeahh!) but I wanted to do this properly with a well-described request.

We had a conversation on this topic last year. @jkotas shared some good context on which this is needed. It stands on its own, so I'll not explain that further.

More recently, we wanted to move to clang 20 in an existing container image which was already in active use in our main branch. We knew that there was a prior break in clang 20 that required a runtime change. That was a sort of "fire where smoke" signal. We decided to proactively manually pin the container images in runtime to avoid any more risk. This required writing a tool to make that practical. We were then able to safely upgrade to clang 20 in the existing image. That proved smart since there was significant breakage. We discovered that on a Friday afternoon and were able to not care about it over the weekend since all of the breakage was contained within a PR.

We could continue to use this manual pattern if we believe that we could reliable precog and act on all of the prereqs images where there will be breakage. That's not a good bet. Instead, we believe that a code-flow model would be best. I propose a nightly digest model where all stale digests are updated in a single PR with an outler-loop run. That's intended as a starting-point proposal.

This idea can be extended to other modalities. We had significant challenge deploying Azure Linux VMs in runtime. This also resulted in ~ 1 day break of runtime CI as part of the process. That wasn't fun. It would be great if we (as an org) could no-fear-deploy VMs at any time via a similar config-as-code codeflow process.

@dotnet-issue-labeler dotnet-issue-labeler bot added the needs-area-label An area label is needed to ensure this gets routed to the appropriate area owners label Mar 12, 2025
@dotnet-policy-service dotnet-policy-service bot added the untriaged New issue has not been triaged by the area owner label Mar 12, 2025
@mthalman
Copy link
Member

This is essentially a request for this feature: dotnet/dotnet-buildtools-prereqs-docker#1321

Also related: dotnet/arcade#15594

Copy link
Contributor

Tagging subscribers to this area: @dotnet/runtime-infrastructure
See info in area-owners.md if you want to be subscribed.

@jkoritzinsky
Copy link
Member

We should also follow the same strategy in the VMR as it uses the same images to build the runtime repo's code.

@steveisok
Copy link
Member

Why can't we use / expand darc for something like this? I guess that's along the same lines of dotnet/dotnet-buildtools-prereqs-docker#1321

Could be an AI opportunity ;-)

@mthalman
Copy link
Member

Why can't we use / expand darc for something like this? I guess that's along the same lines of dotnet/dotnet-buildtools-prereqs-docker#1321

Could be an AI opportunity ;-)

Read through dotnet/arcade#15594 if you haven't already. Renovate already supports these scenarios. I don't think anyone's interested in reimplementing Renovate and stuffing it into darc.

@richlander
Copy link
Member Author

Put a different way, we should be biasing to off the shelf tools and reducing our reliance on custom ones.

@vcsjones vcsjones removed the needs-area-label An area label is needed to ensure this gets routed to the appropriate area owners label Mar 14, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area-Infrastructure untriaged New issue has not been triaged by the area owner
Projects
Status: No status
Development

No branches or pull requests

6 participants