Skip to content

when to drop (or add) ppc support #2510

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
minrk opened this issue Apr 23, 2025 · 22 comments
Open

when to drop (or add) ppc support #2510

minrk opened this issue Apr 23, 2025 · 22 comments
Labels

Comments

@minrk
Copy link
Member

minrk commented Apr 23, 2025

Your question:

The linux-ppc64le target platform seems to regularly require the most effort of all platforms to keep working, while also being the least used by at least 1-2 orders of magnitude. I've been frustrated by this, and have started to remove support from my feedstocks if/when ppc failures are blocking otherwise working builds on platforms where I know people actually use these packages. Since I'm not actually sure the number of users of my ppc packages exceeds zero, it starts to feel pretty bad to spend all this volunteer effort on users who may not even exist.

Is there any guidance for maintainers on how to decide if/when to support a platform for a given package, or drop it if it used to work but is now causing problems? And maybe how to best go about (temporarily) stopping builds on a platform? I know removing the ppc line from conda-forge.yml works, as does skip: target_platform == 'linux-ppc64le', but I'm not sure what is best for bots and such, or if there's any information that should go elsewhere when dropping support, whether it's temporary, indefinitely, or permanently.

@minrk minrk added the question label Apr 23, 2025
@beckermr
Copy link
Member

The bot adds ppc64le once and then moves on.

IMHO, maintainers should feel free to drop support if they do not have the energy for it. People who need it can come and add it back if they like.

@h-vetinari
Copy link
Member

I understand the sentiment.

Whenever I looked at download numbers (e.g. 1, 2), PPC is at least a factor 10 behind the second-least-used platform, and less than 1-in-350 users overall. For example, using current libzlib downloads as a proxy:

artefact uptime downloads % 1 PPC user
to <target> users
win-arm64/libzlib-1.3.1-hfbbf558_2.conda 6 months, 20 days 174 0.00%  -
linux-ppc64le/libzlib-1.3.1-h190368a_2.conda 6 months, 20 days 44'392 0.27% 1
linux-aarch64/libzlib-1.3.1-h86ecc28_2.conda 6 months, 20 days 448'351 2.73% 10
osx-64/libzlib-1.3.1-hd23fc13_2.conda 6 months, 20 days 532'977 3.25% 12
osx-arm64/libzlib-1.3.1-h8359307_2.conda 6 months, 20 days 690'866 4.21% 16
win-64/libzlib-1.3.1-h2466b09_2.conda 6 months, 20 days 1'551'925 9.46% 35
linux-64/libzlib-1.3.1-hb9d3cd8_2.conda 6 months, 20 days 13'132'481 80.07% 296
overall - 16'401'166 100.00% 369

The numbers are the same for openssl, qualitatively

artefact uptime downloads % 1 PPC user
to <target> users
linux-ppc64le/openssl-3.5.0-hede31bd_0.conda 15 days 3'502 0.21% 1
linux-aarch64/openssl-3.5.0-hd08dc88_0.conda 15 days 47'608 2.81% 14
osx-arm64/openssl-3.5.0-h81ee809_0.conda 15 days 81'340 4.79% 23
osx-64/openssl-3.5.0-hc426f3f_0.conda 15 days 104'865 6.18% 30
win-64/openssl-3.5.0-ha4e3fda_0.conda 15 days 172'111 10.14% 49
linux-64/openssl-3.5.0-h7b32b05_0.conda 15 days 1'287'324 75.87% 368
overall - 1'696'750 100% 485

OTOH, several 1000s of users isn't nothing either 🤷

Historically, we had a lot of problems with emulation in numpy/scipy/cvxpy, which at some point lead to dropped builds and/or skipped testing. In recent times, I've had less issues with PPC though. If a recipe builds on linux-aarch64, it usually also builds on linux-ppc64le, so I've kept support where it wasn't getting in the way.

The fact that CUDA dropped support for PPC is relevant here as well, and we're missing certain key packages like pytorch too.

IMHO, maintainers should feel free to drop support if they do not have the energy for it. People who need it can come and add it back if they like.

A couple of years ago when support was first added, there were (non-core) contributors who helped get the arches going on several feedstocks I maintained, but then disappeared. I felt responsible to keep thing running and wasted a lot of time on ppc support at the time. So it's easy as a maintainer to get "trapped" in that sense.

I don't think it's realistic that conda-forge ever drops ppc support, but I'd be OK to empower maintainers to drop ppc support and reject it even if someone comes with a PR. Unless that person signs up for future maintenance, it will just mean ppc gets dropped at the next issue that comes up.

@mfansler
Copy link
Member

OTOH, several 1000s of users isn't nothing either 🤷

What fraction of that is Conda Forge CI? According to the migration status, those both have 2.2K children, so it seems possible that building other ppc64le packages could drive many downloads.

@h-vetinari
Copy link
Member

What fraction of that is Conda Forge CI?

It should be essentially 0. I don't know the specifics, but our downloads from our CI aren't (supposed to be) counted. The first ~10 downloads or so are due to various mirrors(?) or so (at least, I've never seen a main package with less than that), but after that it should only be cf-external users.

@minrk
Copy link
Member Author

minrk commented Apr 24, 2025

It should be essentially 0

Perhaps it should be, but I'm not sure it is. I just triggered a rebuild of slepc and all the petsc downloads incremented by one. It seems unlikely to me that something else downloaded all the ppc builds the exact same number of times the CI builds did in the same 20 minutes I was watching, when they only had 7 downloads before. It does seem like every CI download is counted.

@beckermr
Copy link
Member

Yes my guess is that CI downloads are counted.

@hmaarrfk
Copy link
Contributor

i find that PPC64le struggles alot with graphics stack. My understanding is that PPC64le is exclusively used in data centers, so I've been liberal with dropping it when it causes me problems for those packages...

@h-vetinari
Copy link
Member

Perhaps it should be, but I'm not sure it is. I just triggered a rebuild of slepc and all the petsc downloads incremented by one.

I don't have the exact mechanics, but roughly: the first download is when the package makes it through the CDN, and as I mentioned

The first ~10 downloads or so are due to various mirrors(?) or so

By "10", I literally mean ten. IME, anything <=10-12 downloads is completely unused (again, I don't know which factors exactly conspire for that to be the case).

Yes my guess is that CI downloads are counted.

Hm. I had thought that this was taken into account (and had convinced myself on this based on some anecdotal evidence). We'd be generating 100'000s of downloads daily, so that would make a very substantial chunk of our download numbers.

@minrk
Copy link
Member Author

minrk commented Apr 30, 2025

Not at all scientific, and I'm not sure it's informative, but in the time since you posted numbers for openssl 3.5.0 (6 days), the download numbers have increased by:

platform increase
linux-64 631,289
linux-aarch64 24,715
linux-ppc64le 1,389
osx-64 33,760
osx-arm64 36,817
win-64 70,574

with this number of non-cancelled builds per target platform on azure in a similar time frame:

name count
linux-64 6,652
linux-aarch64 2,540
linux-ppc64le 1,793
linux-s390x 5
osx-64 4,008
osx-arm64 2,805
win-64 2,927
win-arm64 79

As I understand it, during ci setup openssl gets downloaded for the build platform. Assuming all linux builds are cross-compiled (not the case, but probably close), that would only account for ~10k of 630k linux-64 downloads, and 7k of 33k osx-64 downloads. I don't know how common openssl is as a host dependency. Notably, ppc is the only platform where the build count exceeds the total download count. No other platform comes anywhere close.

This obviously doesn't take into account lots of information, so it might be useless:

  • ci builds not on azure
  • no idea how often openssl shows up in build, host, or run dependencies
  • multi-output builds, test env installation
  • caching?

But I think another point in favor of CI installs being counted is that osx-64 is downloaded almost as much as osx-arm64, when ~all macs sold in the last 6 years have been arm, and the arm mac miniforge installer is about 10 times as popular as osx-64, while all mac CI builds for conda-forge still all run on osx-64. I suspect conda-forge CI accounts for a pretty large fraction of counted osx-64 downloads.

@minrk
Copy link
Member Author

minrk commented Apr 30, 2025

about 10 times as popular as osx-64

Oops, I missed that there were 2 urls per platform. ARM mac downloads are only 50% higher than Intel.

@h-vetinari
Copy link
Member

ARM mac downloads are only 50% higher than Intel.

This is consistent with numbers I'm seeing across the board in recent months1.

But I think another point in favor of CI installs being counted is that osx-64 is downloaded almost as much as osx-arm64, when ~all macs sold in the last 6 years have been arm

I've been puzzled by the longevity of the osx-64 numbers, but OTOH, the latest macOS 15 still supports hardware from ~2018 onwards, and these old(er) devices are still around in big numbers2.

Notably, ppc is the only platform where the build count exceeds the total download count.

Regardless of the exact mechanics (i.e. whether it's downloads from the ci-setup, or build/host/run), I think that's a pretty solid argument that CI downloads are not being counted. :)

Footnotes

  1. the fact that osx-64 was ahead of osx-arm64 for openssl was an outlier, not the rule

  2. I won't speculate to what degree macro trends like e.g. a worsening economy might influence more people to keep using their old hardware.

@hmaarrfk
Copy link
Contributor

I know one user of ppc64le, @jeongseok-meta. Perhaps you want to share your "user story" so we can better understand one example of the need for ppc64le packages in the real world.

@hmaarrfk
Copy link
Contributor

i also have a friend that used to work at IBM that used ppc64le, but honestly, jayfurmanek was the ppc64le champion at the time and now i understand he is no longer working there, and not sure how much those ppc64le supercomputers are used there anymore.

@leofang
Copy link
Member

leofang commented Apr 30, 2025

FWIW CUDA dropped ppc64le support entirely by CUDA 12.5. In my (personal) opinion the maintenance overhead of ppc packages on conda-forge is too high and we should drop it too sooner than later.

@minrk
Copy link
Member Author

minrk commented Apr 30, 2025

, I think that's a pretty solid argument that CI downloads are not being counted. :)

I'm not sure I follow there. Due to cross compiling, only builds which have openssl in host or run dependencies (or the rare emulated builds) would increment the download count for arm/ppc. If most but not all packages depend on openssl, then slightly lower but similar order is exactly what I'd expect to see if all or at least most CI downloads were counted. So to me, these numbers indicate that CI downloads are counted and also account for a likely very large fraction of ppc installs.

Since I seem to be able to reliably at-will increment download counts by triggering CI and watching anaconda.org numbers go up by the exact number of builds, I think we can confidently say that CI downloads are counted, at least up to some point.

@beckermr
Copy link
Member

Yes for sure CI downloads are counted. Anaconda would have to track and form reject lists for the IP addresses of all of the microsoft-hosted azure, travis, and other CI provier workers to exclude them which seems a priori impossible or would require so much effort as to be not worth the cost.

@jeongseok-meta
Copy link

I know one user of ppc64le, @jeongseok-meta. Perhaps you want to share your "user story" so we can better understand one example of the need for ppc64le packages in the real world.

Thanks for pinging. I don't currently have a use case for ppc64le. I've just attempted to support it by ensuring its availability in the conda-forge ecosystem, and as such, I also have the same question regarding this issue. ;)

@isuruf
Copy link
Member

isuruf commented May 1, 2025

ppc64le is used in supercomputers like Summit, Lassen by maintainers like @matthiasdiener. If a package is having specific problems with ppc64le, that's okay to drop.

@hmaarrfk
Copy link
Contributor

hmaarrfk commented May 1, 2025

would it be acceptable to "split" the migrator for PPC64le away from the aarch migrator?

@isuruf
Copy link
Member

isuruf commented May 1, 2025

Sure, that requires some coding on the bot to do that.

@hmaarrfk
Copy link
Contributor

hmaarrfk commented May 3, 2025

I really tried to find where the bot reads the aarch migration file but couldn’t find it. Can you point me to it?

@isuruf
Copy link
Member

isuruf commented May 5, 2025

It's at https://github.com/regro/cf-scripts/blob/main/conda_forge_tick/migrators/arch.py#L134

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Development

No branches or pull requests

8 participants