Description
Problem
I would like to load zarr
data directly onto non-CPU devices (especially GPU). The current approach appears to rely on using cupy
to load onto cupy
-supported devices e.g. https://github.com/rapidsai/kvikio/blob/branch-25.02/notebooks/zarr.ipynb.
Unfortunately, there are a number of devices that are not supported by cupy
e.g. I don't believe that my Apple Metal GPU is supported. This means that I must load from zarr
via CPU if I would like to use these devices e.g. zarr
on disk -> numpy
-> torch
(which has Metal support).
This is slow(er) and I don't believe is necessary from the zarr
specification alone (?).
Background
Multi-device support is a very important requirement in the AI/ML community. I would like to use zarr
(and specifically the Python implementation) to run models such as LLMs on multiple devices. The quicker it is to load the model onto device (and with reduced memory usage etc), the better the UX and developer experience is.
Questions
- Is
cupy
the correct/only way to load direct to GPU withzarr-python
? - Is there/will there be any way of loading direct to devices such as Metal with
zarr-python
? - (Related) What is the best way to load a PyTorch neural network on GPU with
zarr-python
? Is itcupy
and then using something like dlpack for zero-copy exchange? Are there alternatives?
cc @jhamman (as suggested by @TomNicholas)