Replies: 6 comments 4 replies
-
Ultimately I think we need access to the This comment in torch suggest that you can access it from this |
Beta Was this translation helpful? Give feedback.
-
@Matt711 torch_stream = torch.cuda.Stream(device=device)
cupy_stream = cupy.cuda.ExternalStream(torch_stream.cuda_stream)
rmm_stream = rmm.pylibrmm.stream.Stream(cupy_stream)
print(rmm_stream)
print(rmm_stream.is_default())
rmm_stream.synchronize()
d_buffer = rmm.DeviceBuffer(size=10, stream=rmm_stream) Output:
|
Beta Was this translation helpful? Give feedback.
-
One small but useful feature that I propose is to replicate the behavior of CuPy and PyTorch in accepting an Another feature that I believe is essential and should mimic the behavior of CuPy and PyTorch is the ability to export the stream pointer as an |
Beta Was this translation helpful? Give feedback.
-
@leofang can you comment on how cuda.core is aiming to standardize cross-library stream references in Python? |
Beta Was this translation helpful? Give feedback.
-
The relevant pieces from
I am planning to overhaul RMM's Python/Cython interface to improve interoperability for vocabulary types like CUDA streams that should be usable across libraries. I want to consolidate |
Beta Was this translation helpful? Give feedback.
-
As an end-user of both RMM and PyTorch, I posed the question regarding interoperable streams between PyTorch and RMM as #1829 . The reason is that both the RMM and PyTorch APIs have their own stream implementations, and I'm interested in exploring the possibility of converting between PyTorch streams and RMM streams. I'm aware that currently, I'll keep this discussion open until I receive word from NVIDIA indicating that it can be closed. Initially, my question was from the perspective of a high-level user of PyTorch and RMM, but now it has delved into the underlying details of |
Beta Was this translation helpful? Give feedback.
-
This is also potentially a PyTorch-related question.
I'm aware that we can use RMM with PyTorch for efficient memory allocation. I also know that it's possible to create a stream in python via
rmm.pylibrmm.stream
. Moreover in C++ RMM, there's evenrmm::cuda_stream_pool
for the efficient utilization of streams.This leads me to wonder if it's possible to create an RMM stream (which is essentially a
cudaStream_t
under the hood) and then convert it to a PyTorch stream.And furthermore, in the Python world, is there something planed in future to be similar to
rmm::cuda_stream_pool
that PyTorch users could also benefit from in the form of a stream pool?I did check around inside this repo but only found pytorch with RMM for memory allocations: https://github.com/rapidsai/rmm/blob/branch-25.04/python/rmm/rmm/tests/test_rmm_pytorch.py
Beta Was this translation helpful? Give feedback.
All reactions