Skip to content

[ISSUE-815] Generate Random Numbers Asynchronously on the GPU #859

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 37 commits into
base: SharedDevelopment
Choose a base branch
from

Conversation

AndrewBMadison
Copy link

@AndrewBMadison AndrewBMadison commented Jun 10, 2025

Closes #815

Description

Replaced the custom Mersenne Twister GPU kernel with an AsyncPhilox_d class that asynchronously fills GPU buffers with random noise using cuRAND's Philox generator. The class supports double-buffering and is designed for concurrent execution.

GPUModel initializes Philox states and fills two initial buffers via loadPhilox() on a member AsyncPhilox_d instance. During each advance() call, requestSegment() retrieves a float* slice from the currently active buffer, sized appropriately for each vertex and ready to be used in advanceVertices().

Once a buffer is consumed, fillBuffer() is triggered on the other buffer while the current one continues to serve slices. This ensures continuous data availability through double-buffering.

AsyncPhilox_d uses its own internal CUDA stream to launch fill kernels asynchronously. To enable true concurrency, all other compute kernels needed to also use non-default streams. This is necessary because stream 0 (the default stream) implicitly synchronizes with all other streams, preventing concurrent execution and causing the scheduler to serialize kernel launches even when they could run in parallel.

Checklist (Mandatory for new features)

  • Added Documentation
  • Added Unit Tests

Testing (Mandatory for all changes)

  • GPU Test: test-medium-connected.xml Passed
  • GPU Test: test-large-long.xml Passed

@AndrewBMadison AndrewBMadison self-assigned this Jun 11, 2025
@AndrewBMadison AndrewBMadison requested a review from stiber June 11, 2025 01:25
Copy link
Contributor

@stiber stiber left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor cleanup.

@AndrewBMadison
Copy link
Author

Minor cleanup.

I have implemented all the changes you requested as well as renamed the stream used by all the synchronous kernels to simulationStream (simulationStream_ as a member variable). I have also added new documentation to the developer docs and linked it into index.md. The old MersenneTwister files are still there if anyone wanted to try it out again, but if you'd like I could remove them.

@AndrewBMadison AndrewBMadison requested a review from stiber June 23, 2025 17:04
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants