Skip to content

Use device functions that accept pointer arguments in ccc.cl and cuda.parallel #4249

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Draft
wants to merge 8 commits into
base: main
Choose a base branch
from

Conversation

shwina
Copy link
Contributor

@shwina shwina commented Mar 24, 2025

Description

Closes #4372

Checklist

  • New or existing tests cover these changes.
  • The documentation is up to date with these changes.

Copy link

copy-pr-bot bot commented Mar 24, 2025

Auto-sync is disabled for draft pull requests in this repository. Workflows must be run manually.

Contributors can view more details about this message here.

@cccl-authenticator-app cccl-authenticator-app bot moved this from Todo to In Progress in CCCL Mar 24, 2025
@shwina shwina force-pushed the fix-struct-types-handling branch from 6804f02 to 4c3cbee Compare April 10, 2025 20:21
@shwina shwina changed the title Fix struct types handling to eliminate ABI mismatch between C++ and Numba Use device functions that accept pointer arguments in ccc.cl and cuda.parallel Apr 10, 2025
@shwina
Copy link
Contributor Author

shwina commented Apr 10, 2025

/ok to test

1 similar comment
@shwina
Copy link
Contributor Author

shwina commented Apr 10, 2025

/ok to test

Copy link
Contributor

🟨 CI finished in 1h 37m: Pass: 66%/3 | Total: 2h 00m | Avg: 40m 05s | Max: 1h 36m | Hits: 98%/164
  • 🟨 cccl_c_parallel: Pass: 50%/2 | Total: 23m 39s | Avg: 11m 49s | Max: 21m 21s | Hits: 98%/164

    🚨 jobs: Test 🚨
      🟩 Build              Pass: 100%/1   | Total:  2m 18s | Avg:  2m 18s | Max:  2m 18s | Hits:  98%/164   
      🔥 Test               Pass:   0%/1   | Total: 21m 21s | Avg: 21m 21s | Max: 21m 21s
    🟨 cpu
      🟨 amd64              Pass:  50%/2   | Total: 23m 39s | Avg: 11m 49s | Max: 21m 21s | Hits:  98%/164   
    🟨 ctk
      🟨 12.8               Pass:  50%/2   | Total: 23m 39s | Avg: 11m 49s | Max: 21m 21s | Hits:  98%/164   
    🟨 cudacxx
      🟨 nvcc12.8           Pass:  50%/2   | Total: 23m 39s | Avg: 11m 49s | Max: 21m 21s | Hits:  98%/164   
    🟨 cudacxx_family
      🟨 nvcc               Pass:  50%/2   | Total: 23m 39s | Avg: 11m 49s | Max: 21m 21s | Hits:  98%/164   
    🟨 cxx
      🟨 GCC13              Pass:  50%/2   | Total: 23m 39s | Avg: 11m 49s | Max: 21m 21s | Hits:  98%/164   
    🟨 cxx_family
      🟨 GCC                Pass:  50%/2   | Total: 23m 39s | Avg: 11m 49s | Max: 21m 21s | Hits:  98%/164   
    🟨 gpu
      🟨 rtx2080            Pass:  50%/2   | Total: 23m 39s | Avg: 11m 49s | Max: 21m 21s | Hits:  98%/164   
    
  • 🟩 python: Pass: 100%/1 | Total: 1h 36m | Avg: 1h 36m | Max: 1h 36m

    🟩 cpu
      🟩 amd64              Pass: 100%/1   | Total:  1h 36m | Avg:  1h 36m | Max:  1h 36m
    🟩 ctk
      🟩 12.8               Pass: 100%/1   | Total:  1h 36m | Avg:  1h 36m | Max:  1h 36m
    🟩 cudacxx
      🟩 nvcc12.8           Pass: 100%/1   | Total:  1h 36m | Avg:  1h 36m | Max:  1h 36m
    🟩 cudacxx_family
      🟩 nvcc               Pass: 100%/1   | Total:  1h 36m | Avg:  1h 36m | Max:  1h 36m
    🟩 cxx
      🟩 GCC13              Pass: 100%/1   | Total:  1h 36m | Avg:  1h 36m | Max:  1h 36m
    🟩 cxx_family
      🟩 GCC                Pass: 100%/1   | Total:  1h 36m | Avg:  1h 36m | Max:  1h 36m
    🟩 gpu
      🟩 rtx2080            Pass: 100%/1   | Total:  1h 36m | Avg:  1h 36m | Max:  1h 36m
    🟩 jobs
      🟩 Test               Pass: 100%/1   | Total:  1h 36m | Avg:  1h 36m | Max:  1h 36m
    

👃 Inspect Changes

Modifications in project?

Project
CCCL Infrastructure
libcu++
CUB
Thrust
CUDA Experimental
stdpar
+/- python
+/- CCCL C Parallel Library
Catch2Helper

Modifications in project or dependencies?

Project
CCCL Infrastructure
libcu++
CUB
Thrust
CUDA Experimental
stdpar
+/- python
+/- CCCL C Parallel Library
Catch2Helper

🏃‍ Runner counts (total jobs: 3)

# Runner
2 linux-amd64-gpu-rtx2080-latest-1
1 linux-amd64-cpu16

@shwina
Copy link
Contributor Author

shwina commented Apr 11, 2025

/ok to test

Copy link
Contributor

🟩 CI finished in 1h 50m: Pass: 100%/3 | Total: 1h 55m | Avg: 38m 38s | Max: 1h 31m | Hits: 98%/328
  • 🟩 cccl_c_parallel: Pass: 100%/2 | Total: 24m 34s | Avg: 12m 17s | Max: 22m 10s | Hits: 98%/328

    🟩 cpu
      🟩 amd64              Pass: 100%/2   | Total: 24m 34s | Avg: 12m 17s | Max: 22m 10s | Hits:  98%/328   
    🟩 ctk
      🟩 12.8               Pass: 100%/2   | Total: 24m 34s | Avg: 12m 17s | Max: 22m 10s | Hits:  98%/328   
    🟩 cudacxx
      🟩 nvcc12.8           Pass: 100%/2   | Total: 24m 34s | Avg: 12m 17s | Max: 22m 10s | Hits:  98%/328   
    🟩 cudacxx_family
      🟩 nvcc               Pass: 100%/2   | Total: 24m 34s | Avg: 12m 17s | Max: 22m 10s | Hits:  98%/328   
    🟩 cxx
      🟩 GCC13              Pass: 100%/2   | Total: 24m 34s | Avg: 12m 17s | Max: 22m 10s | Hits:  98%/328   
    🟩 cxx_family
      🟩 GCC                Pass: 100%/2   | Total: 24m 34s | Avg: 12m 17s | Max: 22m 10s | Hits:  98%/328   
    🟩 gpu
      🟩 rtx2080            Pass: 100%/2   | Total: 24m 34s | Avg: 12m 17s | Max: 22m 10s | Hits:  98%/328   
    🟩 jobs
      🟩 Build              Pass: 100%/1   | Total:  2m 24s | Avg:  2m 24s | Max:  2m 24s | Hits:  98%/164   
      🟩 Test               Pass: 100%/1   | Total: 22m 10s | Avg: 22m 10s | Max: 22m 10s | Hits:  98%/164   
    
  • 🟩 python: Pass: 100%/1 | Total: 1h 31m | Avg: 1h 31m | Max: 1h 31m

    🟩 cpu
      🟩 amd64              Pass: 100%/1   | Total:  1h 31m | Avg:  1h 31m | Max:  1h 31m
    🟩 ctk
      🟩 12.8               Pass: 100%/1   | Total:  1h 31m | Avg:  1h 31m | Max:  1h 31m
    🟩 cudacxx
      🟩 nvcc12.8           Pass: 100%/1   | Total:  1h 31m | Avg:  1h 31m | Max:  1h 31m
    🟩 cudacxx_family
      🟩 nvcc               Pass: 100%/1   | Total:  1h 31m | Avg:  1h 31m | Max:  1h 31m
    🟩 cxx
      🟩 GCC13              Pass: 100%/1   | Total:  1h 31m | Avg:  1h 31m | Max:  1h 31m
    🟩 cxx_family
      🟩 GCC                Pass: 100%/1   | Total:  1h 31m | Avg:  1h 31m | Max:  1h 31m
    🟩 gpu
      🟩 rtx2080            Pass: 100%/1   | Total:  1h 31m | Avg:  1h 31m | Max:  1h 31m
    🟩 jobs
      🟩 Test               Pass: 100%/1   | Total:  1h 31m | Avg:  1h 31m | Max:  1h 31m
    

👃 Inspect Changes

Modifications in project?

Project
CCCL Infrastructure
libcu++
CUB
Thrust
CUDA Experimental
stdpar
+/- python
+/- CCCL C Parallel Library
Catch2Helper

Modifications in project or dependencies?

Project
CCCL Infrastructure
libcu++
CUB
Thrust
CUDA Experimental
stdpar
+/- python
+/- CCCL C Parallel Library
Catch2Helper

🏃‍ Runner counts (total jobs: 3)

# Runner
2 linux-amd64-gpu-rtx2080-latest-1
1 linux-amd64-cpu16

@shwina
Copy link
Contributor Author

shwina commented Apr 17, 2025

/ok to test

2 similar comments
@shwina
Copy link
Contributor Author

shwina commented Apr 17, 2025

/ok to test

@shwina
Copy link
Contributor Author

shwina commented Apr 17, 2025

/ok to test

Copy link
Contributor

🟨 CI finished in 26m 38s: Pass: 60%/5 | Total: 58m 10s | Avg: 11m 38s | Max: 23m 28s | Hits: 98%/160
  • 🟨 python: Pass: 66%/3 | Total: 32m 20s | Avg: 10m 46s | Max: 16m 39s

    🚨 jobs: cuda.parallel 🚨
      🟩 cuda.cccl          Pass: 100%/1   | Total:  9m 57s | Avg:  9m 57s | Max:  9m 57s
      🟩 cuda.cooperative   Pass: 100%/1   | Total: 16m 39s | Avg: 16m 39s | Max: 16m 39s
      🔥 cuda.parallel      Pass:   0%/1   | Total:  5m 44s | Avg:  5m 44s | Max:  5m 44s
    🟨 cpu
      🟨 amd64              Pass:  66%/3   | Total: 32m 20s | Avg: 10m 46s | Max: 16m 39s
    🟨 ctk
      🟨 12.8               Pass:  66%/3   | Total: 32m 20s | Avg: 10m 46s | Max: 16m 39s
    🟨 cudacxx
      🟨 nvcc12.8           Pass:  66%/3   | Total: 32m 20s | Avg: 10m 46s | Max: 16m 39s
    🟨 cudacxx_family
      🟨 nvcc               Pass:  66%/3   | Total: 32m 20s | Avg: 10m 46s | Max: 16m 39s
    🟨 cxx
      🟨 GCC13              Pass:  66%/3   | Total: 32m 20s | Avg: 10m 46s | Max: 16m 39s
    🟨 cxx_family
      🟨 GCC                Pass:  66%/3   | Total: 32m 20s | Avg: 10m 46s | Max: 16m 39s
    🟨 gpu
      🟨 rtx2080            Pass:  66%/3   | Total: 32m 20s | Avg: 10m 46s | Max: 16m 39s
    
  • 🟨 cccl_c_parallel: Pass: 50%/2 | Total: 25m 50s | Avg: 12m 55s | Max: 23m 28s | Hits: 98%/160

    🚨 jobs: Test 🚨
      🟩 Build              Pass: 100%/1   | Total:  2m 22s | Avg:  2m 22s | Max:  2m 22s | Hits:  98%/160   
      🔥 Test               Pass:   0%/1   | Total: 23m 28s | Avg: 23m 28s | Max: 23m 28s
    🟨 cpu
      🟨 amd64              Pass:  50%/2   | Total: 25m 50s | Avg: 12m 55s | Max: 23m 28s | Hits:  98%/160   
    🟨 ctk
      🟨 12.8               Pass:  50%/2   | Total: 25m 50s | Avg: 12m 55s | Max: 23m 28s | Hits:  98%/160   
    🟨 cudacxx
      🟨 nvcc12.8           Pass:  50%/2   | Total: 25m 50s | Avg: 12m 55s | Max: 23m 28s | Hits:  98%/160   
    🟨 cudacxx_family
      🟨 nvcc               Pass:  50%/2   | Total: 25m 50s | Avg: 12m 55s | Max: 23m 28s | Hits:  98%/160   
    🟨 cxx
      🟨 GCC13              Pass:  50%/2   | Total: 25m 50s | Avg: 12m 55s | Max: 23m 28s | Hits:  98%/160   
    🟨 cxx_family
      🟨 GCC                Pass:  50%/2   | Total: 25m 50s | Avg: 12m 55s | Max: 23m 28s | Hits:  98%/160   
    🟨 gpu
      🟨 rtx2080            Pass:  50%/2   | Total: 25m 50s | Avg: 12m 55s | Max: 23m 28s | Hits:  98%/160   
    

👃 Inspect Changes

Modifications in project?

Project
CCCL Infrastructure
libcu++
CUB
Thrust
CUDA Experimental
stdpar
+/- python
+/- CCCL C Parallel Library
Catch2Helper

Modifications in project or dependencies?

Project
CCCL Infrastructure
libcu++
CUB
Thrust
CUDA Experimental
stdpar
+/- python
+/- CCCL C Parallel Library
Catch2Helper

🏃‍ Runner counts (total jobs: 5)

# Runner
4 linux-amd64-gpu-rtx2080-latest-1
1 linux-amd64-cpu16

@shwina shwina force-pushed the fix-struct-types-handling branch from ed9fc21 to c6ab4a0 Compare April 22, 2025 15:03
@shwina
Copy link
Contributor Author

shwina commented Apr 22, 2025

/ok to test

Copy link
Contributor

🟩 CI finished in 35m 56s: Pass: 100%/5 | Total: 1h 05m | Avg: 13m 05s | Max: 32m 42s | Hits: 95%/324
  • 🟩 python: Pass: 100%/3 | Total: 30m 23s | Avg: 10m 07s | Max: 16m 10s

    🟩 cpu
      🟩 amd64              Pass: 100%/3   | Total: 30m 23s | Avg: 10m 07s | Max: 16m 10s
    🟩 ctk
      🟩 12.8               Pass: 100%/3   | Total: 30m 23s | Avg: 10m 07s | Max: 16m 10s
    🟩 cudacxx
      🟩 nvcc12.8           Pass: 100%/3   | Total: 30m 23s | Avg: 10m 07s | Max: 16m 10s
    🟩 cudacxx_family
      🟩 nvcc               Pass: 100%/3   | Total: 30m 23s | Avg: 10m 07s | Max: 16m 10s
    🟩 cxx
      🟩 GCC13              Pass: 100%/3   | Total: 30m 23s | Avg: 10m 07s | Max: 16m 10s
    🟩 cxx_family
      🟩 GCC                Pass: 100%/3   | Total: 30m 23s | Avg: 10m 07s | Max: 16m 10s
    🟩 gpu
      🟩 rtx2080            Pass: 100%/3   | Total: 30m 23s | Avg: 10m 07s | Max: 16m 10s
    🟩 jobs
      🟩 cuda.cccl          Pass: 100%/1   | Total:  7m 16s | Avg:  7m 16s | Max:  7m 16s
      🟩 cuda.cooperative   Pass: 100%/1   | Total: 16m 10s | Avg: 16m 10s | Max: 16m 10s
      🟩 cuda.parallel      Pass: 100%/1   | Total:  6m 57s | Avg:  6m 57s | Max:  6m 57s
    
  • 🟩 cccl_c_parallel: Pass: 100%/2 | Total: 35m 04s | Avg: 17m 32s | Max: 32m 42s | Hits: 95%/324

    🟩 cpu
      🟩 amd64              Pass: 100%/2   | Total: 35m 04s | Avg: 17m 32s | Max: 32m 42s | Hits:  95%/324   
    🟩 ctk
      🟩 12.8               Pass: 100%/2   | Total: 35m 04s | Avg: 17m 32s | Max: 32m 42s | Hits:  95%/324   
    🟩 cudacxx
      🟩 nvcc12.8           Pass: 100%/2   | Total: 35m 04s | Avg: 17m 32s | Max: 32m 42s | Hits:  95%/324   
    🟩 cudacxx_family
      🟩 nvcc               Pass: 100%/2   | Total: 35m 04s | Avg: 17m 32s | Max: 32m 42s | Hits:  95%/324   
    🟩 cxx
      🟩 GCC13              Pass: 100%/2   | Total: 35m 04s | Avg: 17m 32s | Max: 32m 42s | Hits:  95%/324   
    🟩 cxx_family
      🟩 GCC                Pass: 100%/2   | Total: 35m 04s | Avg: 17m 32s | Max: 32m 42s | Hits:  95%/324   
    🟩 gpu
      🟩 rtx2080            Pass: 100%/2   | Total: 35m 04s | Avg: 17m 32s | Max: 32m 42s | Hits:  95%/324   
    🟩 jobs
      🟩 Build              Pass: 100%/1   | Total:  2m 22s | Avg:  2m 22s | Max:  2m 22s | Hits:  92%/162   
      🟩 Test               Pass: 100%/1   | Total: 32m 42s | Avg: 32m 42s | Max: 32m 42s | Hits:  98%/162   
    

👃 Inspect Changes

Modifications in project?

Project
CCCL Infrastructure
libcu++
CUB
Thrust
CUDA Experimental
stdpar
+/- python
+/- CCCL C Parallel Library
Catch2Helper

Modifications in project or dependencies?

Project
CCCL Infrastructure
libcu++
CUB
Thrust
CUDA Experimental
stdpar
+/- python
+/- CCCL C Parallel Library
Catch2Helper

🏃‍ Runner counts (total jobs: 5)

# Runner
4 linux-amd64-gpu-rtx2080-latest-1
1 linux-amd64-cpu16

@griwes
Copy link
Contributor

griwes commented Apr 23, 2025

You probably want this to also deal with the changes from #3439 now that it is merged.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: In Progress
Development

Successfully merging this pull request may close these issues.

Fix custom type handling in cuda.parallel
2 participants