Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow NVRTC to compile more of CUB #3951

Merged
merged 5 commits into from
Feb 27, 2025
Merged

Conversation

bernhardmgruber
Copy link
Contributor

@bernhardmgruber bernhardmgruber commented Feb 26, 2025

I mainly used this to find more places where we include std headers instead of cuda/std headers.

Copy link
Contributor

🟩 CI finished in 1h 33m: Pass: 100%/93 | Total: 2d 15h | Avg: 41m 03s | Max: 1h 16m | Hits: 72%/133929
  • 🟩 cub: Pass: 100%/45 | Total: 1d 16h | Avg: 54m 10s | Max: 1h 16m | Hits: 64%/53485

    🟩 cpu
      🟩 amd64              Pass: 100%/43  | Total:  1d 14h | Avg: 53m 49s | Max:  1h 16m | Hits:  64%/51055 
      🟩 arm64              Pass: 100%/2   | Total:  2h 03m | Avg:  1h 01m | Max:  1h 05m | Hits:  61%/2430  
    🟩 ctk
      🟩 12.0               Pass: 100%/5   | Total:  4h 54m | Avg: 58m 55s | Max:  1h 04m | Hits:  53%/5908  
      🟩 12.5               Pass: 100%/2   | Total:  2h 26m | Avg:  1h 13m | Max:  1h 14m | Hits:  48%/2248  
      🟩 12.8               Pass: 100%/38  | Total:  1d 09h | Avg: 52m 33s | Max:  1h 16m | Hits:  66%/45329 
    🟩 cudacxx
      🟩 ClangCUDA18        Pass: 100%/2   | Total:  2h 08m | Avg:  1h 04m | Max:  1h 04m | Hits:  67%/2100  
      🟩 nvcc12.0           Pass: 100%/5   | Total:  4h 54m | Avg: 58m 55s | Max:  1h 04m | Hits:  53%/5908  
      🟩 nvcc12.5           Pass: 100%/2   | Total:  2h 26m | Avg:  1h 13m | Max:  1h 14m | Hits:  48%/2248  
      🟩 nvcc12.8           Pass: 100%/36  | Total:  1d 07h | Avg: 51m 55s | Max:  1h 16m | Hits:  66%/43229 
    🟩 cudacxx_family
      🟩 ClangCUDA          Pass: 100%/2   | Total:  2h 08m | Avg:  1h 04m | Max:  1h 04m | Hits:  67%/2100  
      🟩 nvcc               Pass: 100%/43  | Total:  1d 14h | Avg: 53m 43s | Max:  1h 16m | Hits:  64%/51385 
    🟩 cxx
      🟩 Clang14            Pass: 100%/4   | Total:  3h 52m | Avg: 58m 09s | Max:  1h 00m | Hits:  62%/4868  
      🟩 Clang15            Pass: 100%/2   | Total:  1h 50m | Avg: 55m 18s | Max: 55m 26s | Hits:  62%/2430  
      🟩 Clang16            Pass: 100%/2   | Total:  1h 59m | Avg: 59m 46s | Max:  1h 00m | Hits:  62%/2430  
      🟩 Clang17            Pass: 100%/2   | Total:  1h 58m | Avg: 59m 19s | Max:  1h 02m | Hits:  62%/2430  
      🟩 Clang18            Pass: 100%/7   | Total:  5h 41m | Avg: 48m 50s | Max:  1h 04m | Hits:  74%/8175  
      🟩 GCC7               Pass: 100%/2   | Total:  1h 59m | Avg: 59m 31s | Max:  1h 01m | Hits:  61%/2434  
      🟩 GCC8               Pass: 100%/1   | Total: 57m 23s | Avg: 57m 23s | Max: 57m 23s | Hits:  61%/1217  
      🟩 GCC9               Pass: 100%/2   | Total:  1h 54m | Avg: 57m 07s | Max: 57m 23s | Hits:  61%/2434  
      🟩 GCC10              Pass: 100%/2   | Total:  2h 01m | Avg:  1h 00m | Max:  1h 01m | Hits:  61%/2434  
      🟩 GCC11              Pass: 100%/2   | Total:  1h 57m | Avg: 58m 33s | Max:  1h 00m | Hits:  61%/2430  
      🟩 GCC12              Pass: 100%/2   | Total:  2h 05m | Avg:  1h 02m | Max:  1h 03m | Hits:  61%/2430  
      🟩 GCC13              Pass: 100%/11  | Total:  7h 07m | Avg: 38m 53s | Max:  1h 16m | Hits:  82%/13365 
      🟩 MSVC14.29          Pass: 100%/2   | Total:  2h 17m | Avg:  1h 08m | Max:  1h 13m | Hits:  13%/2080  
      🟩 MSVC14.42          Pass: 100%/2   | Total:  2h 28m | Avg:  1h 14m | Max:  1h 14m | Hits:  13%/2080  
      🟩 NVHPC24.7          Pass: 100%/2   | Total:  2h 26m | Avg:  1h 13m | Max:  1h 14m | Hits:  48%/2248  
    🟩 cxx_family
      🟩 Clang              Pass: 100%/17  | Total: 15h 23m | Avg: 54m 18s | Max:  1h 04m | Hits:  67%/20333 
      🟩 GCC                Pass: 100%/22  | Total: 18h 02m | Avg: 49m 13s | Max:  1h 16m | Hits:  72%/26744 
      🟩 MSVC               Pass: 100%/4   | Total:  4h 45m | Avg:  1h 11m | Max:  1h 14m | Hits:  13%/4160  
      🟩 NVHPC              Pass: 100%/2   | Total:  2h 26m | Avg:  1h 13m | Max:  1h 14m | Hits:  48%/2248  
    🟩 gpu
      🟩 h100               Pass: 100%/3   | Total:  1h 14m | Avg: 24m 45s | Max: 27m 01s | Hits:  87%/3645  
      🟩 rtx2080            Pass: 100%/34  | Total:  1d 11h | Avg:  1h 02m | Max:  1h 16m | Hits:  56%/40120 
      🟩 rtxa6000           Pass: 100%/8   | Total:  4h 07m | Avg: 30m 54s | Max:  1h 00m | Hits:  90%/9720  
    🟩 jobs
      🟩 Build              Pass: 100%/37  | Total:  1d 13h | Avg:  1h 01m | Max:  1h 16m | Hits:  56%/43765 
      🟩 DeviceLaunch       Pass: 100%/1   | Total: 21m 21s | Avg: 21m 21s | Max: 21m 21s | Hits:  99%/1215  
      🟩 GraphCapture       Pass: 100%/1   | Total: 16m 50s | Avg: 16m 50s | Max: 16m 50s | Hits:  99%/1215  
      🟩 HostLaunch         Pass: 100%/3   | Total:  1h 12m | Avg: 24m 17s | Max: 25m 40s | Hits:  99%/3645  
      🟩 TestGPU            Pass: 100%/3   | Total:  1h 06m | Avg: 22m 09s | Max: 25m 14s | Hits:  99%/3645  
    🟩 sm
      🟩 90                 Pass: 100%/3   | Total:  1h 14m | Avg: 24m 45s | Max: 27m 01s | Hits:  87%/3645  
      🟩 90;90a;100         Pass: 100%/1   | Total:  1h 16m | Avg:  1h 16m | Max:  1h 16m | Hits:  61%/1215  
    🟩 std
      🟩 17                 Pass: 100%/20  | Total: 20h 40m | Avg:  1h 02m | Max:  1h 14m | Hits:  55%/23535 
      🟩 20                 Pass: 100%/25  | Total: 19h 57m | Avg: 47m 54s | Max:  1h 16m | Hits:  72%/29950 
    
  • 🟩 thrust: Pass: 100%/45 | Total: 21h 54m | Avg: 29m 12s | Max: 56m 00s | Hits: 76%/80136

    🟩 cmake_options
      🟩 -DTHRUST_DISPATCH_TYPE=Force32bit Pass: 100%/2   | Total: 33m 50s | Avg: 16m 55s | Max: 23m 25s | Hits:  88%/3564  
    🟩 cpu
      🟩 amd64              Pass: 100%/43  | Total: 21h 02m | Avg: 29m 22s | Max: 56m 00s | Hits:  76%/76573 
      🟩 arm64              Pass: 100%/2   | Total: 51m 37s | Avg: 25m 48s | Max: 27m 40s | Hits:  76%/3563  
    🟩 ctk
      🟩 12.0               Pass: 100%/5   | Total:  2h 47m | Avg: 33m 24s | Max: 51m 47s | Hits:  72%/8901  
      🟩 12.5               Pass: 100%/2   | Total:  1h 47m | Avg: 53m 41s | Max: 56m 00s | Hits:  62%/3562  
      🟩 12.8               Pass: 100%/38  | Total: 17h 20m | Avg: 27m 22s | Max: 55m 16s | Hits:  78%/67673 
    🟩 cudacxx
      🟩 ClangCUDA18        Pass: 100%/2   | Total: 49m 37s | Avg: 24m 48s | Max: 26m 35s | Hits:  77%/3562  
      🟩 nvcc12.0           Pass: 100%/5   | Total:  2h 47m | Avg: 33m 24s | Max: 51m 47s | Hits:  72%/8901  
      🟩 nvcc12.5           Pass: 100%/2   | Total:  1h 47m | Avg: 53m 41s | Max: 56m 00s | Hits:  62%/3562  
      🟩 nvcc12.8           Pass: 100%/36  | Total: 16h 30m | Avg: 27m 30s | Max: 55m 16s | Hits:  78%/64111 
    🟩 cudacxx_family
      🟩 ClangCUDA          Pass: 100%/2   | Total: 49m 37s | Avg: 24m 48s | Max: 26m 35s | Hits:  77%/3562  
      🟩 nvcc               Pass: 100%/43  | Total: 21h 04m | Avg: 29m 25s | Max: 56m 00s | Hits:  76%/76574 
    🟩 cxx
      🟩 Clang14            Pass: 100%/4   | Total:  1h 53m | Avg: 28m 16s | Max: 29m 52s | Hits:  76%/7124  
      🟩 Clang15            Pass: 100%/2   | Total: 55m 41s | Avg: 27m 50s | Max: 28m 10s | Hits:  76%/3562  
      🟩 Clang16            Pass: 100%/2   | Total: 59m 10s | Avg: 29m 35s | Max: 30m 18s | Hits:  76%/3562  
      🟩 Clang17            Pass: 100%/2   | Total: 55m 06s | Avg: 27m 33s | Max: 28m 19s | Hits:  76%/3562  
      🟩 Clang18            Pass: 100%/7   | Total:  2h 25m | Avg: 20m 50s | Max: 27m 32s | Hits:  83%/12467 
      🟩 GCC7               Pass: 100%/2   | Total: 57m 19s | Avg: 28m 39s | Max: 28m 49s | Hits:  76%/3564  
      🟩 GCC8               Pass: 100%/1   | Total: 30m 03s | Avg: 30m 03s | Max: 30m 03s | Hits:  76%/1782  
      🟩 GCC9               Pass: 100%/2   | Total:  1h 02m | Avg: 31m 02s | Max: 31m 40s | Hits:  76%/3564  
      🟩 GCC10              Pass: 100%/2   | Total: 59m 12s | Avg: 29m 36s | Max: 31m 28s | Hits:  76%/3564  
      🟩 GCC11              Pass: 100%/2   | Total: 57m 37s | Avg: 28m 48s | Max: 29m 01s | Hits:  76%/3564  
      🟩 GCC12              Pass: 100%/2   | Total:  1h 01m | Avg: 30m 52s | Max: 30m 58s | Hits:  76%/3564  
      🟩 GCC13              Pass: 100%/10  | Total:  3h 27m | Avg: 20m 44s | Max: 31m 40s | Hits:  84%/17820 
      🟩 MSVC14.29          Pass: 100%/2   | Total:  1h 40m | Avg: 50m 29s | Max: 51m 47s | Hits:  54%/3550  
      🟩 MSVC14.42          Pass: 100%/3   | Total:  2h 21m | Avg: 47m 16s | Max: 55m 16s | Hits:  59%/5325  
      🟩 NVHPC24.7          Pass: 100%/2   | Total:  1h 47m | Avg: 53m 41s | Max: 56m 00s | Hits:  62%/3562  
    🟩 cxx_family
      🟩 Clang              Pass: 100%/17  | Total:  7h 08m | Avg: 25m 13s | Max: 30m 18s | Hits:  79%/30277 
      🟩 GCC                Pass: 100%/21  | Total:  8h 55m | Avg: 25m 29s | Max: 31m 40s | Hits:  80%/37422 
      🟩 MSVC               Pass: 100%/5   | Total:  4h 02m | Avg: 48m 33s | Max: 55m 16s | Hits:  57%/8875  
      🟩 NVHPC              Pass: 100%/2   | Total:  1h 47m | Avg: 53m 41s | Max: 56m 00s | Hits:  62%/3562  
    🟩 gpu
      🟩 h100               Pass: 100%/2   | Total: 30m 18s | Avg: 15m 09s | Max: 18m 58s | Hits:  88%/3564  
      🟩 rtx2080            Pass: 100%/33  | Total: 17h 44m | Avg: 32m 15s | Max: 56m 00s | Hits:  73%/58769 
      🟩 rtx4090            Pass: 100%/10  | Total:  3h 39m | Avg: 21m 58s | Max: 55m 13s | Hits:  84%/17803 
    🟩 jobs
      🟩 Build              Pass: 100%/38  | Total: 20h 21m | Avg: 32m 08s | Max: 56m 00s | Hits:  73%/67671 
      🟩 TestCPU            Pass: 100%/3   | Total: 46m 39s | Avg: 15m 33s | Max: 31m 19s | Hits:  90%/5338  
      🟩 TestGPU            Pass: 100%/4   | Total: 46m 39s | Avg: 11m 39s | Max: 14m 43s | Hits:  96%/7127  
    🟩 sm
      🟩 90                 Pass: 100%/2   | Total: 30m 18s | Avg: 15m 09s | Max: 18m 58s | Hits:  88%/3564  
      🟩 90;90a;100         Pass: 100%/1   | Total: 31m 24s | Avg: 31m 24s | Max: 31m 24s | Hits:  76%/1782  
    🟩 std
      🟩 17                 Pass: 100%/20  | Total: 11h 17m | Avg: 33m 53s | Max: 56m 00s | Hits:  72%/35611 
      🟩 20                 Pass: 100%/23  | Total: 10h 03m | Avg: 26m 13s | Max: 55m 13s | Hits:  79%/40961 
    
  • 🟩 cccl_c_parallel: Pass: 100%/2 | Total: 15m 05s | Avg: 7m 32s | Max: 12m 31s | Hits: 97%/308

    🟩 cpu
      🟩 amd64              Pass: 100%/2   | Total: 15m 05s | Avg:  7m 32s | Max: 12m 31s | Hits:  97%/308   
    🟩 ctk
      🟩 12.8               Pass: 100%/2   | Total: 15m 05s | Avg:  7m 32s | Max: 12m 31s | Hits:  97%/308   
    🟩 cudacxx
      🟩 nvcc12.8           Pass: 100%/2   | Total: 15m 05s | Avg:  7m 32s | Max: 12m 31s | Hits:  97%/308   
    🟩 cudacxx_family
      🟩 nvcc               Pass: 100%/2   | Total: 15m 05s | Avg:  7m 32s | Max: 12m 31s | Hits:  97%/308   
    🟩 cxx
      🟩 GCC13              Pass: 100%/2   | Total: 15m 05s | Avg:  7m 32s | Max: 12m 31s | Hits:  97%/308   
    🟩 cxx_family
      🟩 GCC                Pass: 100%/2   | Total: 15m 05s | Avg:  7m 32s | Max: 12m 31s | Hits:  97%/308   
    🟩 gpu
      🟩 rtx2080            Pass: 100%/2   | Total: 15m 05s | Avg:  7m 32s | Max: 12m 31s | Hits:  97%/308   
    🟩 jobs
      🟩 Build              Pass: 100%/1   | Total:  2m 34s | Avg:  2m 34s | Max:  2m 34s | Hits:  96%/154   
      🟩 Test               Pass: 100%/1   | Total: 12m 31s | Avg: 12m 31s | Max: 12m 31s | Hits:  98%/154   
    
  • 🟩 python: Pass: 100%/1 | Total: 51m 05s | Avg: 51m 05s | Max: 51m 05s

    🟩 cpu
      🟩 amd64              Pass: 100%/1   | Total: 51m 05s | Avg: 51m 05s | Max: 51m 05s
    🟩 ctk
      🟩 12.8               Pass: 100%/1   | Total: 51m 05s | Avg: 51m 05s | Max: 51m 05s
    🟩 cudacxx
      🟩 nvcc12.8           Pass: 100%/1   | Total: 51m 05s | Avg: 51m 05s | Max: 51m 05s
    🟩 cudacxx_family
      🟩 nvcc               Pass: 100%/1   | Total: 51m 05s | Avg: 51m 05s | Max: 51m 05s
    🟩 cxx
      🟩 GCC13              Pass: 100%/1   | Total: 51m 05s | Avg: 51m 05s | Max: 51m 05s
    🟩 cxx_family
      🟩 GCC                Pass: 100%/1   | Total: 51m 05s | Avg: 51m 05s | Max: 51m 05s
    🟩 gpu
      🟩 rtx2080            Pass: 100%/1   | Total: 51m 05s | Avg: 51m 05s | Max: 51m 05s
    🟩 jobs
      🟩 Test               Pass: 100%/1   | Total: 51m 05s | Avg: 51m 05s | Max: 51m 05s
    

👃 Inspect Changes

Modifications in project?

Project
CCCL Infrastructure
libcu++
+/- CUB
+/- Thrust
CUDA Experimental
python
CCCL C Parallel Library
Catch2Helper

Modifications in project or dependencies?

Project
CCCL Infrastructure
libcu++
+/- CUB
+/- Thrust
CUDA Experimental
+/- python
+/- CCCL C Parallel Library
+/- Catch2Helper

🏃‍ Runner counts (total jobs: 93)

# Runner
66 linux-amd64-cpu16
9 windows-amd64-cpu16
6 linux-amd64-gpu-rtxa6000-latest-1
4 linux-arm64-cpu16
3 linux-amd64-gpu-h100-latest-1
3 linux-amd64-gpu-rtx4090-latest-1
2 linux-amd64-gpu-rtx2080-latest-1

@leofang
Copy link
Member

leofang commented Feb 27, 2025

btw another ongoing NVRTC-related work: #3699

@leofang leofang self-requested a review February 27, 2025 07:30
bernhardmgruber and others added 4 commits February 27, 2025 10:23
Co-authored-by: Michael Schellenberger Costa <miscco@nvidia.com>
Co-authored-by: Michael Schellenberger Costa <miscco@nvidia.com>
@miscco miscco enabled auto-merge (squash) February 27, 2025 09:33
Copy link
Contributor

🟩 CI finished in 1h 24m: Pass: 100%/93 | Total: 2d 13h | Avg: 39m 59s | Max: 1h 14m | Hits: 75%/133929
  • 🟩 cub: Pass: 100%/45 | Total: 1d 15h | Avg: 52m 31s | Max: 1h 14m | Hits: 69%/53485

    🟩 cpu
      🟩 amd64              Pass: 100%/43  | Total:  1d 13h | Avg: 52m 12s | Max:  1h 14m | Hits:  69%/51055 
      🟩 arm64              Pass: 100%/2   | Total:  1h 58m | Avg: 59m 08s | Max: 59m 53s | Hits:  68%/2430  
    🟩 ctk
      🟩 12.0               Pass: 100%/5   | Total:  4h 58m | Avg: 59m 45s | Max:  1h 08m | Hits:  58%/5908  
      🟩 12.5               Pass: 100%/2   | Total:  2h 11m | Avg:  1h 05m | Max:  1h 07m | Hits:  63%/2248  
      🟩 12.8               Pass: 100%/38  | Total:  1d 08h | Avg: 50m 52s | Max:  1h 14m | Hits:  71%/45329 
    🟩 cudacxx
      🟩 ClangCUDA18        Pass: 100%/2   | Total:  1h 57m | Avg: 58m 45s | Max: 59m 37s | Hits:  73%/2100  
      🟩 nvcc12.0           Pass: 100%/5   | Total:  4h 58m | Avg: 59m 45s | Max:  1h 08m | Hits:  58%/5908  
      🟩 nvcc12.5           Pass: 100%/2   | Total:  2h 11m | Avg:  1h 05m | Max:  1h 07m | Hits:  63%/2248  
      🟩 nvcc12.8           Pass: 100%/36  | Total:  1d 06h | Avg: 50m 25s | Max:  1h 14m | Hits:  71%/43229 
    🟩 cudacxx_family
      🟩 ClangCUDA          Pass: 100%/2   | Total:  1h 57m | Avg: 58m 45s | Max: 59m 37s | Hits:  73%/2100  
      🟩 nvcc               Pass: 100%/43  | Total:  1d 13h | Avg: 52m 13s | Max:  1h 14m | Hits:  69%/51385 
    🟩 cxx
      🟩 Clang14            Pass: 100%/4   | Total:  3h 47m | Avg: 56m 45s | Max: 59m 53s | Hits:  67%/4868  
      🟩 Clang15            Pass: 100%/2   | Total:  1h 52m | Avg: 56m 14s | Max: 57m 43s | Hits:  67%/2430  
      🟩 Clang16            Pass: 100%/2   | Total:  1h 56m | Avg: 58m 05s | Max: 59m 44s | Hits:  67%/2430  
      🟩 Clang17            Pass: 100%/2   | Total:  1h 53m | Avg: 56m 54s | Max: 57m 26s | Hits:  67%/2430  
      🟩 Clang18            Pass: 100%/7   | Total:  5h 35m | Avg: 47m 57s | Max: 59m 53s | Hits:  78%/8175  
      🟩 GCC7               Pass: 100%/2   | Total:  1h 54m | Avg: 57m 03s | Max: 58m 19s | Hits:  67%/2434  
      🟩 GCC8               Pass: 100%/1   | Total: 59m 47s | Avg: 59m 47s | Max: 59m 47s | Hits:  67%/1217  
      🟩 GCC9               Pass: 100%/2   | Total:  1h 58m | Avg: 59m 18s | Max:  1h 01m | Hits:  67%/2434  
      🟩 GCC10              Pass: 100%/2   | Total:  1h 56m | Avg: 58m 25s | Max:  1h 00m | Hits:  67%/2434  
      🟩 GCC11              Pass: 100%/2   | Total:  1h 53m | Avg: 56m 44s | Max: 57m 35s | Hits:  67%/2430  
      🟩 GCC12              Pass: 100%/2   | Total:  1h 54m | Avg: 57m 29s | Max: 58m 41s | Hits:  67%/2430  
      🟩 GCC13              Pass: 100%/11  | Total:  6h 36m | Avg: 36m 02s | Max:  1h 05m | Hits:  85%/13365 
      🟩 MSVC14.29          Pass: 100%/2   | Total:  2h 23m | Avg:  1h 11m | Max:  1h 14m | Hits:  15%/2080  
      🟩 MSVC14.42          Pass: 100%/2   | Total:  2h 28m | Avg:  1h 14m | Max:  1h 14m | Hits:  15%/2080  
      🟩 NVHPC24.7          Pass: 100%/2   | Total:  2h 11m | Avg:  1h 05m | Max:  1h 07m | Hits:  63%/2248  
    🟩 cxx_family
      🟩 Clang              Pass: 100%/17  | Total: 15h 05m | Avg: 53m 14s | Max: 59m 53s | Hits:  72%/20333 
      🟩 GCC                Pass: 100%/22  | Total: 17h 14m | Avg: 47m 00s | Max:  1h 05m | Hits:  76%/26744 
      🟩 MSVC               Pass: 100%/4   | Total:  4h 52m | Avg:  1h 13m | Max:  1h 14m | Hits:  15%/4160  
      🟩 NVHPC              Pass: 100%/2   | Total:  2h 11m | Avg:  1h 05m | Max:  1h 07m | Hits:  63%/2248  
    🟩 gpu
      🟩 h100               Pass: 100%/3   | Total:  1h 08m | Avg: 22m 55s | Max: 24m 43s | Hits:  88%/3645  
      🟩 rtx2080            Pass: 100%/34  | Total:  1d 10h | Avg:  1h 00m | Max:  1h 14m | Hits:  62%/40120 
      🟩 rtxa6000           Pass: 100%/8   | Total:  4h 05m | Avg: 30m 41s | Max: 57m 21s | Hits:  91%/9720  
    🟩 jobs
      🟩 Build              Pass: 100%/37  | Total:  1d 12h | Avg: 59m 08s | Max:  1h 14m | Hits:  62%/43765 
      🟩 DeviceLaunch       Pass: 100%/1   | Total: 22m 46s | Avg: 22m 46s | Max: 22m 46s | Hits:  99%/1215  
      🟩 GraphCapture       Pass: 100%/1   | Total: 18m 26s | Avg: 18m 26s | Max: 18m 26s | Hits:  99%/1215  
      🟩 HostLaunch         Pass: 100%/3   | Total:  1h 09m | Avg: 23m 18s | Max: 25m 12s | Hits:  99%/3645  
      🟩 TestGPU            Pass: 100%/3   | Total:  1h 04m | Avg: 21m 21s | Max: 22m 08s | Hits:  99%/3645  
    🟩 sm
      🟩 90                 Pass: 100%/3   | Total:  1h 08m | Avg: 22m 55s | Max: 24m 43s | Hits:  88%/3645  
      🟩 90;90a;100         Pass: 100%/1   | Total:  1h 05m | Avg:  1h 05m | Max:  1h 05m | Hits:  67%/1215  
    🟩 std
      🟩 17                 Pass: 100%/20  | Total: 20h 12m | Avg:  1h 00m | Max:  1h 14m | Hits:  60%/23535 
      🟩 20                 Pass: 100%/25  | Total: 19h 11m | Avg: 46m 03s | Max:  1h 14m | Hits:  76%/29950 
    
  • 🟩 thrust: Pass: 100%/45 | Total: 21h 28m | Avg: 28m 38s | Max: 51m 22s | Hits: 78%/80136

    🟩 cmake_options
      🟩 -DTHRUST_DISPATCH_TYPE=Force32bit Pass: 100%/2   | Total:  1h 03m | Avg: 31m 35s | Max: 38m 51s | Hits:  64%/3564  
    🟩 cpu
      🟩 amd64              Pass: 100%/43  | Total: 20h 39m | Avg: 28m 49s | Max: 51m 22s | Hits:  78%/76573 
      🟩 arm64              Pass: 100%/2   | Total: 49m 06s | Avg: 24m 33s | Max: 26m 11s | Hits:  81%/3563  
    🟩 ctk
      🟩 12.0               Pass: 100%/5   | Total:  2h 37m | Avg: 31m 32s | Max: 43m 05s | Hits:  78%/8901  
      🟩 12.5               Pass: 100%/2   | Total:  1h 39m | Avg: 49m 58s | Max: 51m 22s | Hits:  63%/3562  
      🟩 12.8               Pass: 100%/38  | Total: 17h 10m | Avg: 27m 07s | Max: 44m 54s | Hits:  79%/67673 
    🟩 cudacxx
      🟩 ClangCUDA18        Pass: 100%/2   | Total: 46m 48s | Avg: 23m 24s | Max: 23m 58s | Hits:  79%/3562  
      🟩 nvcc12.0           Pass: 100%/5   | Total:  2h 37m | Avg: 31m 32s | Max: 43m 05s | Hits:  78%/8901  
      🟩 nvcc12.5           Pass: 100%/2   | Total:  1h 39m | Avg: 49m 58s | Max: 51m 22s | Hits:  63%/3562  
      🟩 nvcc12.8           Pass: 100%/36  | Total: 16h 24m | Avg: 27m 20s | Max: 44m 54s | Hits:  79%/64111 
    🟩 cudacxx_family
      🟩 ClangCUDA          Pass: 100%/2   | Total: 46m 48s | Avg: 23m 24s | Max: 23m 58s | Hits:  79%/3562  
      🟩 nvcc               Pass: 100%/43  | Total: 20h 41m | Avg: 28m 52s | Max: 51m 22s | Hits:  78%/76574 
    🟩 cxx
      🟩 Clang14            Pass: 100%/4   | Total:  1h 51m | Avg: 27m 56s | Max: 29m 31s | Hits:  79%/7124  
      🟩 Clang15            Pass: 100%/2   | Total: 55m 03s | Avg: 27m 31s | Max: 28m 01s | Hits:  79%/3562  
      🟩 Clang16            Pass: 100%/2   | Total: 55m 14s | Avg: 27m 37s | Max: 28m 36s | Hits:  79%/3562  
      🟩 Clang17            Pass: 100%/2   | Total: 54m 15s | Avg: 27m 07s | Max: 27m 18s | Hits:  79%/3562  
      🟩 Clang18            Pass: 100%/7   | Total:  2h 20m | Avg: 20m 04s | Max: 26m 58s | Hits:  85%/12467 
      🟩 GCC7               Pass: 100%/2   | Total: 57m 20s | Avg: 28m 40s | Max: 29m 32s | Hits:  79%/3564  
      🟩 GCC8               Pass: 100%/1   | Total: 26m 53s | Avg: 26m 53s | Max: 26m 53s | Hits:  79%/1782  
      🟩 GCC9               Pass: 100%/2   | Total: 57m 08s | Avg: 28m 34s | Max: 28m 46s | Hits:  79%/3564  
      🟩 GCC10              Pass: 100%/2   | Total: 58m 39s | Avg: 29m 19s | Max: 29m 37s | Hits:  79%/3564  
      🟩 GCC11              Pass: 100%/2   | Total: 59m 21s | Avg: 29m 40s | Max: 29m 47s | Hits:  79%/3564  
      🟩 GCC12              Pass: 100%/2   | Total: 59m 51s | Avg: 29m 55s | Max: 30m 41s | Hits:  79%/3564  
      🟩 GCC13              Pass: 100%/10  | Total:  4h 09m | Avg: 24m 58s | Max: 38m 51s | Hits:  79%/17820 
      🟩 MSVC14.29          Pass: 100%/2   | Total:  1h 25m | Avg: 42m 35s | Max: 43m 05s | Hits:  70%/3550  
      🟩 MSVC14.42          Pass: 100%/3   | Total:  1h 57m | Avg: 39m 12s | Max: 44m 54s | Hits:  70%/5325  
      🟩 NVHPC24.7          Pass: 100%/2   | Total:  1h 39m | Avg: 49m 58s | Max: 51m 22s | Hits:  63%/3562  
    🟩 cxx_family
      🟩 Clang              Pass: 100%/17  | Total:  6h 56m | Avg: 24m 31s | Max: 29m 31s | Hits:  82%/30277 
      🟩 GCC                Pass: 100%/21  | Total:  9h 28m | Avg: 27m 05s | Max: 38m 51s | Hits:  79%/37422 
      🟩 MSVC               Pass: 100%/5   | Total:  3h 22m | Avg: 40m 33s | Max: 44m 54s | Hits:  70%/8875  
      🟩 NVHPC              Pass: 100%/2   | Total:  1h 39m | Avg: 49m 58s | Max: 51m 22s | Hits:  63%/3562  
    🟩 gpu
      🟩 h100               Pass: 100%/2   | Total: 45m 35s | Avg: 22m 47s | Max: 27m 49s | Hits:  71%/3564  
      🟩 rtx2080            Pass: 100%/33  | Total: 16h 48m | Avg: 30m 34s | Max: 51m 22s | Hits:  78%/58769 
      🟩 rtx4090            Pass: 100%/10  | Total:  3h 54m | Avg: 23m 25s | Max: 44m 54s | Hits:  82%/17803 
    🟩 jobs
      🟩 Build              Pass: 100%/38  | Total: 19h 13m | Avg: 30m 21s | Max: 51m 22s | Hits:  78%/67671 
      🟩 TestCPU            Pass: 100%/3   | Total: 46m 33s | Avg: 15m 31s | Max: 30m 56s | Hits:  90%/5338  
      🟩 TestGPU            Pass: 100%/4   | Total:  1h 28m | Avg: 22m 09s | Max: 38m 51s | Hits:  78%/7127  
    🟩 sm
      🟩 90                 Pass: 100%/2   | Total: 45m 35s | Avg: 22m 47s | Max: 27m 49s | Hits:  71%/3564  
      🟩 90;90a;100         Pass: 100%/1   | Total: 32m 31s | Avg: 32m 31s | Max: 32m 31s | Hits:  79%/1782  
    🟩 std
      🟩 17                 Pass: 100%/20  | Total: 10h 26m | Avg: 31m 18s | Max: 48m 35s | Hits:  77%/35611 
      🟩 20                 Pass: 100%/23  | Total:  9h 59m | Avg: 26m 02s | Max: 51m 22s | Hits:  81%/40961 
    
  • 🟩 cccl_c_parallel: Pass: 100%/2 | Total: 15m 27s | Avg: 7m 43s | Max: 13m 01s | Hits: 98%/308

    🟩 cpu
      🟩 amd64              Pass: 100%/2   | Total: 15m 27s | Avg:  7m 43s | Max: 13m 01s | Hits:  98%/308   
    🟩 ctk
      🟩 12.8               Pass: 100%/2   | Total: 15m 27s | Avg:  7m 43s | Max: 13m 01s | Hits:  98%/308   
    🟩 cudacxx
      🟩 nvcc12.8           Pass: 100%/2   | Total: 15m 27s | Avg:  7m 43s | Max: 13m 01s | Hits:  98%/308   
    🟩 cudacxx_family
      🟩 nvcc               Pass: 100%/2   | Total: 15m 27s | Avg:  7m 43s | Max: 13m 01s | Hits:  98%/308   
    🟩 cxx
      🟩 GCC13              Pass: 100%/2   | Total: 15m 27s | Avg:  7m 43s | Max: 13m 01s | Hits:  98%/308   
    🟩 cxx_family
      🟩 GCC                Pass: 100%/2   | Total: 15m 27s | Avg:  7m 43s | Max: 13m 01s | Hits:  98%/308   
    🟩 gpu
      🟩 rtx2080            Pass: 100%/2   | Total: 15m 27s | Avg:  7m 43s | Max: 13m 01s | Hits:  98%/308   
    🟩 jobs
      🟩 Build              Pass: 100%/1   | Total:  2m 26s | Avg:  2m 26s | Max:  2m 26s | Hits:  97%/154   
      🟩 Test               Pass: 100%/1   | Total: 13m 01s | Avg: 13m 01s | Max: 13m 01s | Hits:  98%/154   
    
  • 🟩 python: Pass: 100%/1 | Total: 51m 45s | Avg: 51m 45s | Max: 51m 45s

    🟩 cpu
      🟩 amd64              Pass: 100%/1   | Total: 51m 45s | Avg: 51m 45s | Max: 51m 45s
    🟩 ctk
      🟩 12.8               Pass: 100%/1   | Total: 51m 45s | Avg: 51m 45s | Max: 51m 45s
    🟩 cudacxx
      🟩 nvcc12.8           Pass: 100%/1   | Total: 51m 45s | Avg: 51m 45s | Max: 51m 45s
    🟩 cudacxx_family
      🟩 nvcc               Pass: 100%/1   | Total: 51m 45s | Avg: 51m 45s | Max: 51m 45s
    🟩 cxx
      🟩 GCC13              Pass: 100%/1   | Total: 51m 45s | Avg: 51m 45s | Max: 51m 45s
    🟩 cxx_family
      🟩 GCC                Pass: 100%/1   | Total: 51m 45s | Avg: 51m 45s | Max: 51m 45s
    🟩 gpu
      🟩 rtx2080            Pass: 100%/1   | Total: 51m 45s | Avg: 51m 45s | Max: 51m 45s
    🟩 jobs
      🟩 Test               Pass: 100%/1   | Total: 51m 45s | Avg: 51m 45s | Max: 51m 45s
    

👃 Inspect Changes

Modifications in project?

Project
CCCL Infrastructure
libcu++
+/- CUB
+/- Thrust
CUDA Experimental
python
CCCL C Parallel Library
Catch2Helper

Modifications in project or dependencies?

Project
CCCL Infrastructure
libcu++
+/- CUB
+/- Thrust
CUDA Experimental
+/- python
+/- CCCL C Parallel Library
+/- Catch2Helper

🏃‍ Runner counts (total jobs: 93)

# Runner
66 linux-amd64-cpu16
9 windows-amd64-cpu16
6 linux-amd64-gpu-rtxa6000-latest-1
4 linux-arm64-cpu16
3 linux-amd64-gpu-h100-latest-1
3 linux-amd64-gpu-rtx4090-latest-1
2 linux-amd64-gpu-rtx2080-latest-1

@miscco miscco merged commit ac268c3 into NVIDIA:main Feb 27, 2025
107 of 110 checks passed
@bernhardmgruber bernhardmgruber deleted the more_nvrtc branch February 27, 2025 22:29
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: Done
Development

Successfully merging this pull request may close these issues.

4 participants