Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix thrust::raw_reference_cast for tuple_of_iterator_references and simplify thrust::generate #3970

Merged
merged 10 commits into from
Mar 1, 2025

Conversation

bernhardmgruber
Copy link
Contributor

@bernhardmgruber bernhardmgruber commented Feb 28, 2025

This PR fixes an issue with thrust::raw_reference_cast for a tuple_of_iterator_references not containing any wrapped references. This allows a simplification of thrust::generate. This allows to drop a test for calling thrust::generate with const iterators. This allows dropping the runtime_static_assert "feature" of the Thrust unit test framework.

@bernhardmgruber bernhardmgruber requested a review from a team as a code owner February 28, 2025 21:28
Comment on lines +22 to +24
[[maybe_unused]] auto zip = thrust::make_zip_iterator(vec.begin(), vec.begin());
static_assert(
is_same_v<decltype(thrust::raw_reference_cast(*zip)), thrust::detail::tuple_of_iterator_references<int&, int&>>);
Copy link
Contributor Author

@bernhardmgruber bernhardmgruber Feb 28, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thrust::raw_reference_cast(*zip)) would return const thrust::detail::tuple_of_iterator_references<int&, int&>& (const and ref qualified) before this PR, if vec is a host vector. It worked correctly for a device vector.

@bernhardmgruber
Copy link
Contributor Author

I am seeing a lot of these failures with nvvc 12.8:

  /home/coder/cccl/lib/cmake/thrust/../../../thrust/thrust/mr/disjoint_pool.h(221): error #20011-D: calling a __host__ function("thrust::THRUST_300000_SM_600_700_800_NS::host_vector< ::thrust::THRUST_300000_SM_600_700_800_NS::pointer<void,  ::thrust::THRUST_300000_SM_600_700_800_NS::cuda_cub::tag,  ::thrust::THRUST_300000_SM_600_700_800_NS::tagged_reference<void,  ::thrust::THRUST_300000_SM_600_700_800_NS::cuda_cub::tag> ,  ::thrust::THRUST_300000_SM_600_700_800_NS::use_default> ,  ::thrust::THRUST_300000_SM_600_700_800_NS::mr::allocator< ::thrust::THRUST_300000_SM_600_700_800_NS::pointer<void,  ::thrust::THRUST_300000_SM_600_700_800_NS::cuda_cub::tag,  ::thrust::THRUST_300000_SM_600_700_800_NS::tagged_reference<void,  ::thrust::THRUST_300000_SM_600_700_800_NS::cuda_cub::tag> ,  ::thrust::THRUST_300000_SM_600_700_800_NS::use_default> ,  ::thrust::THRUST_300000_SM_600_700_800_NS::mr::new_delete_resource> > ::operator =(const thrust::THRUST_300000_SM_600_700_800_NS::host_vector< ::thrust::THRUST_300000_SM_600_700_800_NS::pointer<void,  ::thrust::THRUST_300000_SM_600_700_800_NS::cuda_cub::tag,  ::thrust::THRUST_300000_SM_600_700_800_NS::tagged_reference<void,  ::thrust::THRUST_300000_SM_600_700_800_NS::cuda_cub::tag> ,  ::thrust::THRUST_300000_SM_600_700_800_NS::use_default> ,  ::thrust::THRUST_300000_SM_600_700_800_NS::mr::allocator< ::thrust::THRUST_300000_SM_600_700_800_NS::pointer<void,  ::thrust::THRUST_300000_SM_600_700_800_NS::cuda_cub::tag,  ::thrust::THRUST_300000_SM_600_700_800_NS::tagged_reference<void,  ::thrust::THRUST_300000_SM_600_700_800_NS::cuda_cub::tag> ,  ::thrust::THRUST_300000_SM_600_700_800_NS::use_default> ,  ::thrust::THRUST_300000_SM_600_700_800_NS::mr::new_delete_resource> > &)") from a __host__ __device__ function("thrust::THRUST_300000_SM_600_700_800_NS::mr::disjoint_unsynchronized_pool_resource< ::thrust::THRUST_300000_SM_600_700_800_NS::system::cuda::detail::cuda_memory_resource<&::cudaMalloc, &::cudaFree,  ::thrust::THRUST_300000_SM_600_700_800_NS::pointer<void,  ::thrust::THRUST_300000_SM_600_700_800_NS::cuda_cub::tag,  ::thrust::THRUST_300000_SM_600_700_800_NS::tagged_reference<void,  ::thrust::THRUST_300000_SM_600_700_800_NS::cuda_cub::tag> ,  ::thrust::THRUST_300000_SM_600_700_800_NS::use_default> > ,  ::thrust::THRUST_300000_SM_600_700_800_NS::mr::new_delete_resource> ::pool::operator =") is not allowed

Copy link
Contributor

github-actions bot commented Mar 1, 2025

🟨 CI finished in 52m 13s: Pass: 51%/93 | Total: 18h 51m | Avg: 12m 09s | Max: 51m 38s | Hits: 93%/53793
  • 🟥 thrust: Pass: 0%/45 | Total: 8h 55m | Avg: 11m 53s | Max: 28m 27s

    🟥 cmake_options
      🟥 -DTHRUST_DISPATCH_TYPE=Force32bit Pass:   0%/2   | Total: 10m 42s | Avg:  5m 21s | Max: 10m 42s
    🟥 cpu
      🟥 amd64              Pass:   0%/43  | Total:  8h 27m | Avg: 11m 47s | Max: 28m 27s
      🟥 arm64              Pass:   0%/2   | Total: 28m 20s | Avg: 14m 10s | Max: 14m 36s
    🟥 ctk
      🟥 12.0               Pass:   0%/5   | Total:  1h 17m | Avg: 15m 30s | Max: 25m 11s
      🟥 12.5               Pass:   0%/2   | Total: 40m 27s | Avg: 20m 13s | Max: 20m 33s
      🟥 12.8               Pass:   0%/38  | Total:  6h 57m | Avg: 10m 59s | Max: 28m 27s
    🟥 cudacxx
      🟥 ClangCUDA18        Pass:   0%/2   | Total:  9m 09s | Avg:  4m 34s | Max:  4m 35s
      🟥 nvcc12.0           Pass:   0%/5   | Total:  1h 17m | Avg: 15m 30s | Max: 25m 11s
      🟥 nvcc12.5           Pass:   0%/2   | Total: 40m 27s | Avg: 20m 13s | Max: 20m 33s
      🟥 nvcc12.8           Pass:   0%/36  | Total:  6h 48m | Avg: 11m 20s | Max: 28m 27s
    🟥 cudacxx_family
      🟥 ClangCUDA          Pass:   0%/2   | Total:  9m 09s | Avg:  4m 34s | Max:  4m 35s
      🟥 nvcc               Pass:   0%/43  | Total:  8h 46m | Avg: 12m 14s | Max: 28m 27s
    🟥 cxx
      🟥 Clang14            Pass:   0%/4   | Total: 49m 43s | Avg: 12m 25s | Max: 12m 39s
      🟥 Clang15            Pass:   0%/2   | Total: 25m 13s | Avg: 12m 36s | Max: 12m 55s
      🟥 Clang16            Pass:   0%/2   | Total: 25m 25s | Avg: 12m 42s | Max: 13m 14s
      🟥 Clang17            Pass:   0%/2   | Total: 26m 02s | Avg: 13m 01s | Max: 13m 34s
      🟥 Clang18            Pass:   0%/7   | Total: 47m 34s | Avg:  6m 47s | Max: 13m 44s
      🟥 GCC7               Pass:   0%/2   | Total: 27m 54s | Avg: 13m 57s | Max: 13m 58s
      🟥 GCC8               Pass:   0%/1   | Total: 12m 29s | Avg: 12m 29s | Max: 12m 29s
      🟥 GCC9               Pass:   0%/2   | Total: 26m 16s | Avg: 13m 08s | Max: 13m 26s
      🟥 GCC10              Pass:   0%/2   | Total: 24m 52s | Avg: 12m 26s | Max: 12m 47s
      🟥 GCC11              Pass:   0%/2   | Total: 25m 16s | Avg: 12m 38s | Max: 12m 59s
      🟥 GCC12              Pass:   0%/2   | Total: 27m 01s | Avg: 13m 30s | Max: 13m 45s
      🟥 GCC13              Pass:   0%/10  | Total:  1h 13m | Avg:  7m 21s | Max: 14m 36s
      🟥 MSVC14.29          Pass:   0%/2   | Total: 50m 50s | Avg: 25m 25s | Max: 25m 39s
      🟥 MSVC14.42          Pass:   0%/3   | Total: 52m 50s | Avg: 17m 36s | Max: 28m 27s
      🟥 NVHPC24.7          Pass:   0%/2   | Total: 40m 27s | Avg: 20m 13s | Max: 20m 33s
    🟥 cxx_family
      🟥 Clang              Pass:   0%/17  | Total:  2h 53m | Avg: 10m 13s | Max: 13m 44s
      🟥 GCC                Pass:   0%/21  | Total:  3h 37m | Avg: 10m 21s | Max: 14m 36s
      🟥 MSVC               Pass:   0%/5   | Total:  1h 43m | Avg: 20m 44s | Max: 28m 27s
      🟥 NVHPC              Pass:   0%/2   | Total: 40m 27s | Avg: 20m 13s | Max: 20m 33s
    🟥 gpu
      🟥 h100               Pass:   0%/2   | Total:  7m 17s | Avg:  3m 38s | Max:  7m 17s
      🟥 rtx2080            Pass:   0%/33  | Total:  7h 43m | Avg: 14m 02s | Max: 25m 39s
      🟥 rtx4090            Pass:   0%/10  | Total:  1h 04m | Avg:  6m 27s | Max: 28m 27s
    🟥 jobs
      🟥 Build              Pass:   0%/38  | Total:  8h 55m | Avg: 14m 05s | Max: 28m 27s
      🟥 TestCPU            Pass:   0%/3  
      🟥 TestGPU            Pass:   0%/4  
    🟥 sm
      🟥 90                 Pass:   0%/2   | Total:  7m 17s | Avg:  3m 38s | Max:  7m 17s
      🟥 90;90a;100         Pass:   0%/1   | Total: 13m 57s | Avg: 13m 57s | Max: 13m 57s
    🟥 std
      🟥 17                 Pass:   0%/20  | Total:  4h 56m | Avg: 14m 49s | Max: 25m 39s
      🟥 20                 Pass:   0%/23  | Total:  3h 48m | Avg:  9m 55s | Max: 28m 27s
    
  • 🟩 cub: Pass: 100%/45 | Total: 8h 49m | Avg: 11m 45s | Max: 40m 13s | Hits: 93%/53485

    🟩 cpu
      🟩 amd64              Pass: 100%/43  | Total:  8h 37m | Avg: 12m 02s | Max: 40m 13s | Hits:  92%/51055 
      🟩 arm64              Pass: 100%/2   | Total: 11m 15s | Avg:  5m 37s | Max:  5m 56s | Hits:  99%/2430  
    🟩 ctk
      🟩 12.0               Pass: 100%/5   | Total: 49m 47s | Avg:  9m 57s | Max: 27m 19s | Hits:  85%/5908  
      🟩 12.5               Pass: 100%/2   | Total: 50m 57s | Avg: 25m 28s | Max: 40m 13s | Hits:  98%/2248  
      🟩 12.8               Pass: 100%/38  | Total:  7h 08m | Avg: 11m 16s | Max: 32m 38s | Hits:  94%/45329 
    🟩 cudacxx
      🟩 ClangCUDA18        Pass: 100%/2   | Total:  9m 55s | Avg:  4m 57s | Max:  5m 07s | Hits: 100%/2100  
      🟩 nvcc12.0           Pass: 100%/5   | Total: 49m 47s | Avg:  9m 57s | Max: 27m 19s | Hits:  85%/5908  
      🟩 nvcc12.5           Pass: 100%/2   | Total: 50m 57s | Avg: 25m 28s | Max: 40m 13s | Hits:  98%/2248  
      🟩 nvcc12.8           Pass: 100%/36  | Total:  6h 58m | Avg: 11m 37s | Max: 32m 38s | Hits:  93%/43229 
    🟩 cudacxx_family
      🟩 ClangCUDA          Pass: 100%/2   | Total:  9m 55s | Avg:  4m 57s | Max:  5m 07s | Hits: 100%/2100  
      🟩 nvcc               Pass: 100%/43  | Total:  8h 39m | Avg: 12m 04s | Max: 40m 13s | Hits:  92%/51385 
    🟩 cxx
      🟩 Clang14            Pass: 100%/4   | Total: 22m 50s | Avg:  5m 42s | Max:  6m 02s | Hits: 100%/4868  
      🟩 Clang15            Pass: 100%/2   | Total: 12m 37s | Avg:  6m 18s | Max:  6m 35s | Hits: 100%/2430  
      🟩 Clang16            Pass: 100%/2   | Total: 11m 56s | Avg:  5m 58s | Max:  6m 05s | Hits: 100%/2430  
      🟩 Clang17            Pass: 100%/2   | Total: 11m 50s | Avg:  5m 55s | Max:  6m 02s | Hits: 100%/2430  
      🟩 Clang18            Pass: 100%/7   | Total:  1h 09m | Avg:  9m 54s | Max: 22m 23s | Hits: 100%/8175  
      🟩 GCC7               Pass: 100%/2   | Total: 12m 09s | Avg:  6m 04s | Max:  6m 25s | Hits:  99%/2434  
      🟩 GCC8               Pass: 100%/1   | Total:  5m 57s | Avg:  5m 57s | Max:  5m 57s | Hits:  99%/1217  
      🟩 GCC9               Pass: 100%/2   | Total: 12m 51s | Avg:  6m 25s | Max:  6m 55s | Hits:  99%/2434  
      🟩 GCC10              Pass: 100%/2   | Total: 12m 19s | Avg:  6m 09s | Max:  6m 12s | Hits:  99%/2434  
      🟩 GCC11              Pass: 100%/2   | Total: 13m 31s | Avg:  6m 45s | Max:  6m 47s | Hits:  99%/2430  
      🟩 GCC12              Pass: 100%/2   | Total: 13m 26s | Avg:  6m 43s | Max:  6m 43s | Hits:  99%/2430  
      🟩 GCC13              Pass: 100%/11  | Total:  2h 41m | Avg: 14m 43s | Max: 24m 21s | Hits:  99%/13365 
      🟩 MSVC14.29          Pass: 100%/2   | Total: 56m 07s | Avg: 28m 03s | Max: 28m 48s | Hits:  15%/2080  
      🟩 MSVC14.42          Pass: 100%/2   | Total:  1h 01m | Avg: 30m 37s | Max: 32m 38s | Hits:  15%/2080  
      🟩 NVHPC24.7          Pass: 100%/2   | Total: 50m 57s | Avg: 25m 28s | Max: 40m 13s | Hits:  98%/2248  
    🟩 cxx_family
      🟩 Clang              Pass: 100%/17  | Total:  2h 08m | Avg:  7m 33s | Max: 22m 23s | Hits: 100%/20333 
      🟩 GCC                Pass: 100%/22  | Total:  3h 52m | Avg: 10m 33s | Max: 24m 21s | Hits:  99%/26744 
      🟩 MSVC               Pass: 100%/4   | Total:  1h 57m | Avg: 29m 20s | Max: 32m 38s | Hits:  15%/4160  
      🟩 NVHPC              Pass: 100%/2   | Total: 50m 57s | Avg: 25m 28s | Max: 40m 13s | Hits:  98%/2248  
    🟩 gpu
      🟩 h100               Pass: 100%/3   | Total: 52m 47s | Avg: 17m 35s | Max: 24m 21s | Hits:  99%/3645  
      🟩 rtx2080            Pass: 100%/34  | Total:  5h 38m | Avg:  9m 57s | Max: 40m 13s | Hits:  91%/40120 
      🟩 rtxa6000           Pass: 100%/8   | Total:  2h 17m | Avg: 17m 10s | Max: 22m 24s | Hits:  99%/9720  
    🟩 jobs
      🟩 Build              Pass: 100%/37  | Total:  5h 56m | Avg:  9m 38s | Max: 40m 13s | Hits:  91%/43765 
      🟩 DeviceLaunch       Pass: 100%/1   | Total: 21m 07s | Avg: 21m 07s | Max: 21m 07s | Hits:  99%/1215  
      🟩 GraphCapture       Pass: 100%/1   | Total: 16m 51s | Avg: 16m 51s | Max: 16m 51s | Hits:  99%/1215  
      🟩 HostLaunch         Pass: 100%/3   | Total:  1h 08m | Avg: 22m 41s | Max: 23m 25s | Hits:  99%/3645  
      🟩 TestGPU            Pass: 100%/3   | Total:  1h 06m | Avg: 22m 04s | Max: 24m 21s | Hits:  99%/3645  
    🟩 sm
      🟩 90                 Pass: 100%/3   | Total: 52m 47s | Avg: 17m 35s | Max: 24m 21s | Hits:  99%/3645  
      🟩 90;90a;100         Pass: 100%/1   | Total:  7m 03s | Avg:  7m 03s | Max:  7m 03s | Hits:  99%/1215  
    🟩 std
      🟩 17                 Pass: 100%/20  | Total:  3h 13m | Avg:  9m 41s | Max: 28m 48s | Hits:  88%/23535 
      🟩 20                 Pass: 100%/25  | Total:  5h 35m | Avg: 13m 24s | Max: 40m 13s | Hits:  96%/29950 
    
  • 🟩 cccl_c_parallel: Pass: 100%/2 | Total: 15m 23s | Avg: 7m 41s | Max: 13m 04s | Hits: 98%/308

    🟩 cpu
      🟩 amd64              Pass: 100%/2   | Total: 15m 23s | Avg:  7m 41s | Max: 13m 04s | Hits:  98%/308   
    🟩 ctk
      🟩 12.8               Pass: 100%/2   | Total: 15m 23s | Avg:  7m 41s | Max: 13m 04s | Hits:  98%/308   
    🟩 cudacxx
      🟩 nvcc12.8           Pass: 100%/2   | Total: 15m 23s | Avg:  7m 41s | Max: 13m 04s | Hits:  98%/308   
    🟩 cudacxx_family
      🟩 nvcc               Pass: 100%/2   | Total: 15m 23s | Avg:  7m 41s | Max: 13m 04s | Hits:  98%/308   
    🟩 cxx
      🟩 GCC13              Pass: 100%/2   | Total: 15m 23s | Avg:  7m 41s | Max: 13m 04s | Hits:  98%/308   
    🟩 cxx_family
      🟩 GCC                Pass: 100%/2   | Total: 15m 23s | Avg:  7m 41s | Max: 13m 04s | Hits:  98%/308   
    🟩 gpu
      🟩 rtx2080            Pass: 100%/2   | Total: 15m 23s | Avg:  7m 41s | Max: 13m 04s | Hits:  98%/308   
    🟩 jobs
      🟩 Build              Pass: 100%/1   | Total:  2m 19s | Avg:  2m 19s | Max:  2m 19s | Hits:  98%/154   
      🟩 Test               Pass: 100%/1   | Total: 13m 04s | Avg: 13m 04s | Max: 13m 04s | Hits:  98%/154   
    
  • 🟩 python: Pass: 100%/1 | Total: 51m 38s | Avg: 51m 38s | Max: 51m 38s

    🟩 cpu
      🟩 amd64              Pass: 100%/1   | Total: 51m 38s | Avg: 51m 38s | Max: 51m 38s
    🟩 ctk
      🟩 12.8               Pass: 100%/1   | Total: 51m 38s | Avg: 51m 38s | Max: 51m 38s
    🟩 cudacxx
      🟩 nvcc12.8           Pass: 100%/1   | Total: 51m 38s | Avg: 51m 38s | Max: 51m 38s
    🟩 cudacxx_family
      🟩 nvcc               Pass: 100%/1   | Total: 51m 38s | Avg: 51m 38s | Max: 51m 38s
    🟩 cxx
      🟩 GCC13              Pass: 100%/1   | Total: 51m 38s | Avg: 51m 38s | Max: 51m 38s
    🟩 cxx_family
      🟩 GCC                Pass: 100%/1   | Total: 51m 38s | Avg: 51m 38s | Max: 51m 38s
    🟩 gpu
      🟩 rtx2080            Pass: 100%/1   | Total: 51m 38s | Avg: 51m 38s | Max: 51m 38s
    🟩 jobs
      🟩 Test               Pass: 100%/1   | Total: 51m 38s | Avg: 51m 38s | Max: 51m 38s
    

👃 Inspect Changes

Modifications in project?

Project
CCCL Infrastructure
libcu++
CUB
+/- Thrust
CUDA Experimental
python
CCCL C Parallel Library
Catch2Helper

Modifications in project or dependencies?

Project
CCCL Infrastructure
libcu++
+/- CUB
+/- Thrust
CUDA Experimental
+/- python
+/- CCCL C Parallel Library
+/- Catch2Helper

🏃‍ Runner counts (total jobs: 93)

# Runner
66 linux-amd64-cpu16
9 windows-amd64-cpu16
6 linux-amd64-gpu-rtxa6000-latest-1
4 linux-arm64-cpu16
3 linux-amd64-gpu-h100-latest-1
3 linux-amd64-gpu-rtx4090-latest-1
2 linux-amd64-gpu-rtx2080-latest-1

Copy link
Contributor

github-actions bot commented Mar 1, 2025

🟩 CI finished in 1h 41m: Pass: 100%/93 | Total: 2d 17h | Avg: 42m 04s | Max: 1h 17m | Hits: 60%/133884
  • 🟩 cub: Pass: 100%/45 | Total: 1d 15h | Avg: 52m 28s | Max: 1h 17m | Hits: 69%/53485

    🟩 cpu
      🟩 amd64              Pass: 100%/43  | Total:  1d 13h | Avg: 52m 08s | Max:  1h 17m | Hits:  69%/51055 
      🟩 arm64              Pass: 100%/2   | Total:  1h 59m | Avg: 59m 37s | Max:  1h 00m | Hits:  67%/2430  
    🟩 ctk
      🟩 12.0               Pass: 100%/5   | Total:  4h 56m | Avg: 59m 12s | Max:  1h 04m | Hits:  58%/5908  
      🟩 12.5               Pass: 100%/2   | Total:  2h 12m | Avg:  1h 06m | Max:  1h 07m | Hits:  67%/2248  
      🟩 12.8               Pass: 100%/38  | Total:  1d 08h | Avg: 50m 52s | Max:  1h 17m | Hits:  71%/45329 
    🟩 cudacxx
      🟩 ClangCUDA18        Pass: 100%/2   | Total:  1h 56m | Avg: 58m 28s | Max: 58m 38s | Hits:  73%/2100  
      🟩 nvcc12.0           Pass: 100%/5   | Total:  4h 56m | Avg: 59m 12s | Max:  1h 04m | Hits:  58%/5908  
      🟩 nvcc12.5           Pass: 100%/2   | Total:  2h 12m | Avg:  1h 06m | Max:  1h 07m | Hits:  67%/2248  
      🟩 nvcc12.8           Pass: 100%/36  | Total:  1d 06h | Avg: 50m 27s | Max:  1h 17m | Hits:  70%/43229 
    🟩 cudacxx_family
      🟩 ClangCUDA          Pass: 100%/2   | Total:  1h 56m | Avg: 58m 28s | Max: 58m 38s | Hits:  73%/2100  
      🟩 nvcc               Pass: 100%/43  | Total:  1d 13h | Avg: 52m 11s | Max:  1h 17m | Hits:  69%/51385 
    🟩 cxx
      🟩 Clang14            Pass: 100%/4   | Total:  3h 53m | Avg: 58m 27s | Max:  1h 01m | Hits:  67%/4868  
      🟩 Clang15            Pass: 100%/2   | Total:  1h 51m | Avg: 55m 59s | Max: 56m 01s | Hits:  67%/2430  
      🟩 Clang16            Pass: 100%/2   | Total:  1h 54m | Avg: 57m 08s | Max: 57m 49s | Hits:  67%/2430  
      🟩 Clang17            Pass: 100%/2   | Total:  1h 52m | Avg: 56m 27s | Max: 57m 32s | Hits:  67%/2430  
      🟩 Clang18            Pass: 100%/7   | Total:  5h 35m | Avg: 47m 54s | Max:  1h 02m | Hits:  78%/8175  
      🟩 GCC7               Pass: 100%/2   | Total:  1h 55m | Avg: 57m 53s | Max: 58m 29s | Hits:  67%/2434  
      🟩 GCC8               Pass: 100%/1   | Total: 56m 36s | Avg: 56m 36s | Max: 56m 36s | Hits:  67%/1217  
      🟩 GCC9               Pass: 100%/2   | Total:  1h 54m | Avg: 57m 15s | Max: 58m 12s | Hits:  67%/2434  
      🟩 GCC10              Pass: 100%/2   | Total:  1h 55m | Avg: 57m 50s | Max: 59m 23s | Hits:  67%/2434  
      🟩 GCC11              Pass: 100%/2   | Total:  1h 54m | Avg: 57m 23s | Max: 58m 35s | Hits:  67%/2430  
      🟩 GCC12              Pass: 100%/2   | Total:  1h 59m | Avg: 59m 39s | Max:  1h 03m | Hits:  67%/2430  
      🟩 GCC13              Pass: 100%/11  | Total:  6h 35m | Avg: 35m 58s | Max:  1h 07m | Hits:  85%/13365 
      🟩 MSVC14.29          Pass: 100%/2   | Total:  2h 16m | Avg:  1h 08m | Max:  1h 11m | Hits:  13%/2080  
      🟩 MSVC14.42          Pass: 100%/2   | Total:  2h 32m | Avg:  1h 16m | Max:  1h 17m | Hits:  13%/2080  
      🟩 NVHPC24.7          Pass: 100%/2   | Total:  2h 12m | Avg:  1h 06m | Max:  1h 07m | Hits:  67%/2248  
    🟩 cxx_family
      🟩 Clang              Pass: 100%/17  | Total: 15h 08m | Avg: 53m 26s | Max:  1h 02m | Hits:  72%/20333 
      🟩 GCC                Pass: 100%/22  | Total: 17h 12m | Avg: 46m 55s | Max:  1h 07m | Hits:  76%/26744 
      🟩 MSVC               Pass: 100%/4   | Total:  4h 48m | Avg:  1h 12m | Max:  1h 17m | Hits:  13%/4160  
      🟩 NVHPC              Pass: 100%/2   | Total:  2h 12m | Avg:  1h 06m | Max:  1h 07m | Hits:  67%/2248  
    🟩 gpu
      🟩 h100               Pass: 100%/3   | Total:  1h 11m | Avg: 23m 42s | Max: 25m 00s | Hits:  88%/3645  
      🟩 rtx2080            Pass: 100%/34  | Total:  1d 10h | Avg:  1h 00m | Max:  1h 17m | Hits:  62%/40120 
      🟩 rtxa6000           Pass: 100%/8   | Total:  3h 57m | Avg: 29m 42s | Max: 57m 31s | Hits:  91%/9720  
    🟩 jobs
      🟩 Build              Pass: 100%/37  | Total:  1d 12h | Avg: 59m 11s | Max:  1h 17m | Hits:  62%/43765 
      🟩 DeviceLaunch       Pass: 100%/1   | Total: 20m 49s | Avg: 20m 49s | Max: 20m 49s | Hits:  99%/1215  
      🟩 GraphCapture       Pass: 100%/1   | Total: 16m 49s | Avg: 16m 49s | Max: 16m 49s | Hits:  99%/1215  
      🟩 HostLaunch         Pass: 100%/3   | Total:  1h 09m | Avg: 23m 00s | Max: 23m 53s | Hits:  99%/3645  
      🟩 TestGPU            Pass: 100%/3   | Total:  1h 04m | Avg: 21m 39s | Max: 22m 13s | Hits:  99%/3645  
    🟩 sm
      🟩 90                 Pass: 100%/3   | Total:  1h 11m | Avg: 23m 42s | Max: 25m 00s | Hits:  88%/3645  
      🟩 90;90a;100         Pass: 100%/1   | Total:  1h 07m | Avg:  1h 07m | Max:  1h 07m | Hits:  67%/1215  
    🟩 std
      🟩 17                 Pass: 100%/20  | Total: 20h 15m | Avg:  1h 00m | Max:  1h 14m | Hits:  60%/23535 
      🟩 20                 Pass: 100%/25  | Total: 19h 05m | Avg: 45m 49s | Max:  1h 17m | Hits:  76%/29950 
    
  • 🟩 thrust: Pass: 100%/45 | Total: 1d 00h | Avg: 32m 59s | Max: 1h 05m | Hits: 54%/80091

    🟩 cmake_options
      🟩 -DTHRUST_DISPATCH_TYPE=Force32bit Pass: 100%/2   | Total: 40m 51s | Avg: 20m 25s | Max: 29m 42s | Hits:  73%/3562  
    🟩 cpu
      🟩 amd64              Pass: 100%/43  | Total: 23h 45m | Avg: 33m 09s | Max:  1h 05m | Hits:  55%/76530 
      🟩 arm64              Pass: 100%/2   | Total: 58m 58s | Avg: 29m 29s | Max: 30m 47s | Hits:  46%/3561  
    🟩 ctk
      🟩 12.0               Pass: 100%/5   | Total:  3h 08m | Avg: 37m 41s | Max: 59m 28s | Hits:  60%/8896  
      🟩 12.5               Pass: 100%/2   | Total:  2h 07m | Avg:  1h 03m | Max:  1h 05m | Hits:  26%/3560  
      🟩 12.8               Pass: 100%/38  | Total: 19h 28m | Avg: 30m 44s | Max:  1h 03m | Hits:  55%/67635 
    🟩 cudacxx
      🟩 ClangCUDA18        Pass: 100%/2   | Total: 54m 56s | Avg: 27m 28s | Max: 28m 41s | Hits:  46%/3560  
      🟩 nvcc12.0           Pass: 100%/5   | Total:  3h 08m | Avg: 37m 41s | Max: 59m 28s | Hits:  60%/8896  
      🟩 nvcc12.5           Pass: 100%/2   | Total:  2h 07m | Avg:  1h 03m | Max:  1h 05m | Hits:  26%/3560  
      🟩 nvcc12.8           Pass: 100%/36  | Total: 18h 33m | Avg: 30m 55s | Max:  1h 03m | Hits:  56%/64075 
    🟩 cudacxx_family
      🟩 ClangCUDA          Pass: 100%/2   | Total: 54m 56s | Avg: 27m 28s | Max: 28m 41s | Hits:  46%/3560  
      🟩 nvcc               Pass: 100%/43  | Total: 23h 49m | Avg: 33m 14s | Max:  1h 05m | Hits:  55%/76531 
    🟩 cxx
      🟩 Clang14            Pass: 100%/4   | Total:  2h 04m | Avg: 31m 07s | Max: 31m 55s | Hits:  57%/7120  
      🟩 Clang15            Pass: 100%/2   | Total:  1h 06m | Avg: 33m 06s | Max: 34m 18s | Hits:  46%/3560  
      🟩 Clang16            Pass: 100%/2   | Total:  1h 08m | Avg: 34m 03s | Max: 34m 48s | Hits:  46%/3560  
      🟩 Clang17            Pass: 100%/2   | Total:  1h 03m | Avg: 31m 35s | Max: 31m 40s | Hits:  46%/3560  
      🟩 Clang18            Pass: 100%/7   | Total:  2h 46m | Avg: 23m 43s | Max: 32m 44s | Hits:  63%/12460 
      🟩 GCC7               Pass: 100%/2   | Total:  1h 06m | Avg: 33m 10s | Max: 33m 32s | Hits:  59%/3562  
      🟩 GCC8               Pass: 100%/1   | Total: 32m 53s | Avg: 32m 53s | Max: 32m 53s | Hits:  46%/1781  
      🟩 GCC9               Pass: 100%/2   | Total:  1h 06m | Avg: 33m 05s | Max: 33m 08s | Hits:  55%/3562  
      🟩 GCC10              Pass: 100%/2   | Total:  1h 09m | Avg: 34m 38s | Max: 36m 04s | Hits:  46%/3562  
      🟩 GCC11              Pass: 100%/2   | Total:  1h 08m | Avg: 34m 15s | Max: 36m 46s | Hits:  46%/3562  
      🟩 GCC12              Pass: 100%/2   | Total:  1h 11m | Avg: 35m 40s | Max: 36m 35s | Hits:  46%/3562  
      🟩 GCC13              Pass: 100%/10  | Total:  3h 38m | Avg: 21m 53s | Max: 33m 20s | Hits:  73%/17810 
      🟩 MSVC14.29          Pass: 100%/2   | Total:  1h 57m | Avg: 58m 51s | Max: 59m 28s | Hits:  33%/3548  
      🟩 MSVC14.42          Pass: 100%/3   | Total:  2h 37m | Avg: 52m 32s | Max:  1h 03m | Hits:  34%/5322  
      🟩 NVHPC24.7          Pass: 100%/2   | Total:  2h 07m | Avg:  1h 03m | Max:  1h 05m | Hits:  26%/3560  
    🟩 cxx_family
      🟩 Clang              Pass: 100%/17  | Total:  8h 08m | Avg: 28m 42s | Max: 34m 48s | Hits:  56%/30260 
      🟩 GCC                Pass: 100%/21  | Total:  9h 53m | Avg: 28m 15s | Max: 36m 46s | Hits:  61%/37401 
      🟩 MSVC               Pass: 100%/5   | Total:  4h 35m | Avg: 55m 04s | Max:  1h 03m | Hits:  34%/8870  
      🟩 NVHPC              Pass: 100%/2   | Total:  2h 07m | Avg:  1h 03m | Max:  1h 05m | Hits:  26%/3560  
    🟩 gpu
      🟩 h100               Pass: 100%/2   | Total: 34m 33s | Avg: 17m 16s | Max: 22m 40s | Hits:  73%/3562  
      🟩 rtx2080            Pass: 100%/33  | Total: 20h 14m | Avg: 36m 47s | Max:  1h 05m | Hits:  47%/58736 
      🟩 rtx4090            Pass: 100%/10  | Total:  3h 55m | Avg: 23m 35s | Max:  1h 03m | Hits:  76%/17793 
    🟩 jobs
      🟩 Build              Pass: 100%/38  | Total: 23h 12m | Avg: 36m 38s | Max:  1h 05m | Hits:  47%/67633 
      🟩 TestCPU            Pass: 100%/3   | Total: 47m 31s | Avg: 15m 50s | Max: 32m 20s | Hits:  90%/5335  
      🟩 TestGPU            Pass: 100%/4   | Total: 44m 43s | Avg: 11m 10s | Max: 11m 53s | Hits:  99%/7123  
    🟩 sm
      🟩 90                 Pass: 100%/2   | Total: 34m 33s | Avg: 17m 16s | Max: 22m 40s | Hits:  73%/3562  
      🟩 90;90a;100         Pass: 100%/1   | Total: 30m 00s | Avg: 30m 00s | Max: 30m 00s | Hits:  76%/1781  
    🟩 std
      🟩 17                 Pass: 100%/20  | Total: 12h 50m | Avg: 38m 32s | Max:  1h 05m | Hits:  46%/35591 
      🟩 20                 Pass: 100%/23  | Total: 11h 12m | Avg: 29m 15s | Max:  1h 03m | Hits:  60%/40938 
    
  • 🟩 cccl_c_parallel: Pass: 100%/2 | Total: 15m 42s | Avg: 7m 51s | Max: 13m 26s | Hits: 98%/308

    🟩 cpu
      🟩 amd64              Pass: 100%/2   | Total: 15m 42s | Avg:  7m 51s | Max: 13m 26s | Hits:  98%/308   
    🟩 ctk
      🟩 12.8               Pass: 100%/2   | Total: 15m 42s | Avg:  7m 51s | Max: 13m 26s | Hits:  98%/308   
    🟩 cudacxx
      🟩 nvcc12.8           Pass: 100%/2   | Total: 15m 42s | Avg:  7m 51s | Max: 13m 26s | Hits:  98%/308   
    🟩 cudacxx_family
      🟩 nvcc               Pass: 100%/2   | Total: 15m 42s | Avg:  7m 51s | Max: 13m 26s | Hits:  98%/308   
    🟩 cxx
      🟩 GCC13              Pass: 100%/2   | Total: 15m 42s | Avg:  7m 51s | Max: 13m 26s | Hits:  98%/308   
    🟩 cxx_family
      🟩 GCC                Pass: 100%/2   | Total: 15m 42s | Avg:  7m 51s | Max: 13m 26s | Hits:  98%/308   
    🟩 gpu
      🟩 rtx2080            Pass: 100%/2   | Total: 15m 42s | Avg:  7m 51s | Max: 13m 26s | Hits:  98%/308   
    🟩 jobs
      🟩 Build              Pass: 100%/1   | Total:  2m 16s | Avg:  2m 16s | Max:  2m 16s | Hits:  98%/154   
      🟩 Test               Pass: 100%/1   | Total: 13m 26s | Avg: 13m 26s | Max: 13m 26s | Hits:  98%/154   
    
  • 🟩 python: Pass: 100%/1 | Total: 50m 49s | Avg: 50m 49s | Max: 50m 49s

    🟩 cpu
      🟩 amd64              Pass: 100%/1   | Total: 50m 49s | Avg: 50m 49s | Max: 50m 49s
    🟩 ctk
      🟩 12.8               Pass: 100%/1   | Total: 50m 49s | Avg: 50m 49s | Max: 50m 49s
    🟩 cudacxx
      🟩 nvcc12.8           Pass: 100%/1   | Total: 50m 49s | Avg: 50m 49s | Max: 50m 49s
    🟩 cudacxx_family
      🟩 nvcc               Pass: 100%/1   | Total: 50m 49s | Avg: 50m 49s | Max: 50m 49s
    🟩 cxx
      🟩 GCC13              Pass: 100%/1   | Total: 50m 49s | Avg: 50m 49s | Max: 50m 49s
    🟩 cxx_family
      🟩 GCC                Pass: 100%/1   | Total: 50m 49s | Avg: 50m 49s | Max: 50m 49s
    🟩 gpu
      🟩 rtx2080            Pass: 100%/1   | Total: 50m 49s | Avg: 50m 49s | Max: 50m 49s
    🟩 jobs
      🟩 Test               Pass: 100%/1   | Total: 50m 49s | Avg: 50m 49s | Max: 50m 49s
    

👃 Inspect Changes

Modifications in project?

Project
CCCL Infrastructure
libcu++
CUB
+/- Thrust
CUDA Experimental
python
CCCL C Parallel Library
Catch2Helper

Modifications in project or dependencies?

Project
CCCL Infrastructure
libcu++
+/- CUB
+/- Thrust
CUDA Experimental
+/- python
+/- CCCL C Parallel Library
+/- Catch2Helper

🏃‍ Runner counts (total jobs: 93)

# Runner
66 linux-amd64-cpu16
9 windows-amd64-cpu16
6 linux-amd64-gpu-rtxa6000-latest-1
4 linux-arm64-cpu16
3 linux-amd64-gpu-h100-latest-1
3 linux-amd64-gpu-rtx4090-latest-1
2 linux-amd64-gpu-rtx2080-latest-1

@miscco miscco merged commit 096596b into NVIDIA:main Mar 1, 2025
105 of 108 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: Done
Development

Successfully merging this pull request may close these issues.

2 participants