mirror of
https://github.com/NVIDIA/cuda-samples.git
synced 2025-04-04 07:21:33 +01:00
reductionMultiBlockCG - Reduction using MultiBlock Cooperative Groups
Description
This sample demonstrates single pass reduction using Multi Block Cooperative Groups. This sample requires devices with compute capability 6.0 or higher having compute preemption.
Key Concepts
Cooperative Groups, MultiBlock Cooperative Groups
Supported SM Architectures
SM 6.0 SM 6.1 SM 7.0 SM 7.2 SM 7.5 SM 8.0 SM 8.6 SM 8.7 SM 8.9 SM 9.0
Supported OSes
Linux, Windows
Supported CPU Architecture
x86_64, aarch64
CUDA APIs involved
CUDA Runtime API
cudaMemcpy, cudaFree, cudaSetDevice, cudaDeviceSynchronize, cudaLaunchCooperativeKernel, cudaMalloc, cudaOccupancyMaxActiveBlocksPerMultiprocessor, cudaGetDeviceProperties, cudaOccupancyMaxPotentialBlockSize
Dependencies needed to build/run
Prerequisites
Download and install the CUDA Toolkit 12.5 for your corresponding platform. Make sure the dependencies mentioned in Dependencies section above are installed.