conjugateGradientMultiDeviceCG - conjugateGradient using MultiDevice Cooperative Groups
Description
This sample implements a conjugate gradient solver on multiple GPUs using Multi Device Cooperative Groups, also uses Unified Memory optimized using prefetching and usage hints.
Key Concepts
Unified Memory, Linear Algebra, Cooperative Groups, MultiDevice Cooperative Groups, CUBLAS Library, CUSPARSE Library
Supported SM Architectures
SM 6.0 SM 6.1 SM 7.0 SM 7.2 SM 7.5 SM 8.0 SM 8.6 SM 8.7 SM 8.9 SM 9.0
Supported OSes
Linux, Windows
Supported CPU Architecture
x86_64, aarch64
CUDA APIs involved
CUDA Runtime API
cudaHostAlloc, cudaMemPrefetchAsync, cudaFree, cudaLaunchCooperativeKernel, cudaMallocManaged, cudaSetDevice, cudaGetDeviceCount, cudaGetDeviceProperties, cudaFreeHost, cudaMemset, cudaStreamCreate, cudaStreamSynchronize, cudaDeviceEnablePeerAccess, cudaMemAdvise, cudaOccupancyMaxActiveBlocksPerMultiprocessor, cudaDeviceCanAccessPeer
Dependencies needed to build/run
Prerequisites
Download and install the CUDA Toolkit 12.5 for your corresponding platform. Make sure the dependencies mentioned in Dependencies section above are installed.