mirror of
https://github.com/NVIDIA/cuda-samples.git
synced 2025-04-04 07:21:33 +01:00
clock_nvrtc - Clock libNVRTC
Description
This example shows how to use the clock function using libNVRTC to measure the performance of block of threads of a kernel accurately.
Key Concepts
Performance Strategies, Runtime Compilation
Supported SM Architectures
SM 5.0 SM 5.2 SM 5.3 SM 6.0 SM 6.1 SM 7.0 SM 7.2 SM 7.5 SM 8.0 SM 8.6 SM 8.7 SM 8.9 SM 9.0
Supported OSes
Linux, Windows, QNX
Supported CPU Architecture
x86_64, aarch64
CUDA APIs involved
CUDA Driver API
cuMemcpyDtoH, cuLaunchKernel, cuMemcpyHtoD, cuCtxSynchronize, cuMemAlloc, cuMemFree, cuModuleGetFunction
CUDA Runtime API
cudaBlockSize, cudaGridSize
Dependencies needed to build/run
Prerequisites
Download and install the CUDA Toolkit 12.5 for your corresponding platform. Make sure the dependencies mentioned in Dependencies section above are installed.