mirror of
https://github.com/NVIDIA/cuda-samples.git
synced 2025-04-10 10:12:11 +01:00
jitLto - Saxpy with libnvJitLink
Description
This sample does a simple saxpy multiply and add using nvrtc and nvJitLink with LTO (Link Time Optimization). It has been written for clarity of exposition to illustrate various CUDA programming principles, not with the goal of providing the most performant generic kernel for saxpy.
Key Concepts
CUDA Runtime API, Runtime Compilation
Supported SM Architectures
SM 5.0 SM 5.2 SM 5.3 SM 6.0 SM 6.1 SM 7.0 SM 7.2 SM 7.5 SM 8.0 SM 8.6 SM 8.7 SM 8.9 SM 9.0
Supported OSes
Linux, Windows
Supported CPU Architecture
x86_64, aarch64
CUDA APIs involved
CUDA Driver API
cuModuleLoad, cuModuleLoadDataEx, cuModuleGetFunction, cuMemAlloc, cuMemFree, cuMemcpyHtoD, cuMemcpyDtoH, cuLaunchKernel
Dependencies needed to build/run
Prerequisites
Download and install the CUDA Toolkit 12.5 for your corresponding platform. Make sure the dependencies mentioned in Dependencies section above are installed.