
* Fix compute performance calculation type casting in gpuGetMaxGflopsDeviceIdDRV() for #109 * 3_CUDA_Features/memMapIPCDrv: Increase procIdx buffer size to prevent potential buffer overflow * memMapIPCDrv: Fix memory leaks and improve header inclusion - Remove redundant string.h header - Add memory cleanup for dynamically allocated JIT options and log buffer - Fix printf format specifier for unsigned long long
memMapIPCDrv - Memmap IPC Driver API
Description
This CUDA Driver API sample is a very basic sample that demonstrates Inter Process Communication using cuMemMap APIs with one process per GPU for computation. Requires Compute Capability 3.0 or higher and a Linux Operating System, or a Windows Operating System
Key Concepts
CUDA Driver API, cuMemMap IPC, MMAP
Supported SM Architectures
SM 5.0 SM 5.2 SM 5.3 SM 6.0 SM 6.1 SM 7.0 SM 7.2 SM 7.5 SM 8.0 SM 8.6 SM 8.7 SM 8.9 SM 9.0
Supported OSes
Linux, Windows, QNX
Supported CPU Architecture
x86_64, armv7l, aarch64
CUDA APIs involved
CUDA Driver API
cuDeviceCanAccessPeer, cuMemImportFromShareableHandle, cuModuleLoadDataEx, cuModuleGetFunction, cuMemSetAccess, cuModuleLoad, cuStreamCreate, cuMemRelease, cuInit, cuLaunchKernel, cuMemcpyDtoHAsync, cuMemCreate, cuDeviceGet, cuCtxDestroy, cuDeviceGetCount, cuMemMap, cuMemExportToShareableHandle, cuStreamSynchronize, cuCtxEnablePeerAccess, cuDeviceGetAttribute, cuOccupancyMaxActiveBlocksPerMultiprocessor, cuCtxSetCurrent, cuMemGetAllocationGranularity, cuMemAddressFree, cuMemUnmap, cuCtxCreate, cuStreamDestroy, cuMemAddressReserve
Dependencies needed to build/run
Prerequisites
Download and install the CUDA Toolkit 12.5 for your corresponding platform. Make sure the dependencies mentioned in Dependencies section above are installed.