2024-04-16 08:59:23 -07:00

44 lines
2.4 KiB
TeX

%\pagebreak
\section{\kcode{interop} Construct}
\label{sec:interop}
\index{constructs!interop@\kcode{interop}}
\index{interop construct@\kcode{interop} construct}
The \kcode{interop} construct allows OpenMP to interoperate with foreign runtime environments.
In the example below, asynchronous cuda memory copies and a \ucode{cublasDaxpy} routine are executed
in a cuda stream. Also, an asynchronous target task execution (having a \kcode{nowait} clause)
and two explicit tasks are executed through OpenMP directives. Scheduling dependences (synchronization) are
imposed on the foreign stream and the OpenMP tasks through \kcode{depend} clauses.
\index{interop construct@\kcode{interop} construct!init clause@\kcode{init} clause}
\index{init clause@\kcode{init} clause}
\index{clauses!init@\kcode{init}}
\index{interop construct@\kcode{interop} construct!depend clause@\kcode{depend} clause}
\index{depend clause@\kcode{depend} clause}
\index{clauses!depend@\kcode{depend}}
First, an interop object, \ucode{obj}, is initialized for synchronization by including the
\kcode{targetsync} \plc{interop-type} in the interop \kcode{init} clause
(\kcode{init(targetsync, \ucode{obj})}).
The object provides access to the foreign runtime.
The \kcode{depend} clause provides a dependence behavior
for foreign tasks associated with a valid object.
\index{routines!omp_get_interop_int@\kcode{omp_get_interop_int}}
\index{omp_get_interop_int routine@\kcode{omp_get_interop_int} routine}
Next, the \kcode{omp_get_interop_int} routine is used to extract the foreign
runtime id (\kcode{omp_ipr_fr_id}), and a test in the next statement ensures
that the cuda runtime (\kcode{omp_ifr_cuda}) is available.
\index{routines!omp_get_interop_ptr@\kcode{omp_get_interop_ptr}}
\index{omp_get_interop_ptr routine@\kcode{omp_get_interop_ptr} routine}
\index{interop construct@\kcode{interop} construct!destroy clause@\kcode{destroy} clause}
\index{destroy clause@\kcode{destroy} clause}
\index{clauses!destroy@\kcode{destroy}}
Within the block for executing the \ucode{cublasDaxpy} routine, a stream is acquired
with the \kcode{omp_get_interop_ptr} routine, which returns a cuda stream (\ucode{s}).
The stream is included in the cublas handle, and used directly in the asynchronous memory
routines. The following \kcode{interop} construct, with the \kcode{destroy} clause,
ensures that the foreign tasks have completed.
\cexample[5.1]{interop}{1}