OpenMP-Examples/Examples_async_target_nowait.tex
2020-06-26 07:54:45 -07:00

32 lines
1.6 KiB
TeX

\subsection{\code{nowait} Clause on \code{target} Construct}
\label{subsec:target_nowait_clause}
The following example shows how to execute code asynchronously on a
device without an explicit task. The \code{nowait} clause on a \code{target}
construct allows the thread of the \plc{target task} to perform other
work while waiting for the \code{target} region execution to complete.
Hence, the the \code{target} region can execute asynchronously on the
device (without requiring a host thread to idle while waiting for
the \plc{target task} execution to complete).
In this example the product of two vectors (arrays), \plc{v1}
and \plc{v2}, is formed. One half of the operations is performed
on the device, and the last half on the host, concurrently.
After a team of threads is formed the master thread generates
the \plc{target task} while the other threads can continue on, without a barrier,
to the execution of the host portion of the vector product.
The completion of the \plc{target task} (asynchronous target execution) is
guaranteed by the synchronization in the implicit barrier at the end of the
host vector-product worksharing loop region. See the \code{barrier}
glossary entry in the OpenMP specification for details.
The host loop scheduling is \code{dynamic}, to balance the host thread executions, since
one thread is being used for offload generation. In the situation where
little time is spent by the \plc{target task} in setting
up and tearing down the the target execution, \code{static} scheduling may be desired.
\cexample[4.5]{async_target}{3}
\ffreeexample[4.5]{async_target}{3}