mirror of
https://github.com/OpenMP/Examples.git
synced 2025-04-04 05:41:33 +01:00
32 lines
1.6 KiB
TeX
32 lines
1.6 KiB
TeX
\subsection{\code{nowait} Clause on \code{target} Construct}
|
|
\label{subsec:target_nowait_clause}
|
|
|
|
The following example shows how to execute code asynchronously on a
|
|
device without an explicit task. The \code{nowait} clause on a \code{target}
|
|
construct allows the thread of the \plc{target task} to perform other
|
|
work while waiting for the \code{target} region execution to complete.
|
|
Hence, the the \code{target} region can execute asynchronously on the
|
|
device (without requiring a host thread to idle while waiting for
|
|
the \plc{target task} execution to complete).
|
|
|
|
In this example the product of two vectors (arrays), \plc{v1}
|
|
and \plc{v2}, is formed. One half of the operations is performed
|
|
on the device, and the last half on the host, concurrently.
|
|
|
|
After a team of threads is formed the master thread generates
|
|
the \plc{target task} while the other threads can continue on, without a barrier,
|
|
to the execution of the host portion of the vector product.
|
|
The completion of the \plc{target task} (asynchronous target execution) is
|
|
guaranteed by the synchronization in the implicit barrier at the end of the
|
|
host vector-product worksharing loop region. See the \code{barrier}
|
|
glossary entry in the OpenMP specification for details.
|
|
|
|
The host loop scheduling is \code{dynamic}, to balance the host thread executions, since
|
|
one thread is being used for offload generation. In the situation where
|
|
little time is spent by the \plc{target task} in setting
|
|
up and tearing down the the target execution, \code{static} scheduling may be desired.
|
|
|
|
\cexample[4.5]{async_target}{3}
|
|
|
|
\ffreeexample[4.5]{async_target}{3}
|