mirror of
https://github.com/OpenMP/Examples.git
synced 2025-04-04 05:41:33 +01:00
69 lines
3.9 KiB
TeX
69 lines
3.9 KiB
TeX
\subsection{Asynchronous \kcode{target} with Tasks}
|
|
\label{subsec:async_target_with_tasks}
|
|
\index{target construct@\kcode{target} construct}
|
|
\index{task construct@\kcode{task} construct}
|
|
|
|
\index{directives!declare target@\kcode{declare target}}
|
|
\index{declare target directive@\kcode{declare target} directive}
|
|
|
|
\index{directives!begin declare target@\kcode{begin declare target}}
|
|
\index{begin declare target directive@\kcode{begin declare target} directive}
|
|
|
|
The following example shows how the \kcode{task} and \kcode{target} constructs
|
|
are used to execute multiple \kcode{target} regions asynchronously. The task that
|
|
encounters the \kcode{task} construct generates an explicit task that contains
|
|
a \kcode{target} region. The thread executing the explicit task encounters a task
|
|
scheduling point while waiting for the execution of the \kcode{target} region
|
|
to complete, allowing the thread to switch back to the execution of the encountering
|
|
task or one of the previously generated explicit tasks.
|
|
|
|
\cexample[5.1]{async_target}{1}
|
|
|
|
\pagebreak
|
|
\index{directives!declare target@\kcode{declare target}}
|
|
\index{declare target directive@\kcode{declare target} directive}
|
|
The Fortran version has an interface block that contains the \kcode{declare target}.
|
|
An identical statement exists in the function declaration (not shown here).
|
|
|
|
\ffreeexample[4.0]{async_target}{1}
|
|
|
|
The following example shows how the \kcode{task} and \kcode{target} constructs
|
|
are used to execute multiple \kcode{target} regions asynchronously. The task dependence
|
|
ensures that the storage is allocated and initialized on the device before it is
|
|
accessed.
|
|
|
|
\cexample[5.1]{async_target}{2}
|
|
|
|
The Fortran example below is similar to the C version above. Instead of pointers, though, it uses
|
|
the convenience of Fortran allocatable arrays on the device. In order to preserve the arrays
|
|
allocated on the device across multiple \kcode{target} regions, a \kcode{target data} region
|
|
is used in this case.
|
|
|
|
If there is no shape specified for an allocatable array in a \kcode{map} clause, only the array descriptor
|
|
(also called a dope vector) is mapped. That is, device space is created for the descriptor, and it
|
|
is initially populated with host values. In this case, the \ucode{v1} and \ucode{v2} arrays will be in a
|
|
non-associated state on the device. When space for \ucode{v1} and \ucode{v2} is allocated on the device
|
|
in the first \kcode{target} region the addresses to the space will be included in their descriptors.
|
|
|
|
At the end of the first \kcode{target} region, the arrays \ucode{v1} and \ucode{v2} are preserved on the device
|
|
for access in the second \kcode{target} region. At the end of the second \kcode{target} region, the data
|
|
in array \ucode{p} is copied back, the arrays \ucode{v1} and \ucode{v2} are not.
|
|
|
|
\index{task construct@\kcode{task} construct!depend clause@\kcode{depend} clause}
|
|
\index{clauses!depend@\kcode{depend}}
|
|
\index{depend clause@\kcode{depend} clause}
|
|
A \kcode{depend} clause is used in the \kcode{task} directive to provide a wait at the beginning of the second
|
|
\kcode{target} region, to insure that there is no race condition with \ucode{v1} and \ucode{v2} in the two tasks.
|
|
It would be noncompliant to use \ucode{v1} and/or \ucode{v2} in lieu of \ucode{N} in the \kcode{depend} clauses,
|
|
because the use of non-allocated allocatable arrays as list items in a \kcode{depend} clause would
|
|
lead to unspecified behavior.
|
|
|
|
\noteheader{--} This example is not strictly compliant with the OpenMP 4.5 specification since the allocation status
|
|
of allocatable arrays \ucode{v1} and \ucode{v2} is changed inside the \kcode{target} region, which is not allowed.
|
|
(See the restrictions for the \kcode{map} clause in the \docref{Data-mapping Attribute Rules and Clauses}
|
|
section of the specification.)
|
|
However, the intention is to relax the restrictions on mapping of allocatable variables in the next release
|
|
of the specification so that the example will be compliant.
|
|
|
|
\ffreeexample[4.0]{async_target}{2}
|