mirror of
https://github.com/OpenMP/Examples.git
synced 2025-04-04 05:41:33 +01:00
176 lines
7.9 KiB
TeX
176 lines
7.9 KiB
TeX
\pagebreak
|
|
\section{\kcode{target} Construct}
|
|
\label{sec:target}
|
|
|
|
\subsection{\kcode{target} Construct on \kcode{parallel} Construct}
|
|
\label{subsec:target_parallel}
|
|
\index{constructs!target@\kcode{target}}
|
|
\index{target construct@\kcode{target} construct}
|
|
\index{target construct@\kcode{target} construct!implicit mapping}
|
|
|
|
This following example shows how the \kcode{target} construct offloads a code
|
|
region to a target device. The variables \ucode{p}, \ucode{v1}, \ucode{v2}, and \ucode{N} are implicitly mapped
|
|
to the target device.
|
|
|
|
\cexample[4.0]{target}{1}
|
|
|
|
\ffreeexample[4.0]{target}{1}
|
|
|
|
\subsection{\kcode{target} Construct with \kcode{map} Clause}
|
|
\label{subsec:target_map}
|
|
\index{target construct@\kcode{target} construct!map clause@\kcode{map} clause}
|
|
\index{clauses!map@\kcode{map}}
|
|
\index{map clause@\kcode{map} clause}
|
|
|
|
This following example shows how the \kcode{target} construct offloads a code
|
|
region to a target device. The variables \ucode{p}, \ucode{v1} and \ucode{v2} are explicitly mapped to the
|
|
target device using the \kcode{map} clause. The variable \ucode{N} is implicitly mapped to
|
|
the target device.
|
|
|
|
\cexample[4.0]{target}{2}
|
|
|
|
\ffreeexample[4.0]{target}{2}
|
|
|
|
\subsection{\kcode{map} Clause with \kcode{to}/\kcode{from} map-types}
|
|
\label{subsec:target_map_tofrom}
|
|
\index{map clause@\kcode{map} clause!to map-type@\kcode{to} map-type}
|
|
\index{map clause@\kcode{map} clause!from map-type@\kcode{from} map-type}
|
|
|
|
The following example shows how the \kcode{target} construct offloads a code region
|
|
to a target device. In the \kcode{map} clause, the \kcode{to} and \kcode{from}
|
|
map-types define the mapping between the original (host) data and the target (device)
|
|
data. The \kcode{to} map-type specifies that the data will only be read on the
|
|
device, and the \kcode{from} map-type specifies that the data will only be written
|
|
to on the device. By specifying a guaranteed access on the device, data transfers
|
|
can be reduced for the \kcode{target} region.
|
|
|
|
The \kcode{to} map-type indicates that at the start of the \kcode{target} region
|
|
the variables \ucode{v1} and \ucode{v2} are initialized with the values of the corresponding variables
|
|
on the host device, and at the end of the \kcode{target} region the variables
|
|
\ucode{v1} and \ucode{v2} are not assigned to their corresponding variables on the host device.
|
|
|
|
The \kcode{from} map-type indicates that at the start of the \kcode{target} region
|
|
the variable \ucode{p} is not initialized with the value of the corresponding variable
|
|
on the host device, and at the end of the \kcode{target} region the variable \ucode{p}
|
|
is assigned to the corresponding variable on the host device.
|
|
|
|
\cexample[4.0]{target}{3}
|
|
|
|
The \kcode{to} and \kcode{from} map-types allow programmers to optimize data
|
|
motion. Since data for the \ucode{v} arrays are not returned, and data for the \ucode{p} array
|
|
are not transferred to the device, only one-half of the data is moved, compared
|
|
to the default behavior of an implicit mapping.
|
|
|
|
\ffreeexample[4.0]{target}{3}
|
|
|
|
\subsection{\kcode{map} Clause with Array Sections}
|
|
\label{subsec:target_array_section}
|
|
\index{map clause@\kcode{map} clause!array sections in}
|
|
|
|
The following example shows how the \kcode{target} construct offloads a code region
|
|
to a target device. In the \kcode{map} clause, map-types are used to optimize
|
|
the mapping of variables to the target device. Because variables \ucode{p}, \ucode{v1} and \ucode{v2} are
|
|
pointers, array section notation must be used to map the arrays. The notation \ucode{:N}
|
|
is equivalent to \ucode{0:N}.
|
|
|
|
\cexample[4.0]{target}{4}
|
|
\clearpage
|
|
|
|
In C, the length of the pointed-to array must be specified. In Fortran the extent
|
|
of the array is known and the length need not be specified. A section of the array
|
|
can be specified with the usual Fortran syntax, as shown in the following example.
|
|
The value 1 is assumed for the lower bound for array section \ucode{v2(:N)}.
|
|
|
|
\ffreeexample[4.0]{target}{4}
|
|
|
|
A more realistic situation in which an assumed-size array is passed to \ucode{vec_mult}
|
|
requires that the length of the arrays be specified, because the compiler does
|
|
not know the size of the storage. A section of the array must be specified with
|
|
the usual Fortran syntax, as shown in the following example. The value 1 is assumed
|
|
for the lower bound for array section \ucode{v2(:N)}.
|
|
|
|
\ffreeexample[4.0]{target}{4b}
|
|
|
|
\subsection{\kcode{target} Construct with \kcode{if} Clause}
|
|
\label{subsec:target_if}
|
|
\index{target construct@\kcode{target} construct!if clause@\kcode{if} clause}
|
|
\index{clauses!if@\kcode{if}}
|
|
\index{if clause@\kcode{if} clause}
|
|
|
|
The following example shows how the \kcode{target} construct offloads a code region
|
|
to a target device.
|
|
|
|
The \kcode{if} clause on the \kcode{target} construct indicates that if the variable
|
|
\ucode{N} is smaller than a given threshold, then the \kcode{target} region will be executed
|
|
by the host device.
|
|
|
|
The \kcode{if} clause on the \kcode{parallel} construct indicates that if the
|
|
variable \ucode{N} is smaller than a second threshold then the \kcode{parallel} region
|
|
is inactive.
|
|
|
|
\cexample[4.0]{target}{5}
|
|
|
|
\ffreeexample[4.0]{target}{5}
|
|
|
|
The following example is a modification of the above \example{target.5} code to show the combined \kcode{target}
|
|
and \kcode{parallel} directives. It uses the \plc{directive-name} modifier in multiple \kcode{if}
|
|
clauses to specify the component directive to which it applies.
|
|
|
|
The \kcode{if} clause with the \kcode{target} modifier applies to the \kcode{target} component of the
|
|
combined directive, and the \kcode{if} clause with the \kcode{parallel} modifier applies
|
|
to the \kcode{parallel} component of the combined directive.
|
|
|
|
\cexample[4.5]{target}{6}
|
|
|
|
\ffreeexample[4.5]{target}{6}
|
|
|
|
\subsection{Target Reverse Offload}
|
|
\label{subsec:target_reverse_offload}
|
|
\index{target reverse offload!reverse_offload clause@\kcode{reverse_offload} clause}
|
|
\index{target reverse offload!requires directive@\kcode{requires} directive}
|
|
\index{requires directive@\kcode{requires} directive!reverse_offload clause@\kcode{reverse_offload} clause}
|
|
\index{directives!requires@\kcode{requires}}
|
|
\index{clauses!reverse_offload@\kcode{reverse_offload}}
|
|
\index{reverse_offload clause@\kcode{reverse_offload} clause}
|
|
\index{target construct@\kcode{target} construct!device clause@\kcode{device} clause}
|
|
\index{clauses!device@\kcode{device}}
|
|
\index{device clause@\kcode{device} clause!ancestor modifier@\kcode{ancestor} modifier}
|
|
\index{ancestor modifier@\kcode{ancestor} modifier}
|
|
\index{declare target directive@\kcode{declare target} directive!device_type clause@\kcode{device_type} clause}
|
|
\index{clauses!device_type@\kcode{device_type}}
|
|
\index{device_type clause@\kcode{device_type} clause}
|
|
\index{clauses!enter@\kcode{enter}}
|
|
\index{enter clause@\kcode{enter} clause}
|
|
|
|
\index{directives!declare target@\kcode{declare target}}
|
|
\index{declare target directive@\kcode{declare target} directive}
|
|
|
|
Beginning with OpenMP 5.0, implementations are allowed to
|
|
offload back to the host (reverse offload).
|
|
|
|
In the example below the \ucode{error_handler} function
|
|
is executed back on the host, if an erroneous value is
|
|
detected in the \ucode{A} array on the device.
|
|
|
|
This is accomplished by specifying the \plc{device-modifier}
|
|
\kcode{ancestor} modifier, along with a device number of \ucode{1},
|
|
to indicate that the execution is to be performed on the
|
|
immediate parent (\plc{1st ancestor})-- the host.
|
|
|
|
The \kcode{requires} directive (another 5.0 feature)
|
|
uses the \kcode{reverse_offload} clause to guarantee
|
|
that the reverse offload is implemented.
|
|
|
|
Note that the \kcode{declare target} directive uses the
|
|
\kcode{device_type} clause (another 5.0 feature) to specify that
|
|
the \ucode{error_handler} function is compiled to
|
|
execute on the \plc{host} only. This ensures that no
|
|
attempt will be made to create a device version of the
|
|
function. This feature may be necessary if the function
|
|
exists in another compile unit.
|
|
|
|
|
|
\cexample[5.2]{target_reverse_offload}{7}
|
|
|
|
\ffreeexample[5.0]{target_reverse_offload}{7}
|