mirror of
https://github.com/OpenMP/Examples.git
synced 2025-04-04 05:41:33 +01:00
300 lines
15 KiB
TeX
300 lines
15 KiB
TeX
%\pagebreak
|
|
\section{Declare Target Directive}
|
|
\label{sec:declare_target}
|
|
|
|
%\index{declare target directive@\code{declare target} directive!enter clause@\code{enter} clause}
|
|
%\index{enter clause@\code{enter} clause}
|
|
%\index{clauses!enter@\code{enter}}
|
|
|
|
\subsection{Declare Target Directive for a Procedure}
|
|
|
|
\label{subsec:declare_target_function}
|
|
|
|
\index{directives!declare target@\kcode{declare target}}
|
|
\index{declare target directive@\kcode{declare target} directive}
|
|
|
|
\index{directives!begin declare target@\kcode{begin declare target}}
|
|
\index{begin declare target directive@\kcode{begin declare target} directive}
|
|
|
|
The following example shows how the declare target directive
|
|
is used to indicate that the corresponding call inside a \kcode{target} region
|
|
is to a \ucode{fib} procedure that can execute on the default target device.
|
|
|
|
A version of the function is also available on the host device. When the \kcode{if}
|
|
clause conditional expression on the \kcode{target} construct evaluates to \plc{false},
|
|
the \kcode{target} region (thus \ucode{fib}) will execute on the host device.
|
|
|
|
For the following C/C++ code the declaration of the function \ucode{fib} appears between the
|
|
\kcode{begin declare target} and \kcode{end declare target} directives.
|
|
In the corresponding Fortran code, the \kcode{declare target} directive appears at the
|
|
end of the specification part of the subroutine.
|
|
|
|
\cexample[5.1]{declare_target}{1}
|
|
|
|
The Fortran \ucode{fib} subroutine contains a \kcode{declare target} declaration
|
|
to indicate to the compiler to create an device executable version of the procedure.
|
|
The subroutine name has not been included on the \kcode{declare target}
|
|
directive and is, therefore, implicitly assumed.
|
|
|
|
The program uses the \ucode{module_fib} module, which presents an explicit interface to
|
|
the compiler with the \kcode{declare target} declarations for processing
|
|
the \ucode{fib} call.
|
|
|
|
\ffreeexample[4.0]{declare_target}{1}
|
|
|
|
\pagebreak
|
|
The next Fortran example shows the use of an external subroutine. As the subroutine
|
|
is neither use associated nor an internal procedure, the \kcode{declare target}
|
|
declarations within a external subroutine are unknown to the main program unit;
|
|
therefore, a \kcode{declare target} must be provided within the program
|
|
scope for the compiler to determine that a target binary should be available.
|
|
|
|
\ffreeexample[4.0]{declare_target}{2}
|
|
|
|
\subsection{Declare Target Directive for Indirect Procedure Call}
|
|
\label{subsec:indirect}
|
|
|
|
\index{clauses!indirect@\kcode{indirect}}
|
|
\index{indirect clause@\kcode{indirect} clause}
|
|
|
|
In the OpenMP 5.1 Specification the \kcode{indirect} clause was added to allow
|
|
indirect procedure calls, via function pointers, in a \kcode{target} region.
|
|
The functions to be allowed indirect invocation are specified in an \kcode{enter}
|
|
clause of a declare target directive, along with the \kcode{indirect} clause.
|
|
The clause has an optional enabling/disabling argument (default enabled). In the
|
|
absence of the indirect clause the function pointer would be mapped as a scalar
|
|
(firstprivate) that would point to the host versions of the functions.
|
|
Indirect clause informs the compiler that the function can potentially be
|
|
used via function pointers and to use device versions of the same within
|
|
the target region.
|
|
|
|
Only with an enabled \kcode{indirect} clause and a function specification in an \kcode{enter} clause
|
|
of a declare target directive may a function be called with an indirect invocation in a \kcode{target} region.
|
|
(Note: this feature limits the number of functions that can be used by function
|
|
pointers in the \kcode{target} region to a restricted list for the compiler.)
|
|
%% KFM should be "to a restricted... -> to those listed in the \code{enter} clause.
|
|
|
|
In the following example, the \kcode{declare target} \kcode{enter(\ucode{fun1,fun2})}
|
|
\kcode{indirect} directive specifies that the \ucode{fun1} and \ucode{fun2} functions may
|
|
be invoked with a function pointer in the \kcode{target} region.
|
|
Either the \ucode{fun1} or \ucode{fun2} function is invoked by the \ucode{fptr} function
|
|
pointer in the \kcode{target} construct, as determined by the value of \ucode{count}.
|
|
|
|
\cexample[5.2]{declare_target_indirect_call}{1}
|
|
\ffreeexample[5.2]{declare_target_indirect_call}{1}
|
|
|
|
\subsection{Declare Target Directive for Class Type}
|
|
\label{subsec:declare_target_class}
|
|
|
|
\index{directives!begin declare target@\kcode{begin declare target}}
|
|
\index{begin declare target directive@\kcode{begin declare target} directive}
|
|
|
|
The following example shows the use of the \kcode{begin declare target}
|
|
and \kcode{end declare target} pair to designate the beginning and
|
|
end of the affected declarations, as introduced in OpenMP 5.1.
|
|
The \kcode{begin declare target} directive was defined
|
|
to symmetrically complement the terminating (``end'') directive.
|
|
|
|
\begin{cppspecific}
|
|
|
|
The example also shows 3 different ways to use a \kcode{declare target} directive for a
|
|
class and an external member-function definition (for the \ucode{XOR1}, \ucode{XOR2},
|
|
and \ucode{XOR3} classes and definitions for their corresponding \ucode{foo()} member functions).
|
|
|
|
For \ucode{XOR1}, a \kcode{begin declare target} and
|
|
\kcode{end declare target} directive
|
|
enclose both the class and its member function definition. The compiler immediately
|
|
knows to create a device version of the function for execution in a \kcode{target} region.
|
|
|
|
For \ucode{XOR2}, the class member function definition is not specified with a
|
|
\kcode{declare target} directive.
|
|
An implicit declare target is created for the member function definition.
|
|
The same applies if this declaration arrangement for the class and function
|
|
are included through a header file.
|
|
|
|
For \ucode{XOR3}, the class and its member function are not enclosed by \kcode{begin declare target}
|
|
and \kcode{end declare target} directives,
|
|
but there is an implicit declare target since the class, its function
|
|
and the \kcode{target} construct are in the same file scope. That is, the class
|
|
and its function are treated as if delimited by a \kcode{declare target} directive.
|
|
The same applies if the class and function are included through a header file.
|
|
|
|
\cppnexample[5.1]{declare_target}{2a}
|
|
|
|
\topmarker{C++}
|
|
|
|
Often class definitions and their function definitions are included in separate files,
|
|
as shown in \example{declare_target.2b_classes.hpp} and \example{declare_target.2b_functions.cpp} example code files below.
|
|
In this case, it is necessary to specify a declare target directive for the classes.
|
|
However, as long as the \example{2b_functions.cpp} file includes the corresponding declare target classes,
|
|
there is no need to specify the functions with a declare target directive.
|
|
The functions are treated as if they are specified with a declare target directive.
|
|
Compiling the \example{declare_target.2b_functions.cpp} and \example{declare_target.2b_main.cpp} files
|
|
separately and linking them, will create appropriate executable device functions for the target device.
|
|
|
|
\srcnexample[5.1]{declare_target}{2b_classes}{hpp}
|
|
\smallskip
|
|
\cppnexample[5.1]{declare_target}{2b_functions}[1]
|
|
\smallskip
|
|
\cppnexample[5.1]{declare_target}{2b_main}[1]
|
|
|
|
%\end{cppspecific}
|
|
\topmarker{C++}
|
|
|
|
%\begin{cppspecific}
|
|
The following example shows how the \kcode{begin declare target} and \kcode{end declare target} directives are used to enclose the declaration
|
|
of a variable \ucode{varY} with a class type \ucode{typeY}.
|
|
%Prior to OpenMP 5.0, the member function \code{typeY::foo()} cannot
|
|
%be accessed on a target device because its declaration did not appear between \code{begin declare}
|
|
%\code{target} and \code{end declare target} directives.
|
|
|
|
This example shows pre-OpenMP 5.0 behavior for the \ucode{varY.foo()} function call (an error).
|
|
The member function \ucode{typeY::foo()} cannot be accessed on a target device because its
|
|
declaration does not appear between \kcode{begin declare target} and
|
|
\kcode{end declare target} directives. As of OpenMP 5.0, the
|
|
function is implicitly declared with a declare target directive
|
|
and will successfully execute the function on the device. See previous examples.
|
|
%as if it were included in list or block of a declare target directive,
|
|
|
|
\cppnexample[5.1]{declare_target}{2c}
|
|
\end{cppspecific}
|
|
|
|
\subsection{Declare Target Directive for Variables}
|
|
\label{subsec:declare_target_variables}
|
|
|
|
\index{directives!declare target@\kcode{declare target}}
|
|
\index{declare target directive@\kcode{declare target} directive}
|
|
|
|
\index{directives!begin declare target@\kcode{begin declare target}}
|
|
\index{begin declare target directive@\kcode{begin declare target} directive}
|
|
|
|
The following examples show how the declare target directive is used to indicate that
|
|
global variables are mapped to the implicit device data environment of each target device.
|
|
|
|
In the following example, the declarations of the variables \ucode{p}, \ucode{v1}, and \ucode{v2} appear
|
|
between \kcode{begin declare target} and \kcode{end declare target}
|
|
directives indicating that the variables are mapped to the implicit device data
|
|
environment of each target device. The \kcode{target update} directive
|
|
is then used to manage the consistency of the variables \ucode{p}, \ucode{v1}, and \ucode{v2} between the
|
|
data environment of the encountering host device task and the implicit device data
|
|
environment of the default target device.
|
|
|
|
\cexample[5.1]{declare_target}{3}
|
|
|
|
The Fortran version of the above C code uses a different syntax. Fortran modules
|
|
use a list syntax on the \kcode{declare target} directive to declare
|
|
mapped variables.
|
|
|
|
\ffreeexample[4.0]{declare_target}{3}
|
|
|
|
\pagebreak
|
|
The following example also indicates that the function \ucode{Pfun()} is available on the
|
|
target device, as well as the variable \ucode{Q}, which is mapped to the implicit device
|
|
data environment of each target device. The \kcode{target update} directive
|
|
is then used to manage the consistency of the variable \ucode{Q} between the data environment
|
|
of the encountering host device task and the implicit device data environment of
|
|
the default target device.
|
|
|
|
In the following example, the function and variable declarations appear between
|
|
the \kcode{begin declare target} and \kcode{end declare target}
|
|
directives.
|
|
|
|
\cexample[5.1]{declare_target}{4}
|
|
|
|
The Fortran version of the above C code uses a different syntax. In Fortran modules
|
|
a list syntax on the \kcode{declare target} directive is used to declare
|
|
mapped variables and procedures. The \ucode{N} and \ucode{Q} variables are declared as a comma
|
|
separated list. When the \kcode{declare target} directive is used to
|
|
declare just the procedure, the procedure name need not be listed -- it is implicitly
|
|
assumed, as illustrated in the \ucode{Pfun()} function.
|
|
|
|
\ffreeexample[4.0]{declare_target}{4}
|
|
|
|
\subsection{Declare Target Directive with \kcode{declare simd}}
|
|
\label{subsec:declare_target_simd}
|
|
|
|
\index{directives!declare target@\kcode{declare target}}
|
|
\index{declare target directive@\kcode{declare target} directive}
|
|
|
|
\index{directives!begin declare target@\kcode{begin declare target}}
|
|
\index{begin declare target directive@\kcode{begin declare target} directive}
|
|
|
|
\index{directives!declare simd@\kcode{declare simd}}
|
|
\index{declare simd directive@\kcode{declare simd} directive}
|
|
|
|
The following example shows how the \kcode{begin declare target} and
|
|
\kcode{end declare target} directives are used to indicate that a function
|
|
is available on a target device. The \kcode{declare simd} directive indicates
|
|
that there is a SIMD version of the function \ucode{P()} that is available on the target
|
|
device as well as one that is available on the host device.
|
|
|
|
\cexample[5.1]{declare_target}{5}
|
|
|
|
The Fortran version of the above C code uses a different syntax. Fortran modules
|
|
use a list syntax of the \kcode{declare target} declaration for the mapping.
|
|
%%KFM use a list syntax in the \kcode{declare target} directive for the mapping.
|
|
Here the \ucode{N} and \ucode{Q} variables are declared in the list form as a comma separated list.
|
|
The function declaration does not use a list and implicitly assumes the function
|
|
name. In this Fortran example row and column indices are reversed relative to the
|
|
C/C++ example, as is usual for codes optimized for memory access.
|
|
|
|
\ffreeexample[4.0]{declare_target}{5}
|
|
|
|
|
|
\subsection{Declare Target Directive with \kcode{link} Clause}
|
|
\label{subsec:declare_target_link}
|
|
|
|
\index{directives!declare target@\kcode{declare target}}
|
|
\index{declare target directive@\kcode{declare target} directive}
|
|
|
|
\index{directives!begin declare target@\kcode{begin declare target}}
|
|
\index{begin declare target directive@\kcode{begin declare target} directive}
|
|
|
|
\index{clauses!link@\kcode{link}}
|
|
\index{link clause@\kcode{link} clause}
|
|
|
|
In the OpenMP 4.5 standard the \kcode{declare target} directive was extended to allow static
|
|
data to be mapped, \emph{when needed}, through a \kcode{link} clause.
|
|
|
|
Data storage for items listed in the \kcode{link} clause becomes available on the device
|
|
when it is mapped implicitly or explicitly in a \kcode{map} clause, and it persists for the scope of
|
|
the mapping (as specified by a \kcode{target} construct,
|
|
a \kcode{target data} construct, or
|
|
\kcode{target enter/exit data} constructs).
|
|
|
|
Tip: When all the global data items will not fit on a device and are not needed
|
|
simultaneously, use the \kcode{link} clause and map the data only when it is needed.
|
|
%%KFM simultaneously, use the \kcode{link} clause and map sections of the data only when it is needed.
|
|
|
|
The following C and Fortran examples show two sets of data (single precision and double precision)
|
|
that are global on the host for the entire execution on the host; but are only used
|
|
globally on the device for part of the program execution. The single precision data
|
|
are allocated and persist only for the first \kcode{target} region. Similarly, the
|
|
double precision data are in scope on the device only for the second \kcode{target} region.
|
|
|
|
\cexample[5.1]{declare_target}{6}
|
|
\ffreeexample[4.5]{declare_target}{6}
|
|
|
|
|
|
\subsection{Declare Target Directive with \kcode{device_type} Clause}
|
|
\label{subsec:declare_target_device_type}
|
|
|
|
\index{directives!declare target@\kcode{declare target}}
|
|
\index{declare target directive@\kcode{declare target} directive}
|
|
|
|
\index{directives!begin declare target@\kcode{begin declare target}}
|
|
\index{begin declare target directive@\kcode{begin declare target} directive}
|
|
|
|
\index{clauses!device_type@\kcode{device_type}}
|
|
\index{device_type clause@\kcode{device_type} clause}
|
|
|
|
The \kcode{declare target} directives apply to procedures to ensure that they can be executed or accessed on a device.
|
|
The \kcode{device_type} clause specifies whether a version of the procedure or variable should be made available on the host, device or both.
|
|
This example uses \kcode{nohost} for a procedure \ucode{foo()}. Only a device version of the procedure \ucode{foo()} is made available.
|
|
If the variant function \ucode{foo_onhost()} is not specified for the host fallback execution, the call to \ucode{foo()} from the \kcode{target} region will result in a link time error due to the code generated for host execution of the target region.
|
|
This is because host symbol for the device routine \ucode{foo()} marked as \kcode{nohost} is not required to be present in the host environment.
|
|
|
|
\cexample[5.2]{declare_target}{7}
|
|
\ffreeexample[5.2]{declare_target}{7}
|
|
|