OpenMP-Examples/devices/declare_target.tex

\pagebreak
\section{Declare Target Directive}
\label{sec:declare_target}

%\index{declare target directive@\code{declare}~\code{target} directive!enter clause@\code{enter} clause}
%\index{enter clause@\code{enter} clause}
%\index{clauses!enter@\code{enter}}

\subsection{Declare Target Directive for a Procedure}

\label{subsec:declare_target_function}

\index{directives!declare target@\code{declare}~\code{target}}
\index{declare target directive@\code{declare}~\code{target} directive}

\index{directives!begin declare target@\code{begin}~\code{declare}~\code{target}}
\index{begin declare target directive@\code{begin}~\code{declare}~\code{target} directive}

The following example shows how the declare target directive
is used to indicate that the corresponding call inside a \code{target} region
is to a \code{fib} function that can execute on the default target device.

A version of the function is also available on the host device. When the \code{if}
clause conditional expression on the \code{target} construct evaluates to \plc{false},
the \code{target} region (thus \code{fib}) will execute on the host device.

For the following C/C++ code the declaration of the function \code{fib} appears between the
\code{begin}~\code{declare}~\code{target} and \code{end}~\code{declare}~\code{target} directives.
In the corresponding Fortran code, the \code{declare}~\code{target} directive appears at the
end of the specification part of the subroutine.

\cexample[5.1]{declare_target}{1}

The Fortran \code{fib} subroutine contains a \code{declare} \code{target} declaration
to indicate to the compiler to create an device executable version of the procedure.
The subroutine name has not been included on the \code{declare} \code{target}
directive and is, therefore, implicitly assumed.

The program uses the \code{module\_fib} module, which presents an explicit interface to
the compiler with the \code{declare} \code{target} declarations for processing
the \code{fib} call.

\ffreeexample[4.0]{declare_target}{1}

The next Fortran example shows the use of an external subroutine. As the subroutine
is neither use associated nor an internal procedure, the \code{declare} \code{target}
declarations within a external subroutine are unknown to the main program unit;
therefore, a \code{declare} \code{target} must be provided within the program
scope for the compiler to determine that a target binary should be available.

\ffreeexample[4.0]{declare_target}{2}

\subsection{Declare Target Directive for Class Type}
\label{subsec:declare_target_class}

\index{directives!begin declare target@\code{begin}~\code{declare}~\code{target}}
\index{begin declare target directive@\code{begin}~\code{declare}~\code{target} directive}

The following example shows the use of the \code{begin}~\code{declare}~\code{target}
and \code{end}~\code{declare}~\code{target} pair to designate the beginning and
end of the affected declarations, as introduced in OpenMP 5.1.
The \code{begin}~\code{declare}~\code{target} directive was defined
to symmetrically complement the terminating (``end'') directive.

\cppspecificstart

The example also shows 3 different ways to use a declare target directive for a
class and an external member-function definition (for the \plc{XOR1}, \plc{XOR2},
and \plc{XOR3} classes and definitions for their corresponding \plc{foo} member functions).

For \plc{XOR1}, a \code{begin}~\code{declare}~\code{target} and
\code{end}~\code{declare}~\code{target} directive
enclose both the class and its member function definition. The compiler immediately
knows to create a device version of the function for execution in a \code{target} region.

For \plc{XOR2}, the class member function definition is not specified with a
declare target directive.
An implicit declare target is created for the member function definition.
The same applies if this declaration arrangement for the class and function
are included through a header file.

For \plc{XOR3}, the class and its member function are not enclosed by \code{begin}~\code{declare}~\code{target}
and \code{end}~\code{declare}~\code{target} directives,
but there is an implicit declare target since the class, its function
and the \code{target} construct are in the same file scope. That is, the class
and its function are treated as if delimited by a declare target directive.
The same applies if the class and function are included through a header file.

\cppnexample[5.1]{declare_target}{2a}

% blue line floater at top of this page for "C++, cont."
\begin{figure}[t!]
\linewitharrows{-1}{dashed}{C++ (cont.)}{8em}
\end{figure}

Often class definitions and their function definitions are included in separate files,
as shown in \splc{declare_target.2b_classes.hpp} and \splc{declare_target.2b_functions.cpp} below.
In this case, it is necessary to specify in a declare target directive for the classes.
However, as long as the \splc{2b_functions.cpp} file includes the corresponding declare target classes,
there is no need to specify the functions with a declare target directive.
The functions are treated as if they are specified with a declare target directive.
Compiling the \splc{declare_target.2b_functions.cpp} and \splc{declare_target.2b_main.cpp} files
separately and linking them, will create appropriate executable device functions for the target device.

\srcnexample[5.1]{declare_target}{2b_classes}{hpp}
\smallskip
\cppnexample[5.1]{declare_target}{2b_functions}[1]
\smallskip
\cppnexample[5.1]{declare_target}{2b_main}[1]

%\cppspecificend
% blue line floater at top of this page for "C++, cont."
\begin{figure}[t!]
\linewitharrows{-1}{dashed}{C++ (cont.)}{8em}
\end{figure}

%\cppspecificstart
The following example shows how the \code{begin}~\code{declare} \code{target} and \code{end}
\code{declare} \code{target} directives are used to enclose the declaration
of a variable \plc{varY} with a class type \code{typeY}.
%Prior to OpenMP 5.0, the member function \code{typeY::foo()} cannot
%be accessed on a target device because its declaration did not appear between \code{begin}~\code{declare}
%\code{target} and \code{end} \code{declare} \code{target} directives.

This example shows pre-OpenMP 5.0 behavior for the \plc{varY.foo()} function call (an error).
The member function \code{typeY::foo()} cannot be accessed on a target device because its
declaration does not appear between \code{begin}~\code{declare}~\code{target} and
\code{end}~\code{declare}~\code{target} directives. As of OpenMP 5.0, the
function is implicitly declared with a declare target directive
and will successfully execute the function on the device.  See previous examples.
%as if it were included in list or block of a declare target directive,

\cppnexample[5.1]{declare_target}{2c}
\cppspecificend

\subsection{Declare Target Directive for Variables}
\label{subsec:declare_target_variables}

\index{directives!declare target@\code{declare}~\code{target}}
\index{declare target directive@\code{declare}~\code{target} directive}

\index{directives!begin declare target@\code{begin}~\code{declare}~\code{target}}
\index{begin declare target directive@\code{begin}~\code{declare}~\code{target} directive}

The following examples show how the declare target directive is used to indicate that
global variables are mapped to the implicit device data environment of each target device.

In the following example, the declarations of the variables \plc{p}, \plc{v1}, and \plc{v2} appear
between \code{begin}~\code{declare}~\code{target} and \code{end}~\code{declare}~\code{target}
directives indicating that the variables are mapped to the implicit device data
environment of each target device. The \code{target} \code{update} directive
is then used to manage the consistency of the variables \plc{p}, \plc{v1}, and \plc{v2} between the
data environment of the encountering host device task and the implicit device data
environment of the default target device.

\cexample[5.1]{declare_target}{3}

The Fortran version of the above C code uses a different syntax. Fortran modules
use a list syntax on the \code{declare} \code{target} directive to declare
mapped variables.

\ffreeexample[4.0]{declare_target}{3}

The following example also indicates that the function \plc{Pfun()} is available on the
target device, as well as the variable \plc{Q}, which is mapped to the implicit device
data environment of each target device. The \code{target} \code{update} directive
is then used to manage the consistency of the variable \plc{Q} between the data environment
of the encountering host device task and the implicit device data environment of
the default target device.

In the following example, the function and variable declarations appear between
the \code{begin}~\code{declare}~\code{target} and \code{end}~\code{declare}~\code{target}
directives.

\cexample[5.1]{declare_target}{4}

The Fortran version of the above C code uses a different syntax. In Fortran modules
a list syntax on the \code{declare} \code{target} directive is used to declare
mapped variables and procedures. The \plc{N} and \plc{Q} variables are declared as a comma
separated list. When the \code{declare} \code{target} directive is used to
declare just the procedure, the procedure name need not be listed -- it is implicitly
assumed, as illustrated in the \plc{Pfun()} function.

\ffreeexample[4.0]{declare_target}{4}

\subsection{Declare Target Directive with \code{declare}~\code{simd}}
\label{subsec:declare_target_simd}

\index{directives!declare target@\code{declare}~\code{target}}
\index{declare target directive@\code{declare}~\code{target} directive}

\index{directives!begin declare target@\code{begin}~\code{declare}~\code{target}}
\index{begin declare target directive@\code{begin}~\code{declare}~\code{target} directive}

\index{directives!declare simd@\code{declare}~\code{simd}}
\index{declare simd directive@\code{declare}~\code{simd} directive}

The following example shows how the \code{begin}~\code{declare}~\code{target} and
\code{end}~\code{declare}~\code{target} directives are used to indicate that a function
is available on a target device. The \code{declare} \code{simd} directive indicates
that there is a SIMD version of the function \plc{P()} that is available on the target
device as well as one that is available on the host device.

\cexample[5.1]{declare_target}{5}

The Fortran version of the above C code uses a different syntax. Fortran modules
use a list syntax of the \code{declare} \code{target} declaration for the mapping.
Here the \plc{N} and \plc{Q} variables are declared in the list form as a comma separated list.
The function declaration does not use a list and implicitly assumes the function
name. In this Fortran example row and column indices are reversed relative to the
C/C++ example, as is usual for codes optimized for memory access.

\ffreeexample[4.0]{declare_target}{5}


\subsection{Declare Target Directive with \code{link} Clause}
\label{subsec:declare_target_link}

\index{directives!declare target@\code{declare}~\code{target}}
\index{declare target directive@\code{declare}~\code{target} directive}

\index{directives!begin declare target@\code{begin}~\code{declare}~\code{target}}
\index{begin declare target directive@\code{begin}~\code{declare}~\code{target} directive}

\index{clauses!link@\code{link}}
\index{link clause@\code{link} clause}

In the OpenMP 4.5 standard the declare target directive was extended to allow static
data to be mapped, \emph{when needed}, through a \code{link} clause.

Data storage for items listed in the \code{link} clause becomes available on the device
when it is mapped implicitly or explicitly in a \code{map} clause, and it persists for the scope of
the mapping (as specified by a \code{target} construct,
a \code{target}~\code{data} construct, or
\code{target}~\code{enter/exit}~\code{data} constructs).

Tip: When all the global data items will not fit on a device and are not needed
simultaneously, use the \code{link} clause and map the data only when it is needed.

The following C and Fortran examples show two sets of data (single precision and double precision)
that are global on the host for the entire execution on the host; but are only used
globally on the device for part of the program execution. The single precision data
are allocated and persist only for the first \code{target} region. Similarly, the
double precision data are in scope on the device only for the second \code{target} region.

\cexample[5.1]{declare_target}{6}
\ffreeexample[4.5]{declare_target}{6}


\subsection{Declare Target Directive with \code{device\_type} Clause}
\label{subsec:declare_target_device_type}

\index{directives!declare target@\code{declare}~\code{target}}
\index{declare target directive@\code{declare}~\code{target} directive}

\index{directives!begin declare target@\code{begin}~\code{declare}~\code{target}}
\index{begin declare target directive@\code{begin}~\code{declare}~\code{target} directive}

\index{clauses!device_type@\code{device\_type}}
\index{device_type clause@\code{device\_type} clause}

The \code{declare}~\code{target} directives apply to procedures to ensure that they can be executed or accessed on a device.
The \code{device\_type} clause specifies whether a version of the procedure or variable should be made available on the host, device or both.
This example uses  \code{nohost} for a procedure \plc{foo}. Only a device version of the procedure \plc{foo} is made available.
If the variant function \plc{foo\_onhost} is not specified for the host fallback execution, the call to \plc{foo} from the \code{target} region will result in a link time error due to the code generated for host execution of the target region.
This is because host symbol for the device routine \plc{foo} marked as \code{nohost} is not required to be present in the host environment.

\cexample[5.2]{declare_target}{7}
\ffreeexample[5.2]{declare_target}{7}