mirror of
https://github.com/OpenMP/Examples.git
synced 2025-04-23 14:21:23 +01:00
155 lines
8.1 KiB
TeX
155 lines
8.1 KiB
TeX
\pagebreak
|
|
\section{Metadirectives}
|
|
\label{sec:metadirective}
|
|
\index{directives!metadirective@\code{metadirective}}
|
|
\index{metadirective directive@\code{metadirective} directive}
|
|
|
|
\index{metadirective directive@\code{metadirective} directive!when clause@\code{when} clause}
|
|
\index{metadirective directive@\code{metadirective} directive!otherwise clause@\code{otherwise} clause}
|
|
\index{clauses!when@\code{when}}
|
|
\index{when clause@\code{when} clause}
|
|
\index{clauses!otherwise@\code{otherwise}}
|
|
\index{otherwise clause@\code{otherwise} clause}
|
|
A \code{metadirective} directive provides a mechanism to select a directive in
|
|
a \code{when} clause to be used, depending upon one or more contexts:
|
|
implementation, available devices and the present enclosing construct.
|
|
The directive in an \code{otherwise} clause is used when a directive of the
|
|
\code{when} clause is not selected.
|
|
|
|
\index{context selector!construct@\plc{construct}}
|
|
In the \code{when} clause the \plc{context selector} (or just \plc{selector}) defines traits that are
|
|
evaluated for selection of the directive that follows the selector.
|
|
This ``selectables'' directive is called a \plc{directive variant}.
|
|
Traits are grouped by \plc{construct}, \plc{implementation} and
|
|
\plc{device} \plc{sets} to be used by a selector of the same name.
|
|
|
|
\index{context selector!device@\plc{device}}
|
|
In the first example the architecture trait \plc{arch} of the
|
|
\plc{device} selector set specifies that if an \plc{nvptx} architecture is
|
|
active in the OpenMP context, then the \code{teams}~\code{loop}
|
|
\plc{directive variant} is selected as the directive; otherwise, the \code{parallel}~\code{loop}
|
|
\plc{directive variant} of the \code{otherwise} clause is selected as the directive.
|
|
That is, if a \plc{device} of \plc{nvptx} architecture is supported by the implementation within
|
|
the enclosing \code{target} construct, its \plc{directive variant} is selected.
|
|
The architecture names, such as \plc{nvptx}, are implementation defined.
|
|
Also, note that \plc{device} as used in a \code{target} construct specifies
|
|
a device number, while \plc{device}, as used in the \code{metadirective}
|
|
directive as selector set, has traits of \plc{kind}, \plc{isa} and \plc{arch}.
|
|
|
|
|
|
\cexample[5.2]{metadirective}{1}
|
|
|
|
\ffreeexample[5.2]{metadirective}{1}
|
|
|
|
%\pagebreak
|
|
\index{context selector!implementation@\plc{implementation}}
|
|
In the second example, the \plc{implementation} selector set is specified
|
|
in the \code{when} clause to distinguish between platforms.
|
|
Additionally, specific architectures are specified with the \plc{device}
|
|
selector set.
|
|
|
|
In the code, different \code{teams} constructs are employed as determined
|
|
by the \code{metadirective} directive.
|
|
The number of teams is restricted by a \code{num\_teams} clause
|
|
and a thread limit is also set by a \code{thread\_limit} clause for
|
|
\plc{vendor} platforms and specific architecture
|
|
traits. Otherwise, just the \code{teams} construct is used without
|
|
any clauses, as prescribed by the \code{otherwise} clause.
|
|
|
|
|
|
\cexample[5.2]{metadirective}{2}
|
|
|
|
\ffreeexample[5.2]{metadirective}{2}
|
|
\clearpage
|
|
|
|
\index{context selector!construct@\plc{construct}}
|
|
|
|
\index{directives!declare target@\code{declare}~\code{target}}
|
|
\index{declare target directive@\code{declare}~\code{target} directive}
|
|
|
|
\index{directives!begin declare target@\code{begin}~\code{declare}~\code{target}}
|
|
\index{begin declare target directive@\code{begin}~\code{declare}~\code{target} directive}
|
|
|
|
In the third example, a \plc{construct} selector set is specified in the \code{when} clause.
|
|
Here, a \code{metadirective} directive is used within a function that is also
|
|
compiled as a function for a target device as directed by a declare target directive.
|
|
The \plc{target} directive name of the \code{construct} selector ensures that the
|
|
\code{distribute}~\code{parallel}~\code{for/do} construct is employed for the target compilation.
|
|
Otherwise, for the host-compiled version the \code{parallel}~\code{for/do}~\code{simd} construct is used.
|
|
|
|
In the first call to the \plc{exp\_pi\_diff()} routine the context is a
|
|
\code{target}~\code{teams} construct and the \code{distribute}~\code{parallel}~\code{for/do}
|
|
construct version of the function is invoked,
|
|
while in the second call the \code{parallel}~\code{for/do}~\code{simd} construct version is used.
|
|
|
|
%%%%%%%%
|
|
This case illustrates an important point for users that may want to hoist the
|
|
\code{target} directive out of a function that contains the usual
|
|
\code{target}~\code{teams}~\code{distribute}~\code{parallel}~\code{for/do} construct
|
|
(for providing alternate constructs through the \code{metadirective} directive as here).
|
|
While this combined construct can be decomposed into a \code{target} and
|
|
\code{teams distribute parallel for/do} constructs, the OpenMP 5.0 specification has the restriction:
|
|
``If a \code{teams} construct is nested within a \code{target} construct, that \code{target} construct must
|
|
contain no statements, declarations or directives outside of the \code{teams} construct''.
|
|
So, the \code{teams} construct must immediately follow the \code{target} construct without any intervening
|
|
code statements (which includes function calls).
|
|
Since the \code{target} construct alone cannot be hoisted out of a function,
|
|
the \code{target}~\code{teams} construct has been hoisted out of the function, and the
|
|
\code{distribute}~\code{parallel}~\code{for/do} construct is used
|
|
as the \plc{variant} directive of the \code{metadirective} directive within the function.
|
|
%%%%%%%%
|
|
|
|
\cexample[5.2]{metadirective}{3}
|
|
|
|
\ffreeexample[5.2]{metadirective}{3}
|
|
|
|
\index{context selector!user@\plc{user}}
|
|
\index{context selector!condition selector@\code{condition} selector}
|
|
The \code{user} selector set can be used in a metadirective
|
|
to select directives at execution time when the
|
|
\code{condition(}~\plc{boolean-expr}~\code{)} selector expression is not a constant expression.
|
|
In this case it is a \plc{dynamic} trait set, and the selection is made at run time, rather
|
|
than at compile time.
|
|
|
|
In the following example the \plc{foo} function employs the \code{condition}
|
|
selector to choose a device for execution at run time.
|
|
In the \plc{bar} routine metadirectives are nested.
|
|
At the outer level a selection between serial and parallel execution in performed
|
|
at run time, followed by another run time selection on the schedule kind in the inner
|
|
level when the active \plc{construct} trait is \code{parallel}.
|
|
|
|
(Note, the variable \plc{b} in two of the ``selected'' constructs is declared private for the sole purpose
|
|
of detecting and reporting that the construct is used. Since the variable is private, its value
|
|
is unchanged outside of the construct region, whereas it is changed if the ``unselected'' construct
|
|
is used.)
|
|
|
|
%(Note: The value of \plc{b} after the \code{parallel} region remains 0 for the
|
|
%\code{guided} scheduling case, because its \code{parallel} construct also contains
|
|
%the \code{private(}~\plc{b}~\code{)} clause.
|
|
%The variable \plc{b} is employed for the sole purpose of distinguishing which
|
|
%\code{parallel} construct is selected-- for testing.)
|
|
|
|
%While there might be other ways to make these decisions at run time, such as using
|
|
%an \code{if} clause on a \code{parallel} construct, this mechanism is much more general.
|
|
%For instance, an input ``gpu\_type'' string could be used and tested in boolean expressions
|
|
%to select from one of several possible \code{target} constructs.
|
|
%Also, setting the scheduling variable (\plc{unbalanced}) within the execution through a
|
|
%``work balance'' function might be a more practical approach for setting the schedule kind.
|
|
|
|
|
|
\cexample[5.2]{metadirective}{4}
|
|
|
|
\ffreeexample[5.2]{metadirective}{4}
|
|
|
|
Metadirectives can be used in conjunction with templates as shown in the C++ code below.
|
|
Here the template definition generates two versions of the Fibonacci function.
|
|
The \splc{tasking} boolean is used in the \scode{condition} selector to enable tasking.
|
|
The true form implements a parallel version with \scode{task} and \scode{taskwait}
|
|
constructs as in the \splc{tasking.4.c} code in Section~\ref{sec:task_taskwait}.
|
|
The false form implements a serial version without any tasking constructs.
|
|
Note that the serial version is used in the parallel function for optimally
|
|
processing numbers less than 8.
|
|
|
|
\cppexample[5.0]{metadirective}{5}
|
|
|