mirror of
https://github.com/OpenMP/Examples.git
synced 2025-04-04 05:41:33 +01:00
151 lines
7.0 KiB
TeX
151 lines
7.0 KiB
TeX
\pagebreak
|
|
\section{\kcode{apply} Clause}
|
|
\label{sec:apply_clause}
|
|
|
|
\index{unroll construct@\kcode{unroll} construct!apply clause@\kcode{apply} clause}
|
|
\index{tile construct@\kcode{tile} construct!apply clause@\kcode{apply} clause}
|
|
|
|
\index{apply clause@\kcode{apply} clause}
|
|
\index{clauses!apply@\kcode{apply}}
|
|
|
|
A loop transformation construct can be applied to another nested
|
|
loop transformation construct, but the application of the ``outer'' transformation
|
|
is limited to the outermost generated loop of the ``inner'' transformation.
|
|
|
|
The \kcode{apply} clause on a loop transformation construct can specify additional
|
|
loop transformation directives that apply to generated loops other than the outermost one.
|
|
Clause modifiers are used to specify which generated loop to target.
|
|
Also, an applied directive within a clause may specify another \kcode{apply} clause.
|
|
|
|
%The \code{apply} clause on a loop transformation construct can specify (other)
|
|
%loop transformation directives to be applied to its transformation.
|
|
%Clause modifiers can be used to target specific generated loops, providing a mechanism
|
|
%to overcome the restriction of applying a transformation immediately to the next loop
|
|
%transformation construct. Also, an applied directive within a clause may be another
|
|
%\code{apply} clause.
|
|
|
|
Any nested loop transformation constructs including any constructs that
|
|
result from \kcode{apply} clauses of nested constructs are replaced before any enclosing
|
|
loop transformation construct. This is referred to as the \plc{innermost-first order}
|
|
here.
|
|
|
|
\subsection{Syntax and Effect}
|
|
|
|
In the example below, the \ucode{construct_unroll} and \ucode{apply_unroll} functions
|
|
illustrate the syntax for two equivalent means of applying the \kcode{unroll} loop transformation
|
|
directive to the outermost generated (grid) loop of the \kcode{tile} construct transformation.
|
|
In function \ucode{construct_unroll}, the tile transformation creates the generated (tiled) loops
|
|
and then the \kcode{unroll} construct is applied to outermost loop of the replacement.
|
|
In the \ucode{apply_unroll} function, the \kcode{apply} clause on the \kcode{tile} construct
|
|
is used to apply an \kcode{unroll} transformation on the \plc{grid} loop (the outermost loop
|
|
of the tile transformation) as specified by the \kcode{grid} modifier.
|
|
|
|
\cexample[6.0]{apply_syntax}{1}
|
|
\ffreeexample[6.0]{apply_syntax}{1}
|
|
|
|
For the two functions in the previous example,
|
|
the \ucode{equivalent} function in the next example shows an equivalent
|
|
code that a user could have written without using the \kcode{tile} construct
|
|
or \kcode{apply} clause.
|
|
|
|
\cexample[5.1]{apply_syntax_equivalent}{1}
|
|
\ffreeexample[5.1]{apply_syntax_equivalent}{1}
|
|
|
|
|
|
The following example shows how multiple loop transformation directives
|
|
can be applied to different generated loops resulting from a loop transformation.
|
|
For the 4x4 \kcode{tile} construct there will be two (outer) \plc{grid} loops and two (inner) \plc{intra-tile} loops.
|
|
The first \kcode{apply} clause specifies that the two \plc{grid} loops are to have an \kcode{interchange} directive and a \kcode{nothing} directive
|
|
(just a placeholder to indicate no directive application) applied to the grid (two outermost) loops.
|
|
Directives, read from left to right, are applied to the \plc{grid} loops, from outermost to innermost, respectively.
|
|
The second \kcode{apply} clause specifies that the two \plc{intratile} loops are to have \kcode{nothing} and \kcode{interchange} directives applied to the
|
|
last two \plc{tile} loops, respectively.
|
|
Note that the \ucode{A} array dimensions are \ucode{A[100][100][3]} and \ucode{A(0:2,0:99,0:99)}
|
|
in the C/C++ and Fortran codes to illustrate equivalent sequential memory access for the
|
|
\ucode{i}, \ucode{j} and \ucode{k} loops.
|
|
|
|
\index{interchange directive@\kcode{interchange} directive}
|
|
\index{directives!interchange@\kcode{interchange}}
|
|
\index{nothing directive@\kcode{nothing} directive}
|
|
\index{directives!nothing@\kcode{nothing}}
|
|
|
|
\cexample[6.0]{apply_syntax}{2}
|
|
\pagebreak
|
|
\ffreeexample[6.0]{apply_syntax}{2}
|
|
|
|
For the function in the previous example,
|
|
the \ucode{equivalent} function in the next example shows a possible
|
|
equivalent tile replacement code (\kcode{tile} generated loops) and the
|
|
appropriately positioned \kcode{interchange} and \kcode{nothing} directives.
|
|
|
|
\cexample[6.0]{apply_syntax_equivalent}{2}
|
|
\pagebreak
|
|
\ffreeexample[6.0]{apply_syntax_equivalent}{2}
|
|
|
|
|
|
\index{tile construct@\kcode{tile} construct!apply clause@\kcode{apply} clause}
|
|
\index{grid modifier@\kcode{grid} modifier}
|
|
\index{intratile modifier@\kcode{intratile} modifier}
|
|
|
|
The following example illustrates the use of \kcode{apply} clause
|
|
modifiers with argument. The index of the generated loop instead of
|
|
a positional location can be used for the applied-directive.
|
|
The \kcode{grid(1)} modifier indicates the first grid loop
|
|
generated by the \kcode{tile} directive
|
|
and the \kcode{intratile(2)} modifier indicates the second tile loop
|
|
generated by the \kcode{tile} directive.
|
|
|
|
\cexample[6.0]{apply_syntax}{3}
|
|
\pagebreak
|
|
\ffreeexample[6.0]{apply_syntax}{3}
|
|
|
|
Without the index arguments, the \kcode{nothing} argument would
|
|
be needed as a placeholder, as illustrated by the equivalent codes
|
|
of the above example as follows.
|
|
|
|
\cexample[6.0]{apply_syntax_equivalent}{3}
|
|
\pagebreak
|
|
\ffreeexample[6.0]{apply_syntax_equivalent}{3}
|
|
|
|
|
|
\subsection{Spanning Loop Associations}
|
|
|
|
It is possible for a loop transformation directive to be applied to multiple generated loops,
|
|
and multiple directives applied to the same generated loop.
|
|
The latter is illustrated in the this example.
|
|
|
|
\cexample[6.0]{apply_span}{1}
|
|
\ffreeexample[6.0]{apply_span}{1}
|
|
|
|
In this example, the functions show successive steps in the application of
|
|
the previous loop transformation example as equivalent user-written code.
|
|
First, the tiling is applied in the \ucode{step1} function.
|
|
Next, loop transformations in the generated loop nest are replaced according to the innermost-first order rule.
|
|
Applying the innermost transformation, loop reversal, results in the loop nest in \ucode{step2}.
|
|
After that, the inner tile directive is applied in the \ucode{step3} function.
|
|
|
|
\index{reverse directive@\kcode{reverse} directive}
|
|
\index{directives!reverse@\kcode{reverse}}
|
|
|
|
\cexample[6.0]{apply_span_equivalent}{1}
|
|
\ffreeexample[6.0]{apply_span_equivalent}{1}
|
|
|
|
|
|
\subsection{Nested apply}
|
|
|
|
The following example illustrates how multiple loop transformations can be chained by nesting \kcode{apply} clauses.
|
|
In the \ucode{nested_apply} function, a loop is first tiled, then the intra-tile
|
|
loop is unrolled, and finally the iteration order of the unrolled loop is reversed.
|
|
For C/C++ codes, reversing a loop with an unsigned type index may cause the compiler
|
|
to ensure that underflow is handled correctly.
|
|
|
|
\cexample[6.0]{apply_nested}{1}
|
|
\ffreeexample[6.0]{apply_nested}{1}
|
|
|
|
In this example the \ucode{step1}, \ucode{step2} and \ucode{step3}
|
|
functions are all equivalent to the \ucode{nested_apply} function, but illustrate
|
|
a possible chain of transformations but done manually by a user.
|
|
|
|
\cexample[6.0]{apply_nested_equivalent}{1}
|
|
\ffreeexample[6.0]{apply_nested_equivalent}{1}
|