\pagebreak \section{\kcode{apply} Clause} \label{sec:apply_clause} \index{unroll construct@\kcode{unroll} construct!apply clause@\kcode{apply} clause} \index{tile construct@\kcode{tile} construct!apply clause@\kcode{apply} clause} \index{apply clause@\kcode{apply} clause} \index{clauses!apply@\kcode{apply}} A loop transformation construct can be applied to another nested loop transformation construct, but the application of the ``outer'' transformation is limited to the outermost generated loop of the ``inner'' transformation. The \kcode{apply} clause on a loop transformation construct can specify additional loop transformation directives that apply to generated loops other than the outermost one. Clause modifiers are used to specify which generated loop to target. Also, an applied directive within a clause may specify another \kcode{apply} clause. %The \code{apply} clause on a loop transformation construct can specify (other) %loop transformation directives to be applied to its transformation. %Clause modifiers can be used to target specific generated loops, providing a mechanism %to overcome the restriction of applying a transformation immediately to the next loop %transformation construct. Also, an applied directive within a clause may be another %\code{apply} clause. Any nested loop transformation constructs including any constructs that result from \kcode{apply} clauses of nested constructs are replaced before any enclosing loop transformation construct. This is referred to as the \plc{innermost-first order} here. \subsection{Syntax and Effect} In the example below, the \ucode{construct_unroll} and \ucode{apply_unroll} functions illustrate the syntax for two equivalent means of applying the \kcode{unroll} loop transformation directive to the outermost generated (grid) loop of the \kcode{tile} construct transformation. In function \ucode{construct_unroll}, the tile transformation creates the generated (tiled) loops and then the \kcode{unroll} construct is applied to outermost loop of the replacement. In the \ucode{apply_unroll} function, the \kcode{apply} clause on the \kcode{tile} construct is used to apply an \kcode{unroll} transformation on the \plc{grid} loop (the outermost loop of the tile transformation) as specified by the \kcode{grid} modifier. \cexample[6.0]{apply_syntax}{1} \ffreeexample[6.0]{apply_syntax}{1} For the two functions in the previous example, the \ucode{equivalent} function in the next example shows an equivalent code that a user could have written without using the \kcode{tile} construct or \kcode{apply} clause. \cexample[5.1]{apply_syntax_equivalent}{1} \ffreeexample[5.1]{apply_syntax_equivalent}{1} The following example shows how multiple loop transformation directives can be applied to different generated loops resulting from a loop transformation. For the 4x4 \kcode{tile} construct there will be two (outer) \plc{grid} loops and two (inner) \plc{intra-tile} loops. The first \kcode{apply} clause specifies that the two \plc{grid} loops are to have an \kcode{interchange} directive and a \kcode{nothing} directive (just a placeholder to indicate no directive application) applied to the grid (two outermost) loops. Directives, read from left to right, are applied to the \plc{grid} loops, from outermost to innermost, respectively. The second \kcode{apply} clause specifies that the two \plc{intratile} loops are to have \kcode{nothing} and \kcode{interchange} directives applied to the last two \plc{tile} loops, respectively. Note that the \ucode{A} array dimensions are \ucode{A[100][100][3]} and \ucode{A(0:2,0:99,0:99)} in the C/C++ and Fortran codes to illustrate equivalent sequential memory access for the \ucode{i}, \ucode{j} and \ucode{k} loops. \index{interchange directive@\kcode{interchange} directive} \index{directives!interchange@\kcode{interchange}} \index{nothing directive@\kcode{nothing} directive} \index{directives!nothing@\kcode{nothing}} \cexample[6.0]{apply_syntax}{2} \pagebreak \ffreeexample[6.0]{apply_syntax}{2} For the function in the previous example, the \ucode{equivalent} function in the next example shows a possible equivalent tile replacement code (\kcode{tile} generated loops) and the appropriately positioned \kcode{interchange} and \kcode{nothing} directives. \cexample[6.0]{apply_syntax_equivalent}{2} \pagebreak \ffreeexample[6.0]{apply_syntax_equivalent}{2} \index{tile construct@\kcode{tile} construct!apply clause@\kcode{apply} clause} \index{grid modifier@\kcode{grid} modifier} \index{intratile modifier@\kcode{intratile} modifier} The following example illustrates the use of \kcode{apply} clause modifiers with argument. The index of the generated loop instead of a positional location can be used for the applied-directive. The \kcode{grid(1)} modifier indicates the first grid loop generated by the \kcode{tile} directive and the \kcode{intratile(2)} modifier indicates the second tile loop generated by the \kcode{tile} directive. \cexample[6.0]{apply_syntax}{3} \pagebreak \ffreeexample[6.0]{apply_syntax}{3} Without the index arguments, the \kcode{nothing} argument would be needed as a placeholder, as illustrated by the equivalent codes of the above example as follows. \cexample[6.0]{apply_syntax_equivalent}{3} \pagebreak \ffreeexample[6.0]{apply_syntax_equivalent}{3} \subsection{Spanning Loop Associations} It is possible for a loop transformation directive to be applied to multiple generated loops, and multiple directives applied to the same generated loop. The latter is illustrated in the this example. \cexample[6.0]{apply_span}{1} \ffreeexample[6.0]{apply_span}{1} In this example, the functions show successive steps in the application of the previous loop transformation example as equivalent user-written code. First, the tiling is applied in the \ucode{step1} function. Next, loop transformations in the generated loop nest are replaced according to the innermost-first order rule. Applying the innermost transformation, loop reversal, results in the loop nest in \ucode{step2}. After that, the inner tile directive is applied in the \ucode{step3} function. \index{reverse directive@\kcode{reverse} directive} \index{directives!reverse@\kcode{reverse}} \cexample[6.0]{apply_span_equivalent}{1} \ffreeexample[6.0]{apply_span_equivalent}{1} \subsection{Nested apply} The following example illustrates how multiple loop transformations can be chained by nesting \kcode{apply} clauses. In the \ucode{nested_apply} function, a loop is first tiled, then the intra-tile loop is unrolled, and finally the iteration order of the unrolled loop is reversed. For C/C++ codes, reversing a loop with an unsigned type index may cause the compiler to ensure that underflow is handled correctly. \cexample[6.0]{apply_nested}{1} \ffreeexample[6.0]{apply_nested}{1} In this example the \ucode{step1}, \ucode{step2} and \ucode{step3} functions are all equivalent to the \ucode{nested_apply} function, but illustrate a possible chain of transformations but done manually by a user. \cexample[6.0]{apply_nested_equivalent}{1} \ffreeexample[6.0]{apply_nested_equivalent}{1}