2024-04-16 08:59:23 -07:00

103 lines
4.4 KiB
TeX

%\pagebreak
\section{\kcode{collapse} Clause}
\label{sec:collapse}
\index{clauses!collapse@\kcode{collapse}}
\index{collapse clause@\kcode{collapse} clause}
In the following example, the \ucode{k} and \ucode{j} loops are associated with
the worksharing-loop construct. So the iterations of the \ucode{k} and \ucode{j} loops are
collapsed into one loop with a larger iteration space, and that loop is then divided
among the threads in the current team. Since the \ucode{i} loop is not associated
with the worksharing-loop construct, it is not collapsed, and the \ucode{i} loop is executed
sequentially in its entirety in every iteration of the collapsed \ucode{k} and
\ucode{j} loop.
The variable \ucode{j} can be omitted from the \kcode{private} clause when the
\kcode{collapse} clause is used since it is implicitly private. However, if the
\kcode{collapse} clause is omitted then \ucode{j} will be shared if it is omitted
from the \kcode{private} clause. In either case, \ucode{k} is implicitly private
and could be omitted from the \kcode{private} clause.
\cexample[3.0]{collapse}{1}
\fexample[3.0]{collapse}{1}
In the next example, the \ucode{k} and \ucode{j} loops are associated with the
worksharing-loop construct. So the iterations of the \ucode{k} and \ucode{j} loops are collapsed
into one loop with a larger iteration space, and that loop is then divided among
the threads in the current team.
The sequential execution of the iterations in the \ucode{k} and \ucode{j} loops
determines the order of the iterations in the collapsed iteration space. This implies
that in the sequentially last iteration of the collapsed iteration space, \ucode{k}
will have the value \ucode{2} and \ucode{j} will have the value \ucode{3}. Since
\ucode{klast} and \ucode{jlast} are \kcode{lastprivate}, their values are assigned
by the sequentially last iteration of the collapsed \ucode{k} and \ucode{j} loop.
This example prints: \ucode{2 3}.
\cexample[3.0]{collapse}{2}
\fexample[3.0]{collapse}{2}
\index{clauses!collapse@\kcode{collapse}}
\index{collapse clause@\kcode{collapse} clause}
\index{clauses!ordered@\kcode{ordered}}
\index{ordered clause@\kcode{ordered} clause}
The next example illustrates the interaction of the \kcode{collapse} and \kcode{ordered}
clauses.
In the example, the worksharing-loop construct has both a \kcode{collapse} clause and an \kcode{ordered}
clause. The \kcode{collapse} clause causes the iterations of the \ucode{k} and
\ucode{j} loops to be collapsed into one loop with a larger iteration space, and
that loop is divided among the threads in the current team. An \kcode{ordered}
clause is added to the worksharing-loop construct because an ordered region binds to the loop
region arising from the worksharing-loop construct.
According to the \docref{\kcode{ordered} Construct} section of the OpenMP 4.0 specification,
a thread must not execute more than one ordered region that binds
to the same loop region. So the \kcode{collapse} clause is required for the example
to be conforming. With the \kcode{collapse} clause, the iterations of the \ucode{k}
and \ucode{j} loops are collapsed into one loop, and therefore only one ordered
region will bind to the collapsed \ucode{k} and \ucode{j} loop. Without the \kcode{collapse}
clause, there would be two ordered regions that bind to each iteration of the \ucode{k}
loop (one arising from the first iteration of the \ucode{j} loop, and the other
arising from the second iteration of the \ucode{j} loop).
\pagebreak
The code prints
\pout{0 1 1}
\\
\pout{0 1 2}
\\
\pout{0 2 1}
\\
\pout{1 2 2}
\\
\pout{1 3 1}
\\
\pout{1 3 2}
\cexample[3.0]{collapse}{3}
\fexample[3.0]{collapse}{3}
%\clearpage
\index{non-rectangular loop nest}
The following example illustrates the collapse of a non-rectangular loop nest,
a new feature in OpenMP 5.0. In a loop nest, a non-rectangular loop has a
loop bound that references the iteration variable of an enclosing loop.
The motivation for this feature is illustrated
in the example below that creates a symmetric correlation matrix for a set of
variables. Note that the initial value of the second loop depends on the index
variable of the first loop for the loops to be collapsed.
Here the data are represented by a 2D array, each row corresponds to a variable
and each column corresponds to a sample of the variable -- the last two columns
are the sample mean and standard deviation (for Fortran, rows and columns are swapped).
\cexample[5.0]{collapse}{4}
\ffreeexample[5.0]{collapse}{4}