mirror of
https://github.com/OpenMP/Examples.git
synced 2025-04-04 05:41:33 +01:00
103 lines
4.4 KiB
TeX
103 lines
4.4 KiB
TeX
%\pagebreak
|
|
\section{\kcode{collapse} Clause}
|
|
\label{sec:collapse}
|
|
\index{clauses!collapse@\kcode{collapse}}
|
|
\index{collapse clause@\kcode{collapse} clause}
|
|
|
|
In the following example, the \ucode{k} and \ucode{j} loops are associated with
|
|
the worksharing-loop construct. So the iterations of the \ucode{k} and \ucode{j} loops are
|
|
collapsed into one loop with a larger iteration space, and that loop is then divided
|
|
among the threads in the current team. Since the \ucode{i} loop is not associated
|
|
with the worksharing-loop construct, it is not collapsed, and the \ucode{i} loop is executed
|
|
sequentially in its entirety in every iteration of the collapsed \ucode{k} and
|
|
\ucode{j} loop.
|
|
|
|
The variable \ucode{j} can be omitted from the \kcode{private} clause when the
|
|
\kcode{collapse} clause is used since it is implicitly private. However, if the
|
|
\kcode{collapse} clause is omitted then \ucode{j} will be shared if it is omitted
|
|
from the \kcode{private} clause. In either case, \ucode{k} is implicitly private
|
|
and could be omitted from the \kcode{private} clause.
|
|
|
|
\cexample[3.0]{collapse}{1}
|
|
|
|
\fexample[3.0]{collapse}{1}
|
|
|
|
In the next example, the \ucode{k} and \ucode{j} loops are associated with the
|
|
worksharing-loop construct. So the iterations of the \ucode{k} and \ucode{j} loops are collapsed
|
|
into one loop with a larger iteration space, and that loop is then divided among
|
|
the threads in the current team.
|
|
|
|
The sequential execution of the iterations in the \ucode{k} and \ucode{j} loops
|
|
determines the order of the iterations in the collapsed iteration space. This implies
|
|
that in the sequentially last iteration of the collapsed iteration space, \ucode{k}
|
|
will have the value \ucode{2} and \ucode{j} will have the value \ucode{3}. Since
|
|
\ucode{klast} and \ucode{jlast} are \kcode{lastprivate}, their values are assigned
|
|
by the sequentially last iteration of the collapsed \ucode{k} and \ucode{j} loop.
|
|
This example prints: \ucode{2 3}.
|
|
|
|
\cexample[3.0]{collapse}{2}
|
|
|
|
\fexample[3.0]{collapse}{2}
|
|
|
|
\index{clauses!collapse@\kcode{collapse}}
|
|
\index{collapse clause@\kcode{collapse} clause}
|
|
\index{clauses!ordered@\kcode{ordered}}
|
|
\index{ordered clause@\kcode{ordered} clause}
|
|
The next example illustrates the interaction of the \kcode{collapse} and \kcode{ordered}
|
|
clauses.
|
|
|
|
In the example, the worksharing-loop construct has both a \kcode{collapse} clause and an \kcode{ordered}
|
|
clause. The \kcode{collapse} clause causes the iterations of the \ucode{k} and
|
|
\ucode{j} loops to be collapsed into one loop with a larger iteration space, and
|
|
that loop is divided among the threads in the current team. An \kcode{ordered}
|
|
clause is added to the worksharing-loop construct because an ordered region binds to the loop
|
|
region arising from the worksharing-loop construct.
|
|
|
|
According to the \docref{\kcode{ordered} Construct} section of the OpenMP 4.0 specification,
|
|
a thread must not execute more than one ordered region that binds
|
|
to the same loop region. So the \kcode{collapse} clause is required for the example
|
|
to be conforming. With the \kcode{collapse} clause, the iterations of the \ucode{k}
|
|
and \ucode{j} loops are collapsed into one loop, and therefore only one ordered
|
|
region will bind to the collapsed \ucode{k} and \ucode{j} loop. Without the \kcode{collapse}
|
|
clause, there would be two ordered regions that bind to each iteration of the \ucode{k}
|
|
loop (one arising from the first iteration of the \ucode{j} loop, and the other
|
|
arising from the second iteration of the \ucode{j} loop).
|
|
|
|
\pagebreak
|
|
The code prints
|
|
|
|
\pout{0 1 1}
|
|
\\
|
|
\pout{0 1 2}
|
|
\\
|
|
\pout{0 2 1}
|
|
\\
|
|
\pout{1 2 2}
|
|
\\
|
|
\pout{1 3 1}
|
|
\\
|
|
\pout{1 3 2}
|
|
|
|
\cexample[3.0]{collapse}{3}
|
|
|
|
\fexample[3.0]{collapse}{3}
|
|
%\clearpage
|
|
|
|
|
|
\index{non-rectangular loop nest}
|
|
The following example illustrates the collapse of a non-rectangular loop nest,
|
|
a new feature in OpenMP 5.0. In a loop nest, a non-rectangular loop has a
|
|
loop bound that references the iteration variable of an enclosing loop.
|
|
|
|
The motivation for this feature is illustrated
|
|
in the example below that creates a symmetric correlation matrix for a set of
|
|
variables. Note that the initial value of the second loop depends on the index
|
|
variable of the first loop for the loops to be collapsed.
|
|
Here the data are represented by a 2D array, each row corresponds to a variable
|
|
and each column corresponds to a sample of the variable -- the last two columns
|
|
are the sample mean and standard deviation (for Fortran, rows and columns are swapped).
|
|
|
|
\cexample[5.0]{collapse}{4}
|
|
|
|
\ffreeexample[5.0]{collapse}{4}
|