2024-11-13 11:07:08 -08:00

108 lines
5.7 KiB
TeX

%\pagebreak
\section{Internal Control Variables (ICVs)}
\label{sec:icv}
\index{internal control variables}
According to the \docref{Internal Control Variables} section of the OpenMP 4.0 specification, an OpenMP implementation must act as if there are ICVs that control
the behavior of the program. This example illustrates two ICVs, \plc{nthreads-var}
and \plc{max-active-levels-var}. The \plc{nthreads-var} ICV controls the
number of threads requested for encountered parallel regions; there is one copy
of this ICV per task. The \plc{max-active-levels-var} ICV controls the maximum
number of nested active parallel regions; there is one copy of this ICV for the
whole program.
In the following example, the \plc{nest-var}, \plc{max-active-levels-var},
\plc{dyn-var}, and \plc{nthreads-var} ICVs are modified through calls to
the runtime library routines \kcode{omp_set_nested}, \kcode{omp_set_max_active_levels}, \kcode{omp_set_dynamic}, and \kcode{omp_set_num_threads} respectively. These ICVs
affect the operation of \kcode{parallel} regions. Each implicit task generated
by a \kcode{parallel} region has its own copy of the \plc{nest-var}, \plc{dyn-var},
and \plc{nthreads-var} ICVs.
In the following example, the new value of \plc{nthreads-var} applies only to
the implicit tasks that execute the call to \kcode{omp_set_num_threads}. There
is one copy of the \plc{max-active-levels-var} ICV for the whole program and
its value is the same for all tasks. This example assumes that nested parallelism
is supported.
The outer \kcode{parallel} region creates a team of two threads; each of the threads
will execute one of the two implicit tasks generated by the outer \kcode{parallel}
region.
Each implicit task generated by the outer \kcode{parallel} region calls \kcode{omp_set_num_threads(\ucode{3})},
assigning the value 3 to its respective copy of \plc{nthreads-var}. Then each
implicit task encounters an inner \kcode{parallel} region that creates a team
of three threads; each of the threads will execute one of the three implicit tasks
generated by that inner \kcode{parallel} region.
Since the outer \kcode{parallel} region is executed by 2 threads, and the inner
by 3, there will be a total of 6 implicit tasks generated by the two inner \kcode{parallel}
regions.
Each implicit task generated by an inner \kcode{parallel} region will execute
the call to \kcode{omp_set_num_threads(\ucode{4})}, assigning the value 4 to its respective
copy of \plc{nthreads-var}.
The print statement in the outer \kcode{parallel} region is executed by only one
of the threads in the team. So it will be executed only once.
The print statement in an inner \kcode{parallel} region is also executed by only
one of the threads in the team. Since we have a total of two inner \kcode{parallel}
regions, the print statement will be executed twice -- once per inner \kcode{parallel}
region.
\pagebreak
\cexample{icv}{1}
\fexample{icv}{1}
\pagebreak
\subsection{\kcode{num_threads} Clause with a List}
\label{subsec:icv_nthreads}
\index{clauses!num_threads@\kcode{num_threads}}
\index{num_threads clause@\kcode{num_threads} clause}
Prior to OpenMP 6.0, only a single argument can be specified in the
\kcode{num_threads} clause of a \kcode{parallel} construct.
In this case, the clause argument is used as the requested team size for
that \kcode{parallel} region only and does not affect the value of the
\plc{nthreads-var} ICV in any generated implicit tasks for nested
\kcode{parallel} regions.
That value is instead inherited from the value of the \plc{nthreads-var}
ICV in the task that encountered the \kcode{parallel} construct,
stripping away the first integer, if the value of that ICV is a list of
multiple integers.
In OpenMP 6.0, the \kcode{num_threads} clause permits more than one argument.
In this case, the first argument is still used as the requested team size for
the \kcode{parallel} region. The difference is the \plc{nthreads-var} ICVs of
the generated implicit tasks are set to the list of values given by the
remaining clause arguments, rather than inheriting the value of the
encountering task's \plc{nthreads-var} ICV. Consequentially, a
\kcode{num_threads} clause with an argument list may be used to control not
only the team size for a given \kcode{parallel} region, but also the
requested team size of any nested \kcode{parallel} regions.
The following example illustrates the effect of the \kcode{num_threads} clause
for nested \kcode{parallel} regions. The program starts with the environment
variable \kcode{OMP_NUM_THREADS} set to \ucode{"4,5,6"}, which initializes the
\plc{nthreads-var} ICV of the initial task to the list \{\vcode{4,5,6}\}. Case 1 shows
how this ICV is used to control the requested team size for a nest of three
\kcode{parallel} regions. As indicated from the comments, with each
successive nesting level the \plc{nthreads-var} ICV inherits all but the first
integer in the \plc{nthreads-var} ICV of the task that encounters the
\kcode{parallel} construct. This pattern continues until the \plc{nthreads-var}
ICV contains only a single integer, at which point that value persists for any
further nesting levels. In Case 2, a \kcode{num_threads(\ucode{8})} clause appears on
the outermost \kcode{parallel} construct. This only has the effect of altering
the requested team size for that \kcode{parallel} region. Note that the value of
the \plc{nthreads-var} ICVs inside the \kcode{parallel} region are the same as
for Case 1. In Case 3, the \kcode{num_threads} clause is specified with
multiple arguments \kcode{(\ucode{8,2})}. This sets the \plc{nthreads-var} ICV value in each of
the generated implicit tasks to \{\vcode{2}\}, in accordance with the inheritance rules
for the \plc{nthreads-var} ICV described above.
\cexample[6.0]{icv}{2}[2]
\ffreeexample[6.0]{icv}{2}[2]