OpenMP-Examples/affinity/affinity.tex
2024-11-13 11:07:08 -08:00

262 lines
9.0 KiB
TeX

\pagebreak
\section{\kcode{proc_bind} Clause}
\label{sec:affinity}
\index{affinity!proc_bind clause@\kcode{proc_bind} clause}
\index{clauses!proc_bind@\kcode{proc_bind}}
\index{proc_bind clause@\kcode{proc_bind} clause}
The following examples demonstrate how to use the \kcode{proc_bind} clause to
control the thread binding for a team of threads in a \kcode{parallel} region.
The machine architecture is depicted in Figure~\ref{fig:mach_arch}. It consists of two sockets,
each equipped with a quad-core processor and configured to execute two hardware
threads simultaneously on each core. These examples assume a contiguous core numbering
starting from 0, such that the hardware threads 0,1 form the first physical core.
\ifpdf
\begin{figure}[htb]
\centerline{\includegraphics[width=3.0in,keepaspectratio=true]%
{figs/proc_bind_fig.pdf}}
\caption{A machine architecture with two quad-core processors}
\label{fig:mach_arch}
\end{figure}
\fi
The following equivalent place list declarations consist of eight places (which
we designate as p0 to p7):
\begin{boxeducode}
\kcode{export OMP_PLACES=}"{0,1},{2,3},{4,5},{6,7},{8,9},{10,11},{12,13},
{14,15}"
\end{boxeducode}
or
\begin{boxeducode}
\kcode{export OMP_PLACES=}"{0:2}:8:2"
\end{boxeducode}
\subsection{Spread Affinity Policy}
\label{subsec:affinity_spread}
\index{affinity!spread policy@\kcode{spread} policy}
\index{spread policy@\kcode{spread} policy}
The following example shows the result of the \kcode{spread} affinity policy on
the partition list when the number of threads is less than or equal to the number
of places in the parent's place partition, for the machine architecture depicted
above. Note that the threads are bound to the first place of each subpartition.
\cexample[4.0]{affinity}{1}
\fexample[4.0]{affinity}{1}
It is unspecified on which place the primary thread is initially started. If the
primary thread is initially started on p0, the following placement of threads will
be applied in the parallel region:
\begin{compactitem}
\item thread 0 executes on p0 with the place partition p0,p1
\item thread 1 executes on p2 with the place partition p2,p3
\item thread 2 executes on p4 with the place partition p4,p5
\item thread 3 executes on p6 with the place partition p6,p7
\end{compactitem}
If the primary thread would initially be started on p2, the placement of threads
and distribution of the place partition would be as follows:
\begin{compactitem}
\item thread 0 executes on p2 with the place partition p2,p3
\item thread 1 executes on p4 with the place partition p4,p5
\item thread 2 executes on p6 with the place partition p6,p7
\item thread 3 executes on p0 with the place partition p0,p1
\end{compactitem}
The following example illustrates the \kcode{spread} thread affinity policy when
the number of threads is greater than the number of places in the parent's place
partition.
Let \ucode{T} be the number of threads in the team, and \ucode{P} be the number of places in the
parent's place partition. The first \ucode{T/P} threads of the team (including the primary
thread) execute on the parent's place. The next \ucode{T/P} threads execute on the next
place in the place partition, and so on, with wrap around.
\cexample[4.0]{affinity}{2}
\ffreeexample[4.0]{affinity}{2}
It is unspecified on which place the primary thread is initially started. If the
primary thread is initially started on p0, the following placement of threads will
be applied in the parallel region:
\begin{compactitem}
\item threads 0,1 execute on p0 with the place partition p0
\item threads 2,3 execute on p1 with the place partition p1
\item threads 4,5 execute on p2 with the place partition p2
\item threads 6,7 execute on p3 with the place partition p3
\item threads 8,9 execute on p4 with the place partition p4
\item threads 10,11 execute on p5 with the place partition p5
\item threads 12,13 execute on p6 with the place partition p6
\item threads 14,15 execute on p7 with the place partition p7
\end{compactitem}
If the primary thread would initially be started on p2, the placement of threads
and distribution of the place partition would be as follows:
\begin{compactitem}
\item threads 0,1 execute on p2 with the place partition p2
\item threads 2,3 execute on p3 with the place partition p3
\item threads 4,5 execute on p4 with the place partition p4
\item threads 6,7 execute on p5 with the place partition p5
\item threads 8,9 execute on p6 with the place partition p6
\item threads 10,11 execute on p7 with the place partition p7
\item threads 12,13 execute on p0 with the place partition p0
\item threads 14,15 execute on p1 with the place partition p1
\end{compactitem}
\subsection{Close Affinity Policy}
\label{subsec:affinity_close}
\index{affinity!close policy@\kcode{close} policy}
\index{close policy@\kcode{close} policy}
The following example shows the result of the \kcode{close} affinity policy on
the partition list when the number of threads is less than or equal to the number
of places in parent's place partition, for the machine architecture depicted above.
The place partition is not changed by the \kcode{close} policy.
\cexample[4.0]{affinity}{3}
\fexample[4.0]{affinity}{3}
It is unspecified on which place the primary thread is initially started. If the
primary thread is initially started on p0, the following placement of threads will
be applied in the \kcode{parallel} region:
\begin{compactitem}
\item thread 0 executes on p0 with the place partition p0-p7
\item thread 1 executes on p1 with the place partition p0-p7
\item thread 2 executes on p2 with the place partition p0-p7
\item thread 3 executes on p3 with the place partition p0-p7
\end{compactitem}
If the primary thread would initially be started on p2, the placement of threads
and distribution of the place partition would be as follows:
\begin{compactitem}
\item thread 0 executes on p2 with the place partition p0-p7
\item thread 1 executes on p3 with the place partition p0-p7
\item thread 2 executes on p4 with the place partition p0-p7
\item thread 3 executes on p5 with the place partition p0-p7
\end{compactitem}
The following example illustrates the \kcode{close} thread affinity policy when
the number of threads is greater than the number of places in the parent's place
partition.
Let \ucode{T} be the number of threads in the team, and \ucode{P} be the number of places in the
parent's place partition. The first \ucode{T/P} threads of the team (including the primary
thread) execute on the parent's place. The next \ucode{T/P} threads execute on the next
place in the place partition, and so on, with wrap around. The place partition
is not changed by the \kcode{close} policy.
\cexample[4.0]{affinity}{4}
\ffreeexample[4.0]{affinity}{4}
It is unspecified on which place the primary thread is initially started. If the
primary thread is initially running on p0, the following placement of threads will
be applied in the parallel region:
\begin{compactitem}
\item threads 0,1 execute on p0 with the place partition p0-p7
\item threads 2,3 execute on p1 with the place partition p0-p7
\item threads 4,5 execute on p2 with the place partition p0-p7
\item threads 6,7 execute on p3 with the place partition p0-p7
\item threads 8,9 execute on p4 with the place partition p0-p7
\item threads 10,11 execute on p5 with the place partition p0-p7
\item threads 12,13 execute on p6 with the place partition p0-p7
\item threads 14,15 execute on p7 with the place partition p0-p7
\end{compactitem}
If the primary thread would initially be started on p2, the placement of threads
and distribution of the place partition would be as follows:
\begin{compactitem}
\item threads 0,1 execute on p2 with the place partition p0-p7
\item threads 2,3 execute on p3 with the place partition p0-p7
\item threads 4,5 execute on p4 with the place partition p0-p7
\item threads 6,7 execute on p5 with the place partition p0-p7
\item threads 8,9 execute on p6 with the place partition p0-p7
\item threads 10,11 execute on p7 with the place partition p0-p7
\item threads 12,13 execute on p0 with the place partition p0-p7
\item threads 14,15 execute on p1 with the place partition p0-p7
\end{compactitem}
\subsection{Primary Affinity Policy}
\label{subsec:affinity_primary}
\index{affinity!primary policy@\kcode{primary} policy}
\index{primary policy@\kcode{primary} policy}
The following example shows the result of the \kcode{primary} affinity policy on
the partition list for the machine architecture depicted above. The place partition
is not changed by the primary policy.
\cexample[5.1]{affinity}{5}
\fexample[5.1]{affinity}{5}
\clearpage
It is unspecified on which place the primary thread is initially started. If the
primary thread is initially running on p0, the following placement of threads will
be applied in the parallel region:
\begin{compactitem}
\item threads 0-3 execute on p0 with the place partition p0-p7
\end{compactitem}
If the primary thread would initially be started on p2, the placement of threads
and distribution of the place partition would be as follows:
\begin{compactitem}
\item threads 0-3 execute on p2 with the place partition p0-p7
\end{compactitem}