mirror of
https://github.com/OpenMP/Examples.git
synced 2025-04-04 05:41:33 +01:00
249 lines
8.5 KiB
TeX
249 lines
8.5 KiB
TeX
\pagebreak
|
|
\section{\code{proc\_bind} Clause}
|
|
\label{sec:affinity}
|
|
|
|
The following examples demonstrate how to use the \code{proc\_bind} clause to
|
|
control the thread binding for a team of threads in a \code{parallel} region.
|
|
The machine architecture is depicted in the figure below. It consists of two sockets,
|
|
each equipped with a quad-core processor and configured to execute two hardware
|
|
threads simultaneously on each core. These examples assume a contiguous core numbering
|
|
starting from 0, such that the hardware threads 0,1 form the first physical core.
|
|
|
|
\ifpdf
|
|
%\begin{figure}[htbp]
|
|
\centerline{\includegraphics[width=3.8in,keepaspectratio=true]%
|
|
{figs/proc_bind_fig.pdf}}
|
|
%\end{figure}
|
|
\fi
|
|
|
|
The following equivalent place list declarations consist of eight places (which
|
|
we designate as p0 to p7):
|
|
|
|
\code{OMP\_PLACES=\texttt{"}\{0,1\},\{2,3\},\{4,5\},\{6,7\},\{8,9\},\{10,11\},\{12,13\},\{14,15\}\texttt{"}}
|
|
|
|
or
|
|
|
|
\code{OMP\_PLACES=\texttt{"}\{0:2\}:8:2\texttt{"}}
|
|
|
|
\subsection{Spread Affinity Policy}
|
|
\label{subsec:affinity_spread}
|
|
|
|
|
|
The following example shows the result of the \code{spread} affinity policy on
|
|
the partition list when the number of threads is less than or equal to the number
|
|
of places in the parent's place partition, for the machine architecture depicted
|
|
above. Note that the threads are bound to the first place of each subpartition.
|
|
|
|
\cexample[4.0]{affinity}{1}
|
|
|
|
\fexample[4.0]{affinity}{1}
|
|
|
|
It is unspecified on which place the primary thread is initially started. If the
|
|
primary thread is initially started on p0, the following placement of threads will
|
|
be applied in the parallel region:
|
|
|
|
\begin{compactitem}
|
|
\item thread 0 executes on p0 with the place partition p0,p1
|
|
|
|
\item thread 1 executes on p2 with the place partition p2,p3
|
|
|
|
\item thread 2 executes on p4 with the place partition p4,p5
|
|
|
|
\item thread 3 executes on p6 with the place partition p6,p7
|
|
\end{compactitem}
|
|
|
|
|
|
If the primary thread would initially be started on p2, the placement of threads
|
|
and distribution of the place partition would be as follows:
|
|
|
|
\begin{compactitem}
|
|
\item thread 0 executes on p2 with the place partition p2,p3
|
|
|
|
\item thread 1 executes on p4 with the place partition p4,p5
|
|
|
|
\item thread 2 executes on p6 with the place partition p6,p7
|
|
|
|
\item thread 3 executes on p0 with the place partition p0,p1
|
|
\end{compactitem}
|
|
|
|
The following example illustrates the \code{spread} thread affinity policy when
|
|
the number of threads is greater than the number of places in the parent's place
|
|
partition.
|
|
|
|
Let \plc{T} be the number of threads in the team, and \plc{P} be the number of places in the
|
|
parent's place partition. The first \plc{T/P} threads of the team (including the primary
|
|
thread) execute on the parent's place. The next \plc{T/P} threads execute on the next
|
|
place in the place partition, and so on, with wrap around.
|
|
|
|
\cexample[4.0]{affinity}{2}
|
|
|
|
\ffreeexample[4.0]{affinity}{2}
|
|
|
|
It is unspecified on which place the primary thread is initially started. If the
|
|
primary thread is initially started on p0, the following placement of threads will
|
|
be applied in the parallel region:
|
|
|
|
\begin{compactitem}
|
|
\item threads 0,1 execute on p0 with the place partition p0
|
|
|
|
\item threads 2,3 execute on p1 with the place partition p1
|
|
|
|
\item threads 4,5 execute on p2 with the place partition p2
|
|
|
|
\item threads 6,7 execute on p3 with the place partition p3
|
|
|
|
\item threads 8,9 execute on p4 with the place partition p4
|
|
|
|
\item threads 10,11 execute on p5 with the place partition p5
|
|
|
|
\item threads 12,13 execute on p6 with the place partition p6
|
|
|
|
\item threads 14,15 execute on p7 with the place partition p7
|
|
\end{compactitem}
|
|
|
|
If the primary thread would initially be started on p2, the placement of threads
|
|
and distribution of the place partition would be as follows:
|
|
|
|
\begin{compactitem}
|
|
\item threads 0,1 execute on p2 with the place partition p2
|
|
|
|
\item threads 2,3 execute on p3 with the place partition p3
|
|
|
|
\item threads 4,5 execute on p4 with the place partition p4
|
|
|
|
\item threads 6,7 execute on p5 with the place partition p5
|
|
|
|
\item threads 8,9 execute on p6 with the place partition p6
|
|
|
|
\item threads 10,11 execute on p7 with the place partition p7
|
|
|
|
\item threads 12,13 execute on p0 with the place partition p0
|
|
|
|
\item threads 14,15 execute on p1 with the place partition p1
|
|
\end{compactitem}
|
|
|
|
\subsection{Close Affinity Policy}
|
|
\label{subsec:affinity_close}
|
|
|
|
The following example shows the result of the \code{close} affinity policy on
|
|
the partition list when the number of threads is less than or equal to the number
|
|
of places in parent's place partition, for the machine architecture depicted above.
|
|
The place partition is not changed by the \code{close} policy.
|
|
|
|
\cexample[4.0]{affinity}{3}
|
|
|
|
\fexample[4.0]{affinity}{3}
|
|
|
|
It is unspecified on which place the primary thread is initially started. If the
|
|
primary thread is initially started on p0, the following placement of threads will
|
|
be applied in the \code{parallel} region:
|
|
|
|
\begin{compactitem}
|
|
\item thread 0 executes on p0 with the place partition p0-p7
|
|
|
|
\item thread 1 executes on p1 with the place partition p0-p7
|
|
|
|
\item thread 2 executes on p2 with the place partition p0-p7
|
|
|
|
\item thread 3 executes on p3 with the place partition p0-p7
|
|
\end{compactitem}
|
|
|
|
If the primary thread would initially be started on p2, the placement of threads
|
|
and distribution of the place partition would be as follows:
|
|
|
|
\begin{compactitem}
|
|
\item thread 0 executes on p2 with the place partition p0-p7
|
|
|
|
\item thread 1 executes on p3 with the place partition p0-p7
|
|
|
|
\item thread 2 executes on p4 with the place partition p0-p7
|
|
|
|
\item thread 3 executes on p5 with the place partition p0-p7
|
|
\end{compactitem}
|
|
|
|
The following example illustrates the \code{close} thread affinity policy when
|
|
the number of threads is greater than the number of places in the parent's place
|
|
partition.
|
|
|
|
Let \plc{T} be the number of threads in the team, and \plc{P} be the number of places in the
|
|
parent's place partition. The first \plc{T/P} threads of the team (including the primary
|
|
thread) execute on the parent's place. The next \plc{T/P} threads execute on the next
|
|
place in the place partition, and so on, with wrap around. The place partition
|
|
is not changed by the \code{close} policy.
|
|
|
|
\cexample[4.0]{affinity}{4}
|
|
|
|
\ffreeexample[4.0]{affinity}{4}
|
|
|
|
It is unspecified on which place the primary thread is initially started. If the
|
|
primary thread is initially running on p0, the following placement of threads will
|
|
be applied in the parallel region:
|
|
|
|
\begin{compactitem}
|
|
\item threads 0,1 execute on p0 with the place partition p0-p7
|
|
|
|
\item threads 2,3 execute on p1 with the place partition p0-p7
|
|
|
|
\item threads 4,5 execute on p2 with the place partition p0-p7
|
|
|
|
\item threads 6,7 execute on p3 with the place partition p0-p7
|
|
|
|
\item threads 8,9 execute on p4 with the place partition p0-p7
|
|
|
|
\item threads 10,11 execute on p5 with the place partition p0-p7
|
|
|
|
\item threads 12,13 execute on p6 with the place partition p0-p7
|
|
|
|
\item threads 14,15 execute on p7 with the place partition p0-p7
|
|
\end{compactitem}
|
|
|
|
If the primary thread would initially be started on p2, the placement of threads
|
|
and distribution of the place partition would be as follows:
|
|
|
|
\begin{compactitem}
|
|
\item threads 0,1 execute on p2 with the place partition p0-p7
|
|
|
|
\item threads 2,3 execute on p3 with the place partition p0-p7
|
|
|
|
\item threads 4,5 execute on p4 with the place partition p0-p7
|
|
|
|
\item threads 6,7 execute on p5 with the place partition p0-p7
|
|
|
|
\item threads 8,9 execute on p6 with the place partition p0-p7
|
|
|
|
\item threads 10,11 execute on p7 with the place partition p0-p7
|
|
|
|
\item threads 12,13 execute on p0 with the place partition p0-p7
|
|
|
|
\item threads 14,15 execute on p1 with the place partition p0-p7
|
|
\end{compactitem}
|
|
|
|
\subsection{Primary Affinity Policy}
|
|
\label{subsec:affinity_primary}
|
|
|
|
The following example shows the result of the \code{primary} affinity policy on
|
|
the partition list for the machine architecture depicted above. The place partition
|
|
is not changed by the primary policy.
|
|
|
|
\cexample[4.0]{affinity}{5}
|
|
|
|
\fexample[4.0]{affinity}{5}[1]
|
|
\clearpage
|
|
|
|
It is unspecified on which place the primary thread is initially started. If the
|
|
primary thread is initially running on p0, the following placement of threads will
|
|
be applied in the parallel region:
|
|
|
|
\begin{compactitem}
|
|
\item threads 0-3 execute on p0 with the place partition p0-p7
|
|
\end{compactitem}
|
|
|
|
If the primary thread would initially be started on p2, the placement of threads
|
|
and distribution of the place partition would be as follows:
|
|
|
|
\begin{compactitem}
|
|
\item threads 0-3 execute on p2 with the place partition p0-p7
|
|
\end{compactitem}
|
|
|
|
|