mirror of
https://github.com/OpenMP/Examples.git
synced 2025-04-04 05:41:33 +01:00
262 lines
9.0 KiB
TeX
262 lines
9.0 KiB
TeX
\pagebreak
|
|
\section{\kcode{proc_bind} Clause}
|
|
\label{sec:affinity}
|
|
\index{affinity!proc_bind clause@\kcode{proc_bind} clause}
|
|
\index{clauses!proc_bind@\kcode{proc_bind}}
|
|
\index{proc_bind clause@\kcode{proc_bind} clause}
|
|
|
|
The following examples demonstrate how to use the \kcode{proc_bind} clause to
|
|
control the thread binding for a team of threads in a \kcode{parallel} region.
|
|
The machine architecture is depicted in Figure~\ref{fig:mach_arch}. It consists of two sockets,
|
|
each equipped with a quad-core processor and configured to execute two hardware
|
|
threads simultaneously on each core. These examples assume a contiguous core numbering
|
|
starting from 0, such that the hardware threads 0,1 form the first physical core.
|
|
|
|
\ifpdf
|
|
\begin{figure}[htb]
|
|
\centerline{\includegraphics[width=3.0in,keepaspectratio=true]%
|
|
{figs/proc_bind_fig.pdf}}
|
|
\caption{A machine architecture with two quad-core processors}
|
|
\label{fig:mach_arch}
|
|
\end{figure}
|
|
\fi
|
|
|
|
The following equivalent place list declarations consist of eight places (which
|
|
we designate as p0 to p7):
|
|
\begin{boxeducode}
|
|
\kcode{export OMP_PLACES=}"{0,1},{2,3},{4,5},{6,7},{8,9},{10,11},{12,13},
|
|
{14,15}"
|
|
\end{boxeducode}
|
|
or
|
|
\begin{boxeducode}
|
|
\kcode{export OMP_PLACES=}"{0:2}:8:2"
|
|
\end{boxeducode}
|
|
|
|
\subsection{Spread Affinity Policy}
|
|
\label{subsec:affinity_spread}
|
|
\index{affinity!spread policy@\kcode{spread} policy}
|
|
\index{spread policy@\kcode{spread} policy}
|
|
|
|
|
|
The following example shows the result of the \kcode{spread} affinity policy on
|
|
the partition list when the number of threads is less than or equal to the number
|
|
of places in the parent's place partition, for the machine architecture depicted
|
|
above. Note that the threads are bound to the first place of each subpartition.
|
|
|
|
\cexample[4.0]{affinity}{1}
|
|
|
|
\fexample[4.0]{affinity}{1}
|
|
|
|
It is unspecified on which place the primary thread is initially started. If the
|
|
primary thread is initially started on p0, the following placement of threads will
|
|
be applied in the parallel region:
|
|
|
|
\begin{compactitem}
|
|
\item thread 0 executes on p0 with the place partition p0,p1
|
|
|
|
\item thread 1 executes on p2 with the place partition p2,p3
|
|
|
|
\item thread 2 executes on p4 with the place partition p4,p5
|
|
|
|
\item thread 3 executes on p6 with the place partition p6,p7
|
|
\end{compactitem}
|
|
|
|
|
|
If the primary thread would initially be started on p2, the placement of threads
|
|
and distribution of the place partition would be as follows:
|
|
|
|
\begin{compactitem}
|
|
\item thread 0 executes on p2 with the place partition p2,p3
|
|
|
|
\item thread 1 executes on p4 with the place partition p4,p5
|
|
|
|
\item thread 2 executes on p6 with the place partition p6,p7
|
|
|
|
\item thread 3 executes on p0 with the place partition p0,p1
|
|
\end{compactitem}
|
|
|
|
The following example illustrates the \kcode{spread} thread affinity policy when
|
|
the number of threads is greater than the number of places in the parent's place
|
|
partition.
|
|
|
|
Let \ucode{T} be the number of threads in the team, and \ucode{P} be the number of places in the
|
|
parent's place partition. The first \ucode{T/P} threads of the team (including the primary
|
|
thread) execute on the parent's place. The next \ucode{T/P} threads execute on the next
|
|
place in the place partition, and so on, with wrap around.
|
|
|
|
\cexample[4.0]{affinity}{2}
|
|
|
|
\ffreeexample[4.0]{affinity}{2}
|
|
|
|
It is unspecified on which place the primary thread is initially started. If the
|
|
primary thread is initially started on p0, the following placement of threads will
|
|
be applied in the parallel region:
|
|
|
|
\begin{compactitem}
|
|
\item threads 0,1 execute on p0 with the place partition p0
|
|
|
|
\item threads 2,3 execute on p1 with the place partition p1
|
|
|
|
\item threads 4,5 execute on p2 with the place partition p2
|
|
|
|
\item threads 6,7 execute on p3 with the place partition p3
|
|
|
|
\item threads 8,9 execute on p4 with the place partition p4
|
|
|
|
\item threads 10,11 execute on p5 with the place partition p5
|
|
|
|
\item threads 12,13 execute on p6 with the place partition p6
|
|
|
|
\item threads 14,15 execute on p7 with the place partition p7
|
|
\end{compactitem}
|
|
|
|
If the primary thread would initially be started on p2, the placement of threads
|
|
and distribution of the place partition would be as follows:
|
|
|
|
\begin{compactitem}
|
|
\item threads 0,1 execute on p2 with the place partition p2
|
|
|
|
\item threads 2,3 execute on p3 with the place partition p3
|
|
|
|
\item threads 4,5 execute on p4 with the place partition p4
|
|
|
|
\item threads 6,7 execute on p5 with the place partition p5
|
|
|
|
\item threads 8,9 execute on p6 with the place partition p6
|
|
|
|
\item threads 10,11 execute on p7 with the place partition p7
|
|
|
|
\item threads 12,13 execute on p0 with the place partition p0
|
|
|
|
\item threads 14,15 execute on p1 with the place partition p1
|
|
\end{compactitem}
|
|
|
|
\subsection{Close Affinity Policy}
|
|
\label{subsec:affinity_close}
|
|
\index{affinity!close policy@\kcode{close} policy}
|
|
\index{close policy@\kcode{close} policy}
|
|
|
|
The following example shows the result of the \kcode{close} affinity policy on
|
|
the partition list when the number of threads is less than or equal to the number
|
|
of places in parent's place partition, for the machine architecture depicted above.
|
|
The place partition is not changed by the \kcode{close} policy.
|
|
|
|
\cexample[4.0]{affinity}{3}
|
|
|
|
\fexample[4.0]{affinity}{3}
|
|
|
|
It is unspecified on which place the primary thread is initially started. If the
|
|
primary thread is initially started on p0, the following placement of threads will
|
|
be applied in the \kcode{parallel} region:
|
|
|
|
\begin{compactitem}
|
|
\item thread 0 executes on p0 with the place partition p0-p7
|
|
|
|
\item thread 1 executes on p1 with the place partition p0-p7
|
|
|
|
\item thread 2 executes on p2 with the place partition p0-p7
|
|
|
|
\item thread 3 executes on p3 with the place partition p0-p7
|
|
\end{compactitem}
|
|
|
|
If the primary thread would initially be started on p2, the placement of threads
|
|
and distribution of the place partition would be as follows:
|
|
|
|
\begin{compactitem}
|
|
\item thread 0 executes on p2 with the place partition p0-p7
|
|
|
|
\item thread 1 executes on p3 with the place partition p0-p7
|
|
|
|
\item thread 2 executes on p4 with the place partition p0-p7
|
|
|
|
\item thread 3 executes on p5 with the place partition p0-p7
|
|
\end{compactitem}
|
|
|
|
The following example illustrates the \kcode{close} thread affinity policy when
|
|
the number of threads is greater than the number of places in the parent's place
|
|
partition.
|
|
|
|
Let \ucode{T} be the number of threads in the team, and \ucode{P} be the number of places in the
|
|
parent's place partition. The first \ucode{T/P} threads of the team (including the primary
|
|
thread) execute on the parent's place. The next \ucode{T/P} threads execute on the next
|
|
place in the place partition, and so on, with wrap around. The place partition
|
|
is not changed by the \kcode{close} policy.
|
|
|
|
\cexample[4.0]{affinity}{4}
|
|
|
|
\ffreeexample[4.0]{affinity}{4}
|
|
|
|
It is unspecified on which place the primary thread is initially started. If the
|
|
primary thread is initially running on p0, the following placement of threads will
|
|
be applied in the parallel region:
|
|
|
|
\begin{compactitem}
|
|
\item threads 0,1 execute on p0 with the place partition p0-p7
|
|
|
|
\item threads 2,3 execute on p1 with the place partition p0-p7
|
|
|
|
\item threads 4,5 execute on p2 with the place partition p0-p7
|
|
|
|
\item threads 6,7 execute on p3 with the place partition p0-p7
|
|
|
|
\item threads 8,9 execute on p4 with the place partition p0-p7
|
|
|
|
\item threads 10,11 execute on p5 with the place partition p0-p7
|
|
|
|
\item threads 12,13 execute on p6 with the place partition p0-p7
|
|
|
|
\item threads 14,15 execute on p7 with the place partition p0-p7
|
|
\end{compactitem}
|
|
|
|
If the primary thread would initially be started on p2, the placement of threads
|
|
and distribution of the place partition would be as follows:
|
|
|
|
\begin{compactitem}
|
|
\item threads 0,1 execute on p2 with the place partition p0-p7
|
|
|
|
\item threads 2,3 execute on p3 with the place partition p0-p7
|
|
|
|
\item threads 4,5 execute on p4 with the place partition p0-p7
|
|
|
|
\item threads 6,7 execute on p5 with the place partition p0-p7
|
|
|
|
\item threads 8,9 execute on p6 with the place partition p0-p7
|
|
|
|
\item threads 10,11 execute on p7 with the place partition p0-p7
|
|
|
|
\item threads 12,13 execute on p0 with the place partition p0-p7
|
|
|
|
\item threads 14,15 execute on p1 with the place partition p0-p7
|
|
\end{compactitem}
|
|
|
|
\subsection{Primary Affinity Policy}
|
|
\label{subsec:affinity_primary}
|
|
\index{affinity!primary policy@\kcode{primary} policy}
|
|
\index{primary policy@\kcode{primary} policy}
|
|
|
|
The following example shows the result of the \kcode{primary} affinity policy on
|
|
the partition list for the machine architecture depicted above. The place partition
|
|
is not changed by the primary policy.
|
|
|
|
\cexample[5.1]{affinity}{5}
|
|
|
|
\fexample[5.1]{affinity}{5}
|
|
\clearpage
|
|
|
|
It is unspecified on which place the primary thread is initially started. If the
|
|
primary thread is initially running on p0, the following placement of threads will
|
|
be applied in the parallel region:
|
|
|
|
\begin{compactitem}
|
|
\item threads 0-3 execute on p0 with the place partition p0-p7
|
|
\end{compactitem}
|
|
|
|
If the primary thread would initially be started on p2, the placement of threads
|
|
and distribution of the place partition would be as follows:
|
|
|
|
\begin{compactitem}
|
|
\item threads 0-3 execute on p2 with the place partition p0-p7
|
|
\end{compactitem}
|
|
|
|
|