OpenMP-Examples/affinity/affinity_query.tex
2024-04-16 08:59:23 -07:00

58 lines
3.1 KiB
TeX

\newpage
\section{Affinity Query Functions}
\label{sec: affinity_query}
\index{affinity query!omp_get_num_places routine@\kcode{omp_get_num_places} routine}
\index{routines!omp_get_num_places@\kcode{omp_get_num_places}}
\index{omp_get_num_places routine@\kcode{omp_get_num_places} routine}
\index{affinity query!omp_get_place_num routine@\kcode{omp_get_place_num} routine}
\index{routines!omp_get_place_num@\kcode{omp_get_place_num}}
\index{omp_get_place_num routine@\kcode{omp_get_place_num} routine}
\index{affinity query!omp_get_place_num_procs routine@\kcode{omp_get_place_num_procs} routine}
\index{routines!omp_get_place_num_procs@\kcode{omp_get_place_num_procs}}
\index{omp_get_place_num_procs routine@\kcode{omp_get_place_num_procs} routine}
\index{affinity!spread policy@\kcode{spread} policy}
\index{spread policy@\kcode{spread} policy}
\index{environment variables!OMP_PLACES@\kcode{OMP_PLACES}}
\index{OMP_PLACES@\kcode{OMP_PLACES}}
In the example below a team of threads is generated on each socket of
the system, using nested parallelism. Several query functions are used
to gather information to support the creation of the teams and to obtain
socket and thread numbers.
For proper execution of the code, the user must create a place partition, such that
each place is a listing of the core numbers for a socket. For example,
in a 2 socket system with 8 cores in each socket, and sequential numbering
in the socket for the core numbers, the \kcode{OMP_PLACES} variable would be set
to "\{0:8\},\{8:8\}", using the place syntax \{\splc{lower_bound:length:stride}\},
and the default stride of 1.
The code determines the number of sockets (\ucode{n_sockets})
using the \kcode{omp_get_num_places()} query function.
In this example each place is constructed with a list of
each socket's core numbers, hence the number of places is equal
to the number of sockets.
The outer parallel region forms a team of threads, and each thread
executes on a socket (place) because the \kcode{proc_bind} clause uses
\kcode{spread} in the outer \kcode{parallel} construct.
Next, in the \ucode{socket_init} function, an inner parallel region creates a team
of threads equal to the number of elements (core numbers) from the place
of the parent thread. Because the outer \kcode{parallel} construct uses
a \kcode{spread} affinity policy, each of its threads inherits a sub-partition of
the original partition. Hence, the \kcode{omp_get_place_num_procs} query function
returns the number of elements (here procs = cores) in the sub-partition of the thread.
After each parent thread creates its nested parallel region on the section,
the socket number and thread number are reported.
Note: Portable tools like hwloc (Portable HardWare LOCality package), which support
many common operating systems, can be used to determine the configuration of a system.
On some systems there are utilities, files or user guides that provide configuration
information. For instance, the socket number and proc\_id's for a socket
can be found in the /proc/cpuinfo text file on Linux systems.
\cexample[4.5]{affinity_query}{1}
\ffreeexample[4.5]{affinity_query}{1}