OpenMP-Examples/Chap_memory_model.tex

\cchapter{Memory Model}{memory_model}
\label{chap:memory_model}

OpenMP provides a shared-memory model that allows all threads on a given
device shared access to \emph{memory}. For a given OpenMP region that may be
executed by more than one thread or SIMD lane, variables in memory may be
\plc{shared} or \plc{private} with respect to those threads or SIMD lanes. A
variable's data-sharing attribute indicates whether it is shared (the
\plc{shared} attribute) or private (the \plc{private}, \plc{firstprivate},
\plc{lastprivate}, \plc{linear}, and \plc{reduction} attributes) in the data
environment of an OpenMP region. While private variables in an OpenMP region
are new copies of the original variable (with same name) that may then be
concurrently accessed or modified by their respective threads or SIMD lanes, a
shared variable in an OpenMP region is the same as the variable of the same
name in the enclosing region. Concurrent accesses or modifications to a
shared variable may therefore require synchronization to avoid data races.

OpenMP's memory model also includes a \emph{temporary view} of memory that is
associated with each thread. Two different threads may see different values for
a given variable in their respective temporary views. Threads may employ flush
operations for the purposes of making their temporary view of a variable
consistent with the value of the variable in memory. The effect of a given
flush operation is characterized by its flush properties -- some combination of
\plc{strong}, \plc{release}, and \plc{acquire} -- and, for \plc{strong}
flushes, a \plc{flush-set}.

A \plc{strong} flush will force consistency between the temporary view and the
memory for all variables in its \plc{flush-set}.  Furthermore, all strong flushes in a
program that have intersecting flush-sets will execute in some total order, and
within a thread strong flushes may not be reordered with respect to other
memory operations on variables in its flush-set. \plc{Release} and
\plc{acquire} flushes operate in pairs. A release flush may ``synchronize''
with an acquire flush, and when it does so the local memory operations that
precede the release flush will appear to have been completed before the local
memory operations on the same variables that follow the acquire flush.

Flush operations arise from explicit \kcode{flush} directives, implicit
\kcode{flush} directives, and also from the execution of \kcode{atomic}
constructs. The \kcode{flush} directive forces a  consistent view of local
variables of the thread executing the \kcode{flush}.  When a list is supplied on
the directive, only the items (variables) in the list are guaranteed to be
flushed.  Implied flushes exist at prescribed locations of certain constructs.
For the complete list of these locations and associated constructs, please
refer to the \docref{\kcode{flush} Construct} section of the OpenMP Specifications
document.

In this chapter, examples illustrate how race conditions may arise for accesses
to variables with a \plc{shared} data-sharing attribute when flush operations
are not properly employed.  A race condition can exist when two or more threads
are involved in accessing a variable and at least one of the accesses modifies
the variable.  In particular, a data race will arise when conflicting accesses
do not have a well-defined \emph{completion order}.  The existence of data
races in OpenMP programs result in undefined behavior, and so they should
generally be avoided for programs to be correct.  The completion order of
accesses to a shared variable is guaranteed in OpenMP through a set of memory
consistency rules that are described in the \docref{OpenMP Memory Consistency}
section of the OpenMP Specifications document.

%This chapter also includes examples that exhibit non-sequentially consistent
%(\emph{non-SC}) behavior. Sequential consistency (\emph{SC}) is the desirable
%property that the results of a multi-threaded program are as if all operations
%are performed in some total order, consistent with the program order of
%operations performed by each thread. OpenMP guarantees that a correct program
%(i.e. a program that does not have a data race) will exhibit SC behavior
%so long as the only \code{atomic} constructs it uses are SC atomic directives.


% The following table lists construct in which implied flushes exist, and the
% location of their execution.
%
% %\begin{table}[hb]
% \begin{center}
% %\caption {Execution Location for Implicit Flushes. }
% \begin{tabular}{ | p{0.6\linewidth} | l | }
% \hline
% \code{CONSTRUCT}                                   & \makecell{\code{EXECUTION} \\ \code{LOCATION}} \\
% \hline
% \code{parallel}                                    & upon entry and exit \\
% \hline
% \makecell[l]{worksharing \\ \hspace{1.5em}\code{for}, \code{do}
%                          \\ \hspace{1.5em}\code{sections}
%                          \\ \hspace{1.5em}\code{single}
%                          \\ \hspace{1.5em}\code{workshare} }
%                                                    & upon exit \\
% \hline
% \code{critical}                                    & upon entry and exit \\
% \hline
% \code{target}                                      & upon entry and exit \\
% \hline
% \code{barrier}                                     & during \\
% \hline
% \code{atomic} operation with \plc{seq\_cst} clause & upon entry and exit \\
% \hline
% \code{ordered}*                                    & upon entry and exit \\
% \hline
% \code{cancel}** and \code{cancellation point}**    & during \\
% \hline
% \code{target data}                                 & upon entry and exit \\
% \hline
% \code{target update} + \code{to} clause,
% \code{target enter data}                           & on entry \\
% \hline
% \code{target update} + \code{from} clause,
% \code{target exit data}                            & on exit \\
% \hline
% \code{omp\_set\_lock}                              & during \\
% \hline
% \makecell[l]{ \code{omp\_set/unset\_lock}, \code{omp\_test\_lock}***
%            \\ \code{omp\_set/unset/test\_nest\_lock}*** }
%                                                    & during \\
% \hline
% task scheduling point                              & \makecell[l]{immediately \\ before and after} \\
% \hline
% \end{tabular}
% %\caption {Execution Location for Implicit Flushes. }
%
% \end{center}
% %\end{table}
%
% * without clauses and with \code{threads} or \code{depend} clauses \newline
% ** when \plc{cancel-var} ICV is \plc{true} (cancellation is turned on) and cancellation is activated \newline
% *** if the region causes the lock to be set or unset
%
% A flush with a list is implied for non-sequentially consistent \code{atomic} operations
% (\code{atomic} directive without a \code{seq\_cst} clause), where the list item is the
% specific storage location accessed atomically (specified as the \plc{x} variable
% in \plc{atomic Construct} subsection of the OpenMP Specifications document).

% Examples 1-3 show the difficulty of synchronizing threads through \code{flush} and \code{atomic} directives.


%===== Examples Sections =====
\input{memory_model/mem_model}
\input{memory_model/allocators}
\input{memory_model/fort_race}