\cchapter{Memory Model}{memory_model} \label{chap:memory_model} OpenMP provides a shared-memory model that allows all threads on a given device shared access to \emph{memory}. For a given OpenMP region that may be executed by more than one thread or SIMD lane, variables in memory may be \plc{shared} or \plc{private} with respect to those threads or SIMD lanes. A variable's data-sharing attribute indicates whether it is shared (the \plc{shared} attribute) or private (the \plc{private}, \plc{firstprivate}, \plc{lastprivate}, \plc{linear}, and \plc{reduction} attributes) in the data environment of an OpenMP region. While private variables in an OpenMP region are new copies of the original variable (with same name) that may then be concurrently accessed or modified by their respective threads or SIMD lanes, a shared variable in an OpenMP region is the same as the variable of the same name in the enclosing region. Concurrent accesses or modifications to a shared variable may therefore require synchronization to avoid data races. OpenMP's memory model also includes a \emph{temporary view} of memory that is associated with each thread. Two different threads may see different values for a given variable in their respective temporary views. Threads may employ flush operations for the purposes of making their temporary view of a variable consistent with the value of the variable in memory. The effect of a given flush operation is characterized by its flush properties -- some combination of \plc{strong}, \plc{release}, and \plc{acquire} -- and, for \plc{strong} flushes, a \plc{flush-set}. A \plc{strong} flush will force consistency between the temporary view and the memory for all variables in its \plc{flush-set}. Furthermore, all strong flushes in a program that have intersecting flush-sets will execute in some total order, and within a thread strong flushes may not be reordered with respect to other memory operations on variables in its flush-set. \plc{Release} and \plc{acquire} flushes operate in pairs. A release flush may ``synchronize'' with an acquire flush, and when it does so the local memory operations that precede the release flush will appear to have been completed before the local memory operations on the same variables that follow the acquire flush. Flush operations arise from explicit \kcode{flush} directives, implicit \kcode{flush} directives, and also from the execution of \kcode{atomic} constructs. The \kcode{flush} directive forces a consistent view of local variables of the thread executing the \kcode{flush}. When a list is supplied on the directive, only the items (variables) in the list are guaranteed to be flushed. Implied flushes exist at prescribed locations of certain constructs. For the complete list of these locations and associated constructs, please refer to the \docref{\kcode{flush} Construct} section of the OpenMP Specifications document. In this chapter, examples illustrate how race conditions may arise for accesses to variables with a \plc{shared} data-sharing attribute when flush operations are not properly employed. A race condition can exist when two or more threads are involved in accessing a variable and at least one of the accesses modifies the variable. In particular, a data race will arise when conflicting accesses do not have a well-defined \emph{completion order}. The existence of data races in OpenMP programs result in undefined behavior, and so they should generally be avoided for programs to be correct. The completion order of accesses to a shared variable is guaranteed in OpenMP through a set of memory consistency rules that are described in the \docref{OpenMP Memory Consistency} section of the OpenMP Specifications document. %This chapter also includes examples that exhibit non-sequentially consistent %(\emph{non-SC}) behavior. Sequential consistency (\emph{SC}) is the desirable %property that the results of a multi-threaded program are as if all operations %are performed in some total order, consistent with the program order of %operations performed by each thread. OpenMP guarantees that a correct program %(i.e. a program that does not have a data race) will exhibit SC behavior %so long as the only \code{atomic} constructs it uses are SC atomic directives. % The following table lists construct in which implied flushes exist, and the % location of their execution. % % %\begin{table}[hb] % \begin{center} % %\caption {Execution Location for Implicit Flushes. } % \begin{tabular}{ | p{0.6\linewidth} | l | } % \hline % \code{CONSTRUCT} & \makecell{\code{EXECUTION} \\ \code{LOCATION}} \\ % \hline % \code{parallel} & upon entry and exit \\ % \hline % \makecell[l]{worksharing \\ \hspace{1.5em}\code{for}, \code{do} % \\ \hspace{1.5em}\code{sections} % \\ \hspace{1.5em}\code{single} % \\ \hspace{1.5em}\code{workshare} } % & upon exit \\ % \hline % \code{critical} & upon entry and exit \\ % \hline % \code{target} & upon entry and exit \\ % \hline % \code{barrier} & during \\ % \hline % \code{atomic} operation with \plc{seq\_cst} clause & upon entry and exit \\ % \hline % \code{ordered}* & upon entry and exit \\ % \hline % \code{cancel}** and \code{cancellation point}** & during \\ % \hline % \code{target data} & upon entry and exit \\ % \hline % \code{target update} + \code{to} clause, % \code{target enter data} & on entry \\ % \hline % \code{target update} + \code{from} clause, % \code{target exit data} & on exit \\ % \hline % \code{omp\_set\_lock} & during \\ % \hline % \makecell[l]{ \code{omp\_set/unset\_lock}, \code{omp\_test\_lock}*** % \\ \code{omp\_set/unset/test\_nest\_lock}*** } % & during \\ % \hline % task scheduling point & \makecell[l]{immediately \\ before and after} \\ % \hline % \end{tabular} % %\caption {Execution Location for Implicit Flushes. } % % \end{center} % %\end{table} % % * without clauses and with \code{threads} or \code{depend} clauses \newline % ** when \plc{cancel-var} ICV is \plc{true} (cancellation is turned on) and cancellation is activated \newline % *** if the region causes the lock to be set or unset % % A flush with a list is implied for non-sequentially consistent \code{atomic} operations % (\code{atomic} directive without a \code{seq\_cst} clause), where the list item is the % specific storage location accessed atomically (specified as the \plc{x} variable % in \plc{atomic Construct} subsection of the OpenMP Specifications document). % Examples 1-3 show the difficulty of synchronizing threads through \code{flush} and \code{atomic} directives. %===== Examples Sections ===== \input{memory_model/mem_model} \input{memory_model/allocators} \input{memory_model/fort_race}