mirror of
https://github.com/OpenMP/Examples.git
synced 2025-04-04 05:41:33 +01:00
121 lines
7.9 KiB
TeX
121 lines
7.9 KiB
TeX
\pagebreak
|
|
\section{Pointer mapping}
|
|
\label{sec:pointer_mapping}
|
|
|
|
Pointers that contain host addresses require that those addresses are translated to device addresses for them to be useful in the context of a device data environment. Broadly speaking, there are two scenarios where this is important.
|
|
|
|
The first scenario is where the pointer is mapped to the device data environment, such that references to the pointer inside a \code{target} region are to the corresponding pointer. Pointer attachment ensures that the corresponding pointer will contain a device address when all of the following conditions are true:
|
|
\begin{itemize}
|
|
\item the pointer is mapped by directive $A$ to a device;
|
|
\item a list item that uses the pointer as its base pointer (call it the \emph{pointee}) is mapped, to the same device, by directive $B$, which may be the same as $A$;
|
|
\item the effect of directive $B$ is to create either the corresponding pointer or pointee in the device data environment of the device.
|
|
\end{itemize}
|
|
|
|
Given the above conditions, pointer attachment is initiated as a result of directive $B$ and subsequent references to the pointee list item in a target region that use the pointer will access the corresponding pointee. The corresponding pointer remains in this \emph{attached} state until it is removed from the device data environment.
|
|
|
|
The second scenario, which is only applicable for C/C++, is where the pointer is implicitly privatized inside a \code{target} construct when it appears as the base pointer to a list item on the construct and does not appear explicitly as a list item in a \code{map} clause, \code{is\_device\_ptr} clause, or data-sharing attribute clause. This scenario can be further split into two cases: the list item is a zero-length array section (e.g., \plc{p[:0]}) or it is not.
|
|
|
|
If it is a zero-length array section, this will trigger a runtime check on entry to the \code{target} region for a previously mapped list item where the value of the pointer falls within the range of its base address and ending address. If such a match is found the private pointer is initialized to the device address corresponding to the value of the original pointer, and otherwise it is initialized to NULL (or retains its original value if the \code{unified\_address} requirement is specified for that compilation unit).
|
|
|
|
If the list item (again, call it the \emph{pointee}) is not a zero-length array section, the private pointer will be initialized such that references in the \code{target} region to the pointee list item that use the pointer will access the corresponding pointee.
|
|
|
|
The following example shows the basics of mapping pointers with and without
|
|
associated storage on the host.
|
|
|
|
Storage for pointers \plc{ptr1} and \plc{ptr2} is created on the host.
|
|
To map storage that is associated with a pointer on the host, the data can be
|
|
explicitly mapped as an array section so that the compiler knows
|
|
the amount of data to be assigned in the device (to the "corresponding" data storage area).
|
|
On the \code{target} construct array sections are mapped; however, the pointer \plc{ptr1}
|
|
is mapped, while \plc{ptr2} is not. Since \plc{ptr2} is not explicitly mapped, it is
|
|
firstprivate. This creates a subtle difference in the way these pointers can be used.
|
|
|
|
As a firstprivate pointer, \plc{ptr2} can be manipulated on the device;
|
|
however, as an explicitly mapped pointer,
|
|
\plc{ptr1} becomes an \emph{attached} pointer and cannot be manipulated.
|
|
In both cases the host pointer is not updated with the device pointer
|
|
address---as one would expect for distributed memory.
|
|
The storage data on the host is updated from the corresponding device
|
|
data at the end of the \code{target} region.
|
|
|
|
As a comparison, note that the \plc{aray} array is automatically mapped,
|
|
since the compiler knows the extent of the array.
|
|
|
|
The pointer \plc{ptr3} is used inside the \code{target} construct, but it does
|
|
not appear in a data-mapping or data-sharing clause. Nor is there a
|
|
\code{defaultmap} clause on the construct to indicate what its implicit
|
|
data-mapping or data-sharing attribute should be. For such a case, \plc{ptr3}
|
|
will be implicitly privatized within the construct and there will be a runtime
|
|
check to see if the host memory to which it is pointing has corresponding memory
|
|
in the device data environment. If this runtime check passes, the private
|
|
\plc{ptr3} would be initialized to point to the corresponding memory. But in
|
|
this case the check does not pass and so it is initialized to null.
|
|
Since \plc{ptr3} is private, the value to which it is assigned in the
|
|
\code{target} region is not returned into the original \plc{ptr3} on the host.
|
|
|
|
\cexample[5.0]{target_ptr_map}{1}
|
|
|
|
In the following example the global pointer \plc{p} appears in a
|
|
\code{declare}~\code{target} directive. Hence, the pointer \plc{p} will
|
|
persist on the device throughout executions in all \code{target} regions.
|
|
|
|
The pointer is also used in an array section of a \code{map} clause on
|
|
a \code{target} construct. When storage associated with
|
|
a \code{declare}~\code{target} pointer
|
|
is mapped, as for the array section \plc{p[:N]} in the
|
|
\code{target} construct, the array section on the device is \emph{attached}
|
|
to the device pointer \plc{p} on entry to the construct, and
|
|
the value of the device pointer \plc{p} becomes undefined on exit.
|
|
(Of course, storage allocation for
|
|
the array section on the device will occur before the
|
|
pointer on the device is \emph{attached}.)
|
|
% For globals with declare target is there such a things a
|
|
% original and corresponding?
|
|
|
|
\cexample[5.0]{target_ptr_map}{2}
|
|
|
|
The following two examples illustrate subtle differences in pointer attachment
|
|
to device address because of the order of data mapping.
|
|
|
|
In example \plc{target\_ptr\_map.3a}
|
|
the global pointer \plc{p1} points to array \plc{x} and \plc{p2} points to
|
|
array \plc{y} on the host.
|
|
The array section \plc{x[:N]} is mapped by the \code{target}~\code{enter}~\code{data} directive while array \plc{y} is mapped
|
|
on the \code{target} construct.
|
|
Since the \code{declare}~\code{target} directive is applied to the declaration
|
|
of \plc{p1}, \plc{p1} is a treated like a mapped variable on the \code{target}
|
|
construct and references to \plc{p1} inside the construct will be to the
|
|
corresponding \plc{p1} that exists on the device. However, the corresponding
|
|
\plc{p1} will be undefined since there is no pointer attachment for it. Pointer
|
|
attachment for \plc{p1} would require that (1) \plc{p1} (or an lvalue
|
|
expression that refers to the same storage as \plc{p1}) appears as a base
|
|
pointer to a list item in a \code{map} clause, and (2) the construct that has
|
|
the \code{map} clause causes the list item to transition from \emph{not mapped}
|
|
to \emph{mapped}. The conditions are clearly not satisifed for this example.
|
|
|
|
The problem for \plc{p2} in this example is also subtle. It will be privatized
|
|
inside the \code{target} construct, with a runtime check for whether the memory
|
|
to which it is pointing has corresponding memory that is accessible on the
|
|
device. If this check is successful then the \plc{p2} inside the construct
|
|
would be appropriately initialized to point to that corresponding memory.
|
|
Unfortunately, despite there being an implicit map of the array \plc{y} (to
|
|
which \plc{p2} is pointing) on the construct, the order of this map relative to
|
|
the initialization of \plc{p2} is unspecified. Therefore, the initial value of
|
|
\plc{p2} will also be undefined.
|
|
|
|
Thus, referencing values via either \plc{p1} or \plc{p2} inside
|
|
the \code{target} region would be invalid.
|
|
|
|
\cexample[5.0]{target_ptr_map}{3a}
|
|
|
|
In example \plc{target\_ptr\_map.3b} the mapping orders for arrays \plc{x}
|
|
and \plc{y} were rearranged to allow proper pointer attachments.
|
|
On the \code{target} construct, the \code{map(x)} clause triggers pointer
|
|
attachment for \plc{p1} to the device address of \plc{x}.
|
|
Pointer \plc{p2} is assigned the device address of the previously mapped
|
|
array \plc{y}.
|
|
Referencing values via either \plc{p1} or \plc{p2} inside the \code{target} region is now valid.
|
|
|
|
\cexample[5.0]{target_ptr_map}{3b}
|
|
|