mirror of
https://github.com/OpenMP/Examples.git
synced 2025-04-04 05:41:33 +01:00
168 lines
11 KiB
TeX
168 lines
11 KiB
TeX
\pagebreak
|
|
\section{Pointer Mapping}
|
|
\label{sec:pointer_mapping}
|
|
\index{mapping!pointer}
|
|
\index{mapping!pointer attachment}
|
|
\index{pointer attachment}
|
|
|
|
Pointers that contain host addresses require that those addresses are translated to device addresses for them to be useful in the context of a device data environment. Broadly speaking, there are two scenarios where this is important.
|
|
|
|
The first scenario is where the pointer is mapped to the device data environment, such that references to the pointer inside a \kcode{target} region are to the corresponding pointer. Pointer \plc{attachment} ensures that the corresponding pointer will contain a device address when all of the following conditions are true:
|
|
\begin{itemize}
|
|
\item the pointer is mapped by directive $A$ to a device;
|
|
\item a list item that uses the pointer as its base pointer (call it the \emph{pointee}) is mapped, to the same device, by directive $B$, which may be the same as $A$;
|
|
\item the effect of directive $B$ is to create either the corresponding pointer or pointee in the device data environment of the device.
|
|
\end{itemize}
|
|
|
|
Given the above conditions, pointer attachment is initiated as a result of directive $B$ and subsequent references to the pointee list item in a target region that use the pointer will access the corresponding pointee. The corresponding pointer remains in this \plc{attached} state until it is removed from the device data environment.
|
|
|
|
The second scenario, which is only applicable for C/C++, is where the pointer is implicitly privatized inside a \kcode{target} construct when it appears as the base pointer to a list item on the construct and does not appear explicitly as a list item in a \kcode{map} clause, \kcode{is_device_ptr} clause, or data-sharing attribute clause. This scenario can be further split into two cases: the list item is a zero-length array section (e.g., \ucode{p[:0]}) or it is not.
|
|
|
|
If it is a zero-length array section, this will trigger a runtime check on entry to the \kcode{target} region for a previously mapped list item where the value of the pointer falls within the range of its base address and ending address. If such a match is found the private pointer is initialized to the device address corresponding to the value of the original pointer, and otherwise it is initialized to \bcode{NULL} (or retains its original value if the \kcode{unified_address} requirement is specified for that compilation unit).
|
|
|
|
If the list item (again, call it the \emph{pointee}) is not a zero-length array section, the private pointer will be initialized such that references in the \kcode{target} region to the pointee list item that use the pointer will access the corresponding pointee.
|
|
|
|
The following example shows the basics of mapping pointers with and without
|
|
associated storage on the host.
|
|
|
|
Storage for pointers \ucode{ptr1} and \ucode{ptr2} is created on the host.
|
|
To map storage that is associated with a pointer on the host, the data can be
|
|
explicitly mapped as an array section so that the compiler knows
|
|
the amount of data to be assigned in the device (to the \plc{corresponding} data storage area).
|
|
On the \kcode{target} construct array sections are mapped; however, the pointer \ucode{ptr1}
|
|
is mapped, while \ucode{ptr2} is not. Since \ucode{ptr2} is not explicitly mapped, it is
|
|
firstprivate. This creates a subtle difference in the way these pointers can be used.
|
|
|
|
As a firstprivate pointer, \ucode{ptr2} can be manipulated on the device;
|
|
however, as an explicitly mapped pointer,
|
|
\ucode{ptr1} becomes an \emph{attached} pointer and cannot be manipulated.
|
|
In both cases the host pointer is not updated with the device pointer
|
|
address---as one would expect for distributed memory.
|
|
The storage data on the host is updated from the corresponding device
|
|
data at the end of the \kcode{target} region.
|
|
|
|
As a comparison, note that the \ucode{aray} array is automatically mapped,
|
|
since the compiler knows the extent of the array.
|
|
|
|
The pointer \ucode{ptr3} is used inside the \kcode{target} construct, but it does
|
|
not appear in a data-mapping or data-sharing clause. Nor is there a
|
|
\kcode{defaultmap} clause on the construct to indicate what its implicit
|
|
data-mapping or data-sharing attribute should be. For such a case, \ucode{ptr3}
|
|
will be implicitly privatized within the construct and there will be a runtime
|
|
check to see if the host memory to which it is pointing has corresponding memory
|
|
in the device data environment. If this runtime check passes, the private
|
|
\ucode{ptr3} would be initialized to point to the corresponding memory. But in
|
|
this case the check does not pass and so it is initialized to null.
|
|
Since \ucode{ptr3} is private, the value to which it is assigned in the
|
|
\kcode{target} region is not returned into the original \ucode{ptr3} on the host.
|
|
|
|
\cexample[5.0]{target_ptr_map}{1}
|
|
|
|
\index{directives!begin declare target@\kcode{begin declare target}}
|
|
\index{begin declare target directive@\kcode{begin declare target} directive}
|
|
|
|
In the following example the global pointer \ucode{p} appears in a
|
|
declare target directive. Hence, the pointer \ucode{p} will
|
|
persist on the device throughout executions in all \kcode{target} regions.
|
|
|
|
The pointer is also used in an array section of a \kcode{map} clause on
|
|
a \kcode{target} construct. When the pointer of storage associated with
|
|
a declare target directive
|
|
is mapped, as for the array section \ucode{p[:N]} in the
|
|
\kcode{target} construct, the array section on the device is \emph{attached}
|
|
to the device pointer \ucode{p} on entry to the construct, and
|
|
the value of the device pointer \ucode{p} becomes undefined on exit.
|
|
(Of course, storage allocation for
|
|
the array section on the device will occur before the
|
|
pointer on the device is attached.)
|
|
% For globals with declare target is there such a things a
|
|
% original and corresponding?
|
|
|
|
\cexample[5.1]{target_ptr_map}{2}
|
|
|
|
\index{directives!begin declare target@\kcode{begin declare target}}
|
|
\index{begin declare target directive@\kcode{begin declare target} directive}
|
|
|
|
The following two examples illustrate subtle differences in pointer attachment
|
|
to device address because of the order of data mapping.
|
|
|
|
In example \example{target_ptr_map.3a}
|
|
the global pointer \ucode{p1} points to array \ucode{x} and \ucode{p2} points to
|
|
array \ucode{y} on the host.
|
|
The array section \ucode{x[:N]} is mapped by the \kcode{target enter data} directive while array \ucode{y} is mapped
|
|
on the \kcode{target} construct.
|
|
Since the \kcode{begin declare target} directive is applied to the declaration
|
|
of \ucode{p1}, \ucode{p1} is a treated like a mapped variable on the \kcode{target}
|
|
construct and references to \ucode{p1} inside the construct will be to the
|
|
corresponding \ucode{p1} that exists on the device. However, the corresponding
|
|
\ucode{p1} will be undefined since there is no pointer attachment for it. Pointer
|
|
attachment for \ucode{p1} would require that (1) \ucode{p1} (or an lvalue
|
|
expression that refers to the same storage as \ucode{p1}) appears as a base
|
|
pointer to a list item in a \kcode{map} clause, and (2) the construct that has
|
|
the \kcode{map} clause causes the list item to transition from \emph{not mapped}
|
|
to \emph{mapped}. The conditions are clearly not satisfied for this example.
|
|
|
|
The problem for \ucode{p2} in this example is also subtle. It will be privatized
|
|
inside the \kcode{target} construct, with a runtime check for whether the memory
|
|
to which it is pointing has corresponding memory that is accessible on the
|
|
device. If this check is successful, then the \ucode{p2} inside the construct
|
|
would be appropriately initialized to point to that corresponding memory.
|
|
Unfortunately, despite there being an implicit map of the array \ucode{y} (to
|
|
which \ucode{p2} is pointing) on the construct, the order of this map relative to
|
|
the initialization of \ucode{p2} is unspecified. Therefore, the initial value of
|
|
\ucode{p2} will also be undefined.
|
|
|
|
Thus, referencing values via either \ucode{p1} or \ucode{p2} inside
|
|
the \kcode{target} region would be invalid.
|
|
|
|
\cexample[5.1]{target_ptr_map}{3a}
|
|
|
|
In example \example{target_ptr_map.3b} the mapping orders for arrays \ucode{x}
|
|
and \ucode{y} were rearranged to allow proper pointer attachments.
|
|
On the \kcode{target} construct, the \kcode{map(\ucode{x})} clause triggers pointer
|
|
attachment for \ucode{p1} to the device address of \ucode{x}.
|
|
Pointer \ucode{p2} is assigned the device address of the previously mapped
|
|
array \ucode{y}.
|
|
Referencing values via either \ucode{p1} or \ucode{p2} inside the \kcode{target} region is now valid.
|
|
|
|
\cexample[5.1]{target_ptr_map}{3b}
|
|
%\clearpage
|
|
|
|
\index{routines!omp_target_is_accessible@\kcode{omp_target_is_accessible}}
|
|
\index{omp_target_is_accessible routine@\kcode{omp_target_is_accessible} routine}
|
|
|
|
In the following example, storage allocated on the host is not mapped in a \kcode{target}
|
|
region if it is determined that the host memory is accessible from the device.
|
|
On platforms that support host memory access from a target device,
|
|
it may be more efficient to omit map clauses and avoid the potential memory allocation
|
|
and data transfers that may result from the map.
|
|
%For discrete memory storage on host and devices, explicit mapping may be required, whereas for
|
|
%Unified Shared Memory platforms it may be optimal to avoid using map clauses,
|
|
%because re-allocation of the space may occur when map clauses are present.
|
|
The \kcode{omp_target_is_accessible} API routine is used to determine if the
|
|
host storage of size \ucode{buf_size} is accessible on the device, and a metadirective
|
|
is used to select the directive variant (a \kcode{target} with/without a \kcode{map} clause).
|
|
|
|
The \kcode{omp_target_is_accessible} routine will return true if the storage indicated
|
|
by the first and second arguments is accessible on the target device. In this case,
|
|
the host pointer \ucode{ptr} may be directly dereferenced in the subsequent
|
|
\kcode{target} region to access this storage, rather than mapping an array section based
|
|
off the pointer. By explicitly specifying the host pointer in a \kcode{firstprivate}
|
|
clause on the construct, its original value will be used directly in the \kcode{target} region.
|
|
In OpenMP 5.1, removing the \kcode{firstprivate} clause will result in an implicit presence
|
|
check of the storage to which \ucode{ptr} points, and since this storage is not mapped by the
|
|
program, \ucode{ptr} will be \bcode{NULL}-initialized in the \kcode{target} region.
|
|
In the OpenMP 5.2 Specification, a false presence check without
|
|
the \kcode{firstprivate} clause will cause the pointer to retain its original value.
|
|
|
|
\cexample[5.2]{target_ptr_map}{4}
|
|
|
|
\index{mapping!deep copy}
|
|
Similar to the previous example, the \kcode{omp_target_is_accessible} routine is used to
|
|
discover if a deep copy is required for the platform. Here, the \ucode{deep_copy} map,
|
|
defined in the \kcode{declare mapper} directive, is used if the host storage referenced by
|
|
\ucode{s.ptr} (or \ucode{s\%ptr} in Fortran) is not accessible from the device.
|
|
|
|
\cexample[5.2]{target_ptr_map}{5}
|
|
\ffreeexample[5.2]{target_ptr_map}{5}
|