mirror of
https://github.com/OpenMP/Examples.git
synced 2025-04-04 05:41:33 +01:00
188 lines
9.5 KiB
TeX
188 lines
9.5 KiB
TeX
%\pagebreak
|
|
\section{Memory Allocators}
|
|
\label{sec:allocators}
|
|
|
|
\index{memory allocators!allocator traits}
|
|
\index{memory allocators!memory space}
|
|
\index{memory allocators!omp_alloc routine@\kcode{omp_alloc} routine}
|
|
\index{memory allocators!allocators directive@\kcode{allocators} directive}
|
|
|
|
\index{omp_alloc routine@\kcode{omp_alloc} routine}
|
|
\index{routines!omp_alloc@\kcode{omp_alloc}}
|
|
|
|
\index{directives!allocators@\kcode{allocators}}
|
|
\index{allocators directive@\kcode{allocators} directive}
|
|
\index{allocators directive@\kcode{allocators} directive!allocator clause@\kcode{allocator} clause}
|
|
|
|
\index{clauses!allocator@\kcode{allocator}}
|
|
\index{allocator clause@\kcode{allocator} clause}
|
|
\index{omp_init_allocator routine@\kcode{omp_init_allocator} routine}
|
|
\index{routines!omp_init_allocator@\kcode{omp_init_allocator}}
|
|
|
|
OpenMP memory allocators can be used to allocate memory with
|
|
specific allocator traits. In the following example an OpenMP allocator is used to
|
|
specify an alignment for arrays \ucode{x} and \ucode{y}. The
|
|
general approach for attributing traits to variables allocated by
|
|
OpenMP is to create or specify a pre-defined \plc{memory space}, create an
|
|
array of \plc{traits}, and then form an \plc{allocator} from the
|
|
memory space and trait.
|
|
The allocator is then specified
|
|
in an OpenMP allocation (using an API \kcode{omp_alloc()} function
|
|
for C/C++ code and an \kcode{allocators} directive for Fortran code
|
|
in the \example{allocators.1} example).
|
|
|
|
In the example below the \ucode{xy_memspace} variable is declared
|
|
and assigned the default memory space (\kcode{omp_default_mem_space}).
|
|
Next, an array for \plc{traits} is created. Since only one
|
|
trait will be used, the array size is \ucode{1}.
|
|
A trait is a structure in C/C++ and a derived type in Fortran,
|
|
containing 2 components: a key and a corresponding value (key-value pair).
|
|
The trait key used here is \kcode{omp_atk_alignment} (an enum for C/C++
|
|
and a parameter for Fortran)
|
|
and the trait value of 64 is specified in the \ucode{xy_traits} declaration.
|
|
These declarations are followed by a call to the
|
|
\kcode{omp_init_allocator()} function to combine the memory
|
|
space (\ucode{xy_memspace}) and the traits (\ucode{xy_traits})
|
|
to form an allocator (\ucode{xy_alloc}).
|
|
|
|
%In the C/C++ code the API \plc{omp\_allocate()} function is used
|
|
%to allocate space, similar to \plc{malloc}, except that the allocator
|
|
%is specified as the second argument.
|
|
%In Fortran an API allocation function is not available.
|
|
%An \code{allocate} construct is used (with \plc{x} and \plc{y}
|
|
%listed as the variables to be allocated), along
|
|
%with an \code{allocator} clause (specifying the \plc{xy\_alloc} as the allocator)
|
|
%for the following Fortran \plc{allocate} statement.
|
|
|
|
In the C/C++ code the API \kcode{omp_allocate()} function is used
|
|
to allocate space, similar to \bcode{malloc}, except that the allocator
|
|
is specified as the second argument.
|
|
In Fortran an \kcode{allocators} directive is used to specify an allocator
|
|
for the following Fortran \bcode{allocate} statement.
|
|
A variable list in the \kcode{allocate} clause may be supplied if the allocator
|
|
is to be applied to a subset of variables in the Fortran allocate
|
|
statement.
|
|
Here, the \ucode{xy_alloc} allocator is specified
|
|
in the modifier of the \kcode{allocator} clause,
|
|
and the set of all variables used in the \bcode{allocate} statement is specified in the list.
|
|
|
|
%"for a following Fortran allocation statement" (no using "immediately" here)
|
|
% it looks like if you have a list, the allocation statement does not need
|
|
% to follow immediately.(?)
|
|
% spec5.0 157:19-20 The allocate directive must appear in the same scope as
|
|
% the declarations of each of its list items and must follow all such declarations.
|
|
|
|
\cexample[5.0]{allocators}{1}
|
|
\ffreeexample[5.2]{allocators}{1}
|
|
|
|
|
|
When using the \kcode{allocators} construct with optional clauses in Fortran code,
|
|
users should be aware of the behavior of a reallocation.
|
|
|
|
In the following example, the \ucode{a} variable is allocated with 64-byte
|
|
alignment through the \kcode{align} clause of the \kcode{allocators} construct.
|
|
%The alignment of the newly allocated object, \splc{a}, in the (reallocation)
|
|
%assignment \splc{a = b} may not be the same as before.
|
|
The alignment of the newly allocated object, \ucode{a}, in the (reallocation)
|
|
assignment \ucode{a = b} will not be reallocated with the 64-byte alignment, but
|
|
with the 32-byte alignment prescribed by the trait of the \ucode{my_alloctr}
|
|
allocator. It is best to avoid this problem by constructing and using an
|
|
allocator (not the \kcode{align} clause) with the required alignment in
|
|
the \kcode{allocators} construct.
|
|
Note that in the subsequent
|
|
deallocation of \ucode{a} the deallocation must precede the destruction
|
|
of the allocator used in the allocation of \ucode{a}.
|
|
|
|
\ffreeexample[5.2]{allocators}{2}
|
|
|
|
When creating and using an \kcode{allocators} construct within a Fortran procedure
|
|
for allocating storage (and subsequently freeing the allocator storage with an
|
|
\kcode{omp_destroy_allocator} construct), users should be aware of the necessity
|
|
of using an explicit Fortran deallocation instead of relying on auto-deallocation.
|
|
|
|
In the following example, a user-defined allocator is used in the allocation
|
|
of the \ucode{c} variable, and then the allocator is destroyed.
|
|
Auto-deallocation at the end of the \ucode{broken_auto_deallocation} procedure
|
|
will fail without the allocator, hence an explicit deallocation should be used
|
|
(before the \kcode{omp_destroy_allocator} construct).
|
|
Note that an allocator may be specified directly in the \kcode{allocate} clause
|
|
without using the \kcode{allocator} complex modifier, so long as no other modifier
|
|
is specified in the clause.
|
|
|
|
\ffreeexample[5.2]{allocators}{3}
|
|
\pagebreak
|
|
|
|
\index{directives!allocate@\kcode{allocate}}
|
|
\index{allocate directive@\kcode{allocate} directive}
|
|
\index{allocate directive@\kcode{allocate} directive!allocator clause@\kcode{allocator} clause}
|
|
|
|
The \kcode{allocate} directive is a convenient way to apply an OpenMP
|
|
allocator to the allocation of declared variables.
|
|
|
|
This example illustrates the allocation of specific types of storage in a program
|
|
for use in libraries, privatized variables, and with offloading.
|
|
|
|
Two groups of variables, \{\ucode{v1, v2}\} and \{\ucode{v3, v4}\}, are used with the \kcode{allocate}
|
|
directive, and the \{\ucode{v5, v6}\} pair is used with the \kcode{allocate} clause.
|
|
Here we explicitly use predefined allocators \kcode{omp_high_bw_mem_alloc} and \kcode{omp_default_mem_alloc}
|
|
with the \kcode{allocate} directive in CASE 1. Similar effects are achieved for private variables of a task
|
|
by using the \kcode{allocate} clause, as shown in CASE 2.
|
|
|
|
Note, when the \kcode{allocate} directive does not specify an \kcode{allocator} clause, an
|
|
implementation-defined default, stored in the \plc{def-allocator-var} ICV, is used
|
|
(not illustrated here).
|
|
Users can set and get the default allocator with the \kcode{omp_set_default_allocator}
|
|
and \kcode{omp_get_default_allocator} API routines.
|
|
|
|
\cexample[5.1]{allocators}{4}
|
|
\ffreeexample[5.1]{allocators}{4}
|
|
|
|
\index{uses_allocators clause@\kcode{uses_allocators} clause}
|
|
\index{clauses!uses_allocators@\kcode{uses_allocators}}
|
|
|
|
The use of allocators in \kcode{target} regions is facilitated by the
|
|
\kcode{uses_allocators} clause as shown in the cases below.
|
|
|
|
In CASE 1, the predefined \kcode{omp_cgroup_mem_alloc} allocator is made available on the
|
|
device in the first \kcode{target} construct as specified in the \kcode{uses_allocators} clause.
|
|
The allocator is then used in the \kcode{allocate}
|
|
clause of the \kcode{teams} construct to allocate a private array for each
|
|
team (contention group). The private \ucode{xbuf} arrays that are filled by each
|
|
team are reduced as specified in the \kcode{reduction} clause on the \kcode{teams} construct.
|
|
|
|
In CASE 2, user-defined traits are specified in the \ucode{cgroup_traits} variable.
|
|
An allocator is initialized for the \kcode{target} region in the \kcode{uses_allocators} clause,
|
|
and the traits specified in \ucode{cgroup_traits} are included by the \kcode{traits} modifier.
|
|
|
|
In CASE 3, the \ucode{cgroup_alloc} variable is initialized on the host with traits
|
|
and a memory space. However, these are ignored by the \kcode{uses_allocators} clause
|
|
and a new allocator for the \kcode{target} region is initialized with default traits.
|
|
|
|
\cexample[5.2]{allocators}{5}
|
|
\ffreeexample[5.2]{allocators}{5}
|
|
|
|
\index{dynamic_allocators clause@\kcode{dynamic_allocators} clause}
|
|
\index{clauses!dynamic_allocators@\kcode{dynamic_allocators}}
|
|
|
|
The following example shows how to make an allocator available in a \kcode{target} region
|
|
without specifying a \kcode{uses_allocators} clause.
|
|
|
|
In CASE 1, the predefined \kcode{omp_cgroup_mem_alloc} allocator is used in the \kcode{target}
|
|
region as in CASE 1 of the previous example, but without specifying a \kcode{uses_allocators} clause.
|
|
This is accomplished by specifying the \kcode{requires} directive with a
|
|
\kcode{dynamic_allocators} clause in the same compilation unit, to remove
|
|
restrictions on allocator usage in \kcode{target} regions.
|
|
|
|
CASE 2 also uses the \kcode{dynamic_allocators} clause to remove allocator
|
|
restrictions in \kcode{target} regions. Here, an allocator is initialized
|
|
by calling the \kcode{omp_init_allocator} routine in the \kcode{target} region.
|
|
The allocator is then used for the allocations of array \ucode{xbuf} in
|
|
an \kcode{allocate} clause of the \kcode{target teams} construct
|
|
for each team and destroyed after its use.
|
|
The use of separate \kcode{target} regions is needed here since
|
|
no statement is allowed between a \kcode{target} directive and
|
|
its nested \kcode{teams} construct.
|
|
|
|
\cexample[5.2]{allocators}{6}
|
|
\ffreeexample[5.2]{allocators}{6}
|