OpenMP-Examples/memory_model/allocators.tex

%\pagebreak
\section{Memory Allocators}
\label{sec:allocators}

\index{memory allocators!allocator traits}
\index{memory allocators!memory space}
\index{memory allocators!omp_alloc routine@\kcode{omp_alloc} routine}
\index{memory allocators!allocators directive@\kcode{allocators} directive}

\index{omp_alloc routine@\kcode{omp_alloc} routine}
\index{routines!omp_alloc@\kcode{omp_alloc}}

\index{directives!allocators@\kcode{allocators}}
\index{allocators directive@\kcode{allocators} directive}
\index{allocators directive@\kcode{allocators} directive!allocator clause@\kcode{allocator} clause}

\index{clauses!allocator@\kcode{allocator}}
\index{allocator clause@\kcode{allocator} clause}
\index{omp_init_allocator routine@\kcode{omp_init_allocator} routine}
\index{routines!omp_init_allocator@\kcode{omp_init_allocator}}

OpenMP memory allocators can be used to allocate memory with
specific allocator traits.  In the following example an OpenMP allocator is used to
specify an alignment for arrays \ucode{x} and \ucode{y}. The
general approach for attributing traits to variables allocated by
OpenMP is to create or specify a pre-defined \plc{memory space}, create an
array of \plc{traits}, and then form an \plc{allocator} from the
memory space and trait.
The allocator is then specified
in an OpenMP allocation (using an API \kcode{omp_alloc()} function
for C/C++ code and an \kcode{allocators} directive for Fortran code
in the \example{allocators.1} example).

In the example below the \ucode{xy_memspace} variable is declared
and assigned the default memory space (\kcode{omp_default_mem_space}).
Next, an array for \plc{traits} is created. Since only one
trait will be used, the array size is \ucode{1}.
A trait is a structure in C/C++ and a derived type in Fortran,
containing 2 components: a key and a corresponding value (key-value pair).
The trait key used here is \kcode{omp_atk_alignment} (an enum for C/C++
and a parameter for Fortran)
and the trait value of 64 is specified in the \ucode{xy_traits} declaration.
These declarations are followed by a call to the
\kcode{omp_init_allocator()} function to combine the memory
space (\ucode{xy_memspace}) and the traits (\ucode{xy_traits})
to form an allocator (\ucode{xy_alloc}).

%In the C/C++ code the API  \plc{omp\_allocate()} function is used
%to allocate space, similar to \plc{malloc}, except that the allocator
%is specified as the second argument.
%In Fortran an API allocation function is not available.
%An \code{allocate} construct is used (with \plc{x} and \plc{y}
%listed as the variables to be allocated), along
%with an \code{allocator} clause (specifying the \plc{xy\_alloc} as the allocator)
%for the following Fortran \plc{allocate} statement.

In the C/C++ code the API  \kcode{omp_allocate()} function is used
to allocate space, similar to \bcode{malloc}, except that the allocator
is specified as the second argument.
In Fortran an \kcode{allocators} directive is used to specify an allocator
for the following Fortran \bcode{allocate} statement.
A variable list in the \kcode{allocate} clause may be supplied if the allocator
is to be applied to a subset of variables in the Fortran allocate
statement.
Here, the \ucode{xy_alloc} allocator is specified
in the modifier of the \kcode{allocator} clause,
and the set of all variables used in the \bcode{allocate} statement is specified in the list.

%"for a following Fortran allocation statement" (no using "immediately" here)
% it looks like if you have a list, the allocation statement does not need
% to follow immediately.(?)
% spec5.0 157:19-20 The allocate directive must appear in the same scope as
% the declarations of each of its list items and must follow all such declarations.

\cexample[5.0]{allocators}{1}
\ffreeexample[5.2]{allocators}{1}


When using the \kcode{allocators} construct with optional clauses in Fortran code,
users should be aware of the behavior of a reallocation.

In the following example, the \ucode{a} variable is allocated with 64-byte
alignment through the \kcode{align} clause of the \kcode{allocators} construct.
%The alignment of the newly allocated object, \splc{a}, in the (reallocation)
%assignment \splc{a = b} may not be the same as before.
The alignment of the newly allocated object, \ucode{a}, in the (reallocation)
assignment \ucode{a = b} will not be reallocated with the 64-byte alignment, but
with the 32-byte alignment prescribed by the trait of the \ucode{my_alloctr}
allocator. It is best to avoid this problem by constructing and using an
allocator (not the \kcode{align} clause) with the required alignment in
the \kcode{allocators} construct.
Note that in the subsequent
deallocation of \ucode{a} the deallocation must precede the destruction
of the allocator used in the allocation of \ucode{a}.

\ffreeexample[5.2]{allocators}{2}

When creating and using an \kcode{allocators} construct within a Fortran procedure
for allocating storage (and subsequently freeing the allocator storage with an
\kcode{omp_destroy_allocator} construct), users should be aware of the necessity
of using an explicit Fortran deallocation instead of relying on auto-deallocation.

In the following example, a user-defined allocator is used in the allocation
of the \ucode{c} variable, and then the allocator is destroyed.
Auto-deallocation at the end of the \ucode{broken_auto_deallocation} procedure
will fail without the allocator, hence an explicit deallocation should be used
(before the \kcode{omp_destroy_allocator} construct).
Note that an allocator may be specified directly in the \kcode{allocate} clause
without using the \kcode{allocator} complex modifier, so long as no other modifier
is specified in the clause.

\ffreeexample[5.2]{allocators}{3}
\pagebreak

\index{directives!allocate@\kcode{allocate}}
\index{allocate directive@\kcode{allocate} directive}
\index{allocate directive@\kcode{allocate} directive!allocator clause@\kcode{allocator} clause}

The \kcode{allocate} directive is a convenient way to apply an OpenMP
allocator to the allocation of declared variables.

This example illustrates the allocation of specific types of storage in a program
for use in libraries, privatized variables, and with offloading.

Two groups of variables, \{\ucode{v1, v2}\} and \{\ucode{v3, v4}\}, are used with the \kcode{allocate}
directive, and the \{\ucode{v5, v6}\} pair is used with the \kcode{allocate} clause.
Here we explicitly use predefined allocators \kcode{omp_high_bw_mem_alloc} and \kcode{omp_default_mem_alloc}
with the \kcode{allocate} directive in CASE 1. Similar effects are achieved for private variables of a task
by using the \kcode{allocate} clause, as shown in CASE 2.

Note, when the \kcode{allocate} directive does not specify an \kcode{allocator} clause, an
implementation-defined default, stored in the \plc{def-allocator-var} ICV, is used
(not illustrated here).
Users can set and get the default allocator with the \kcode{omp_set_default_allocator}
and \kcode{omp_get_default_allocator} API routines.

\cexample[5.1]{allocators}{4}
\ffreeexample[5.1]{allocators}{4}

\index{uses_allocators clause@\kcode{uses_allocators} clause}
\index{clauses!uses_allocators@\kcode{uses_allocators}}

The use of allocators in \kcode{target} regions is facilitated by the
\kcode{uses_allocators} clause as shown in the cases below.

In CASE 1, the predefined \kcode{omp_cgroup_mem_alloc} allocator is made available on the
device in the first \kcode{target} construct as specified in the \kcode{uses_allocators} clause.
The allocator is then used in the \kcode{allocate}
clause of the \kcode{teams} construct to allocate a private array for each
team (contention group). The private \ucode{xbuf} arrays that are filled by each
team are reduced as specified in the \kcode{reduction} clause on the \kcode{teams} construct.

In CASE 2, user-defined traits are specified in the \ucode{cgroup_traits} variable.
An allocator is initialized for the \kcode{target} region in the \kcode{uses_allocators} clause,
and the traits specified in \ucode{cgroup_traits} are included by the \kcode{traits} modifier.

In CASE 3, the \ucode{cgroup_alloc} variable is initialized on the host with traits
and a memory space. However, these are ignored by the \kcode{uses_allocators} clause
and a new allocator for the \kcode{target} region is initialized with default traits.

\cexample[5.2]{allocators}{5}
\ffreeexample[5.2]{allocators}{5}

\index{dynamic_allocators clause@\kcode{dynamic_allocators} clause}
\index{clauses!dynamic_allocators@\kcode{dynamic_allocators}}

The following example shows how to make an allocator available in a \kcode{target} region
without specifying a \kcode{uses_allocators} clause.

In CASE 1, the predefined \kcode{omp_cgroup_mem_alloc} allocator is used in the \kcode{target}
region as in CASE 1 of the previous example, but without specifying a \kcode{uses_allocators} clause.
This is accomplished by specifying the \kcode{requires} directive with a
\kcode{dynamic_allocators} clause in the same compilation unit, to remove
restrictions on allocator usage in \kcode{target} regions.

CASE 2 also uses the \kcode{dynamic_allocators} clause to remove allocator
restrictions in \kcode{target} regions. Here, an allocator is initialized
by calling the \kcode{omp_init_allocator} routine in the \kcode{target} region.
The allocator is then used for the allocations of array \ucode{xbuf} in
an \kcode{allocate} clause of the \kcode{target teams} construct
for each team and destroyed after its use.
The use of separate \kcode{target} regions is needed here since
no statement is allowed between a \kcode{target} directive and
its nested \kcode{teams} construct.

\cexample[5.2]{allocators}{6}
\ffreeexample[5.2]{allocators}{6}