%\pagebreak \section{Memory Allocators} \label{sec:allocators} \index{memory allocators!allocator traits} \index{memory allocators!memory space} \index{memory allocators!omp_alloc routine@\kcode{omp_alloc} routine} \index{memory allocators!allocators directive@\kcode{allocators} directive} \index{omp_alloc routine@\kcode{omp_alloc} routine} \index{routines!omp_alloc@\kcode{omp_alloc}} \index{directives!allocators@\kcode{allocators}} \index{allocators directive@\kcode{allocators} directive} \index{allocators directive@\kcode{allocators} directive!allocator clause@\kcode{allocator} clause} \index{clauses!allocator@\kcode{allocator}} \index{allocator clause@\kcode{allocator} clause} \index{omp_init_allocator routine@\kcode{omp_init_allocator} routine} \index{routines!omp_init_allocator@\kcode{omp_init_allocator}} OpenMP memory allocators can be used to allocate memory with specific allocator traits. In the following example an OpenMP allocator is used to specify an alignment for arrays \ucode{x} and \ucode{y}. The general approach for attributing traits to variables allocated by OpenMP is to create or specify a pre-defined \plc{memory space}, create an array of \plc{traits}, and then form an \plc{allocator} from the memory space and trait. The allocator is then specified in an OpenMP allocation (using an API \kcode{omp_alloc()} function for C/C++ code and an \kcode{allocators} directive for Fortran code in the \example{allocators.1} example). In the example below the \ucode{xy_memspace} variable is declared and assigned the default memory space (\kcode{omp_default_mem_space}). Next, an array for \plc{traits} is created. Since only one trait will be used, the array size is \ucode{1}. A trait is a structure in C/C++ and a derived type in Fortran, containing 2 components: a key and a corresponding value (key-value pair). The trait key used here is \kcode{omp_atk_alignment} (an enum for C/C++ and a parameter for Fortran) and the trait value of 64 is specified in the \ucode{xy_traits} declaration. These declarations are followed by a call to the \kcode{omp_init_allocator()} function to combine the memory space (\ucode{xy_memspace}) and the traits (\ucode{xy_traits}) to form an allocator (\ucode{xy_alloc}). %In the C/C++ code the API \plc{omp\_allocate()} function is used %to allocate space, similar to \plc{malloc}, except that the allocator %is specified as the second argument. %In Fortran an API allocation function is not available. %An \code{allocate} construct is used (with \plc{x} and \plc{y} %listed as the variables to be allocated), along %with an \code{allocator} clause (specifying the \plc{xy\_alloc} as the allocator) %for the following Fortran \plc{allocate} statement. In the C/C++ code the API \kcode{omp_allocate()} function is used to allocate space, similar to \bcode{malloc}, except that the allocator is specified as the second argument. In Fortran an \kcode{allocators} directive is used to specify an allocator for the following Fortran \bcode{allocate} statement. A variable list in the \kcode{allocate} clause may be supplied if the allocator is to be applied to a subset of variables in the Fortran allocate statement. Here, the \ucode{xy_alloc} allocator is specified in the modifier of the \kcode{allocator} clause, and the set of all variables used in the \bcode{allocate} statement is specified in the list. %"for a following Fortran allocation statement" (no using "immediately" here) % it looks like if you have a list, the allocation statement does not need % to follow immediately.(?) % spec5.0 157:19-20 The allocate directive must appear in the same scope as % the declarations of each of its list items and must follow all such declarations. \cexample[5.0]{allocators}{1} \ffreeexample[5.2]{allocators}{1} When using the \kcode{allocators} construct with optional clauses in Fortran code, users should be aware of the behavior of a reallocation. In the following example, the \ucode{a} variable is allocated with 64-byte alignment through the \kcode{align} clause of the \kcode{allocators} construct. %The alignment of the newly allocated object, \splc{a}, in the (reallocation) %assignment \splc{a = b} may not be the same as before. The alignment of the newly allocated object, \ucode{a}, in the (reallocation) assignment \ucode{a = b} will not be reallocated with the 64-byte alignment, but with the 32-byte alignment prescribed by the trait of the \ucode{my_alloctr} allocator. It is best to avoid this problem by constructing and using an allocator (not the \kcode{align} clause) with the required alignment in the \kcode{allocators} construct. Note that in the subsequent deallocation of \ucode{a} the deallocation must precede the destruction of the allocator used in the allocation of \ucode{a}. \ffreeexample[5.2]{allocators}{2} When creating and using an \kcode{allocators} construct within a Fortran procedure for allocating storage (and subsequently freeing the allocator storage with an \kcode{omp_destroy_allocator} construct), users should be aware of the necessity of using an explicit Fortran deallocation instead of relying on auto-deallocation. In the following example, a user-defined allocator is used in the allocation of the \ucode{c} variable, and then the allocator is destroyed. Auto-deallocation at the end of the \ucode{broken_auto_deallocation} procedure will fail without the allocator, hence an explicit deallocation should be used (before the \kcode{omp_destroy_allocator} construct). Note that an allocator may be specified directly in the \kcode{allocate} clause without using the \kcode{allocator} complex modifier, so long as no other modifier is specified in the clause. \ffreeexample[5.2]{allocators}{3} \pagebreak \index{directives!allocate@\kcode{allocate}} \index{allocate directive@\kcode{allocate} directive} \index{allocate directive@\kcode{allocate} directive!allocator clause@\kcode{allocator} clause} The \kcode{allocate} directive is a convenient way to apply an OpenMP allocator to the allocation of declared variables. This example illustrates the allocation of specific types of storage in a program for use in libraries, privatized variables, and with offloading. Two groups of variables, \{\ucode{v1, v2}\} and \{\ucode{v3, v4}\}, are used with the \kcode{allocate} directive, and the \{\ucode{v5, v6}\} pair is used with the \kcode{allocate} clause. Here we explicitly use predefined allocators \kcode{omp_high_bw_mem_alloc} and \kcode{omp_default_mem_alloc} with the \kcode{allocate} directive in CASE 1. Similar effects are achieved for private variables of a task by using the \kcode{allocate} clause, as shown in CASE 2. Note, when the \kcode{allocate} directive does not specify an \kcode{allocator} clause, an implementation-defined default, stored in the \plc{def-allocator-var} ICV, is used (not illustrated here). Users can set and get the default allocator with the \kcode{omp_set_default_allocator} and \kcode{omp_get_default_allocator} API routines. \cexample[5.1]{allocators}{4} \ffreeexample[5.1]{allocators}{4} \index{uses_allocators clause@\kcode{uses_allocators} clause} \index{clauses!uses_allocators@\kcode{uses_allocators}} The use of allocators in \kcode{target} regions is facilitated by the \kcode{uses_allocators} clause as shown in the cases below. In CASE 1, the predefined \kcode{omp_cgroup_mem_alloc} allocator is made available on the device in the first \kcode{target} construct as specified in the \kcode{uses_allocators} clause. The allocator is then used in the \kcode{allocate} clause of the \kcode{teams} construct to allocate a private array for each team (contention group). The private \ucode{xbuf} arrays that are filled by each team are reduced as specified in the \kcode{reduction} clause on the \kcode{teams} construct. In CASE 2, user-defined traits are specified in the \ucode{cgroup_traits} variable. An allocator is initialized for the \kcode{target} region in the \kcode{uses_allocators} clause, and the traits specified in \ucode{cgroup_traits} are included by the \kcode{traits} modifier. In CASE 3, the \ucode{cgroup_alloc} variable is initialized on the host with traits and a memory space. However, these are ignored by the \kcode{uses_allocators} clause and a new allocator for the \kcode{target} region is initialized with default traits. \cexample[5.2]{allocators}{5} \ffreeexample[5.2]{allocators}{5} \index{dynamic_allocators clause@\kcode{dynamic_allocators} clause} \index{clauses!dynamic_allocators@\kcode{dynamic_allocators}} The following example shows how to make an allocator available in a \kcode{target} region without specifying a \kcode{uses_allocators} clause. In CASE 1, the predefined \kcode{omp_cgroup_mem_alloc} allocator is used in the \kcode{target} region as in CASE 1 of the previous example, but without specifying a \kcode{uses_allocators} clause. This is accomplished by specifying the \kcode{requires} directive with a \kcode{dynamic_allocators} clause in the same compilation unit, to remove restrictions on allocator usage in \kcode{target} regions. CASE 2 also uses the \kcode{dynamic_allocators} clause to remove allocator restrictions in \kcode{target} regions. Here, an allocator is initialized by calling the \kcode{omp_init_allocator} routine in the \kcode{target} region. The allocator is then used for the allocations of array \ucode{xbuf} in an \kcode{allocate} clause of the \kcode{target teams} construct for each team and destroyed after its use. The use of separate \kcode{target} regions is needed here since no statement is allowed between a \kcode{target} directive and its nested \kcode{teams} construct. \cexample[5.2]{allocators}{6} \ffreeexample[5.2]{allocators}{6}