OpenMP-Examples/SIMD/linear_modifier.tex
2024-04-16 08:59:23 -07:00

84 lines
4.2 KiB
TeX

%%% section
\section{\kcode{ref}, \kcode{val}, \kcode{uval} Modifiers for \kcode{linear} Clause}
\label{sec:linear_modifier}
\index{modifiers, linear@modifiers, \kcode{linear}!ref@\kcode{ref}}
\index{modifiers, linear@modifiers, \kcode{linear}!val@\kcode{val}}
\index{modifiers, linear@modifiers, \kcode{linear}!uval@\kcode{uval}}
\index{clauses!linear@\kcode{linear}}
\index{linear clause@\kcode{linear} clause}
When generating vector functions from \kcode{declare simd} directives,
it is important for a compiler to know the proper types of function arguments in
order to generate efficient codes.
This is especially true for C++ reference types and Fortran arguments.
In the following example, the function \ucode{add_one2} has a C++ reference
parameter (or Fortran argument) \ucode{p}. Variable \ucode{p} gets incremented by 1 in the function.
The caller loop \ucode{i} in the main program passes
a variable \ucode{k} as a reference to the function \ucode{add_one2} call.
The \kcode{ref} modifier for the \kcode{linear} clause on the
\kcode{declare simd} directive specifies that the
reference-type parameter \ucode{p} is to match the property of the variable
\ucode{k} in the loop.
This use of reference type is equivalent to the second call to
\ucode{add_one2} with a direct passing of the array element \ucode{a[i]}.
In the example, the preferred vector
length 8 is specified for both the caller loop and the callee function.
When \kcode{linear(\ucode{p}: ref)} is applied to an argument passed by reference,
it tells the compiler that the addresses in its vector argument are consecutive,
and so the compiler can generate a single vector load or store instead of
a gather or scatter. This allows more efficient SIMD code to be generated with
less source changes.
\cppexample[5.2]{linear_modifier}{1}
\ffreeexample[5.2]{linear_modifier}{1}
%\clearpage
The following example is a variant of the above example. The function \ucode{add_one2}
in the C++ code includes an additional C++ reference parameter \ucode{i}.
The loop index \ucode{i} of the caller loop \ucode{i} in the main program
is passed as a reference to the function \ucode{add_one2} call.
The loop index \ucode{i} has a uniform address with
linear value of step 1 across SIMD lanes.
Thus, the \kcode{uval} modifier is used for the \kcode{linear} clause
to specify that the C++ reference-type parameter \ucode{i} is to match
the property of loop index \ucode{i}.
In the corresponding Fortran code the arguments \ucode{p} and
\ucode{i} in the routine \ucode{add_on2} are passed by references.
Similar modifiers are used for these variables in the \kcode{linear} clauses
to match with the property at the caller loop in the main program.
When \kcode{linear(\ucode{i}: uval)} is applied to an argument passed by reference, it
tells the compiler that its addresses in the vector argument are uniform
so that the compiler can generate a scalar load or scalar store and create
linear values. This allows more efficient SIMD code to be generated with
less source changes.
\cppexample[5.2]{linear_modifier}{2}
\ffreeexample[5.2]{linear_modifier}{2}
In the following example, the function \ucode{func} takes arrays \ucode{x} and \ucode{y}
as arguments, and accesses the array elements referenced by the index \ucode{i}.
The caller loop \ucode{i} in the main program passes a linear copy of
the variable \ucode{k} to the function \ucode{func}.
The \kcode{val} modifier is used for the \kcode{linear} clause
in the \kcode{declare simd} directive for the function
\ucode{func} to specify that the argument \ucode{i} is to match the property of
the actual argument \ucode{k} passed in the SIMD loop.
Arrays \ucode{x} and \ucode{y} have uniform addresses across SIMD lanes.
When \kcode{linear(\ucode{i}: val,step(\ucode{1}))} is applied to an argument,
it tells the compiler that its addresses in the vector argument may not be
consecutive, however, their values are linear (with stride 1 here). When the value of \ucode{i} is used
in subscript of array references (e.g., \ucode{x[i]}), the compiler can generate
a vector load or store instead of a gather or scatter. This allows more
efficient SIMD code to be generated with less source changes.
\cexample[5.2]{linear_modifier}{3}
\ffreeexample[5.2]{linear_modifier}{3}