mirror of
https://github.com/OpenMP/Examples.git
synced 2025-04-03 13:21:33 +01:00
53 lines
2.7 KiB
TeX
53 lines
2.7 KiB
TeX
\cchapter{SIMD}{SIMD}
|
|
\label{chap:simd}
|
|
|
|
Single instruction, multiple data (SIMD) is a form of parallel execution
|
|
in which the same operation is performed on multiple data elements
|
|
independently in hardware vector processing units (VPU), also called SIMD units.
|
|
The addition of two vectors to form a third vector is a SIMD operation.
|
|
Many processors have SIMD (vector) units that can perform simultaneously
|
|
2, 4, 8 or more executions of the same operation (by a single SIMD unit).
|
|
|
|
Loops without loop-carried backward dependences (or with dependences preserved using
|
|
\kcode{ordered simd}) are candidates for vectorization by the compiler for
|
|
execution with SIMD units. In addition, with state-of-the-art vectorization
|
|
technology and \kcode{declare simd} directive extensions for function vectorization
|
|
in the OpenMP 4.5 specification, loops with function calls can be vectorized as well.
|
|
The basic idea is that a scalar function call in a loop can be replaced by a vector version
|
|
of the function, and the loop can be vectorized simultaneously by combining a loop
|
|
vectorization (\kcode{simd} directive on the loop) and a function
|
|
vectorization (\kcode{declare simd} directive on the function).
|
|
|
|
A \kcode{simd} construct states that SIMD operations be performed on the
|
|
data within the loop. A number of clauses are available to provide
|
|
data-sharing attributes (\kcode{private}, \kcode{linear}, \kcode{reduction} and
|
|
\kcode{lastprivate}). Other clauses provide vector length preference/restrictions
|
|
(\kcode{simdlen} / \kcode{safelen}), loop fusion (\kcode{collapse}), and data
|
|
alignment (\kcode{aligned}).
|
|
|
|
The \kcode{declare simd} directive designates
|
|
that a vector version of the function should also be constructed for
|
|
execution within loops that contain the function and have a \kcode{simd}
|
|
directive. Clauses provide argument specifications (\kcode{linear},
|
|
\kcode{uniform}, and \kcode{aligned}), a requested vector length
|
|
(\kcode{simdlen}), and designate whether the function is always/never
|
|
called conditionally in a loop (\kcode{notinbranch}/\kcode{inbranch}).
|
|
The latter is for optimizing performance.
|
|
|
|
Also, the \kcode{simd} construct has been combined with the worksharing loop
|
|
constructs (\kcode{for simd} and \kcode{do simd}) to enable simultaneous thread
|
|
execution in different SIMD units.
|
|
%Hence, the \code{simd} construct can be
|
|
%used alone on a loop to direct vectorization (SIMD execution), or in
|
|
%combination with a parallel loop construct to include thread parallelism
|
|
%(a parallel loop sequentially followed by a \code{simd} construct,
|
|
%or a combined construct such as \code{parallel do simd} or
|
|
%\code{parallel for simd}).
|
|
|
|
|
|
%===== Examples Sections =====
|
|
\input{SIMD/SIMD}
|
|
\input{SIMD/linear_modifier}
|
|
|
|
|