v5.2 release

This commit is contained in:
Henry Jin 2022-04-18 15:02:25 -07:00
parent fb0edc81e7
commit a5e3d8b3f2
674 changed files with 5315 additions and 1609 deletions

View File

@ -31,7 +31,7 @@ execution within loops that contain the function and have a \code{simd}
directive. Clauses provide argument specifications (\code{linear},
\code{uniform}, and \code{aligned}), a requested vector length
(\code{simdlen}), and designate whether the function is always/never
called conditionally in a loop (\code{branch}/\code{inbranch}).
called conditionally in a loop (\code{notinbranch}/\code{inbranch}).
The latter is for optimizing performance.
Also, the \code{simd} construct has been combined with the worksharing loop

View File

@ -22,7 +22,7 @@ Data-sharing attributes of variables can be classified as being \plc{predetermin
Certain variables and objects have predetermined attributes.
A commonly found case is the loop iteration variable in associated loops
of a \code{for} or \code{do} construct. It has a private data-sharing attribute.
Variables with predetermined data-sharing attributes can not be listed in a data-sharing clause; but there are some
Variables with predetermined data-sharing attributes cannot be listed in a data-sharing clause; but there are some
exceptions (mainly concerning loop iteration variables).
Variables with explicitly determined data-sharing attributes are those that are
@ -50,7 +50,7 @@ The common \plc{list items} are arrays, array sections, scalars, pointers, and
structure elements (members).
Procedures and global variables have predetermined data mapping if they appear
within the list or block of a \code{declare target} directive. Also, a C/C++ pointer
within the list or block of a \code{declare}~\code{target} directive. Also, a C/C++ pointer
is mapped as a zero-length array section, as is a C++ variable that is a reference to a pointer.
% Waiting for response from Eric on this.

View File

@ -15,7 +15,7 @@ data to the device storage.
The constructs that explicitly
create storage, transfer data, and free storage on the device
are catagorized as structured and unstructured. The
are categorized as structured and unstructured. The
\code{target} \code{data} construct is structured. It creates
a data region around \code{target} constructs, and is
convenient for providing persistent data throughout multiple
@ -33,14 +33,14 @@ the device, and controls on the storage duration.
There is an important change in the OpenMP 4.5 specification
that alters the data model for scalar variables and C/C++ pointer variables.
The default behavior for scalar variables and C/C++ pointer variables
in an 4.5 compliant code is \code{firstprivate}. Example
in a 4.5 compliant code is \code{firstprivate}. Example
codes that have been updated to reflect this new behavior are
annotated with a description that describes changes required
for correct execution. Often it is a simple matter of mapping
the variable as \code{tofrom} to obtain the intended 4.0 behavior.
In OpenMP version 4.5 the mechanism for target
execution is specified as occuring through a \plc{target task}.
execution is specified as occurring through a \plc{target task}.
When the \code{target} construct is encountered a new
\plc{target task} is generated. The \plc{target task}
completes after the \code{target} region has executed and all data
@ -59,13 +59,14 @@ clause introduced in OpenMP 4.5.
\input{devices/target_structure_mapping}
\input{devices/target_fort_allocatable_array_mapping}
\input{devices/array_sections}
\input{devices/C++_virtual_functions}
\input{devices/array_shaping}
\input{devices/target_mapper}
\input{devices/target_data}
\input{devices/target_unstructured_data}
\input{devices/target_update}
\input{devices/target_associate_ptr}
\input{devices/declare_target}
\input{devices/lambda_expressions}
\input{devices/teams}
\input{devices/async_target_depend}
\input{devices/async_target_with_tasks}

View File

@ -1,11 +1,12 @@
\cchapter{OpenMP Directive Syntax}{directives}
\label{chap:directive_syntax}
\index{directive syntax}
OpenMP \emph{directives} use base-language mechanisms to specify OpenMP program behavior.
In C code, the directives are formed exclusively with pragmas, whereas in C++
code, directives are formed from either pragmas or attributes.
Fortran directives are formed with comments in free form and fixed form sources (codes).
All of these mechanism allow the compilation to ignore the OpenMP directives if
All of these mechanisms allow the compilation to ignore the OpenMP directives if
OpenMP is not supported or enabled.
@ -35,6 +36,18 @@ Fortran comments
where \code{c\$omp} and \code{*\$omp} may be used in Fortran fixed form sources.
Most OpenMP directives accept clauses that alter the semantics of the directive in some way,
and some directives also accept parenthesized arguments that follow the directive name.
A clause may just be a keyword (e.g., \scode{untied}) or it may also accept argument lists
(e.g., \scode{shared(x,y,z)}) and/or optional modifiers (e.g., \scode{tofrom} in
\scode{map(tofrom:}~\scode{x,y,z)}).
Clause modifiers may be "simple" or "complex" -- a complex modifier consists of a
keyword followed by one or more parameters, bracketed by parentheses, while a simple
modifier does not. An example of a complex modifier is the \scode{iterator} modifier,
as in \scode{map(iterator(i=0:n),}~\scode{tofrom:}~\scode{p[i])}, or the \scode{step} modifier, as in
\scode{linear(x:}~\scode{ref,}~\scode{step(4))}.
In the preceding examples, \scode{tofrom} and \scode{ref} are simple modifiers.
%===== Examples Sections =====
\input{directives/pragmas}

73
Chap_introduction.tex Normal file
View File

@ -0,0 +1,73 @@
% This is the introduction for the OpenMP Examples document.
% This is an included file. See the main file (openmp-examples.tex) for more information.
%
% When editing this file:
%
% 1. To change formatting, appearance, or style, please edit openmp.sty.
%
% 2. Custom commands and macros are defined in openmp.sty.
%
% 3. Be kind to other editors -- keep a consistent style by copying-and-pasting to
% create new content.
%
% 4. We use semantic markup, e.g. (see openmp.sty for a full list):
% \code{} % for bold monospace keywords, code, operators, etc.
% \plc{} % for italic placeholder names, grammar, etc.
%
% 5. Other recommendations:
% Use the convenience macros defined in openmp.sty for the minor headers
% such as Comments, Syntax, etc.
%
% To keep items together on the same page, prefer the use of
% \begin{samepage}.... Avoid \parbox for text blocks as it interrupts line numbering.
% When possible, avoid \filbreak, \pagebreak, \newpage, \clearpage unless that's
% what you mean. Use \needspace{} cautiously for troublesome paragraphs.
%
% Avoid absolute lengths and measures in this file; use relative units when possible.
% Vertical space can be relative to \baselineskip or ex units. Horizontal space
% can be relative to \linewidth or em units.
%
% Prefer \emph{} to italicize terminology, e.g.:
% This is a \emph{definition}, not a placeholder.
% This is a \plc{var-name}.
%
\cchapter{Introduction}{introduction}
\label{chap:introduction}
This collection of programming examples supplements the OpenMP API for Shared
Memory Parallelization specifications, and is not part of the formal specifications. It
assumes familiarity with the OpenMP specifications, and shares the typographical
conventions used in that document.
The OpenMP API specification provides a model for parallel programming that is
portable across shared memory architectures from different vendors. Compilers from
numerous vendors support the OpenMP API.
The directives, library routines, and environment variables demonstrated in this
document allow users to create and manage parallel programs while permitting
portability. The directives extend the C, C++ and Fortran base languages with single
program multiple data (SPMD) constructs, tasking constructs, device constructs,
worksharing constructs, and synchronization constructs, and they provide support for
sharing and privatizing data. The functionality to control the runtime environment is
provided by library routines and environment variables. Compilers that support the
OpenMP API often include a command line option to the compiler that activates and
allows interpretation of all OpenMP directives.
The documents and source codes for OpenMP Examples can be downloaded from
\href{https://github.com/OpenMP/Examples}{https://github.com/OpenMP/Examples}.
Each directory holds the contents of a chapter and has a \splc{sources} subdirectory of its codes.
The codes for this OpenMP \VER{} Examples document have the tag
\href{https://github.com/OpenMP/Examples/tree/v\VER}{\plc{v\PVER}}.
Complete information about the OpenMP API and a list of the compilers that support
the OpenMP API can be found at the OpenMP.org web site
\code{https://www.openmp.org}
\clearpage
\input{introduction/Examples}
% This is the end of introduction.tex of the OpenMP Examples document.

View File

@ -22,4 +22,5 @@ whereby specific hot spots can be affected by transformation directives.
%===== Examples Sections =====
\input{loop_transformations/tile}
\input{loop_transformations/unroll}
\input{loop_transformations/partial_tile}

View File

@ -25,7 +25,7 @@ flush operation is characterized by its flush properties -- some combination of
flushes, a \emph{flush-set}.
A \emph{strong} flush will force consistency between the temporary view and the
memory for all variables in its \emph{flush-set}. Furthermore all strong flushes in a
memory for all variables in its \emph{flush-set}. Furthermore, all strong flushes in a
program that have intersecting flush-sets will execute in some total order, and
within a thread strong flushes may not be reordered with respect to other
memory operations on variables in its flush-set. \emph{Release} and
@ -53,7 +53,7 @@ do not have a well-defined \emph{completion order}. The existence of data
races in OpenMP programs result in undefined behavior, and so they should
generally be avoided for programs to be correct. The completion order of
accesses to a shared variable is guaranteed in OpenMP through a set of memory
consistency rules that are described in the \plc{OpenMP Memory Consitency}
consistency rules that are described in the \plc{OpenMP Memory Consistency}
section of the OpenMP Specifications document.
%This chapter also includes examples that exhibit non-sequentially consistent

View File

@ -102,8 +102,8 @@ The \code{masked} construct is not a worksharing construct. The \code{masked} r
executed only by the primary thread. There is no implicit barrier (and flush)
at the end of the \code{masked} region; hence the other threads of the team continue
execution beyond code statements beyond the \code{masked} region.
The \code{master} contruct, which has been deprecated in OpenMP 5.1, has identical semantics
to the \code{masked} contruct with no \code{filter} clause.
The \code{master} construct, which has been deprecated in OpenMP 5.1, has identical semantics
to the \code{masked} construct with no \code{filter} clause.
%===== Examples Sections =====

View File

@ -108,6 +108,7 @@ chapter in the OpenMP Specifications document.
\input{program_control/nested_loop}
\input{program_control/nesting_restrict}
\input{program_control/target_offload}
\input{program_control/reproducible}
\input{program_control/interop}
\input{program_control/utilities}

View File

@ -36,7 +36,7 @@ of ordered regions while allowing code outside the region to run in parallel.
Since OpenMP 4.5 the \code{ordered} construct can also be a stand-alone
directive that specifies cross-iteration dependences in a doacross loop nest.
The \code{depend} clause uses a \code{sink} \plc{dependence-type}, along with a
The \code{depend} clause uses a \code{sink} \plc{dependence-type}, along with an
iteration vector argument (vec) to indicate the iteration that satisfies the
dependence. The \code{depend} clause with a \code{source}
\plc{dependence-type} specifies dependence satisfaction.

View File

@ -54,25 +54,28 @@ For a brief revision history, see `Changes.log` in the repo.
* Insert the code in the sources directory for each chapter, and include the following metadata:
* Metadata Tags for example sources:
```
@@name: <ename>.<seq-no>[c|cpp|f|f90]
@@name: <ename>.<seq-no>
@@type: C|C++|F-fixed|F-free
@@requires: preprocessing
@@compilable: yes|no|maybe
@@linkable: yes|no|maybe
@@expect: success|failure|nothing|rt-error
@@expect: success|compile-time-error|runtime-error|undefined-behavior
@@version: omp_<verno>
```
* **name**
is the name of an example
* **type**
is the source code type, which can be translated into or from proper file extension (c,cpp,f,f90)
is the source code type, which can be translated into or from proper file extension (C:c,C++:cpp,F-fixed:f,F-free:f90)
* **requires**
any additional requirements, currently `preprocessing` for requiring preprocessing
* **compilable**
indicates whether the source code is compilable
* **linkable**
indicates whether the source code is linkable
* **expect**
indicates some expected result for testing purpose "`success|failure|nothing`" applies
to the result of code compilation "`rt-error`" is for a case where compilation may be
successful, but the code contains potential runtime issues (such as race condition).
indicates some expected result for testing purpose "`success|compile-time-error|ct-error`" applies
to the result of code compilation; "`runtime-error|rt-error`" is for a case where compilation may be
successful, but the code contains potential runtime issues (such as race condition); `undefined-behavior` could result from a non-conforming code.
Alternative would be to just use "`conforming`" or "`non-conforming`".
* **version**
indicates features for a specific OpenMP version, such as "`omp_5.0`"
@ -94,23 +97,30 @@ For a brief revision history, see `Changes.log` in the repo.
# LaTeX macros for examples
## LaTeX macros for examples
The following describes LaTeX macros defined specifically for examples.
* Source code with language h-rules
* Source code without language h-rules
* Language h-rules
* Other macros
* See `openmp.sty` for more information
### Source code with language h-rules
```
\cexample[<verno>]{<ename>}{<seq-no>} % for C/C++ examples
\cppexample[<verno>]{<ename>}{<seq-no>} % for C++ examples
\fexample[<verno>]{<ename>}{<seq-no>} % for fixed-form Fortran examples
\ffreeexample[<verno>]{<ename>}{<seq-no>} % for free-form Fortran examples
\cexample[<verno>]{<ename>}{<seq-no>}[<s>] % for C/C++ examples
\cppexample[<verno>]{<ename>}{<seq-no>}[<s>] % for C++ examples
\fexample[<verno>]{<ename>}{<seq-no>}[<s>] % for fixed-form Fortran examples
\ffreeexample[<verno>]{<ename>}{<seq-no>}[<s>] % for free-form Fortran examples
```
* Source code without language h-rules
### Source code without language h-rules
```
\cnexample[<verno>]{<ename>}{<seq-no>}
\cppnexample[<verno>]{<ename>}{<seq-no>}
\fnexample[<verno>]{<ename>}{<seq-no>}
\ffreenexample[<verno>]{<ename>}{<seq-no>}
\srcnexample[<verno>]{<ename>}{<seq-no>}{<ext>}
\cnexample[<verno>]{<ename>}{<seq-no>}[<s>]
\cppnexample[<verno>]{<ename>}{<seq-no>}[<s>]
\fnexample[<verno>]{<ename>}{<seq-no>}[<s>]
\ffreenexample[<verno>]{<ename>}{<seq-no>}[<s>]
\srcnexample[<verno>]{<ename>}{<seq-no>}{<ext>}[<s>]
```
Optional `<verno>` can be supplied in a macro to include a specific OpenMP
@ -123,7 +133,11 @@ For a brief revision history, see `Changes.log` in the repo.
source code should not contain any `@@` metadata tags. The `ext` argument
to this macro is the file extension (such as `h`, `hpp`, `inc`).
* Language h-rules
The `<s>` option to each macro allows finer-control of any additional lines
to be skipped due to addition of new `@@` tags, such as `@@requires`.
The default value for `<s>` is 0.
### Language h-rules
```
\cspecificstart, \cspecificend
\cppspecificstart, \cppspecificend
@ -131,9 +145,11 @@ For a brief revision history, see `Changes.log` in the repo.
\fortranspecificstart, \fortranspecificend
```
* Chapter and section macros
### Other macros
```
\cchapter{<Chapter Name>}{<chap_directory>}
\hexentry[ext1]{<example_name>}[ext2]{<earlier_tag>}
\hexmentry[ext1]{<example_name>}[ext2]{<earlier_tag>}{<prior_name>}
```
The `\cchapter` macro is used for starting a chapter with proper page spacing.
@ -146,8 +162,15 @@ A previously-defined macro `\sinput{<section_file>}` to import a section
file from `<chap_directory>` is no longer supported. Please use
`\input{<chap_directory>/<section_file>}` explicitly.
* See `openmp.sty` for more information
The two macros `\hexentry` and `\hexmentry` are defined for simplifying
entries in the feature deprecation and update tables. Option `[ext1]` is
the file extension with a default value of `c` and option `[ext2]` is
the file extension for the associated second file if present.
`<earlier_tag>` is the version tag of the corresponding example
in the earlier version. `\hexentry` assumes no name change for an example
in different versions; `\hexmentry` can be used to specify a prior name
if it is different.
### License
## License
For copyright information, please see `omp_copyright.txt`.

281
Deprecated_Features.tex Normal file
View File

@ -0,0 +1,281 @@
\cchapter{Feature Deprecations and Updates in Examples}{deprecated_features}
\label{chap:deprecated_features}
\label{sec:deprecated_features}
\index{deprecated features}
Deprecation of features began in OpenMP 5.0.
Examples that use a deprecated feature have been updated with an equivalent
replacement feature.
Table~\ref{tab:Deprecated Features} summarizes deprecated features and
their replacements in each version. Affected examples are updated
accordingly and listed in Section~\ref{sec:Updated Examples}.
\nolinenumbers
\renewcommand{\arraystretch}{1.4}
\tablefirsthead{%
\hline
\textbf{Version} & \textbf{Deprecated Feature} & \textbf{Replacement}\\
\hline\\[-3.5ex]
}
\tablehead{%
\multicolumn{2}{l}{\small\slshape table continued from previous page}\\
\hline
\textbf{Version} & \textbf{Deprecated Feature} & \textbf{Replacement}\\
\hline\\[-3ex]
}
\tabletail{%
\hline\\[-4ex]
\multicolumn{2}{l}{\small\slshape table continued on next page}\\
}
\tablelasttail{\hline\\[-2ex]}
\tablecaption{Deprecated Features and Their Replacements\label{tab:Deprecated Features}}
\begin{supertabular}{p{0.4in} p{2.3in} p{2.2in}}
5.2 & \scode{default} clause on metadirectives
& \scode{otherwise} clause \\
5.2 & delimited \scode{declare}~\scode{target} directive for C/C++
& \scode{begin}~\scode{declare}~\scode{target} directive \\
5.2 & \scode{to} clause on \scode{declare}~\scode{target} directive
& \scode{enter} clause \\
5.2 & non-argument \scode{destroy} clause on \scode{depobj} construct
& \scode{destroy(}\plc{argument}\code{)} \\
5.2 & \scode{allocate} construct for Fortran \scode{ALLOCATE} statements
& \scode{allocators} construct \\
5.2 & \scode{depend} clause on \scode{ordered} construct
& \scode{doacross} clause \\
5.2 & \scode{linear(}\plc{modifier(list): linear-step}\code{)} clause
& \scode{linear(}\plc{list:}~\scode{step(}\plc{linear-step}\scode{)}\plc{, modifier}\scode{)} clause \\
\hline
5.1 & \scode{master} construct
& \scode{masked} construct \\
5.1 & \scode{master} affinity policy
& \scode{primary} affinity policy \\
\hline
5.0 & \scode{omp_lock_hint_*} constants
& \scode{omp_sync_hint_*} constants \\[2pt]
\end{supertabular}
\linenumbers
These replacements appear in examples that illustrate, otherwise, earlier features.
When using a compiler that is compliant with a version prior to
the indicated version, the earlier form of an example for a previous
version is listed as a reference.
\newpage
\section{Updated Examples for Different Versions}
\label{sec:Updated Examples}
The following tables list the updated examples for different versions as
a result of feature deprecation. The \emph{Earlier Version} column of
the tables shows the version tag of the earlier version. It also shows
the prior name of an example when it has been renamed.
Table~\ref{tab:Updated Examples 5.2} lists the updated examples for OpenMP 5.2
in the Examples Document Version
\href{https://github.com/OpenMP/Examples/tree/v5.2}{5.2}.
The \emph{Earlier Version} column of the table lists the earlier version
tags of the examples that can be found in
the Examples Document Version
\href{https://github.com/OpenMP/Examples/tree/v5.1}{5.1}.
\index{clauses!default@\code{default}}
\index{clauses!otherwise@\code{otherwise}}
\index{clauses!to@\code{to}}
\index{clauses!enter@\code{enter}}
\index{clauses!depend@\code{depend}}
\index{clauses!doacross@\code{doacross}}
\index{clauses!linear@\code{linear}}
\index{clauses!destroy@\code{destroy}}
\index{default clause@\code{default} clause}
\index{otherwise clause@\code{otherwise} clause}
\index{to clause@\code{to} clause}
\index{enter clause@\code{enter} clause}
\index{depend clause@\code{depend} clause}
\index{doacross clause@\code{doacross} clause}
\index{linear clause@\code{linear} clause}
\index{destroy clause@\code{destroy} clause}
\index{directives!begin declare target@\code{begin}~\code{declare}~\code{target}}
\index{begin declare target directive@\code{begin}~\code{declare}~\code{target} directive}
\index{allocate construct@\code{allocate} construct}
\index{allocators construct@\code{allocators} construct}
\nolinenumbers
\renewcommand{\arraystretch}{1.0}
\tablefirsthead{%
\hline\\[-2ex]
\textbf{Example Name} & \textbf{Earlier Version} & \textbf{Feature Updated}
\\[2pt]
\hline\\[-2ex]
}
\tablehead{%
\multicolumn{2}{l}{\small\slshape table continued from previous page}\\[2pt]
\hline\\[-2ex]
\textbf{Example Name} & \textbf{Earlier Version} & \textbf{Feature Updated}\\[2pt]
\hline\\[-2ex]
}
\tabletail{%
\hline\\[-2.5ex]
\multicolumn{2}{l}{\small\slshape table continued on next page}\\
}
\tablelasttail{\hline\\[-1ex]}
\tablecaption{Updated Examples for Version 5.2\label{tab:Updated Examples 5.2}}
\begin{supertabular}{p{1.7in} p{1.2in} p{2.1in}}
\hexentry{error.1}[f90]{5.1} &
\scode{default} clause on metadirectives \\
\hexentry{metadirective.1}[f90]{5.0} &
replaced with \scode{otherwise} clause \\
\hexentry{metadirective.2}[f90]{5.0} & \\
\hexentry{metadirective.3}[f90]{5.0} & \\
\hexentry{metadirective.4}[f90]{5.1} & \\
\hexentry{target_ptr_map.4}{5.1} & \\
\hexentry{target_ptr_map.5}[f90]{5.1} & \\[2pt]
\hline\\[-2ex]
\hexentry[f90]{array_shaping.1}{5.0} &
\scode{to} clause on \scode{declare} \scode{target} \\
\hexentry{target_reverse_offload.7}{5.0} &
directive replaced with \scode{enter} clause \\
\hexentry{target_task_reduction.1}[f90]{5.1} & \\
\hexentry{target_task_reduction.2a}[f90]{5.0} & \\
\hexentry{target_task_reduction.2b}[f90]{5.1} &\\[2pt]
\hline\\[-2ex]
\hexentry{array_shaping.1}{5.0} &
delimited \scode{declare}~\scode{target} \\
\hexentry{async_target.1}{4.0} &
directive replaced with \\
\hexentry{async_target.2}{4.0} &
\scode{begin}~\scode{declare}~\scode{target} \\
\hexentry{declare_target.1}{4.0} &
directive for C/C++ \\
\hexentry[cpp]{declare_target.2c}{4.0} & \\
\hexentry{declare_target.3}{4.0} & \\
\hexentry{declare_target.4}{4.0} & \\
\hexentry{declare_target.5}{4.0} & \\
\hexentry{declare_target.6}{4.0} & \\
\hexentry{declare_variant.1}{5.0} & \\
\hexentry{device.1}{4.0} & \\
\hexentry{metadirective.3}{5.0} & \\
\hexentry{target_ptr_map.2}{5.0} & \\
\hexentry{target_ptr_map.3a}{5.0} & \\
\hexentry{target_ptr_map.3b}{5.0} & \\
\hexentry{target_struct_map.1}{5.0} & \\
\hexentry[cpp]{target_struct_map.2}{5.0} & \\
\hexentry{target_struct_map.3}{5.0} & \\
\hexentry{target_struct_map.4}{5.0} & \\[2pt]
\hline\\[-2ex]
\hexentry{doacross.1}[f90]{4.5} &
\scode{depend} clause on \scode{ordered} \\
\hexentry{doacross.2}[f90]{4.5} &
construct replaced with \scode{doacross} \\
\hexentry{doacross.3}[f90]{4.5} &
clause \\
\hexentry{doacross.4}[f90]{4.5} & \\[2pt]
\hline\\[-2ex]
\hexentry[cpp]{linear_modifier.1}[f90]{4.5} &
modifier syntax change for \scode{linear} \\
\hexentry[cpp]{linear_modifier.2}[f90]{4.5} &
clause on \scode{declare}~\scode{simd} directive \\
\hexentry{linear_modifier.3}[f90]{4.5} & \\[2pt]
\hline\\[-2ex]
\hexentry[f90]{allocators.1}{5.0} &
\scode{allocate} construct replaced with \scode{allocators} construct
for Fortran allocate statements \\[2pt]
\hline\\[-2ex]
\hexentry{depobj.1}[f90]{5.0} &
argument added to \scode{destroy} clause on \scode{depobj}
construct \\[2pt]
\end{supertabular}
\linenumbers
Table~\ref{tab:Updated Examples 5.1} lists the updated examples for OpenMP 5.1
in the Examples Document Version
\href{https://github.com/OpenMP/Examples/tree/v5.1}{5.1}.
The \emph{Earlier Version} column of the table lists the earlier version
tags and prior names of the examples that can be found in
the Examples Document Version
\href{https://github.com/OpenMP/Examples/tree/v5.0.1}{5.0.1}.
\index{affinity!master policy@\code{master} policy}
\index{affinity!primary policy@\code{primary} policy}
\index{constructs!master@\code{master}}
\index{constructs!masked@\code{masked}}
\index{master construct@\code{master} construct}
\index{masked construct@\code{masked} construct}
\nolinenumbers
\renewcommand{\arraystretch}{1.0}
\tablefirsthead{%
\hline\\[-2ex]
\textbf{Example Name} & \textbf{Earlier Version} & \textbf{Feature Updated}
\\[2pt]
\hline\\[-2ex]
}
\tablehead{%
\multicolumn{2}{l}{\small\slshape table continued from previous page}\\[2pt]
\hline\\[-2ex]
\textbf{Example Name} & \textbf{Earlier Version} & \textbf{Feature Updated}\\[2pt]
\hline\\[-2ex]
}
\tabletail{%
\hline\\[-2.5ex]
\multicolumn{2}{l}{\small\slshape table continued on next page}\\
}
\tablelasttail{\hline\\[-1ex]}
\tablecaption{Updated Examples for Version 5.1\label{tab:Updated Examples 5.1}}
\begin{supertabular}{p{1.8in} p{1.4in} p{1.8in}}
\hexentry{affinity.5}[f]{4.0} &
\scode{master} affinity policy replaced with \scode{primary} policy \\[2pt]
\hline\\[-2ex]
\hexentry{async_target.3}[f90]{5.0} &
\scode{master} construct replaced \\
\hexentry{cancellation.2}[f90]{4.0} &
with \scode{masked} construct \\
\hexentry{copyprivate.2}[f]{3.0} & \\
\hexentry[f]{fort_sa_private.5}{3.0} & \\
\hexentry{lock_owner.1}[f]{3.0} & \\
\hexmentry{masked.1}[f]{3.0}{master.1} & \\
\hexmentry{parallel_masked_taskloop.1}[f90]{5.0}{parallel_master_taskloop.1} &\\
\hexentry{reduction.6}[f]{3.0} & \\
\hexentry{target_task_reduction.1}[f90]{5.0} & \\
\hexentry{target_task_reduction.2b}[f90]{5.0} & \\
\hexentry{taskloop_simd_reduction.1}[f90]{5.0} & \\
\hexentry{task_detach.1}[f90]{5.0} & \\[2pt]
\end{supertabular}
\linenumbers
Table~\ref{tab:Updated Examples 5.0} lists the updated examples for OpenMP 5.0
in the Examples Document Version
\href{https://github.com/OpenMP/Examples/tree/v5.1}{5.1}.
The \emph{Earlier Version} column of the table lists the earlier version
tags of the examples that can be found in
the Examples Document Version
\href{https://github.com/OpenMP/Examples/tree/v5.0.1}{5.0.1}.
\nolinenumbers
\renewcommand{\arraystretch}{1.0}
\tablefirsthead{%
\hline\\[-2ex]
\textbf{Example Name} & \textbf{Earlier Version} & \textbf{Feature Updated}
\\[2pt]
\hline\\[-2ex]
}
\tablehead{%
\multicolumn{2}{l}{\small\slshape table continued from previous page}\\[2pt]
\hline\\[-2ex]
\textbf{Example Name} & \textbf{Earlier Version} & \textbf{Feature Updated}\\[2pt]
\hline\\[-2ex]
}
\tabletail{%
\hline\\[-2.5ex]
\multicolumn{2}{l}{\small\slshape table continued on next page}\\
}
\tablelasttail{\hline\\[-1ex]}
\tablecaption{Updated Examples for Version 5.0\label{tab:Updated Examples 5.0}}
\begin{supertabular}{p{1.6in} p{1.3in} p{2.1in}}
\hexentry{critical.2}[f]{4.5} &
\scode{omp_lock_hint_*} constants \\
\hexentry[cpp]{init_lock_with_hint.1}[f]{4.5} &
replaced with \scode{omp_sync_hint_*} constants \\[2pt]
\end{supertabular}
\linenumbers

View File

@ -1,23 +1,33 @@
\bchapter{Foreword}
\chapter*{Foreword}
\label{chap:foreword}
The OpenMP Examples document has been updated with new features
found in the OpenMP 5.1 Specification. The additional examples and updates
found in the OpenMP \VER\ Specification. The additional examples and updates
are referenced in the Document Revision History of the Appendix on page~\pageref{chap:history}.
Text describing an example with a 5.1 feature specifically states
that the feature support begins in the OpenMP 5.1 Specification. Also,
an \code{\small omp\_5.1} keyword is included in the metadata of the source code.
These distinctions are presented to remind readers that a 5.1 compliant
Text describing an example with a \VER\ feature specifically states
that the feature support begins in the OpenMP \VER\ Specification. Also,
an \code{\small omp\_\VER} keyword is included in the metadata of the source code.
These distinctions are presented to remind readers that a \VER\ compliant
OpenMP implementation is necessary to use these features in codes.
Examples for most of the 5.1 features are included in this document,
Examples for most of the \VER\ features are included in this document,
and incremental releases will become available as more feature examples
and updates are submitted, and approved by the OpenMP Examples Subcommittee.
and updates are submitted and approved by the OpenMP Examples Subcommittee.
Examples are accepted for this document after discussions, revisions and reviews
in the Examples Subcommittee, and two reviews/discussions and two votes
in the OpenMP Language Committee.
Draft examples are often derived from case studies for new features in the language,
and are revised to illustrate the basic application of the features with code comments,
and a text description. We are grateful to the numerous members of the Language Committee
who took the time to prepare codes and descriptions, and shepherd them through
the acceptance process. We sincerely appreciate the Example Subcommittee members, who
actively participated and contributed in weekly meetings over the years.
\bigskip
Examples Subcommitee Co-chairs: \smallskip\linebreak
Examples Subcommittee Co-chairs: \smallskip\linebreak
Henry Jin (\textsc{NASA} Ames Research Center) \linebreak
Kent Milfeld (\textsc{TACC}, Texas Advanced Research Center)
Kent Milfeld (\textsc{TACC}, Texas Advanced Computing Center)

View File

@ -1,6 +1,72 @@
\cchapter{Document Revision History}{history}
\label{chap:history}
%=====================================
\section{Changes from 5.1 to 5.2}
\label{sec:history_51_to_52}
\begin{itemize}
\item General changes:
\begin{itemize}
\item Included a description of the semantics for OpenMP directive syntax
(see \specref{chap:directive_syntax})
\item Reorganized the Introduction Chapter and moved the Feature
Deprecation Chapter to Appendix~\ref{chap:deprecated_features}
\item Included a list of examples that were updated for feature deprecation
and replacement in each version (see Appendix~\ref{sec:Updated Examples})
\item Added Index entries
\end{itemize}
\item Updated the examples for feature deprecation and replacement in OpenMP 5.2.
See Table~\ref{tab:Deprecated Features} and
Table~\ref{tab:Updated Examples 5.2} for details.
\item Added the following examples for the 5.2 features:
\begin{itemize}
\item Mapping class objects with virtual functions
(\specref{sec:virtual_functions})
\item \scode{allocators} construct for Fortran \code{allocate} statement
(\specref{sec:allocators})
\item Behavior of reallocation of variables through OpenMP allocator in
Fortran (\specref{sec:allocators})
\end{itemize}
\item Added the following examples for the 5.1 features:
\begin{itemize}
\item Clarification of optional \code{end} directive for strictly structured
block in Fortran (\specref{sec:fortran_free_format_comments})
\item \scode{filter} clause on \scode{masked} construct (\specref{sec:masked})
\item \scode{omp_all_memory} reserved locator for specifying task dependences
(\specref{subsec:depend_undefer_task})
\item Behavior of Fortran allocatable variables in \code{target} regions
(\specref{sec:fort_allocatable_array_mapping})
\item Device memory routines in Fortran
(\specref{subsec:target_mem_and_device_ptrs})
\item Partial tiles from \scode{tile} construct
(\specref{sec:incomplete_tiles})
\item Fortran associate names and selectors in \code{target} region
(\specref{sec:associate_target})
\item \scode{allocate} directive for variable declarations and
\scode{allocate} clause on \scode{task} constructs
(\specref{sec:allocators})
\item Controlling concurrency and reproducibility with \code{order} clause
(\specref{sec:reproducible_modifier})
\end{itemize}
\item Added other examples:
\begin{itemize}
\item Using lambda expressions with \scode{target} constructs
(\specref{sec:lambda_expressions})
\item Target memory and device pointer routines
(\specref{subsec:target_mem_and_device_ptrs})
\item Examples to illustrate the ordering properties of
the \plc{flush} operation (\specref{sec:mem_model})
\item User selector in the \code{metadirective} directive
(\specref{sec:metadirective})
\end{itemize}
\end{itemize}
%=====================================
\section{Changes from 5.0.1 to 5.1}
\label{sec:history_501_to_51}

View File

@ -1,17 +1,18 @@
# Makefile for the OpenMP Examples document in LaTex format.
# For more information, see the main document, openmp-examples.tex.
version=5.1
version=5.2
default: openmp-examples.pdf
diff: openmp-diff-abridged.pdf
book: BOOK_BUILD="\\\\def\\\\bookbuild{1}"
book: clean openmp-examples.pdf
cp openmp-examples-${version}.pdf openmp-examples-${version}-book.pdf
CHAPTERS=Title_Page.tex \
Foreword_Chapt.tex \
Introduction_Chapt.tex \
Examples_Chapt.tex \
Deprecated_Features_Chapt.tex \
Chap_*.tex \
Deprecated_Features.tex \
History.tex \
*/*.tex
@ -22,6 +23,8 @@ SOURCES=*/sources/*.c \
INTERMEDIATE_FILES=openmp-examples.pdf \
openmp-examples.toc \
openmp-examples.lof \
openmp-examples.lot \
openmp-examples.idx \
openmp-examples.aux \
openmp-examples.ilg \
@ -29,20 +32,30 @@ INTERMEDIATE_FILES=openmp-examples.pdf \
openmp-examples.out \
openmp-examples.log
LATEXCMD=pdflatex -interaction=batchmode -file-line-error
LATEXDCMD=$(LATEXCMD) -draftmode
# check for branches names with "name_XXX"
DIFF_TICKET_ID=$(shell git rev-parse --abbrev-ref HEAD)
openmp-examples.pdf: $(CHAPTERS) $(SOURCES) openmp.sty openmp-examples.tex openmp-logo.png
openmp-examples.pdf: $(CHAPTERS) $(SOURCES) openmp.sty openmp-examples.tex openmp-logo.png generated-include.tex
rm -f $(INTERMEDIATE_FILES)
pdflatex -interaction=batchmode -file-line-error openmp-examples.tex
pdflatex -interaction=batchmode -file-line-error openmp-examples.tex
pdflatex -interaction=batchmode -file-line-error openmp-examples.tex
touch generated-include.tex
$(LATEXDCMD) openmp-examples.tex
makeindex -s openmp-index.ist openmp-examples.idx
$(LATEXDCMD) openmp-examples.tex
$(LATEXCMD) openmp-examples.tex
cp openmp-examples.pdf openmp-examples-${version}.pdf
clean:
rm -f $(INTERMEDIATE_FILES)
rm -f generated-include.tex
rm -f openmp-diff-full.pdf openmp-diff-abridged.pdf
rm -rf *.tmpdir
cd util; make clean
realclean: clean
rm -f openmp-examples-${version}.pdf openmp-examples-${version}-book.pdf
ifdef DIFF_TO
VC_DIFF_TO := -r ${DIFF_TO}
@ -52,11 +65,11 @@ endif
ifdef DIFF_FROM
VC_DIFF_FROM := -r ${DIFF_FROM}
else
VC_DIFF_FROM := -r work_5.1
VC_DIFF_FROM := -r work_5.2
endif
DIFF_TO:=HEAD
DIFF_FROM:=work_5.1
DIFF_FROM:=work_5.2
DIFF_TYPE:=UNDERLINE
COMMON_DIFF_OPTS:=--math-markup=whole \
@ -67,6 +80,10 @@ VC_DIFF_OPTS:=${COMMON_DIFF_OPTS} --force -c latexdiff.cfg --flatten --type="${D
VC_DIFF_MINIMAL_OPTS:= --only-changes --force
generated-include.tex:
echo "$(BOOK_BUILD)"
echo "$(BOOK_BUILD)" > $@
%.tmpdir: $(wildcard *.sty) $(wildcard *.png) $(wildcard *.aux) openmp-examples.pdf
mkdir -p $@/sources
for i in affinity devices loop_transformations parallel_execution SIMD tasking \
@ -88,3 +105,5 @@ openmp-diff-minimal.pdf: diffs-slow-minimal.tmpdir
env PATH="$(shell pwd)/util/latexdiff:$(PATH)" latexdiff-vc ${VC_DIFF_MINIMAL_OPTS} -d $< ${VC_DIFF_OPTS} openmp-examples.tex
cp $</openmp-examples.pdf $@
if [ "x$(DIFF_TICKET_ID)" != "x" ]; then cp $@ ${@:.pdf=-$(DIFF_TICKET_ID).pdf}; fi
.PHONY: diff default book clean realclean

View File

@ -2,6 +2,8 @@
\section{\code{simd} and \code{declare} \code{simd} Directives}
\label{sec:SIMD}
\index{constructs!simd@\code{simd}}
\index{simd construct@\code{simd} construct}
The following example illustrates the basic use of the \code{simd} construct
to assure the compiler that the loop can be vectorized.
@ -10,6 +12,12 @@ to assure the compiler that the loop can be vectorized.
\ffreeexample[4.0]{SIMD}{1}
\index{directives!declare simd@\code{declare}~\code{simd}}
\index{declare simd directive@\code{declare}~\code{simd} directive}
\index{clauses!uniform@\code{uniform}}
\index{uniform clause@\code{uniform} clause}
\index{clauses!linear@\code{linear}}
\index{linear clause@\code{linear} clause}
When a function can be inlined within a loop the compiler has an opportunity to
vectorize the loop. By guaranteeing SIMD behavior of a function's operations,
characterizing the arguments of the function and privatizing temporary
@ -43,6 +51,11 @@ variable.
\ffreeexample[4.0]{SIMD}{2}
%\pagebreak
\index{clauses!private@\code{private}}
\index{private clause@\code{private} clause}
\index{clauses!reduction@\code{reduction}}
\index{reduction clause@\code{reduction} clause}
\index{reductions!reduction clause@\code{reduction} clause}
A thread that encounters a SIMD construct executes a vectorized code of the
iterations. Similar to the concerns of a worksharing loop a loop vectorized
with a SIMD construct must assure that temporary and reduction variables are
@ -56,6 +69,8 @@ construct.
%\pagebreak
\index{clauses!safelen@\code{safelen}}
\index{safelen clause@\code{safelen} clause}
A \code{safelen(N)} clause in a \code{simd} construct assures the compiler that
there are no loop-carried dependencies for vectors of size \plc{N} or below. If
the \code{safelen} clause is not specified, then the default safelen value is
@ -71,6 +86,8 @@ than 16, the behavior is undefined.
\ffreeexample[4.0]{SIMD}{4}
%\pagebreak
\index{clauses!collapse@\code{collapse}}
\index{collapse clause@\code{collapse} clause}
The following SIMD construct instructs the compiler to collapse the \plc{i} and
\plc{j} loops into a single SIMD loop in which SIMD chunks are executed by
threads of the team. Within the workshared loop chunks of a thread, the SIMD
@ -84,6 +101,10 @@ chunks are executed in the lanes of the vector units.
%%% section
\section{\code{inbranch} and \code{notinbranch} Clauses}
\label{sec:SIMD_branch}
\index{clauses!inbranch@\code{inbranch}}
\index{inbranch clause@\code{inbranch} clause}
\index{clauses!notinbranch@\code{notinbranch}}
\index{notinbranch clause@\code{notinbranch} clause}
The following examples illustrate the use of the \code{declare} \code{simd}
directive with the \code{inbranch} and \code{notinbranch} clauses. The
@ -114,6 +135,7 @@ version of the \plc{fib()} function.
\pagebreak
\section{Loop-Carried Lexical Forward Dependence}
\label{sec:SIMD_forward_dep}
\index{dependences!loop-carried lexical forward}
The following example tests the restriction on an SIMD loop with the loop-carried lexical forward-dependence. This dependence must be preserved for the correct execution of SIMD loops.

View File

@ -1,8 +1,14 @@
%%% section
\section{\code{ref}, \code{val}, \code{uval} Modifiers for \code{linear} Clause}
\label{sec:linear_modifier}
\index{modifiers, linear@modifiers, \code{linear}!ref@\code{ref}}
\index{modifiers, linear@modifiers, \code{linear}!val@\code{val}}
\index{modifiers, linear@modifiers, \code{linear}!uval@\code{uval}}
\index{clauses!linear@\code{linear}}
\index{linear clause@\code{linear} clause}
When generating vector functions from \code{declare}~\code{simd} directives, it is important for a compiler to know the proper types of function arguments in
When generating vector functions from \code{declare}~\code{simd} directives,
it is important for a compiler to know the proper types of function arguments in
order to generate efficient codes.
This is especially true for C++ reference types and Fortran arguments.
@ -11,66 +17,67 @@ parameter (or Fortran argument) \plc{p}. Variable \plc{p} gets incremented by 1
The caller loop \plc{i} in the main program passes
a variable \plc{k} as a reference to the function \plc{add\_one2} call.
The \code{ref} modifier for the \code{linear} clause on the
\code{declare}~\code{simd} directive is used to annotate the
reference-type parameter \plc{p} to match the property of the variable
\code{declare}~\code{simd} directive specifies that the
reference-type parameter \plc{p} is to match the property of the variable
\plc{k} in the loop.
This use of reference type is equivalent to the second call to
\plc{add\_one2} with a direct passing of the array element \plc{a[i]}.
In the example, the preferred vector
length 8 is specified for both the caller loop and the callee function.
When \code{linear(ref(p))} is applied to an argument passed by reference,
When \code{linear(p:~ref)} is applied to an argument passed by reference,
it tells the compiler that the addresses in its vector argument are consecutive,
and so the compiler can generate a single vector load or store instead of
a gather or scatter. This allows more efficient SIMD code to be generated with
less source changes.
\cppexample[4.5]{linear_modifier}{1}
\ffreeexample[4.5]{linear_modifier}{1}
\cppexample[5.2]{linear_modifier}{1}
\ffreeexample[5.2]{linear_modifier}{1}
\clearpage
The following example is a variant of the above example. The function \plc{add\_one2} in the C++ code includes an additional C++ reference parameter \plc{i}.
The following example is a variant of the above example. The function \plc{add\_one2}
in the C++ code includes an additional C++ reference parameter \plc{i}.
The loop index \plc{i} of the caller loop \plc{i} in the main program
is passed as a reference to the function \plc{add\_one2} call.
The loop index \plc{i} has a uniform address with
linear value of step 1 across SIMD lanes.
Thus, the \code{uval} modifier is used for the \code{linear} clause
to annotate the C++ reference-type parameter \plc{i} to match
to specify that the C++ reference-type parameter \plc{i} is to match
the property of loop index \plc{i}.
In the correponding Fortran code the arguments \plc{p} and
In the corresponding Fortran code the arguments \plc{p} and
\plc{i} in the routine \plc{add\_on2} are passed by references.
Similar modifiers are used for these variables in the \code{linear} clauses
to match with the property at the caller loop in the main program.
When \code{linear(uval(i))} is applied to an argument passed by reference, it
When \code{linear(i:~uval)} is applied to an argument passed by reference, it
tells the compiler that its addresses in the vector argument are uniform
so that the compiler can generate a scalar load or scalar store and create
linear values. This allows more efficient SIMD code to be generated with
less source changes.
\cppexample[4.5]{linear_modifier}{2}
\ffreeexample[4.5]{linear_modifier}{2}
\cppexample[5.2]{linear_modifier}{2}
\ffreeexample[5.2]{linear_modifier}{2}
In the following example, the function \plc{func} takes arrays \plc{x} and \plc{y} as arguments, and accesses the array elements referenced by
the index \plc{i}.
In the following example, the function \plc{func} takes arrays \plc{x} and \plc{y}
as arguments, and accesses the array elements referenced by the index \plc{i}.
The caller loop \plc{i} in the main program passes a linear copy of
the variable \plc{k} to the function \plc{func}.
The \code{val} modifier is used for the \code{linear} clause
in the \code{declare}~\code{simd} directive for the function
\plc{func} to annotate argument \plc{i} to match the property of
\plc{func} to specify that the argument \plc{i} is to match the property of
the actual argument \plc{k} passed in the SIMD loop.
Arrays \plc{x} and \plc{y} have uniform addresses across SIMD lanes.
When \code{linear(val(i):1)} is applied to an argument,
When \code{linear(i:~val,step(1))} is applied to an argument,
it tells the compiler that its addresses in the vector argument may not be
consecutive, however, their values are linear (with stride 1 here). When the value of \plc{i} is used
in subscript of array references (e.g., \plc{x[i]}), the compiler can generate
a vector load or store instead of a gather or scatter. This allows more
efficient SIMD code to be generated with less source changes.
\cexample[4.5]{linear_modifier}{3}
\ffreeexample[4.5]{linear_modifier}{3}
\cexample[5.2]{linear_modifier}{3}
\ffreeexample[5.2]{linear_modifier}{3}

View File

@ -1,5 +1,5 @@
/*
* @@name: SIMD.1c
* @@name: SIMD.1
* @@type: C
* @@compilable: yes
* @@linkable: no

View File

@ -1,4 +1,4 @@
! @@name: SIMD.1f
! @@name: SIMD.1
! @@type: F-free
! @@compilable: yes
! @@linkable: no

View File

@ -1,5 +1,5 @@
/*
* @@name: SIMD.2c
* @@name: SIMD.2
* @@type: C
* @@compilable: yes
* @@linkable: yes

View File

@ -1,4 +1,4 @@
! @@name: SIMD.2f
! @@name: SIMD.2
! @@type: F-free
! @@compilable: yes
! @@linkable: yes

View File

@ -1,5 +1,5 @@
/*
* @@name: SIMD.3c
* @@name: SIMD.3
* @@type: C
* @@compilable: yes
* @@linkable: no

View File

@ -1,4 +1,4 @@
! @@name: SIMD.3f
! @@name: SIMD.3
! @@type: F-free
! @@compilable: yes
! @@linkable: no

View File

@ -1,5 +1,5 @@
/*
* @@name: SIMD.4c
* @@name: SIMD.4
* @@type: C
* @@compilable: yes
* @@linkable: no

View File

@ -1,4 +1,4 @@
! @@name: SIMD.4f
! @@name: SIMD.4
! @@type: F-free
! @@compilable: yes
! @@linkable: no

View File

@ -1,5 +1,5 @@
/*
* @@name: SIMD.5c
* @@name: SIMD.5
* @@type: C
* @@compilable: yes
* @@linkable: no

View File

@ -1,4 +1,4 @@
! @@name: SIMD.5f
! @@name: SIMD.5
! @@type: F-free
! @@compilable: yes
! @@linkable: no

View File

@ -1,5 +1,5 @@
/*
* @@name: SIMD.6c
* @@name: SIMD.6
* @@type: C
* @@compilable: yes
* @@linkable: no

View File

@ -1,4 +1,4 @@
! @@name: SIMD.6f
! @@name: SIMD.6
! @@type: F-free
! @@compilable: yes
! @@linkable: no

View File

@ -1,5 +1,5 @@
/*
* @@name: SIMD.7c
* @@name: SIMD.7
* @@type: C
* @@compilable: yes
* @@linkable: yes

View File

@ -1,4 +1,4 @@
! @@name: SIMD.7f
! @@name: SIMD.7
! @@type: F-free
! @@compilable: yes
! @@linkable: yes

View File

@ -1,5 +1,5 @@
/*
* @@name: SIMD.8c
* @@name: SIMD.8
* @@type: C
* @@compilable: yes
* @@linkable: yes

View File

@ -1,4 +1,4 @@
! @@name: SIMD.8f
! @@name: SIMD.8
! @@type: F-free
! @@compilable: yes
! @@linkable: yes

View File

@ -1,17 +1,17 @@
/*
* @@name: linear_modifier.1cpp
* @@name: linear_modifier.1
* @@type: C++
* @@compilable: yes
* @@linkable: yes
* @@expect: success
* @@version: omp_4.5
* @@version: omp_5.1
*/
#include <stdio.h>
#define NN 1023
int a[NN];
#pragma omp declare simd linear(ref(p)) simdlen(8)
#pragma omp declare simd linear(p: ref) simdlen(8)
void add_one2(int& p)
{
p += 1;

View File

@ -1,16 +1,16 @@
! @@name: linear_modifier.1.f90
! @@name: linear_modifier.1
! @@type: F-free
! @@compilable: yes
! @@linkable: yes
! @@expect: success
! @@version: omp_4.5
! @@version: omp_5.2
module m
integer, parameter :: NN = 1023
integer :: a(NN)
contains
subroutine add_one2(p)
!$omp declare simd(add_one2) linear(ref(p)) simdlen(8)
!$omp declare simd(add_one2) linear(p: ref) simdlen(8)
implicit none
integer :: p

View File

@ -1,17 +1,17 @@
/*
* @@name: linear_modifier.2cpp
* @@name: linear_modifier.2
* @@type: C++
* @@compilable: yes
* @@linkable: yes
* @@expect: success
* @@version: omp_4.5
* @@version: omp_5.2
*/
#include <stdio.h>
#define NN 1023
int a[NN];
#pragma omp declare simd linear(ref(p)) linear(uval(i))
#pragma omp declare simd linear(p: ref) linear(i: uval)
void add_one2(int& p, const int& i)
{
p += i;

View File

@ -1,16 +1,16 @@
! @@name: linear_modifier.2f90
! @@name: linear_modifier.2
! @@type: F-free
! @@compilable: yes
! @@linkable: yes
! @@expect: success
! @@version: omp_4.5
! @@version: omp_5.2
module m
integer, parameter :: NN = 1023
integer :: a(NN)
contains
subroutine add_one2(p, i)
!$omp declare simd(add_one2) linear(ref(p)) linear(uval(i))
!$omp declare simd(add_one2) linear(p: ref) linear(i: uval)
implicit none
integer :: p
integer, intent(in) :: i

View File

@ -1,16 +1,16 @@
/*
* @@name: linear_modifier.3c
* @@name: linear_modifier.3
* @@type: C
* @@compilable: yes
* @@linkable: yes
* @@expect: success
* @@version: omp_4.5
* @@version: omp_5.2
*/
#include <stdio.h>
#define N 128
#pragma omp declare simd simdlen(4) uniform(x, y) linear(val(i):1)
#pragma omp declare simd simdlen(4) uniform(x, y) linear(i:val,step(1))
double func(double x[], double y[], int i)
{
return (x[i] + y[i]);

View File

@ -1,13 +1,13 @@
! @@name: linear_modifier.3f
! @@name: linear_modifier.3
! @@type: F-free
! @@compilable: yes
! @@linkable: yes
! @@expect: success
! @@version: omp_4.5
! @@version: omp_5.2
module func_mod
contains
real(8) function func(x, y, i)
!$omp declare simd(func) simdlen(4) uniform(x, y) linear(val(i):1)
!$omp declare simd(func) simdlen(4) uniform(x, y) linear(i:val,step(1))
implicit none
real(8), intent(in) :: x(*), y(*)
integer, intent(in) :: i

View File

@ -23,11 +23,12 @@
\vspace{2.3in} %was 3.0
Source codes for OpenMP \PVER{} Examples can be downloaded from
\href{https://github.com/OpenMP/Examples/tree/v\VER}{github}.\\
Source codes for OpenMP \VER{} Examples are available at
\href{https://github.com/OpenMP/Examples/tree/v\VER}%
{github (https://github.com/OpenMP/Examples/tree/v\VER)}.\\
\begin{adjustwidth}{0pt}{1em}\setlength{\parskip}{0.25\baselineskip}%
Copyright \copyright{} 1997-2021 OpenMP Architecture Review Board.\\
Copyright \copyright{} 1997-2022 OpenMP Architecture Review Board.\\
Permission to copy without fee all or part of this material is granted,
provided the OpenMP Architecture Review Board copyright notice and
the title of this document appear. Notice is given that copying is by
@ -37,7 +38,7 @@ permission of OpenMP Architecture Review Board.\end{adjustwidth}
% Blank page
\cleardoublepage
%\cleardoublepage
%For final version, uncomment the line above, comment out the lines below
%This working version enacted the following tickets: 287, 519, 550, 593,

View File

@ -1,19 +1,24 @@
\pagebreak
\section{\code{proc\_bind} Clause}
\label{sec:affinity}
\index{affinity!proc_bind clause@\scode{proc_bind} clause}
\index{clauses!proc_bind@\scode{proc_bind}}
\index{proc_bind clause@\scode{proc_bind} clause}
The following examples demonstrate how to use the \code{proc\_bind} clause to
control the thread binding for a team of threads in a \code{parallel} region.
The machine architecture is depicted in the figure below. It consists of two sockets,
The machine architecture is depicted in Figure~\ref{fig:mach_arch}. It consists of two sockets,
each equipped with a quad-core processor and configured to execute two hardware
threads simultaneously on each core. These examples assume a contiguous core numbering
starting from 0, such that the hardware threads 0,1 form the first physical core.
\ifpdf
%\begin{figure}[htbp]
\centerline{\includegraphics[width=3.8in,keepaspectratio=true]%
\begin{figure}[htb]
\centerline{\includegraphics[width=3.0in,keepaspectratio=true]%
{figs/proc_bind_fig.pdf}}
%\end{figure}
\caption{A machine architecture with two quad-core processors}
\label{fig:mach_arch}
\end{figure}
\fi
The following equivalent place list declarations consist of eight places (which
@ -27,6 +32,8 @@ or
\subsection{Spread Affinity Policy}
\label{subsec:affinity_spread}
\index{affinity!spread policy@\code{spread} policy}
\index{spread policy@\code{spread} policy}
The following example shows the result of the \code{spread} affinity policy on
@ -124,6 +131,8 @@ and distribution of the place partition would be as follows:
\subsection{Close Affinity Policy}
\label{subsec:affinity_close}
\index{affinity!close policy@\code{close} policy}
\index{close policy@\code{close} policy}
The following example shows the result of the \code{close} affinity policy on
the partition list when the number of threads is less than or equal to the number
@ -220,6 +229,8 @@ and distribution of the place partition would be as follows:
\subsection{Primary Affinity Policy}
\label{subsec:affinity_primary}
\index{affinity!primary policy@\code{primary} policy}
\index{primary policy@\code{primary} policy}
The following example shows the result of the \code{primary} affinity policy on
the partition list for the machine architecture depicted above. The place partition
@ -227,7 +238,7 @@ is not changed by the primary policy.
\cexample[4.0]{affinity}{5}
\fexample[4.0]{affinity}{5}[1]
\fexample[4.0]{affinity}{5}
\clearpage
It is unspecified on which place the primary thread is initially started. If the

View File

@ -1,5 +1,14 @@
\section{Affinity Display}
\label{sec:affinity_display}
\index{affinity display!OMP_DISPLAY_AFFINITY@\scode{OMP_DISPLAY_AFFINITY}}
\index{environment variables!OMP_DISPLAY_AFFINITY@\scode{OMP_DISPLAY_AFFINITY}}
\index{OMP_DISPLAY_AFFINITY@\scode{OMP_DISPLAY_AFFINITY}}
\index{affinity display!OMP_AFFINITY_FORMAT@\scode{OMP_AFFINITY_FORMAT}}
\index{environment variables!OMP_AFFINITY_FORMAT@\scode{OMP_AFFINITY_FORMAT}}
\index{OMP_AFFINITY_FORMAT@\scode{OMP_AFFINITY_FORMAT}}
\index{affinity display!omp_display_affinity routine@\scode{omp_display_affinity} routine}
\index{routines!omp_display_affinity@\scode{omp_display_affinity}}
\index{omp_display_affinity routine@\scode{omp_display_affinity} routine}
The following examples illustrate ways to display thread affinity.
Automatic display of affinity can be invoked by setting
@ -49,6 +58,8 @@ where the numbers correspond to core ids for the system. Note, \code{OMP\_DISPLA
set and is \code{FALSE} by default. This example shows how to use API routines to
perform affinity display operations.
\index{environment variables!OMP_PLACES@\scode{OMP_PLACES}}
\index{OMP_PLACES@\scode{OMP_PLACES}}
For each of the two first-level threads the \code{OMP\_PLACES} variable specifies
a place with all the core-ids of the socket (\{0,2,4,6\} for one thread and \{1,3,5,7\} for the other).
(As is sometimes the case in 2-socket systems, one socket may consist
@ -62,8 +73,14 @@ the affinities for the threads on each socket are printed according to this form
\ffreeexample[5.0]{affinity_display}{2}
\index{affinity display!omp_get_affinity_format routine@\scode{omp_get_affinity_format} routine}
\index{routines!omp_get_affinity_format@\scode{omp_get_affinity_format}}
\index{omp_get_affinity_format routine@\scode{omp_get_affinity_format} routine}
\index{affinity display!omp_set_affinity_format routine@\scode{omp_set_affinity_format} routine}
\index{routines!omp_set_affinity_format@\scode{omp_set_affinity_format}}
\index{omp_set_affinity_format routine@\scode{omp_set_affinity_format} routine}
The next example illustrates more details about affinity formatting.
First, the \code{omp\_get\_affininity\_format()} API routine is used to
First, the \code{omp\_get\_affinity\_format()} API routine is used to
obtain the default format. The code checks to make sure the storage
provides enough space to hold the format.
Next, the \code{omp\_set\_affinity\_format()} API routine sets a user-defined
@ -83,6 +100,9 @@ and the "0" indicates that any unused space is to be prefixed with zeros
%The period (\plc{.}) indicates right justified and \plc{0} leading zeros.
%All other text in the format is just user narrative.
\index{affinity display!omp_capture_affinity routine@\scode{omp_capture_affinity} routine}
\index{routines!omp_capture_affinity@\scode{omp_capture_affinity}}
\index{omp_capture_affinity routine@\scode{omp_capture_affinity} routine}
Within the parallel region the affinity for each thread is captured by
\code{omp\_capture\_affinity()} into a buffer array with elements indexed
by the thread number (\plc{thrd\_num}).
@ -98,6 +118,7 @@ The maximum value for the number of characters (\plc{nchars}) returned by
clause and the \plc{if(nchars >= max\_req\_store) max\_req\_store=nchars} statement.
It is used to report possible truncation (if \plc{max\_req\_store} > \plc{buffer\_store}).
\newpage
\cexample[5.0]{affinity_display}{3}
\ffreeexample[5.0]{affinity_display}{3}

View File

@ -1,5 +1,18 @@
\section{Affinity Query Functions}
\label{sec: affinity_query}
\index{affinity query!omp_get_num_places routine@\scode{omp_get_num_places} routine}
\index{routines!omp_get_num_places@\scode{omp_get_num_places}}
\index{omp_get_num_places routine@\scode{omp_get_num_places} routine}
\index{affinity query!omp_get_place_num routine@\scode{omp_get_place_num} routine}
\index{routines!omp_get_place_num@\scode{omp_get_place_num}}
\index{omp_get_place_num routine@\scode{omp_get_place_num} routine}
\index{affinity query!omp_get_place_num_procs routine@\scode{omp_get_place_num_procs} routine}
\index{routines!omp_get_place_num_procs@\scode{omp_get_place_num_procs}}
\index{omp_get_place_num_procs routine@\scode{omp_get_place_num_procs} routine}
\index{affinity!spread policy@\code{spread} policy}
\index{spread policy@\code{spread} policy}
\index{environment variables!OMP_PLACES@\scode{OMP_PLACES}}
\index{OMP_PLACES@\scode{OMP_PLACES}}
In the example below a team of threads is generated on each socket of
the system, using nested parallelism. Several query functions are used

View File

@ -1,5 +1,5 @@
/*
* @@name: affinity.1c
* @@name: affinity.1
* @@type: C
* @@compilable: yes
* @@linkable: no

View File

@ -1,4 +1,4 @@
! @@name: affinity.1f
! @@name: affinity.1
! @@type: F-fixed
! @@compilable: yes
! @@linkable: no

View File

@ -1,5 +1,5 @@
/*
* @@name: affinity.2c
* @@name: affinity.2
* @@type: C
* @@compilable: yes
* @@linkable: no

View File

@ -1,4 +1,4 @@
! @@name: affinity.2f
! @@name: affinity.2
! @@type: F-free
! @@compilable: yes
! @@linkable: no

View File

@ -1,5 +1,5 @@
/*
* @@name: affinity.3c
* @@name: affinity.3
* @@type: C
* @@compilable: yes
* @@linkable: no

View File

@ -1,4 +1,4 @@
! @@name: affinity.3f
! @@name: affinity.3
! @@type: F-fixed
! @@compilable: yes
! @@linkable: no

View File

@ -1,5 +1,5 @@
/*
* @@name: affinity.4c
* @@name: affinity.4
* @@type: C
* @@compilable: yes
* @@linkable: no

View File

@ -1,4 +1,4 @@
! @@name: affinity.4f
! @@name: affinity.4
! @@type: F-free
! @@compilable: yes
! @@linkable: no

View File

@ -1,15 +1,11 @@
/*
* @@name: affinity.5c
* @@name: affinity.5
* @@type: C
* @@compilable: yes
* @@linkable: no
* @@expect: success
* @@version: omp_5.1
*/
#if _OPENMP < 202011
#define primary master
#endif
void work();
int main()
{

View File

@ -1,14 +1,9 @@
! @@name: affinity.5f
! @@name: affinity.5
! @@type: F-fixed
! @@compilable: yes
! @@requires: preprocessing
! @@linkable: no
! @@expect: success
! @@version: omp_5.1
#if _OPENMP < 202011
#define primary master
#endif
PROGRAM EXAMPLE
!$OMP PARALLEL PROC_BIND(primary) NUM_THREADS(4)
CALL WORK()

View File

@ -1,5 +1,5 @@
/*
* @@name: affinity.1.c
* @@name: affinity.6
* @@type: C
* @@compilable: yes
* @@linkable: no

View File

@ -1,4 +1,4 @@
! @@name: affinity.6f
! @@name: affinity.6
! @@type: F-free
! @@compilable: yes
! @@linkable: no

View File

@ -1,5 +1,5 @@
/*
* @@name: affinity_display.1.c
* @@name: affinity_display.1
* @@type: C
* @@compilable: yes
* @@linkable: yes
@ -9,52 +9,53 @@
#include <stdio.h>
#include <omp.h>
int main(void){ //MAX threads = 8, single socket system
int main(void){ //MAX threads = 8, single socket system
omp_display_affinity(NULL); //API call-- Displays Affinity of Primary Thread
//API call-- Displays Affinity of Primary Thread
omp_display_affinity(NULL);
// API CALL OUTPUT (default format):
//team_num= 0, nesting_level= 0, thread_num= 0, thread_affinity= 0,1,2,3,4,5,6,7
// API CALL OUTPUT (default format):
// team_num= 0, nesting_level= 0, thread_num= 0,
// thread_affinity= 0,1,2,3,4,5,6,7
// OMP_DISPLAY_AFFINITY=TRUE, OMP_NUM_THREADS=8
// OMP_DISPLAY_AFFINITY=TRUE, OMP_NUM_THREADS=8
#pragma omp parallel num_threads(omp_get_num_procs())
{
if(omp_get_thread_num()==0)
if(omp_get_thread_num()==0)
printf("1st Parallel Region -- Affinity Reported \n");
// DISPLAY OUTPUT (default format) has been sorted:
// team_num= 0, nesting_level= 1, thread_num= 0, thread_affinity= 0
// team_num= 0, nesting_level= 1, thread_num= 1, thread_affinity= 1
// ...
// team_num= 0, nesting_level= 1, thread_num= 7, thread_affinity= 7
// DISPLAY OUTPUT (default format) has been sorted:
// team_num= 0, nesting_level= 1, thread_num= 0, thread_affinity= 0
// team_num= 0, nesting_level= 1, thread_num= 1, thread_affinity= 1
// ...
// team_num= 0, nesting_level= 1, thread_num= 7, thread_affinity= 7
// doing work here
// doing work here
}
#pragma omp parallel num_threads( omp_get_num_procs() )
{
if(omp_get_thread_num()==0)
printf("%s%s\n","Same Affinity as in Previous Parallel Region",
" -- no Affinity Reported\n");
if(omp_get_thread_num()==0)
printf("%s%s\n","Same Affinity as in Previous Parallel Region",
" -- no Affinity Reported\n");
// NO AFFINITY OUTPUT:
//(output in 1st parallel region only for OMP_DISPLAY_AFFINITY=TRUE)
// doing more work here
// NO AFFINITY OUTPUT:
//(output in 1st parallel region only for OMP_DISPLAY_AFFINITY=TRUE)
// doing more work here
}
// Report Affinity for 1/2 number of threads
// Report Affinity for 1/2 number of threads
#pragma omp parallel num_threads( omp_get_num_procs()/2 )
{
if(omp_get_thread_num()==0)
if(omp_get_thread_num()==0)
printf("Report Affinity for using 1/2 of max threads.\n");
// DISPLAY OUTPUT (default format) has been sorted:
// team_num= 0, nesting_level= 1, thread_num= 0, thread_affinity= 0,1
// team_num= 0, nesting_level= 1, thread_num= 1, thread_affinity= 2,3
// team_num= 0, nesting_level= 1, thread_num= 2, thread_affinity= 4,5
// team_num= 0, nesting_level= 1, thread_num= 3, thread_affinity= 6,7
// DISPLAY OUTPUT (default format) has been sorted:
// team_num= 0, nesting_level= 1, thread_num= 0, thread_affinity= 0,1
// team_num= 0, nesting_level= 1, thread_num= 1, thread_affinity= 2,3
// team_num= 0, nesting_level= 1, thread_num= 2, thread_affinity= 4,5
// team_num= 0, nesting_level= 1, thread_num= 3, thread_affinity= 6,7
// do work
}

View File

@ -1,4 +1,4 @@
! @@name: affinity_display.1.f90
! @@name: affinity_display.1
! @@type: F-free
! @@compilable: yes
! @@linkable: yes
@ -10,13 +10,15 @@ program affinity_display ! MAX threads = 8, single socket system
implicit none
character(len=0) :: null
call omp_display_affinity(null) !API call- Displays Affinity of Primary Thrd
! API call - Displays Affinity of Primary Thread
call omp_display_affinity(null)
! API CALL OUTPUT (default format):
!team_num= 0, nesting_level= 0, thread_num= 0, thread_affinity= 0,1,2,3,4,5,6,7
! API CALL OUTPUT (default format):
! team_num= 0, nesting_level= 0, thread_num= 0, &
! thread_affinity= 0,1,2,3,4,5,6,7
! OMP_DISPLAY_AFFINITY=TRUE, OMP_NUM_THREADS=8
! OMP_DISPLAY_AFFINITY=TRUE, OMP_NUM_THREADS=8
!$omp parallel num_threads(omp_get_num_procs())
@ -24,11 +26,11 @@ program affinity_display ! MAX threads = 8, single socket system
print*, "1st Parallel Region -- Affinity Reported"
endif
! DISPLAY OUTPUT (default format) has been sorted:
! team_num= 0, nesting_level= 1, thread_num= 0, thread_affinity= 0
! team_num= 0, nesting_level= 1, thread_num= 1, thread_affinity= 1
! ...
! team_num= 0, nesting_level= 1, thread_num= 7, thread_affinity= 7
! DISPLAY OUTPUT (default format) has been sorted:
! team_num= 0, nesting_level= 1, thread_num= 0, thread_affinity= 0
! team_num= 0, nesting_level= 1, thread_num= 1, thread_affinity= 1
! ...
! team_num= 0, nesting_level= 1, thread_num= 7, thread_affinity= 7
! doing work here
@ -40,25 +42,30 @@ program affinity_display ! MAX threads = 8, single socket system
print*, "Same Affinity in Parallel Region -- no Affinity Reported"
endif
! NO AFFINITY OUTPUT:
!(output in 1st parallel region only for OMP_DISPLAY_AFFINITY=TRUE)
! NO AFFINITY OUTPUT:
! (output in 1st parallel region only for
! OMP_DISPLAY_AFFINITY=TRUE)
! doing more work here
!$omp end parallel
! Report Affinity for 1/2 number of threads
! Report Affinity for 1/2 number of threads
!$omp parallel num_threads( omp_get_num_procs()/2 )
if(omp_get_thread_num()==0) then
print*, "Different Affinity in Parallel Region -- Affinity Reported"
print*, "Altered Affinity in Parallel Region -- Affinity Reported"
endif
! DISPLAY OUTPUT (default format) has been sorted:
! team_num= 0, nesting_level= 1, thread_num= 0, thread_affinity= 0,1
! team_num= 0, nesting_level= 1, thread_num= 1, thread_affinity= 2,3
! team_num= 0, nesting_level= 1, thread_num= 2, thread_affinity= 4,5
! team_num= 0, nesting_level= 1, thread_num= 3, thread_affinity= 6,7
! DISPLAY OUTPUT (default format) has been sorted:
! team_num= 0, nesting_level= 1, thread_num= 0, &
! thread_affinity= 0,1
! team_num= 0, nesting_level= 1, thread_num= 1, &
! thread_affinity= 2,3
! team_num= 0, nesting_level= 1, thread_num= 2, &
! thread_affinity= 4,5
! team_num= 0, nesting_level= 1, thread_num= 3, &
! thread_affinity= 6,7
! do work

View File

@ -1,5 +1,5 @@
/*
* @@name: affinity_display.2c
* @@name: affinity_display.2
* @@type: C
* @@compilable: yes
* @@linkable: yes
@ -14,62 +14,65 @@ void socket_work(int socket_num, int n_thrds);
int main(void)
{
int n_sockets, socket_num, n_thrds_on_socket;
int n_sockets, socket_num, n_thrds_on_socket;
omp_set_nested(1); // or env var= OMP_NESTED=true
omp_set_max_active_levels(2); // or env var= OMP_MAX_ACTIVE_LEVELS=2
omp_set_nested(1); // or env var= OMP_NESTED=true
omp_set_max_active_levels(2); // or env var= OMP_MAX_ACTIVE_LEVELS=2
n_sockets = omp_get_num_places();
n_thrds_on_socket = omp_get_place_num_procs(0);
n_sockets = omp_get_num_places();
n_thrds_on_socket = omp_get_place_num_procs(0);
// OMP_NUM_THREADS=2,4
// OMP_PLACES="{0,2,4,6},{1,3,5,7}" #2 sockets; even/odd proc-ids
// OMP_AFFINITY_FORMAT=\
// "nest_level= %L, parent_thrd_num= %a, thrd_num= %n, thrd_affinity= %A"
#pragma omp parallel num_threads(n_sockets) private(socket_num)
{
socket_num = omp_get_place_num();
// OMP_NUM_THREADS=2,4
// OMP_PLACES="{0,2,4,6},{1,3,5,7}" #2 sockets; even/odd proc-ids
// OMP_AFFINITY_FORMAT=\
// "nest_level= %L, parent_thrd_num= %a, thrd_num= %n, thrd_affinity= %A"
if(socket_num==0)
printf(" LEVEL 1 AFFINITIES 1 thread/socket, %d sockets:\n\n", n_sockets);
#pragma omp parallel num_threads(n_sockets) private(socket_num)
{
socket_num = omp_get_place_num();
omp_display_affinity(NULL); // not needed if OMP_DISPLAY_AFFINITY=TRUE
if(socket_num==0)
printf(" LEVEL 1 AFFINITIES 1 thread/socket, %d sockets:\n\n",
n_sockets);
// OUTPUT:
// LEVEL 1 AFFINITIES 1 thread/socket, 2 sockets:
// nest_level= 1, parent_thrd_num= 0, thrd_num= 0, thrd_affinity= 0,2,4,6
// nest_level= 1, parent_thrd_num= 0, thrd_num= 1, thrd_affinity= 1,3,5,7
// not needed if OMP_DISPLAY_AFFINITY=TRUE
omp_display_affinity(NULL);
socket_work(socket_num, n_thrds_on_socket);
}
return 0;
// OUTPUT:
// LEVEL 1 AFFINITIES 1 thread/socket, 2 sockets:
// nest_level= 1, parent_thrd_num= 0, thrd_num= 0, thrd_affinity= 0,2,4,6
// nest_level= 1, parent_thrd_num= 0, thrd_num= 1, thrd_affinity= 1,3,5,7
socket_work(socket_num, n_thrds_on_socket);
}
return 0;
}
void socket_work(int socket_num, int n_thrds)
{
#pragma omp parallel num_threads(n_thrds)
{
if(omp_get_thread_num()==0)
printf(" LEVEL 2 AFFINITIES, %d threads on socket %d\n",n_thrds, socket_num);
omp_display_affinity(NULL); // not needed if OMP_DISPLAY_AFFINITY=TRUE
// OUTPUT:
// LEVEL 2 AFFINITIES, 4 threads on socket 0
// nest_level= 2, parent_thrd_num= 0, thrd_num= 0, thrd_affinity= 0
// nest_level= 2, parent_thrd_num= 0, thrd_num= 1, thrd_affinity= 2
// nest_level= 2, parent_thrd_num= 0, thrd_num= 2, thrd_affinity= 4
// nest_level= 2, parent_thrd_num= 0, thrd_num= 3, thrd_affinity= 6
{
#pragma omp parallel num_threads(n_thrds)
{
if(omp_get_thread_num()==0)
printf(" LEVEL 2 AFFINITIES, %d threads on socket %d\n",
n_thrds, socket_num);
// not needed if OMP_DISPLAY_AFFINITY=TRUE
omp_display_affinity(NULL);
// OUTPUT:
// LEVEL 2 AFFINITIES, 4 threads on socket 0
// nest_level= 2, parent_thrd_num= 0, thrd_num= 0, thrd_affinity= 0
// nest_level= 2, parent_thrd_num= 0, thrd_num= 1, thrd_affinity= 2
// nest_level= 2, parent_thrd_num= 0, thrd_num= 2, thrd_affinity= 4
// nest_level= 2, parent_thrd_num= 0, thrd_num= 3, thrd_affinity= 6
// LEVEL 2 AFFINITIES, 4 threads on socket 1
// nest_level= 2, parent_thrd_num= 1, thrd_num= 0, thrd_affinity= 1
// nest_level= 2, parent_thrd_num= 1, thrd_num= 1, thrd_affinity= 3
// nest_level= 2, parent_thrd_num= 1, thrd_num= 2, thrd_affinity= 5
// nest_level= 2, parent_thrd_num= 1, thrd_num= 3, thrd_affinity= 7
// LEVEL 2 AFFINITIES, 4 threads on socket 1
// nest_level= 2, parent_thrd_num= 1, thrd_num= 0, thrd_affinity= 1
// nest_level= 2, parent_thrd_num= 1, thrd_num= 1, thrd_affinity= 3
// nest_level= 2, parent_thrd_num= 1, thrd_num= 2, thrd_affinity= 5
// nest_level= 2, parent_thrd_num= 1, thrd_num= 3, thrd_affinity= 7
// ... Do Some work on Socket
}
}
}
}

View File

@ -1,4 +1,4 @@
! @@name: affinity_display.2.f90
! @@name: affinity_display.2
! @@type: F-free
! @@compilable: yes
! @@linkable: yes
@ -20,22 +20,26 @@ program affinity_display
! OMP_NUM_THREADS=2,4
! OMP_PLACES="{0,2,4,6},{1,3,5,7}" #2 sockets; even/odd proc-ids
! OMP_AFFINITY_FORMAT=\
! "nest_level= %L, parent_thrd_num= %a, thrd_num= %n, thrd_affinity= %A"
!"nest_level= %L, parent_thrd_num= %a, thrd_num= %n, thrd_affinity= %A"
!$omp parallel num_threads(n_sockets) private(socket_num)
socket_num = omp_get_place_num()
if(socket_num==0) then
write(*,'("LEVEL 1 AFFINITIES 1 thread/socket ",i0," sockets")')n_sockets
write(*,'("LEVEL 1 AFFINITIES 1 thread/socket ",i0," sockets")') &
n_sockets
endif
call omp_display_affinity(null) !not needed if OMP_DISPLAY_AFFINITY=TRUE
call omp_display_affinity(null) ! not needed
! if OMP_DISPLAY_AFFINITY=TRUE
! OUTPUT:
! LEVEL 1 AFFINITIES 1 thread/socket, 2 sockets:
! nest_level= 1, parent_thrd_num= 0, thrd_num= 0, thrd_affinity= 0,2,4,6
! nest_level= 1, parent_thrd_num= 0, thrd_num= 1, thrd_affinity= 1,3,5,7
! nest_level= 1, parent_thrd_num= 0, thrd_num= 0, &
! thrd_affinity= 0,2,4,6
! nest_level= 1, parent_thrd_num= 0, thrd_num= 1, &
! thrd_affinity= 1,3,5,7
call socket_work(socket_num, n_thrds_on_socket)
@ -56,7 +60,8 @@ subroutine socket_work(socket_num, n_thrds)
n_thrds,socket_num
endif
call omp_display_affinity(null); !not needed if OMP_DISPLAY_AFFINITY=TRUE
call omp_display_affinity(null) ! not needed
! if OMP_DISPLAY_AFFINITY=TRUE
! OUTPUT:
! LEVEL 2 AFFINITIES, 4 threads on socket 0

View File

@ -1,5 +1,5 @@
/*
* @@name: affinity_display.3.c
* @@name: affinity_display.3
* @@type: C
* @@compilable: yes
* @@linkable: yes
@ -25,9 +25,9 @@ int main(void){
char **buffer;
// CODE SEGMENT 1 AFFINITY FORMAT
// CODE SEGMENT 1 AFFINITY FORMAT
// Get and Display Default Affinity Format
// Get and Display Default Affinity Format
nchars = omp_get_affinity_format(default_format,(size_t)FORMAT_STORE);
printf("Default Affinity Format is: %s\n",default_format);
@ -37,44 +37,49 @@ int main(void){
printf(" FORMAT_STORE to %d.\n", nchars+1);
}
// Set Affinity Format
// Set Affinity Format
omp_set_affinity_format(my_format);
printf("Affinity Format set to: %s\n",my_format);
// CODE SEGMENT 2 CAPTURE AFFINITY
// CODE SEGMENT 2 CAPTURE AFFINITY
// Set up buffer for affinity of n threads
// Set up buffer for affinity of n threads
n = omp_get_num_procs();
buffer = (char **)malloc( sizeof(char *) * n );
for(i=0;i<n;i++){ buffer[i]=(char *)malloc( sizeof(char) * BUFFER_STORE); }
for(i=0;i<n;i++){
buffer[i]=(char *)malloc( sizeof(char) * BUFFER_STORE);
}
// Capture Affinity using Affinity Format set above.
// Use max reduction to check size of buffer areas
// Capture Affinity using Affinity Format set above.
// Use max reduction to check size of buffer areas
max_req_store = 0;
#pragma omp parallel private(thrd_num,nchars) reduction(max:max_req_store)
#pragma omp parallel private(thrd_num,nchars) \
reduction(max:max_req_store)
{
if(omp_get_num_threads()>n) exit(1); //safety: don't exceed # of buffers
//safety: don't exceed # of buffers
if(omp_get_num_threads()>n) exit(1);
thrd_num=omp_get_thread_num();
nchars=omp_capture_affinity(buffer[thrd_num],(size_t)BUFFER_STORE,NULL);
nchars=omp_capture_affinity(buffer[thrd_num],
(size_t)BUFFER_STORE,NULL);
if(nchars > max_req_store) max_req_store=nchars;
// ...
}
for(i=0;i<n;i++){
printf("thrd_num= %d, affinity: %s\n", i,buffer[i]);
for(i=0;i<n;i++){
printf("thrd_num= %d, affinity: %s\n", i,buffer[i]);
}
// For 4 threads with OMP_PLACES='{0,1},{2,3},{4,5},{6,7}'
// Format host=%20H thrd_num=%0.4n binds_to=%A
// For 4 threads with OMP_PLACES='{0,1},{2,3},{4,5},{6,7}'
// Format host=%20H thrd_num=%0.4n binds_to=%A
// affinity: host=hpc.cn567 thrd_num=0000 binds_to=0,1
// affinity: host=hpc.cn567 thrd_num=0001 binds_to=2,3
// affinity: host=hpc.cn567 thrd_num=0002 binds_to=4,5
// affinity: host=hpc.cn567 thrd_num=0003 binds_to=6,7
// affinity: host=hpc.cn567 thrd_num=0000 binds_to=0,1
// affinity: host=hpc.cn567 thrd_num=0001 binds_to=2,3
// affinity: host=hpc.cn567 thrd_num=0002 binds_to=4,5
// affinity: host=hpc.cn567 thrd_num=0003 binds_to=6,7
if(max_req_store>=BUFFER_STORE){

View File

@ -1,4 +1,4 @@
! @@name: affinity_display.3.f90
! @@name: affinity_display.3
! @@type: F-free
! @@compilable: yes
! @@linkable: yes

View File

@ -1,5 +1,5 @@
/*
* @@name: affinity_query.1c
* @@name: affinity_query.1
* @@type: C
* @@compilable: yes
* @@linkable: no

View File

@ -1,4 +1,4 @@
! @@name: affinity_query.1f
! @@name: affinity_query.1
! @@type: F-free
! @@compilable: yes
! @@linkable: no

View File

@ -1,5 +1,9 @@
\section{Task Affinity}
\label{sec: task_affinity}
\index{affinity!task affinity}
\index{affinity!affinity clause@\code{affinity} clause}
\index{clauses!affinity@\code{affinity}}
\index{affinity clause@\code{affinity} clause}
The next example illustrates the use of the \code{affinity}
clause with a \code{task} construct.

View File

@ -2,6 +2,7 @@
\section{Fortran \code{ASSOCIATE} Construct}
\fortranspecificstart
\label{sec:associate}
\index{ASSOCIATE construct, Fortran@\code{ASSOCIATE} construct, Fortran}
The following is an invalid example of specifying an associate name on a data-sharing attribute
clause. The constraint in the Data Sharing Attribute Rules section in the OpenMP
@ -29,5 +30,40 @@ region, \plc{v} has the value of -1 and \plc{u} has the value of the original \p
\pagebreak
\ffreenexample[4.0]{associate}{3}
% blue line floater at top of this page for "Fortran, cont."
\begin{figure}[t!]
\linewitharrows{-1}{dashed}{Fortran (cont.)}{8em}
\end{figure}
\label{sec:associate_target}
\bigskip
The following example illustrates mapping behavior for a Fortran
associate name and its selector for a \scode{target} construct.
For the first 3 \scode{target} constructs the associate name \splc{a_aray} is
associated with the selector \splc{aray}, an array.
For the \scode{target} construct of code block TARGET 1 just the selector
\splc{aray} is used and is implicitly mapped,
likewise for the associate name \splc{a_aray} in the TARGET 2 block.
However, mapping an associate name and its selector is not valid for the same
\scode{target} construct. Hence the TARGET 3 block is non-conforming.
In TARGET 4, the \splc{scalr} selector used in the \scode{target} region
has an implicit data-sharing attribute of firstprivate since it is a scalar.
Hence, the assigned value is not returned.
In TARGET 5, the associate name \splc{a_scalr} is implicitly mapped and the
assigned value is returned to the host (default \scode{tofrom} mapping behavior).
In TARGET 6, the use of the associate name and its selector in the \scode{target}
region is conforming because the scalar firstprivate behavior of the selector
and the implicit mapping of the associate name are allowed.
At the end of the \scode{target} region only the
associate name's value is returned to the host.
In TARGET 7, the selector and associate name appear in
an explicit mapping for the same \scode{target} construct,
hence the code block is non-conforming.
\ffreenexample[5.1]{associate}{4}
\fortranspecificend

View File

@ -2,6 +2,8 @@
\section{C/C++ Arrays in a \code{firstprivate} Clause}
\ccppspecificstart
\label{sec:carrays_fpriv}
\index{clauses!firstprivate@\code{firstprivate}}
\index{firstprivate clause@\code{firstprivate} clause!C/C++ arrays in}
The following example illustrates the size and value of list items of array or
pointer type in a \code{firstprivate} clause . The size of new list items is

View File

@ -1,6 +1,10 @@
\pagebreak
\section{\code{copyin} Clause}
\label{sec:copyin}
\index{clauses!copyin@\code{copyin}}
\index{copyin clause@\code{copyin} clause}
\index{directives!threadprivate@\code{threadprivate}}
\index{threadprivate directive@\code{threadprivate} directive}
The \code{copyin} clause is used to initialize threadprivate data upon entry
to a \code{parallel} region. The value of the threadprivate variable in the primary

View File

@ -1,6 +1,8 @@
\pagebreak
\section{\code{copyprivate} Clause}
\label{sec:copyprivate}
\index{clauses!copyprivate@\code{copyprivate}}
\index{copyprivate clause@\code{copyprivate} clause}
The \code{copyprivate} clause can be used to broadcast values acquired by a single
thread directly to all instances of the private variables in the other threads.
@ -9,6 +11,8 @@ is not affected by the presence of the directives. If it is called from a \code{
region, then the actual arguments with which \code{a} and \code{b} are associated
must be private.
\index{constructs!single@\code{single}}
\index{single construct@\code{single} construct}
The thread that executes the structured block associated with the \code{single}
construct broadcasts the values of the private variables \code{a}, \code{b},
\code{x}, and
@ -20,6 +24,8 @@ any of the threads have left the barrier at the end of the construct.
\fexample{copyprivate}{1}
\index{constructs!masked@\code{masked}}
\index{masked construct@\code{masked} construct}
In this example, assume that the input must be performed by the primary thread.
Since the \code{masked} construct does not support the \code{copyprivate} clause,
it cannot broadcast the input value that is read. However, \code{copyprivate}
@ -27,7 +33,7 @@ is used to broadcast an address where the input value is stored.
\cexample[5.1]{copyprivate}{2}
\fexample[5.1]{copyprivate}{2}[1]
\fexample[5.1]{copyprivate}{2}
Suppose that the number of lock variables required within a \code{parallel} region
cannot easily be determined prior to entering it. The \code{copyprivate} clause

View File

@ -1,6 +1,8 @@
\section{C++ Reference in Data-Sharing Clauses}
\cppspecificstart
\label{sec:cpp_reference}
\index{clauses!data-sharing, C++ reference in}
\index{data-sharing clauses, C++ reference in}
C++ reference types are allowed in data-sharing attribute clauses as of OpenMP 4.5, except
for the \code{threadprivate}, \code{copyin} and \code{copyprivate} clauses.

View File

@ -1,6 +1,8 @@
\pagebreak
\section{\code{default(none)} Clause}
\label{sec:default_none}
\index{clauses!default(none)@\code{default(none)}}
\index{default(none) clause@\code{default(none)} clause}
The following example distinguishes the variables that are affected by the \code{default(none)}
clause from those that are not.

View File

@ -2,6 +2,7 @@
\section{Fortran Private Loop Iteration Variables}
\label{sec:fort_loopvar}
\fortranspecificstart
\index{loop variables, Fortran}
In general loop iteration variables will be private, when used in the \plc{do-loop}
of a \code{do} and \code{parallel do} construct or in sequential loops in a

View File

@ -2,6 +2,8 @@
\section{Fortran Restrictions on Storage Association with the \code{private} Clause}
\fortranspecificstart
\label{sec:fort_sa_private}
\index{clauses!private@\code{private}}
\index{private clause@\code{private} clause!storage association, Fortran}
The following non-conforming examples illustrate the implications of the \code{private}
clause rules with regard to storage association.

View File

@ -2,6 +2,10 @@
\section{Fortran Restrictions on \code{shared} and \code{private} Clauses with Common Blocks}
\fortranspecificstart
\label{sec:fort_sp_common}
\index{clauses!private@\code{private}}
\index{clauses!shared@\code{shared}}
\index{private clause@\code{private} clause!common blocks, Fortran}
\index{shared clause@\code{shared} clause!common blocks, Fortran}
When a named common block is specified in a \code{private}, \code{firstprivate},
or \code{lastprivate} clause of a construct, none of its members may be declared

View File

@ -1,6 +1,8 @@
\pagebreak
\section{\code{lastprivate} Clause}
\label{sec:lastprivate}
\index{clauses!lastprivate@\code{lastprivate}}
\index{lastprivate clause@\code{lastprivate} clause}
Correct execution sometimes depends on the value that the last iteration of a loop
assigns to a variable. Such programs must list all such variables in a \code{lastprivate}
@ -12,6 +14,8 @@ sequentially.
\fexample{lastprivate}{1}
\clearpage
\index{lastprivate clause@\code{lastprivate} clause!conditional modifier@\code{conditional} modifier}
\index{conditional modifier@\code{conditional} modifier}
The next example illustrates the use of the \code{conditional} modifier in
a \code{lastprivate} clause to return the last value when it may not come from
the last iteration of a loop.

View File

@ -1,6 +1,8 @@
\pagebreak
\section{\code{private} Clause}
\label{sec:private}
\index{clauses!private@\code{private}}
\index{private clause@\code{private} clause}
In the following example, the values of original list items \plc{i} and \plc{j}
are retained on exit from the \code{parallel} region, while the private list

View File

@ -7,6 +7,9 @@ This section covers ways to perform reductions in parallel, task, taskloop, and
\subsection{\code{reduction} Clause}
\label{subsec:reduction}
\index{clauses!reduction@\code{reduction}}
\index{reduction clause@\code{reduction} clause}
\index{reductions!reduction clause@\code{reduction} clause}
The following example demonstrates the \code{reduction} clause; note that some
reductions can be expressed in the loop in several ways, as shown for the \code{max}
@ -64,7 +67,7 @@ the start of the \code{parallel} region.
\cexample[5.1]{reduction}{6}
\fexample[5.1]{reduction}{6}[1]
\fexample[5.1]{reduction}{6}
The following example demonstrates the reduction of array \plc{a}. In C/C++ this is illustrated by the explicit use of an array section \plc{a[0:N]} in the \code{reduction} clause. The corresponding Fortran example uses array syntax supported in the base language. As of the OpenMP 4.5 specification the explicit use of array section in the \code{reduction} clause in Fortran is not permitted. But this oversight has been fixed in the OpenMP 5.0 specification.
@ -75,6 +78,12 @@ The following example demonstrates the reduction of array \plc{a}. In C/C++ thi
\subsection{Task Reduction}
\label{subsec:task_reduction}
\index{clauses!task_reduction@\scode{task_reduction}}
\index{task_reduction clause@\scode{task_reduction} clause}
\index{reductions!task_reduction clause@\scode{task_reduction} clause}
\index{clauses!in_reduction@\scode{in_reduction}}
\index{in_reduction clause@\scode{in_reduction} clause}
\index{reductions!in_reduction clause@\scode{in_reduction} clause}
In OpenMP 5.0 the \code{task\_reduction} clause was created for the \code{taskgroup} construct,
to allow reductions among explicit tasks that have an \code{in\_reduction} clause.
@ -97,6 +106,8 @@ reduction).
\ffreeexample[5.0]{task_reduction}{1}
\index{reduction clause@\code{reduction} clause!task modifier@\code{task} modifier}
\index{task modifier@\code{task} modifier}
In OpenMP 5.0 the \code{task} \plc{reduction-modifier} for the \code{reduction} clause was
introduced to provide a means of performing reductions among implicit and explicit tasks.
@ -134,6 +145,9 @@ and list item (variable \code{x}) match as required.
\subsection{Reduction on Combined Target Constructs}
\label{subsec:target_reduction}
\index{reduction clause@\code{reduction} clause!on target construct@on \code{target} construct}
\index{constructs!target@\code{target}}
\index{target construct@\code{target} construct}
When a \code{reduction} clause appears on a combined construct that combines
a \code{target} construct with another construct, there is an implicit map
@ -174,6 +188,12 @@ first construct.
\subsection{Task Reduction with Target Constructs}
\label{subsec:target_task_reduction}
\index{in_reduction clause@\scode{in_reduction} clause}
\index{constructs!target@\code{target}}
\index{target construct@\code{target} construct}
\index{clauses!enter@\code{enter}}
\index{enter clause@\code{enter} clause}
The following examples illustrate how task reductions can apply to target tasks
that result from a \code{target} construct with the \code{in\_reduction}
@ -184,34 +204,43 @@ task reduction will be combined (in some order) into the original variable
listed in the \code{task\_reduction} clause before exiting the \code{taskgroup}
region.
\cexample[5.1]{target_task_reduction}{1}
\cexample[5.2]{target_task_reduction}{1}
\ffreeexample[5.1]{target_task_reduction}{1}[1]
\ffreeexample[5.2]{target_task_reduction}{1}
\clearpage
\index{reduction clause@\code{reduction} clause!task modifier@\code{task} modifier}
\index{task modifier@\code{task} modifier}
In the next pair of examples, the task reduction is defined by a
\code{reduction} clause with the \code{task} modifier, rather than a
\code{task\_reduction} clause on a \code{taskgroup} construct. Again, the
partial results from the participating tasks will be combined in some order
into the original reduction variable, \code{sum}.
\cexample[5.0]{target_task_reduction}{2a}
\cexample[5.2]{target_task_reduction}{2a}
\ffreeexample[5.0]{target_task_reduction}{2a}
\ffreeexample[5.2]{target_task_reduction}{2a}
\index{in_reduction clause@\scode{in_reduction} clause!with target construct@with \code{target} construct}
\index{constructs!target@\code{target}}
\index{target construct@\code{target} construct}
Next, the \code{task} modifier is again used to define a task reduction over
participating tasks. This time, the participating tasks are a target task
resulting from a \code{target} construct with the \code{in\_reduction} clause,
and the implicit task (executing on the primary thread) that calls
\code{host\_compute}. As before, the partial results from these paricipating
\code{host\_compute}. As before, the partial results from these participating
tasks are combined in some order into the original reduction variable.
\cexample[5.1]{target_task_reduction}{2b}
\cexample[5.2]{target_task_reduction}{2b}
\ffreeexample[5.1]{target_task_reduction}{2b}[1]
\ffreeexample[5.2]{target_task_reduction}{2b}
\subsection{Taskloop Reduction}
\label{subsec:taskloop_reduction}
\index{reduction clause@\code{reduction} clause!on taskloop construct@on \code{taskloop} construct}
\index{constructs!taskloop@\code{taskloop}}
\index{taskloop construct@\code{taskloop} construct}
In the OpenMP 5.0 Specification the \code{taskloop} construct
was extended to include the reductions.
@ -249,7 +278,7 @@ reduction that has not been defined.
%create a new reduction and also that all tasks generated by the taskloop will
%participate on it.
The second example computes exactly the same value as in the preceding\plc{taskloop\_reduction.1} code section,
The second example computes exactly the same value as in the preceding \plc{taskloop\_reduction.1} code section,
but in a very different way.
First, in the \plc{array\_sum} function a \code{taskgroup} region is created
that defines the scope of a new reduction using the \code{task\_reduction} clause.
@ -261,7 +290,7 @@ This is allowed because what is expressed with the \code{in\_reduction} clause
is different from what is expressed with the \code{reduction} clause.
In one case the generated tasks are specified to participate in a previously
declared reduction (\code{in\_reduction} clause) whereas in the other case
creation of a new reduction is specified and also that all tasks generated
creation of a new reduction is specified and also all tasks generated
by the taskloop will participate on it.
\cexample[5.0]{taskloop_reduction}{2}
@ -271,6 +300,9 @@ by the taskloop will participate on it.
In the OpenMP 5.0 Specification, \code{reduction} clauses for the
\code{taskloop}~\code{ simd} construct were also added.
\index{reduction clause@\code{reduction} clause!on taskloop simd construct@on \code{taskloop}~\code{simd} construct}
\index{combined constructs!taskloop simd@\code{taskloop}~\code{simd}}
\index{taskloop simd construct@\code{taskloop}~\code{simd} construct}
The examples below compare reductions for the \code{taskloop} and the \code{taskloop}~\code{simd} constructs.
These examples illustrate the use of \code{reduction} clauses within
"stand-alone" \code{taskloop} constructs, and the use of \code{in\_reduction} clauses for tasks of taskloops to participate
@ -341,11 +373,14 @@ At the end of the parallel region \plc{asum} contains the combined result of all
\cexample[5.1]{taskloop_simd_reduction}{1}
\ffreeexample[5.1]{taskloop_simd_reduction}{1}[1]
\ffreeexample[5.1]{taskloop_simd_reduction}{1}
\subsection{Reduction with the \code{scope} Construct}
\label{subsec:reduction_scope}
\index{reduction clause@\code{reduction} clause!on scope construct@on \code{scope} construct}
\index{constructs!scope@\code{scope}}
\index{scope construct@\code{scope} construct}
The following example illustrates the use of the \code{scope} construct
to perform a reduction in a \code{parallel} region. The case is useful for

View File

@ -1,6 +1,10 @@
\pagebreak
\section{\code{scan} Directive}
\label{sec:scan}
\index{directives!scan@\code{scan}}
\index{scan directive@\code{scan} directive}
\index{reduction clause@\code{reduction} clause!inscan modifier@\code{inscan} modifier}
\index{inscan modifier@\code{inscan} modifier}
The following examples illustrate how to parallelize a loop that saves
the \emph{prefix sum} of a reduction. This is accomplished by using
@ -9,6 +13,12 @@ variable of the scan, and specifying with a \code{scan} directive whether
the storage statement includes or excludes the scan input of the present
iteration (\texttt{k}).
\index{scan directive@\code{scan} directive!inclusive clause@\code{inclusive} clause}
\index{scan directive@\code{scan} directive!exclusive clause@\code{exclusive} clause}
\index{clauses!inclusive@\code{inclusive}}
\index{inclusive clause@\code{inclusive} clause}
\index{clauses!exclusive@\code{exclusive}}
\index{exclusive clause@\code{exclusive} clause}
Basically, the \code{inscan} modifier connects a loop and/or SIMD reduction to
the scan operation, and a \code{scan} construct with an \code{inclusive} or
\code{exclusive} clause specifies whether the ``scan phase'' (lexical block

View File

@ -1,4 +1,4 @@
! @@name: associate.1f
! @@name: associate.1
! @@type: F-fixed
! @@compilable: no
! @@linkable: no

View File

@ -1,4 +1,4 @@
! @@name: associate.2f
! @@name: associate.2
! @@type: F-fixed
! @@compilable: yes
! @@linkable: yes

View File

@ -1,4 +1,4 @@
! @@name: associate.3f
! @@name: associate.3
! @@type: F-free
! @@compilable: yes
! @@linkable: yes

View File

@ -0,0 +1,58 @@
! @@name: associate.4
! @@type: F-free
! @@compilable: yes
! @@linkable: yes
! @@expect: success
! @@version: omp_5.1
program main
integer :: scalr, aray(3)
scalr = -1 ; aray = -1
associate(a_scalr=>scalr, a_aray=>aray)
!$omp target !! TARGET 1
aray = [1,2,3]
!$omp end target
print *, a_aray, aray !! 1 2 3 1 2 3
!$omp target !! TARGET 2
a_aray = [4,5,6]
!$omp end target
print *, a_aray, aray !! 4 5 6 4 5 6
!!!$omp target !! TARGET 3
!! !! mapping, in this case implicit,
!! !! of aray AND a_aray NOT ALLOWED
!! aray = [4,5,6]
!! a_aray = [1,2,3]
!!!$omp end target
!$omp target !! TARGET 4
scalr = 1 !! scalr is firstprivate
!$omp end target
print *, a_scalr, scalr !! -1 -1
!$omp target !! TARGET 5
a_scalr = 2 !! a_scalr implicitly mapped
!$omp end target
print *, a_scalr, scalr !! 2 2
!$omp target !! TARGET 6
scalr = 3 !! scalr is firstprivate
print *, a_scalr, scalr !! 2 3
a_scalr = 4 !! a_scalr implicitly mapped
print *, a_scalr, scalr !! 4 3
!$omp end target
print *, a_scalr, scalr !! 4 4
!!!$omp target map(a_scalr,scalr) !! TARGET 7
!! mapping, in this case explicit,
!! of scalr AND a_sclar NOT ALLOWED
!! scalr = 5
!! a_scalr = 5
!!!$omp end target
end associate
end program

View File

@ -1,5 +1,5 @@
/*
* @@name: carrays_fpriv.1c
* @@name: carrays_fpriv.1
* @@type: C
* @@compilable: yes
* @@linkable: yes

View File

@ -1,5 +1,5 @@
/*
* @@name: copyin.1c
* @@name: copyin.1
* @@type: C
* @@compilable: yes
* @@linkable: no

View File

@ -1,4 +1,4 @@
! @@name: copyin.1f
! @@name: copyin.1
! @@type: F-fixed
! @@compilable: yes
! @@linkable: no

View File

@ -1,5 +1,5 @@
/*
* @@name: copyprivate.1c
* @@name: copyprivate.1
* @@type: C
* @@compilable: yes
* @@linkable: no

View File

@ -1,4 +1,4 @@
! @@name: copyprivate.1f
! @@name: copyprivate.1
! @@type: F-fixed
! @@compilable: yes
! @@linkable: no

View File

@ -1,15 +1,11 @@
/*
* @@name: copyprivate.2c
* @@name: copyprivate.2
* @@type: C
* @@compilable: yes
* @@linkable: no
* @@expect: success
* @@version: omp_5.1
*/
#if _OPENMP < 202011
#define masked master
#endif
#include <stdio.h>
#include <stdlib.h>

View File

@ -1,14 +1,9 @@
! @@name: copyprivate.2f
! @@name: copyprivate.2
! @@type: F-fixed
! @@compilable: yes
! @@requires: preprocessing
! @@linkable: no
! @@expect: success
! @@version: omp_5.1
#if _OPENMP < 202011
#define MASKED MASTER
#endif
REAL FUNCTION READ_NEXT()
REAL, POINTER :: TMP

View File

@ -1,5 +1,5 @@
/*
* @@name: copyprivate.3c
* @@name: copyprivate.3
* @@type: C
* @@compilable: yes
* @@linkable: no

View File

@ -1,4 +1,4 @@
! @@name: copyprivate.3f
! @@name: copyprivate.3
! @@type: F-fixed
! @@compilable: yes
! @@linkable: no

View File

@ -1,4 +1,4 @@
! @@name: copyprivate.4f
! @@name: copyprivate.4
! @@type: F-fixed
! @@compilable: yes
! @@linkable: no

View File

@ -1,5 +1,5 @@
/*
* @@name: cpp_reference.1c
* @@name: cpp_reference.1
* @@type: C++
* @@compilable: yes
* @@linkable: no

View File

@ -1,5 +1,5 @@
/*
* @@name: default_none.1c
* @@name: default_none.1
* @@type: C
* @@compilable: no
* @@linkable: no

View File

@ -1,4 +1,4 @@
! @@name: default_none.1f
! @@name: default_none.1
! @@type: F-fixed
! @@compilable: no
! @@linkable: no

View File

@ -1,4 +1,4 @@
! @@name: fort_loopvar.1f
! @@name: fort_loopvar.1
! @@type: F-free
! @@compilable: yes
! @@linkable: no

View File

@ -1,4 +1,4 @@
! @@name: fort_loopvar.2f
! @@name: fort_loopvar.2
! @@type: F-free
! @@compilable: yes
! @@linkable: no

View File

@ -1,4 +1,4 @@
! @@name: fort_sa_private.1f
! @@name: fort_sa_private.1
! @@type: F-fixed
! @@compilable: yes
! @@linkable: yes

View File

@ -1,4 +1,4 @@
! @@name: fort_sa_private.2f
! @@name: fort_sa_private.2
! @@type: F-fixed
! @@compilable: maybe
! @@linkable: maybe

View File

@ -1,4 +1,4 @@
! @@name: fort_sa_private.3f
! @@name: fort_sa_private.3
! @@type: F-fixed
! @@compilable: maybe
! @@linkable: maybe

View File

@ -1,4 +1,4 @@
! @@name: fort_sa_private.4f
! @@name: fort_sa_private.4
! @@type: F-fixed
! @@compilable: maybe
! @@linkable: maybe

Some files were not shown because too many files have changed in this diff Show More