diff --git a/Chap_SIMD.tex b/Chap_SIMD.tex index e7874c2..dc57ef3 100644 --- a/Chap_SIMD.tex +++ b/Chap_SIMD.tex @@ -1,5 +1,4 @@ -\pagebreak -\chapter{SIMD} +\cchapter{SIMD}{SIMD} \label{chap:simd} Single instruction, multiple data (SIMD) is a form of parallel execution @@ -12,7 +11,7 @@ Many processors have SIMD (vector) units that can perform simultaneously Loops without loop-carried backward dependency (or with dependency preserved using ordered simd) are candidates for vectorization by the compiler for execution with SIMD units. In addition, with state-of-the-art vectorization -technology and \code{declare simd} construct extensions for function vectorization +technology and \code{declare simd} directive extensions for function vectorization in the OpenMP 4.5 specification, loops with function calls can be vectorized as well. The basic idea is that a scalar function call in a loop can be replaced by a vector version of the function, and the loop can be vectorized simultaneously by combining a loop @@ -46,3 +45,8 @@ execution in different SIMD units. %\code{parallel for simd}). +%===== Examples Sections ===== +\input{SIMD/SIMD} +\input{SIMD/linear_modifier} + + diff --git a/Chap_affinity.tex b/Chap_affinity.tex index 91cb6cd..c465b97 100644 --- a/Chap_affinity.tex +++ b/Chap_affinity.tex @@ -1,5 +1,4 @@ -\pagebreak -\chapter{OpenMP Affinity} +\cchapter{OpenMP Affinity}{affinity} \label{chap:openmp_affinity} OpenMP Affinity consists of a \code{proc\_bind} policy (thread affinity policy) and a specification of @@ -53,21 +52,21 @@ variables for the MPI library. %Forked threads within an MPI process %which sets \code{OMP\_PLACES} specifically for the MPI process. Threads of a team are positioned onto places in a compact manner, a -scattered distribution, or onto the master's place, by setting the +scattered distribution, or onto the primary thread's place, by setting the \code{OMP\_PROC\_BIND} environment variable or the \code{proc\_bind} clause to -\plc{close}, \plc{spread}, or \plc{master}, respectively. When +\code{close}, \code{spread}, or \code{primary} (\code{master} has been deprecated), respectively. When \code{OMP\_PROC\_BIND} is set to FALSE no binding is enforced; and when the value is TRUE, the binding is implementation defined to a set of places in the \code{OMP\_PLACES} variable or to places defined by the implementation if the \code{OMP\_PLACES} variable -is not set. +is not set. The \code{OMP\_PLACES} variable can also be set to an abstract name -(\plc{threads}, \plc{cores}, \plc{sockets}) to specify that a place is +(\code{threads}, \code{cores}, \code{sockets}) to specify that a place is either a single hardware thread, a core, or a socket, respectively. This description of the \code{OMP\_PLACES} is most useful when the number of threads is equal to the number of hardware thread, cores -or sockets. It can also be used with a \plc{close} or \plc{spread} +or sockets. It can also be used with a \code{close} or \code{spread} distribution policy when the equality doesn't hold. @@ -116,3 +115,11 @@ distribution policy when the equality doesn't hold. % thread # 0 * * * * _ _ _ _ _ _ _ _ #mask for thread 0 % thread # 0 _ _ _ _ * * * * _ _ _ _ #mask for thread 1 % thread # 0 _ _ _ _ _ _ _ _ * * * * #mask for thread 2 + + +%===== Examples Sections ===== +\input{affinity/affinity} +\input{affinity/task_affinity} +\input{affinity/affinity_display} +\input{affinity/affinity_query} + diff --git a/Chap_data_environment.tex b/Chap_data_environment.tex index e38d8aa..fad3911 100644 --- a/Chap_data_environment.tex +++ b/Chap_data_environment.tex @@ -1,5 +1,4 @@ -\pagebreak -\chapter{Data Environment} +\cchapter{Data Environment}{data_environment} \label{chap:data_environment} The OpenMP \plc{data environment} contains data attributes of variables and objects. Many constructs (such as \code{parallel}, \code{simd}, \code{task}) @@ -73,3 +72,22 @@ it has been referenced (+1 on entry and -1 on exited) in nested (structured) map regions and/or accumulative (unstructured) mappings, determines the operation. Details of the \code{map} clause and reference count operation are specified in the \plc{map Clause} subsection of the OpenMP Specifications document. + + +%===== Examples Sections ===== +\input{data_environment/threadprivate} +\input{data_environment/default_none} +\input{data_environment/private} +\input{data_environment/fort_loopvar} +\input{data_environment/fort_sp_common} +\input{data_environment/fort_sa_private} +\input{data_environment/carrays_fpriv} +\input{data_environment/lastprivate} +\input{data_environment/reduction} +\input{data_environment/udr} +\input{data_environment/scan} +\input{data_environment/copyin} +\input{data_environment/copyprivate} +\input{data_environment/cpp_reference} +\input{data_environment/associate} + diff --git a/Chap_devices.tex b/Chap_devices.tex index ca7da2e..0be8779 100644 --- a/Chap_devices.tex +++ b/Chap_devices.tex @@ -1,5 +1,4 @@ -\pagebreak -\chapter{Devices} +\cchapter{Devices}{devices} \label{chap:devices} The \code{target} construct consists of a \code{target} directive @@ -51,3 +50,26 @@ This new specification does not affect the execution of pre-4.5 code; it is a necessary element for asynchronous execution of the \code{target} region when using the new \code{nowait} clause introduced in OpenMP 4.5. + + +%===== Examples Sections ===== +\input{devices/target} +\input{devices/target_defaultmap} +\input{devices/target_pointer_mapping} +\input{devices/target_structure_mapping} +\input{devices/target_fort_allocatable_array_mapping} +\input{devices/array_sections} +\input{devices/array_shaping} +\input{devices/target_mapper} +\input{devices/target_data} +\input{devices/target_unstructured_data} +\input{devices/target_update} +\input{devices/target_associate_ptr} +\input{devices/declare_target} +\input{devices/teams} +\input{devices/async_target_depend} +\input{devices/async_target_with_tasks} +\input{devices/async_target_nowait} +\input{devices/async_target_nowait_depend} +\input{devices/device} + diff --git a/Chap_directives.tex b/Chap_directives.tex new file mode 100644 index 0000000..c0daeba --- /dev/null +++ b/Chap_directives.tex @@ -0,0 +1,45 @@ +\cchapter{OpenMP Directive Syntax}{directives} +\label{chap:directive_syntax} + +OpenMP \emph{directives} use base-language mechanisms to specify OpenMP program behavior. +In C code, the directives are formed exclusively with pragmas, whereas in C++ +code, directives are formed from either pragmas or attributes. +Fortran directives are formed with comments in free form and fixed form sources (codes). +All of these mechanism allow the compilation to ignore the OpenMP directives if +OpenMP is not supported or enabled. + + +The OpenMP directive is a combination of the base-language mechanism and a \plc{directive-specification}, +as shown below. The \plc{directive-specification} consists +of the \plc{directive-name} which may seldomly have arguments, +followed by optional \plc{clauses}. Full details of the syntax can be found in the OpenMP Specification. +Illustrations of the syntax is given in the examples. + +The formats for combining a base-language mechanism and a \plc{directive-specification} are: + +C/C++ pragmas +\begin{indentedcodelist} +\code{\#pragma omp} \plc{directive-specification} +\end{indentedcodelist} + +C++ attributes +\begin{indentedcodelist} +\code{[[omp :: directive(} \plc{directive-specification} \code{)]]} +\code{[[using omp : directive(} \plc{directive-specification} \code{)]]} +\end{indentedcodelist} + +Fortran comments +\begin{indentedcodelist} +\code{!\$omp} \plc{directive-specification} +\end{indentedcodelist} + +where \code{c\$omp} and \code{*\$omp} may be used in Fortran fixed form sources. + + +%===== Examples Sections ===== +\input{directives/pragmas} +\input{directives/attributes} +\input{directives/fixed_format_comments} +\input{directives/free_format_comments} + + diff --git a/Chap_loop_transformations.tex b/Chap_loop_transformations.tex new file mode 100644 index 0000000..a9f49bd --- /dev/null +++ b/Chap_loop_transformations.tex @@ -0,0 +1,25 @@ +\cchapter{Loop Transformations}{loop_transformations} +\label{chap:loop_transformations} + +To obtain better performance on a platform, code may need to be restructured +relative to the way it is written (which is often for best readability). +User-directed loop transformations accomplish this goal by providing a means +to separate code semantics and its optimization. + +A loop transformation construct states that a transformation operation is to be +performed on set of nested loops. This directive approach can target specific loops +for transformation, rather than applying more time-consuming general compiler +heuristics methods with compiler options that may not be able to discover +optimal transformations. + +Loop transformations can be augmented by preprocessor support or OpenMP \code{metadirective} +directives, to select optimal dimension and size parameters for specific platforms, +facilitating a single code base for multiple platforms. +Moreover, directive-based transformations make experimenting easier: +whereby specific hot spots can be affected by transformation directives. + + +%===== Examples Sections ===== +\input{loop_transformations/tile} +\input{loop_transformations/unroll} + diff --git a/Chap_memory_model.tex b/Chap_memory_model.tex index 85dfcdf..c3bc1a8 100644 --- a/Chap_memory_model.tex +++ b/Chap_memory_model.tex @@ -1,5 +1,4 @@ -\pagebreak -\chapter{Memory Model} +\cchapter{Memory Model}{memory_model} \label{chap:memory_model} OpenMP provides a shared-memory model that allows all threads on a given @@ -129,3 +128,10 @@ section of the OpenMP Specifications document. % in \plc{atomic Construct} subsection of the OpenMP Specifications document). % Examples 1-3 show the difficulty of synchronizing threads through \code{flush} and \code{atomic} directives. + + +%===== Examples Sections ===== +\input{memory_model/mem_model} +\input{memory_model/allocators} +\input{memory_model/fort_race} + diff --git a/Chap_ompt_interface.tex b/Chap_ompt_interface.tex new file mode 100644 index 0000000..079ce60 --- /dev/null +++ b/Chap_ompt_interface.tex @@ -0,0 +1,19 @@ +\cchapter{OMPT Interface}{ompt_interface} +\label{chap:ompt_interface} +OMPT defines mechanisms and an API for interfacing with tools in the OpenMP program. + +The OMPT API provides the following functionality: +\begin{itemize} + \addtolength{\itemindent}{1cm} + \item examines the state associated with an OpenMP thread + \item interprets the call stack of an OpenMP thread + \item receives notification about OpenMP events + \item traces activity on OpenMP target devices + \item assesses implementation-dependent details + \item controls a tool from an OpenMP application +\end{itemize} + +The following sections will illustrate basic mechanisms and operations of the OMPT API. + + +\input{ompt_interface/ompt_start} diff --git a/Chap_parallel_execution.tex b/Chap_parallel_execution.tex index db21ca4..af139a3 100644 --- a/Chap_parallel_execution.tex +++ b/Chap_parallel_execution.tex @@ -1,5 +1,4 @@ -\pagebreak -\chapter{Parallel Execution} +\cchapter{Parallel Execution}{parallel_execution} \label{chap:parallel_execution} A single thread, the \plc{initial thread}, begins sequential execution of @@ -10,7 +9,7 @@ A \code{parallel} construct encloses code, forming a parallel region. An \plc{initial thread} encountering a \code{parallel} region forks (creates) a team of threads at the beginning of the \code{parallel} region, and joins them (removes from execution) at the -end of the region. The initial thread becomes the master thread of the team in a +end of the region. The initial thread becomes the primary thread of the team in a \code{parallel} region with a \plc{thread} number equal to zero, the other threads are numbered from 1 to number of threads minus 1. A team may be comprised of just a single thread. @@ -19,9 +18,9 @@ Each thread of a team is assigned an implicit task consisting of code within the parallel region. The task that creates a parallel region is suspended while the tasks of the team are executed. A thread is tied to its task; that is, only the thread assigned to the task can execute that task. After completion -of the \code{parallel} region, the master thread resumes execution of the generating task. +of the \code{parallel} region, the primary thread resumes execution of the generating task. -%After the \code{parallel} region the master thread becomes the initial +%After the \code{parallel} region the primary thread becomes the initial %thread again, and continues to execute the \plc{sequential part}. Any task within a \code{parallel} region is allowed to encounter another @@ -43,7 +42,8 @@ defined. When dynamic adjustment is on and the number of threads is specified, the number of threads becomes an upper limit for the number of threads to be provided by the OpenMP runtime. -\pagebreak +%\pagebreak +\bigskip WORKSHARING CONSTRUCTS A worksharing construct distributes the execution of the associated region @@ -96,9 +96,33 @@ region with a single structure block (section of code). Statements in the by threads of the team. \bigskip -MASTER CONSTRUCT +MASKED CONSTRUCT + +The \code{masked} construct is not a worksharing construct. The \code{masked} region is +executed only by the primary thread. There is no implicit barrier (and flush) +at the end of the \code{masked} region; hence the other threads of the team continue +execution beyond code statements beyond the \code{masked} region. +The \code{master} contruct, which has been deprecated in OpenMP 5.1, has identical semantics +to the \code{masked} contruct with no \code{filter} clause. + + +%===== Examples Sections ===== +\input{parallel_execution/ploop} +\input{parallel_execution/parallel} +\input{parallel_execution/host_teams} +\input{parallel_execution/nthrs_nesting} +\input{parallel_execution/nthrs_dynamic} +\input{parallel_execution/fort_do} +\input{parallel_execution/nowait} +\input{parallel_execution/collapse} +\input{parallel_execution/linear_in_loop} +\input{parallel_execution/psections} +\input{parallel_execution/fpriv_sections} +\input{parallel_execution/single} +\input{parallel_execution/workshare} +\input{parallel_execution/masked} +\input{parallel_execution/loop} +\input{parallel_execution/pra_iterator} +\input{parallel_execution/set_dynamic_nthrs} +\input{parallel_execution/get_nthrs} -The \code{master} construct is not a worksharing construct. The master region is -is executed only by the master thread. There is no implicit barrier (and flush) -at the end of the \code{master} region; hence the other threads of the team continue -execution beyond code statements beyond the \code{master} region. diff --git a/Chap_program_control.tex b/Chap_program_control.tex index 0b53b97..3ed6d24 100644 --- a/Chap_program_control.tex +++ b/Chap_program_control.tex @@ -1,17 +1,26 @@ -\pagebreak -\chapter{Program Control} -\label{sec:program_control} +\cchapter{Program Control}{program_control} +\label{chap:program_control} -Some specific and elementary concepts of controlling program execution are -illustrated in the examples of this chapter. Control can be directly -managed with conditional control code (ifdef's with the \code{\_OPENMP} -macro, and the Fortran sentinel (\code{!\$}) -for conditionally compiling). The \code{if} clause on some constructs +Basic concepts and mechanisms for directing and controlling a program compilation and execution +are provided in this introduction and illustrated in subsequent examples. + +\bigskip +CONDITIONAL COMPILATION and EXECUTION + +Conditional compilation can be performed with conventional \#ifdef directives +in C, C++, and Fortran, and additionally with OpenMP sentinel (\code{!\$}) in Fortran. +The \code{if} clause on some directives can direct the runtime to ignore or alter the behavior of the construct. -Of course, the base-language \code{if} statements can be used to control the "execution" +Of course, the base-language \code{if} statements can be used to control the execution of stand-alone directives (such as \code{flush}, \code{barrier}, \code{taskwait}, and \code{taskyield}). -However, the directives must appear in a block structure, and not as a substatement as shown in examples 1 and 2 of this chapter. +However, the directives must appear in a block structure, and not as a substatement. +The \code{metadirective} and \code{declare}~\code{variant} directives provide conditional +selection of directives and routines for compilation (and use), respectively. +The \code{assume} and \code{requires} directives provide invariants +for optimizing compilation, and essential features for compilation +and correct execution, respectively. + \bigskip CANCELLATION @@ -28,15 +37,15 @@ The \code{cancel} construct is also a cancellation point for any other thread of to also continue execution at the end of the named region. Also, once the specified region has been activated for cancellation any thread that encounnters -a \code{cancellation point} construct with the same named region (\plc{construct-type-clause}), +a \code{cancellation}~\code{point} construct with the same named region (\plc{construct-type-clause}), continues execution at the end of the region. For an activated \code{cancel taskgroup} construct, the tasks that belong to the taskgroup set of the innermost enclosing taskgroup region will be canceled. -A task that encounters the cancel taskgroup construct continues execution at the end of its +A task that encounters a \code{cancel}~\code{taskgroup} construct continues execution at the end of its task region. Any task of the taskgroup that has already begun execution will run to completion, -unless it encounters a \code{cancellation point}; tasks that have not begun execution "may" be +unless it encounters a \code{cancellation}~\code{point}; tasks that have not begun execution may be discarded as completed tasks. \bigskip @@ -44,9 +53,10 @@ CONTROL VARIABLES Internal control variables (ICV) are used by implementations to hold values which control the execution of OpenMP regions. Control (and hence the ICVs) may be set as implementation defaults, - or set and adjusted through environment variables, clauses, and API functions. Many of the ICV control - values are accessible through API function calls. Also, initial ICV values are reported by the runtime - if the \code{OMP\_DISPLAY\_ENV} environment variable has been set to \code{TRUE}. + or set and adjusted through environment variables, clauses, and API functions. + %Many of the ICV control values are accessible through API function calls. + Initial ICV values are reported by the runtime + if the \code{OMP\_DISPLAY\_ENV} environment variable has been set to \code{TRUE} or \code{VERBOSE}. %As an example, the \plc{nthreads-var} is the ICV that holds the number of threads %to be used in a \code{parallel} region. It can be set with the \code{OMP\_NUM\_THREADS} environment variable, @@ -59,9 +69,9 @@ CONTROL VARIABLES \bigskip NESTED CONSTRUCTS -Certain combinations of nested constructs are permitted, giving rise to a \plc{combined} construct -consisting of two or more constructs. These can be used when the two (or several) constructs would be used -immediately in succession (closely nested). A combined construct can use the clauses of the component +Certain combinations of nested constructs are permitted, giving rise to \plc{combined} constructs +consisting of two or more directives. These can be used when the two (or several) constructs would be used +immediately in succession (closely nested). A \plc{combined} construct can use the clauses of the component constructs without restrictions. A \plc{composite} construct is a combined construct which has one or more clauses with (an often obviously) modified or restricted meaning, relative to when the constructs are uncombined. %%[appear separately (singly). @@ -72,14 +82,32 @@ modified or restricted meaning, relative to when the constructs are uncombined. %the parallel loop constructs and the \code{SIMD} construct), because the \code{collapse} clause must %explicitly address the ordering of loop chunking \plc{and} SIMD "combined" execution. -Certain nestings are forbidden, and often the reasoning is obvious. Worksharing constructs cannot be nested, and +Certain nestings are forbidden, and often the reasoning is obvious. For example, worksharing constructs cannot be nested, and the \code{barrier} construct cannot be nested inside a worksharing construct, or a \code{critical} construct. -Also, \code{target} constructs cannot be nested. +Also, \code{target} constructs cannot be nested, unless the nested target is a reverse offload. -The \code{parallel} construct can be nested, as well as the \code{task} construct. The parallel -execution in the nested \code{parallel} construct(s) is control by the \code{OMP\_NESTED} and -\code{OMP\_MAX\_ACTIVE\_LEVELS} environment variables, and the \code{omp\_set\_nested()} and -\code{omp\_set\_max\_active\_levels()} functions. +The \code{parallel} construct can be nested, as well as the \code{task} construct. +The parallel execution in the nested parallel construct(s) is controlled by the +\code{OMP\_MAX\_ACTIVE\_LEVELS} environment variable, and the \code{omp\_set\_max\_active\_levels} routine. +Use the \code{omp\_get\_max\_active\_levels} routine to determine the maximum levels provided by an implementation. +As of OpenMP 5.0, use of the \code{OMP\_NESTED} environment variable and the \code{omp\_set\_nested} routine +has been deprecated. More details on nesting can be found in the \plc{Nesting of Regions} of the \plc{Directives} chapter in the OpenMP Specifications document. + + +%===== Examples Sections ===== +\input{program_control/cond_comp} +\input{program_control/icv} +\input{program_control/standalone} +\input{program_control/cancellation} +\input{program_control/requires} +\input{program_control/variant} +\input{program_control/metadirective} +\input{program_control/nested_loop} +\input{program_control/nesting_restrict} +\input{program_control/target_offload} +\input{program_control/interop} +\input{program_control/utilities} + diff --git a/Chap_synchronization.tex b/Chap_synchronization.tex index ec75388..00a7cc9 100644 --- a/Chap_synchronization.tex +++ b/Chap_synchronization.tex @@ -1,5 +1,4 @@ -\pagebreak -\chapter{Synchronization} +\cchapter{Synchronization}{synchronization} \label{chap:synchronization} The \code{barrier} construct is a stand-alone directive that requires all threads @@ -79,3 +78,23 @@ Scheduling constraints on task execution can be prescribed by the \code{depend} clause to enforce dependence on previously generated tasks. More details on controlling task executions can be found in the \plc{Tasking} Chapter in the OpenMP Specifications document. %(DO REF. RIGHT.) + + +%===== Examples Sections ===== +\input{synchronization/critical} +\input{synchronization/worksharing_critical} +\input{synchronization/barrier_regions} +\input{synchronization/atomic} +\input{synchronization/atomic_restrict} +\input{synchronization/flush_nolist} +\input{synchronization/acquire_release} +\input{synchronization/ordered} +\input{synchronization/depobj} +\input{synchronization/doacross} +\input{synchronization/locks} +\input{synchronization/init_lock} +\input{synchronization/init_lock_with_hint} +\input{synchronization/lock_owner} +\input{synchronization/simple_lock} +\input{synchronization/nestable_lock} + diff --git a/Chap_tasking.tex b/Chap_tasking.tex index 59e15da..a8b6692 100644 --- a/Chap_tasking.tex +++ b/Chap_tasking.tex @@ -1,5 +1,4 @@ -\pagebreak -\chapter{Tasking} +\cchapter{Tasking}{tasking} \label{chap:tasking} Tasking constructs provide units of work to a thread for execution. @@ -50,3 +49,14 @@ A complete list of the tasking constructs and details of their clauses can be found in the \plc{Tasking Constructs} chapter of the OpenMP Specifications, in the \plc{OpenMP Application Programming Interface} section. + +%===== Examples Sections ===== +\input{tasking/tasking} +\input{tasking/task_priority} +\input{tasking/task_dep} +\input{tasking/task_detach} +\input{tasking/taskgroup} +\input{tasking/taskyield} +\input{tasking/taskloop} +\input{tasking/parallel_masked_taskloop} + diff --git a/Contributions.md b/Contributions.md new file mode 100644 index 0000000..c5a8dcc --- /dev/null +++ b/Contributions.md @@ -0,0 +1,153 @@ +# Contributing + +The usual process for adding new examples, making changes or adding corrections +is to submit an issue for discussion and initial evaluation of changes or example additions. +When there is a consensus at a meeting about the contribution, +you will be asked to submit a pull request. + +Of course, if your contribution is an obvious correction, clarification, or note, you +may want to submit a pull request directly. + +----------------------------------------------------------- + +## The OpenMP Examples document + +The OpenMP Examples document is in LaTeX format. +Please see the master LaTeX file, `openmp-examples.tex`, for more information. + +## Maintainer + +[OpenMP Examples Subcommittee](http://twiki.openmp.org/twiki/bin/view/OpenMPLang/OpenMPExamplesSubCommittee) +For a brief revision history, see `Changes.log` in the repo. + +## Git procedure + + * Fork your own branch of the OpenMP [examples-internal repo](https:/github.com/openmp/examples-internal) + * Clone your fork locally + * If you are working on generic or old-version updates, create a branch off master. + * If you are working on an example for a release candidate for version #.#, create a branch off work_#.#. + 1.) `git clone --branch https://github.com//examples-internal` + 2.) `git checkout -b ` + 3.) ... `add`, `commit` + 4.) `git push -u origin ` + 5.) `make` or `make diff` will create a full-document pdf or just a pdf with differences (do this at any point). + * `git status` and `git branch -a` are your friends + * Submit an issue for your work (usually with a diff pdf), and then you will be asked to submit a pull request + * Create an issue by selecting the (issue tab)[https://github.com/openmp/examples-internal/issues] and clicking on `new issue`. + * Use this MarkDown Cheatsheet for (issue formatting)[https://wordpress.com/support/markdown-quick-reference/] + * More MarkDown details are available (here)[https://markdown-it.github.io] + * You can cut and paste markdown formatted text in a (reader)[https://dillinger.io] to see formatting effects. + * Forced spaces are available in Markdown. On a Mac is is "option+space". + * Polling is available. Go to (gh-poll)[https://app.gh-polls.com/]. Type an option on each line, then click `copy markdown`, and paste the contents into the issue. (Use preview to check your poll, and then submit it.) + * Create a pull request + + +## Processing source code + + * Prepare source code (C/C++ and Fortran) and a text description (use similar styles found in recent examples) + * Determine the *example* name ``, *sequence* number `` and *compiler* suffix `` for the example + * The syntax is: `..` (e.g. `affinity_display.1.f90`) + * The example name may be a Section name (e.g. affinity), or a Subsection name (affinity_display) + * If you are creating a new Chapter, it may be the chapter name. + * New examples are usually added at the end of a Section or Subsection. Number it as the next number in the sequence numbers for examples in that Section or Subsection. + * The compiler suffix `` is `c`, `cpp`, `f`, and `f90` for C, C++ and Fortran codes. + * Insert the code in the sources directory for each chapter, and include the following metadata: + * Metadata Tags for example sources: + ``` + @@name: .[c|cpp|f|f90] + @@type: C|C++|F-fixed|F-free + @@compilable: yes|no|maybe + @@linkable: yes|no|maybe + @@expect: success|failure|nothing|rt-error + @@version: omp_ + ``` + * **name** + is the name of an example + * **type** + is the source code type, which can be translated into or from proper file extension (c,cpp,f,f90) + * **compilable** + indicates whether the source code is compilable + * **linkable** + indicates whether the source code is linkable + * **expect** + indicates some expected result for testing purpose "`success|failure|nothing`" applies + to the result of code compilation "`rt-error`" is for a case where compilation may be + successful, but the code contains potential runtime issues (such as race condition). + Alternative would be to just use "`conforming`" or "`non-conforming`". + * **version** + indicates features for a specific OpenMP version, such as "`omp_5.0`" + + +## Process for text + * Create or update the description text in a Section/Subsection file under each chapter directory, usually `/.tex` + * If adding a new Subsection, just include it in the appropriate subsection file (`.tex`) + * If adding a new Section, create an `
.tex` file and add an entry in the corresponding chapter file, such as `Chap_affinity.tex` + * If adding a new Chapter, create a `Chap_.tex` file with introductory text, and add a new `
.tex` file with text and links to the code. Update `Makefile` and `openmp-examples.tex` to include the new chapter file. + * Commit your changes into your fork of examples-internal + * Summit your issue at [OpenMP Examples internal repo]( https://github.com/openmp/examples-internal/issues), and include a PDF when ready. + * Examples subcommittee members can view [meeting schedule and notes](http://twiki.openmp.org/twiki/bin/view/OpenMPLang/ExamplesSchedules) + * Shepherd your issue to acceptance (discussed at weekly Examples meeting and in issue comments) + * When it is in a ready state, you should then submit a pull request. + * It will be reviewed and voted on, and changes will be requested. + * Once the last changes are made, it will be verified and merged into an appropriate branch (either the `master` branch or a working branch). + + + + +# LaTeX macros for examples + +* Source code with language h-rules +``` + \cexample[]{}{} % for C/C++ examples + \cppexample[]{}{} % for C++ examples + \fexample[]{}{} % for fixed-form Fortran examples + \ffreeexample[]{}{} % for free-form Fortran examples +``` + +* Source code without language h-rules +``` + \cnexample[]{}{} + \cppnexample[]{}{} + \fnexample[]{}{} + \ffreenexample[]{}{} + \srcnexample[]{}{}{} +``` + + Optional `` can be supplied in a macro to include a specific OpenMP + version in the example header. This option also suggests one additional + tag (`@@version`) line is included in the corresponding source code. + If this is not the case (i.e., no `@@version` tag line), one needs to + prefix `` with an underscore '\_' symbol in the macro. + + The exception is macro `\srcnexample`, for which the corresponding + source code should not contain any `@@` metadata tags. The `ext` argument + to this macro is the file extension (such as `h`, `hpp`, `inc`). + +* Language h-rules +``` + \cspecificstart, \cspecificend + \cppspecificstart, \cppspecificend + \ccppspecificstart, \ccppspecificend + \fortranspecificstart, \fortranspecificend +``` + +* Chapter and section macros +``` + \cchapter{}{} +``` + +The `\cchapter` macro is used for starting a chapter with proper page spacing. +`` is the name of a chapter and `` is the name +of the chapter directory. All section and subsection files for the chapter +should be placed under ``. The corresponding example sources +should be placed under the `sources` directory inside ``. + +A previously-defined macro `\sinput{}` to import a section +file from `` is no longer supported. Please use +`\input{/}` explicitly. + +* See `openmp.sty` for more information + +### License + +For copyright information, please see `omp_copyright.txt`. diff --git a/Deprecated_Features_Chapt.tex b/Deprecated_Features_Chapt.tex new file mode 100644 index 0000000..de35e00 --- /dev/null +++ b/Deprecated_Features_Chapt.tex @@ -0,0 +1,21 @@ +\bchapter{Deprecated Features} +\label{chap:deprecated_features} + +Deprecation of features began in OpenMP 5.0. +Examples that use a deprecated feature have been updated with an equivalent replacement feature. + +Deprecations affecting examples are the following: +\begin{description}[labelindent=5mm,font=\normalfont] +\item[5.1] -- \ \scode{masked} construct replaces \scode{master} construct. +\item[5.1] -- \ \scode{primary} affinity policy replaces \scode{master} affinity policy. +\item[5.0] -- \ \scode{omp_sync_hint_*} constants replace \scode{omp_lock_hint_*} constants. +\end{description} + +These replacements appear in examples that illustrate, otherwise, earlier features. +When using a compiler that is compliant with a version prior to +the indicated version, the earlier form of +an example is restored by a C-style conditional compilation using the \scode{_OPENMP} macro. + +Since Fortran compilers do not preprocess codes by default, a Fortran preprocessor +flag will be required to compile Fortran examples with the C-style conditional +compilation statements. diff --git a/Examples_Chapt.tex b/Examples_Chapt.tex index ce772ee..f022693 100644 --- a/Examples_Chapt.tex +++ b/Examples_Chapt.tex @@ -1,7 +1,6 @@ - -\chapter*{Examples} +\bchapter{Examples} \label{chap:examples} -\addcontentsline{toc}{chapter}{\protect\numberline{}Examples} + The following are examples of the OpenMP API directives, constructs, and routines. \ccppspecificstart A statement following a directive is compound only when necessary, and a @@ -12,15 +11,14 @@ Each example is labeled as \plc{ename.seqno.ext}, where \plc{ename} is the example name, \plc{seqno} is the sequence number in a section, and \plc{ext} is the source file extension to indicate the code type and source form. \plc{ext} is one of the following: -\begin{compactitem} -\item \plc{c} -- C code, -\item \plc{cpp} -- C++ code, -\item \plc{f} -- Fortran code in fixed form, and -\item \plc{f90} -- Fortran code in free form. -\end{compactitem} +\begin{description}[noitemsep,labelindent=5mm,widest=f90] +\item[\plc{c}] -- \ C code, +\item[\plc{cpp}] -- \ C++ code, +\item[\plc{f}] -- \ Fortran code in fixed form, and +\item[\plc{f90}] -- \ Fortran code in free form. +\end{description} Some of the example labels may include version information (\code{\small{}omp\_\plc{verno}}) to indicate features that are illustrated by an example for a specific OpenMP version, such as ``\plc{scan.1.c} \;(\code{\small{}omp\_5.0}).'' - diff --git a/Examples_master.tex b/Examples_master.tex deleted file mode 100644 index 48e7548..0000000 --- a/Examples_master.tex +++ /dev/null @@ -1,13 +0,0 @@ -\pagebreak -\section{The \code{master} Construct} -\label{sec:master} - -The following example demonstrates the master construct . In the example, the master -keeps track of how many iterations have been executed and prints out a progress -report. The other threads skip the master region without waiting. - -\cexample{master}{1} - -\fexample{master}{1} - - diff --git a/Examples_parallel_master_taskloop.tex b/Examples_parallel_master_taskloop.tex deleted file mode 100644 index d88ead4..0000000 --- a/Examples_parallel_master_taskloop.tex +++ /dev/null @@ -1,33 +0,0 @@ -\pagebreak -\section{The \code{parallel master taskloop} Construct} -\label{sec:parallel_master_taskloop} - -In the OpenMP 5.0 Specification several combined constructs containing -the \code{taskloop} construct were added. - -Just as the \code{for} and \code{do} constructs have been combined -with the \code{parallel} construct for convenience, so too, the combined -\code{parallel}~\code{master}~\code{taskloop} and -\code{parallel}~\code{master}~\code{taskloop}~\code{simd} -constructs have been created for convenience. - -In the following example the first \code{taskloop} construct is enclosed -by the usual \code{parallel} and \code{master} constructs to form -a team of threads, and a single task generator (master thread) for -the \code{taskloop} construct. - -The same OpenMP operations for the first taskloop are accomplished by the second -taskloop with the \code{parallel}~\code{master}~\code{taskloop} -combined construct. -The third taskloop uses the combined \code{parallel}~\code{master}~\code{taskloop}~\code{simd} -construct to accomplish the same behavior as closely nested \code{parallel master}, -and \code{taskloop simd} constructs. - -As with any combined construct the clauses of the components may be used -with appropriate restrictions. The combination of the \code{parallel}~\code{master} construct -with the \code{taskloop} or \code{taskloop}~\code{simd} construct produces no additional -restrictions. - -\cexample[5.0]{parallel_master_taskloop}{1} - -\ffreeexample[5.0]{parallel_master_taskloop}{1} diff --git a/Examples_target_structure_mapping.tex b/Examples_target_structure_mapping.tex deleted file mode 100644 index 9cb52f1..0000000 --- a/Examples_target_structure_mapping.tex +++ /dev/null @@ -1,54 +0,0 @@ -\pagebreak -\section{Structure mapping} -\label{sec:structure_mapping} - - -%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% -In the example below, only structure elements \plc{S.a}, \plc{S.b} and \plc{S.p} -of the \plc{S} structure appear in \code{map} clauses of a \code{target} construct. -Only these components have corresponding variables and storage on the device. -Hence, the large arrays, \plc{S.buffera} and \plc{S.bufferb}, and the \plc{S.x} component have no storage -on the device and cannot be accessed. - -Also, since the pointer member \plc{S.p} is used in an array section of a -\code{map} clause, the array storage of the array section on the device, -\plc{S.p[:N]}, is \emph{attached} to the pointer member \plc{S.p} on the device. -Explicitly mapping the pointer member \plc{S.p} is optional in this case. - -Note: The buffer arrays and the \plc{x} variable have been grouped together, so that -the components that will reside on the device are all together (without gaps). -This allows the runtime to optimize the transfer and the storage footprint on the device. - -\cexample[5.0]{target_struct_map}{1} - - -%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% -The following example is a slight modification of the above example for -a C++ class. In the member function \plc{SAXPY::driver} -the array section \plc{p[:N]} is \emph{attached} to the pointer member \plc{p} -on the device. - -\cppexample[5.0]{target_struct_map}{2} - -%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% - -%In this example a pointer, \plc{p}, is mapped in a -%\code{target}~\code{data} construct (\code{map(p)}) and remains -%persistent throughout the \code{target}~\code{data} region. The address stored -%on the host is not assigned to the device pointer variable, and -%the device value is not copied back to the host at the end of the -%region (for a pointer, it is as though \code{map(alloc:p}) is effectively -%used). The array section, \plc{p[:N]}, is mapped on both \code{target} -%constructs, and the pointer \plc{p} on the device is attached at the -%beginning and detached at the end of the regions to the newly created -%array section on the device. -% -%Also, in the following example the global variable, \plc{a}, becomes -%allocated when it is first used on the device in a \code{target} region, -%and persists on the device for all target regions. The value on the -%device and host may be different, as shown by the print statements. -%The values may be made consistent with the \code{update} construct, -%as shown in the \plc{declare\_target.3.c} and \plc{declare\_target.3.f90} -%examples. -% -%\cexample{target_struct_map}{2} diff --git a/Foreword_Chapt.tex b/Foreword_Chapt.tex index 00b452a..b1af0e3 100644 --- a/Foreword_Chapt.tex +++ b/Foreword_Chapt.tex @@ -1,19 +1,17 @@ -\pagebreak -\chapter*{Foreword} +\bchapter{Foreword} \label{chap:foreword} -\addcontentsline{toc}{chapter}{\protect\numberline{}Foreword} The OpenMP Examples document has been updated with new features -found in the OpenMP 5.0 Specification. The additional examples and updates +found in the OpenMP 5.1 Specification. The additional examples and updates are referenced in the Document Revision History of the Appendix on page~\pageref{chap:history}. -Text describing an example with a 5.0 feature specifically states -that the feature support begins in the OpenMP 5.0 Specification. Also, -an \code{\small omp\_5.0} keyword has been added to metadata in the source code. -These distinctions are presented to remind readers that a 5.0 compliant +Text describing an example with a 5.1 feature specifically states +that the feature support begins in the OpenMP 5.1 Specification. Also, +an \code{\small omp\_5.1} keyword is included in the metadata of the source code. +These distinctions are presented to remind readers that a 5.1 compliant OpenMP implementation is necessary to use these features in codes. -Examples for most of the 5.0 features are included in this document, +Examples for most of the 5.1 features are included in this document, and incremental releases will become available as more feature examples and updates are submitted, and approved by the OpenMP Examples Subcommittee. @@ -21,3 +19,5 @@ and updates are submitted, and approved by the OpenMP Examples Subcommittee. Examples Subcommitee Co-chairs: \smallskip\linebreak Henry Jin (\textsc{NASA} Ames Research Center) \linebreak Kent Milfeld (\textsc{TACC}, Texas Advanced Research Center) + + diff --git a/History.tex b/History.tex index 2787a36..75e69e9 100644 --- a/History.tex +++ b/History.tex @@ -1,6 +1,73 @@ -\chapter{Document Revision History} +\cchapter{Document Revision History}{history} \label{chap:history} +%===================================== +\section{Changes from 5.0.1 to 5.1} +\label{sec:history_501_to_51} + +\begin{itemize} +\item General changes: +\begin{itemize} + \item Replaced \code{master} construct example with equivalent \code{masked} construct example (\specref{sec:masked}) + \item Primary thread is now used to describe thread number 0 in the current team + \item \code{primary} thread affinity policy is now used to specify that every + thread in the team is assigned to the same place as the primary thread (\specref{subsec:affinity_primary}) + \item The \scode{omp_lock_hint_*} constants have been renamed \scode{omp_sync_hint_*} (\specref{sec:critical}, \specref{sec:locks}) +\end{itemize} + +\item Added the following new chapters: +\begin{itemize} + \item Deprecated Features (on page~\pageref{chap:deprecated_features}) + \item Directive Syntax (\specref{chap:directive_syntax}) + \item Loop Transformations (\specref{chap:loop_transformations}) + \item OMPT Interface (\specref{chap:ompt_interface}) +\end{itemize} + +\item Added the following examples for the 5.1 features: +\begin{itemize} + \item OpenMP directives in C++ \plc{attribute} specifiers + (\specref{sec:attributes}) + \item Directive syntax adjustment to allow Fortran \code{BLOCK} ... + \code{END}~\code{BLOCK} as a structured block + (\specref{sec:fortran_free_format_comments}) + \item \code{omp\_target\_is\_accessible} API routine + (\specref{sec:pointer_mapping}) + \item Fortran allocatable array mapping in \code{target} regions (\specref{sec:fort_allocatable_array_mapping}) + \item \code{begin}~\code{declare}~\code{target} (with + \code{end}~\code{declare}~\code{target}) directive + (\specref{subsec:declare_target_class}) + \item \code{tile} construct (\specref{sec:tile}) + \item \code{unroll} construct (\specref{sec:unroll}) + \item Reduction with the \code{scope} construct + (\specref{subsec:reduction_scope}) + \item \code{metadirective} directive with dynamic \code{condition} selector + (\specref{sec:metadirective}) + \item \code{interop} construct (\specref{sec:interop}) + \item Environment display with the \scode{omp_display_env} routine + (\specref{subsec:display_env}) + \item \code{error} directive (\specref{subsec:error}) +\end{itemize} + +\item Included additional examples for the 5.0 features: +\begin{itemize} + \item \code{collapse} clause for non-rectangular loop nest + (\specref{sec:collapse}) + \item \code{detach} clause for tasks (\specref{sec:task_detachment}) + \item Pointer attachment for a structure member (\specref{sec:structure_mapping}) + \item Host and device pointer association with the \scode{omp_target_associate_ptr} routine (\specref{sec:target_associate_ptr}) + + \item Sample code on activating the tool interface + (\specref{sec:ompt_start}) +\end{itemize} + +\item Added other examples: +\begin{itemize} + \item The \scode{omp_get_wtime} routine (\specref{subsec:get_wtime}) +\end{itemize} +\end{itemize} + + +%===================================== \section{Changes from 5.0.0 to 5.0.1} \label{sec:history_50_to_501} @@ -18,7 +85,7 @@ OpenMP 3.0 and later. \item \code{conditional} modifier for the \code{lastprivate} clause (\specref{sec:lastprivate}) \item \code{task} modifier for the \code{reduction} clause (\specref{subsec:task_reduction}) \item Reduction on combined target constructs (\specref{subsec:target_reduction}) -\item Task reduction with target constructs +\item Task reduction with \code{target} constructs (\specref{subsec:target_task_reduction}) \item \code{scan} directive for returning the \emph{prefix sum} of a reduction (\specref{sec:scan}) @@ -59,12 +126,12 @@ in \specref{sec:mem_model}. \item \code{mutexinoutset} task dependences (\specref{subsec:task_dep_mutexinoutset}) \item Multidependence Iterators (in \code{depend} clauses) (\specref{subsec:depend_iterator}) \item Combined constructs: \code{parallel}~\code{master}~\code{taskloop} and \code{parallel}~\code{master}~\code{taskloop}~\code{simd} - (\specref{sec:parallel_master_taskloop}) + (\specref{sec:parallel_masked_taskloop}) \item Reverse Offload through \plc{ancestor} modifier of \code{device} clause. (\specref{subsec:target_reverse_offload}) \item Pointer Mapping - behavior of mapped pointers (\specref{sec:pointer_mapping}) %Example_target_ptr_map* \item Structure Mapping - behavior of mapped structures (\specref{sec:structure_mapping}) %Examples_target_structure_mapping.tex target_struct_map* \item Array Shaping with the \plc{shape-operator} (\specref{sec:array-shaping}) -\item The \code{declare}~\code{mapper} construct (\specref{sec:declare_mapper}) +\item The \code{declare}~\code{mapper} directive (\specref{sec:declare_mapper}) \item Acquire and Release Semantics Synchronization: Memory ordering clauses \code{acquire}, \code{release}, and \code{acq\_rel} were added to flush and atomic constructs @@ -150,7 +217,7 @@ Added the following new examples: \item array sections in device constructs (\specref{sec:array_sections}) \item \code{target}~\code{data} construct (\specref{sec:target_data}) \item \code{target}~\code{update} construct (\specref{sec:target_update}) -\item \code{declare}~\code{target} construct (\specref{sec:declare_target}) +\item \code{declare}~\code{target} directive (\specref{sec:declare_target}) \item \code{teams} constructs (\specref{sec:teams}) \item asynchronous execution of a \code{target} region using tasks (\specref{subsec:async_target_with_tasks}) \item device runtime routines (\specref{sec:device}) diff --git a/Introduction_Chapt.tex b/Introduction_Chapt.tex index d042c4a..22e0455 100644 --- a/Introduction_Chapt.tex +++ b/Introduction_Chapt.tex @@ -1,5 +1,5 @@ % This is the introduction for the OpenMP Examples document. -% This is an included file. See the master file (openmp-examples.tex) for more information. +% This is an included file. See the main file (openmp-examples.tex) for more information. % % When editing this file: % @@ -32,9 +32,9 @@ % This is a \plc{var-name}. % -\chapter*{Introduction} +\bchapter{Introduction} \label{chap:introduction} -\addcontentsline{toc}{chapter}{\protect\numberline{}Introduction} + This collection of programming examples supplements the OpenMP API for Shared Memory Parallelization specifications, and is not part of the formal specifications. It assumes familiarity with the OpenMP specifications, and shares the typographical @@ -59,7 +59,7 @@ directory at \href{https://github.com/OpenMP/Examples}{https://github.com/OpenMP/Examples}. The codes for this OpenMP \VER{} Examples document have the tag \plc{v\VER}. -%\href{https://github.com/OpenMP/Examples/tree/master/sources}{https://github.com/OpenMP/Examples/sources}. +%\href{https://github.com/OpenMP/Examples/tree/main/sources}{https://github.com/OpenMP/Examples/sources}. Complete information about the OpenMP API and a list of the compilers that support the OpenMP API can be found at the OpenMP.org web site diff --git a/Makefile b/Makefile index a5522ff..376edae 100644 --- a/Makefile +++ b/Makefile @@ -1,7 +1,7 @@ # Makefile for the OpenMP Examples document in LaTex format. -# For more information, see the master document, openmp-examples.tex. +# For more information, see the main document, openmp-examples.tex. -version=5.0.1 +version=5.1 default: openmp-examples.pdf diff: openmp-diff-abridged.pdf @@ -9,13 +9,16 @@ diff: openmp-diff-abridged.pdf CHAPTERS=Title_Page.tex \ Foreword_Chapt.tex \ Introduction_Chapt.tex \ - Examples_*.tex \ - History.tex + Examples_Chapt.tex \ + Deprecated_Features_Chapt.tex \ + Chap_*.tex \ + History.tex \ + */*.tex -SOURCES=sources/*.c \ - sources/*.cpp \ - sources/*.f90 \ - sources/*.f +SOURCES=*/sources/*.c \ + */sources/*.cpp \ + */sources/*.f90 \ + */sources/*.f INTERMEDIATE_FILES=openmp-examples.pdf \ openmp-examples.toc \ @@ -49,11 +52,11 @@ endif ifdef DIFF_FROM VC_DIFF_FROM := -r ${DIFF_FROM} else - VC_DIFF_FROM := -r master + VC_DIFF_FROM := -r work_5.1 endif DIFF_TO:=HEAD -DIFF_FROM:=master +DIFF_FROM:=work_5.1 DIFF_TYPE:=UNDERLINE COMMON_DIFF_OPTS:=--math-markup=whole \ @@ -66,6 +69,10 @@ VC_DIFF_MINIMAL_OPTS:= --only-changes --force %.tmpdir: $(wildcard *.sty) $(wildcard *.png) $(wildcard *.aux) openmp-examples.pdf mkdir -p $@/sources + for i in affinity devices loop_transformations parallel_execution SIMD tasking \ + data_environment memory_model program_control synchronization \ + directives ompt_interface; do \ + mkdir -p $@/$$i; ln -sf "$$PWD"/$$i/sources $@/$$i/sources; done mkdir -p $@/figs cp -f $^ "$@/" cp -f sources/* "$@/sources" diff --git a/README b/README deleted file mode 100644 index ff85f04..0000000 --- a/README +++ /dev/null @@ -1,76 +0,0 @@ -This is the OpenMP Examples document in LaTeX format. -Please see the master file, openmp-examples.tex, for more information. - -For a brief revision history, please see Changes.log. - -For copyright information, please see omp_copyright.txt. - - -1) Process for adding an example - - - Prepare source code and text description - - Give a high level description in a trac ticket - - Determine a name (ename) for the example - - Propose a new name if creating a new chapter - - Use the existing name if adding to an existing chapter - - Number the example within the chapter (seq-no) - - Create files for the source code with proper tags in - sources/Example_.c.c - sources/Example_.f.f - - Create or update the description text in the chapter file - Examples_,tex - - If needed, add the new chapter file name in - Makefile - openmp-examples.tex - - Commit the changes in git and push to the GitHub repo - - Discuss and vote in committee - -2) Tags (meta data) for example sources - - @@name: .[c|cpp|f|f90] - @@type: C|C++|F-fixed|F-free - @@compilable: yes|no|maybe - @@linkable: yes|no|maybe - @@expect: success|failure|nothing|rt-error - @@version: omp_ - - "name" is the name of an example - "type" is the source code type, which can be translated into or from - proper file extension (c,cpp,f,f90) - "compilable" indicates whether the source code is compilable - "linkable" indicates whether the source code is linkable - "expect" indicates some expected result for testing purpose - "success|failure|nothing" applies to the result of code compilation - "rt-error" is for a case where compilation may be successful, - but the code contains potential runtime issues (such as race condition). - Alternative would be to just use "conforming" or "non-conforming". - "version" indicates features for a specific OpenMP version, such as "omp_5.0" - -3) LaTeX macros for examples - -- Source code with language h-rules - \cexample[]{}{} % for C/C++ examples - \cppexample[]{}{} % for C++ examples - \fexample[]{}{} % for fixed-form Fortran examples - \ffreeexample[]{}{} % for free-form Fortran examples - -- Source code without language h-rules - \cnexample[]{}{} - \cppnexample[]{}{} - \fnexample[]{}{} - \ffreenexample[]{}{} - - Optional can be supplied in a macro to include a specific OpenMP - version in the example header. This option also suggests one additional - tag (@@version) line is included in the corresponding source code. - If this is not the case (i.e., no @@version tag line), one needs to - prefix with an underscore '_' symbol in the macro. - -- Language h-rules - \cspecificstart, \cspecificend - \cppspecificstart, \cppspecificend - \ccppspecificstart, \ccppspecificend - \fortranspecificstart, \fortranspecificend - -- See openmp.sty for more information - diff --git a/README.md b/README.md new file mode 100644 index 0000000..be9c5fe --- /dev/null +++ b/README.md @@ -0,0 +1,10 @@ +# OpenMP Examples Document + +This is the OpenMP Examples document in LaTeX format. + +Please see [Contributions.md](Contributions.md) on how to make contributions to adding new examples. + +For a brief revision history, please see [Changes.log](Changes.log). + +For copyright information, please see [omp_copyright.txt](omp_copyright.txt). + diff --git a/Examples_SIMD.tex b/SIMD/SIMD.tex similarity index 93% rename from Examples_SIMD.tex rename to SIMD/SIMD.tex index fd1f29e..071b719 100644 --- a/Examples_SIMD.tex +++ b/SIMD/SIMD.tex @@ -1,5 +1,5 @@ %\pagebreak -\section{\code{simd} and \code{declare} \code{simd} Constructs} +\section{\code{simd} and \code{declare} \code{simd} Directives} \label{sec:SIMD} The following example illustrates the basic use of the \code{simd} construct @@ -8,29 +8,27 @@ to assure the compiler that the loop can be vectorized. \cexample[4.0]{SIMD}{1} \ffreeexample[4.0]{SIMD}{1} - -\clearpage When a function can be inlined within a loop the compiler has an opportunity to vectorize the loop. By guaranteeing SIMD behavior of a function's operations, characterizing the arguments of the function and privatizing temporary variables of the loop, the compiler can often create faster, vector code for -the loop. In the examples below the \code{declare} \code{simd} construct is +the loop. In the examples below the \code{declare} \code{simd} directive is used on the \plc{add1} and \plc{add2} functions to enable creation of their corresponding SIMD function versions for execution within the associated SIMD loop. The functions characterize two different approaches of accessing data within the function: by a single variable and as an element in a data array, respectively. The \plc{add3} C function uses dereferencing. -The \code{declare} \code{simd} constructs also illustrate the use of +The \code{declare} \code{simd} directives also illustrate the use of \code{uniform} and \code{linear} clauses. The \code{uniform(fact)} clause indicates that the variable \plc{fact} is invariant across the SIMD lanes. In the \plc{add2} function \plc{a} and \plc{b} are included in the \code{uniform} list because the C pointer and the Fortran array references are constant. The \plc{i} index used in the \plc{add2} function is included in a \code{linear} clause with a constant-linear-step of 1, to guarantee a unity increment of the -associated loop. In the \code{declare} \code{simd} construct for the \plc{add3} +associated loop. In the \code{declare} \code{simd} directive for the \plc{add3} C function the \code{linear(a,b:1)} clause instructs the compiler to generate unit-stride loads across the SIMD lanes; otherwise, costly \emph{gather} instructions would be generated for the unknown sequence of access of the @@ -44,7 +42,7 @@ variable. \ffreeexample[4.0]{SIMD}{2} -\pagebreak +%\pagebreak A thread that encounters a SIMD construct executes a vectorized code of the iterations. Similar to the concerns of a worksharing loop a loop vectorized with a SIMD construct must assure that temporary and reduction variables are @@ -57,7 +55,7 @@ construct. \ffreeexample[4.0]{SIMD}{3} -\pagebreak +%\pagebreak A \code{safelen(N)} clause in a \code{simd} construct assures the compiler that there are no loop-carried dependencies for vectors of size \plc{N} or below. If the \code{safelen} clause is not specified, then the default safelen value is @@ -72,7 +70,7 @@ than 16, the behavior is undefined. \ffreeexample[4.0]{SIMD}{4} -\pagebreak +%\pagebreak The following SIMD construct instructs the compiler to collapse the \plc{i} and \plc{j} loops into a single SIMD loop in which SIMD chunks are executed by threads of the team. Within the workshared loop chunks of a thread, the SIMD @@ -88,7 +86,7 @@ chunks are executed in the lanes of the vector units. \label{sec:SIMD_branch} The following examples illustrate the use of the \code{declare} \code{simd} -construct with the \code{inbranch} and \code{notinbranch} clauses. The +directive with the \code{inbranch} and \code{notinbranch} clauses. The \code{notinbranch} clause informs the compiler that the function \plc{foo} is never called conditionally in the SIMD loop of the function \plc{myaddint}. On the other hand, the \code{inbranch} clause for the function goo indicates that diff --git a/Examples_linear_modifier.tex b/SIMD/linear_modifier.tex similarity index 100% rename from Examples_linear_modifier.tex rename to SIMD/linear_modifier.tex diff --git a/sources/Example_SIMD.1.c b/SIMD/sources/SIMD.1.c similarity index 100% rename from sources/Example_SIMD.1.c rename to SIMD/sources/SIMD.1.c diff --git a/sources/Example_SIMD.1.f90 b/SIMD/sources/SIMD.1.f90 similarity index 100% rename from sources/Example_SIMD.1.f90 rename to SIMD/sources/SIMD.1.f90 diff --git a/sources/Example_SIMD.2.c b/SIMD/sources/SIMD.2.c similarity index 100% rename from sources/Example_SIMD.2.c rename to SIMD/sources/SIMD.2.c diff --git a/sources/Example_SIMD.2.f90 b/SIMD/sources/SIMD.2.f90 similarity index 100% rename from sources/Example_SIMD.2.f90 rename to SIMD/sources/SIMD.2.f90 diff --git a/sources/Example_SIMD.3.c b/SIMD/sources/SIMD.3.c similarity index 100% rename from sources/Example_SIMD.3.c rename to SIMD/sources/SIMD.3.c diff --git a/sources/Example_SIMD.3.f90 b/SIMD/sources/SIMD.3.f90 similarity index 100% rename from sources/Example_SIMD.3.f90 rename to SIMD/sources/SIMD.3.f90 diff --git a/sources/Example_SIMD.4.c b/SIMD/sources/SIMD.4.c similarity index 100% rename from sources/Example_SIMD.4.c rename to SIMD/sources/SIMD.4.c diff --git a/sources/Example_SIMD.4.f90 b/SIMD/sources/SIMD.4.f90 similarity index 100% rename from sources/Example_SIMD.4.f90 rename to SIMD/sources/SIMD.4.f90 diff --git a/sources/Example_SIMD.5.c b/SIMD/sources/SIMD.5.c similarity index 100% rename from sources/Example_SIMD.5.c rename to SIMD/sources/SIMD.5.c diff --git a/sources/Example_SIMD.5.f90 b/SIMD/sources/SIMD.5.f90 similarity index 100% rename from sources/Example_SIMD.5.f90 rename to SIMD/sources/SIMD.5.f90 diff --git a/sources/Example_SIMD.6.c b/SIMD/sources/SIMD.6.c similarity index 100% rename from sources/Example_SIMD.6.c rename to SIMD/sources/SIMD.6.c diff --git a/sources/Example_SIMD.6.f90 b/SIMD/sources/SIMD.6.f90 similarity index 100% rename from sources/Example_SIMD.6.f90 rename to SIMD/sources/SIMD.6.f90 diff --git a/sources/Example_SIMD.7.c b/SIMD/sources/SIMD.7.c similarity index 100% rename from sources/Example_SIMD.7.c rename to SIMD/sources/SIMD.7.c diff --git a/sources/Example_SIMD.7.f90 b/SIMD/sources/SIMD.7.f90 similarity index 100% rename from sources/Example_SIMD.7.f90 rename to SIMD/sources/SIMD.7.f90 diff --git a/sources/Example_SIMD.8.c b/SIMD/sources/SIMD.8.c similarity index 100% rename from sources/Example_SIMD.8.c rename to SIMD/sources/SIMD.8.c diff --git a/sources/Example_SIMD.8.f90 b/SIMD/sources/SIMD.8.f90 similarity index 100% rename from sources/Example_SIMD.8.f90 rename to SIMD/sources/SIMD.8.f90 diff --git a/sources/Example_linear_modifier.1.cpp b/SIMD/sources/linear_modifier.1.cpp similarity index 100% rename from sources/Example_linear_modifier.1.cpp rename to SIMD/sources/linear_modifier.1.cpp diff --git a/sources/Example_linear_modifier.1.f90 b/SIMD/sources/linear_modifier.1.f90 similarity index 100% rename from sources/Example_linear_modifier.1.f90 rename to SIMD/sources/linear_modifier.1.f90 diff --git a/sources/Example_linear_modifier.2.cpp b/SIMD/sources/linear_modifier.2.cpp similarity index 100% rename from sources/Example_linear_modifier.2.cpp rename to SIMD/sources/linear_modifier.2.cpp diff --git a/sources/Example_linear_modifier.2.f90 b/SIMD/sources/linear_modifier.2.f90 similarity index 100% rename from sources/Example_linear_modifier.2.f90 rename to SIMD/sources/linear_modifier.2.f90 diff --git a/sources/Example_linear_modifier.3.c b/SIMD/sources/linear_modifier.3.c similarity index 100% rename from sources/Example_linear_modifier.3.c rename to SIMD/sources/linear_modifier.3.c diff --git a/sources/Example_linear_modifier.3.f90 b/SIMD/sources/linear_modifier.3.f90 similarity index 100% rename from sources/Example_linear_modifier.3.f90 rename to SIMD/sources/linear_modifier.3.f90 diff --git a/Title_Page.tex b/Title_Page.tex index 0520fce..b43a169 100644 --- a/Title_Page.tex +++ b/Title_Page.tex @@ -27,7 +27,7 @@ Source codes for OpenMP \PVER{} Examples can be downloaded from \href{https://github.com/OpenMP/Examples/tree/v\VER}{github}.\\ \begin{adjustwidth}{0pt}{1em}\setlength{\parskip}{0.25\baselineskip}% -Copyright © 1997-2020 OpenMP Architecture Review Board.\\ +Copyright \copyright{} 1997-2021 OpenMP Architecture Review Board.\\ Permission to copy without fee all or part of this material is granted, provided the OpenMP Architecture Review Board copyright notice and the title of this document appear. Notice is given that copying is by @@ -37,14 +37,11 @@ permission of OpenMP Architecture Review Board.\end{adjustwidth} % Blank page -\clearpage -\thispagestyle{empty} -\phantom{a} -\emph{This page intentionally left blank} +\cleardoublepage %For final version, uncomment the line above, comment out the lines below %This working version enacted the following tickets: 287, 519, 550, 593, %674, 688, 689, %and a few other editorial changes. -\vfill +%\vfill diff --git a/Examples_affinity.tex b/affinity/affinity.tex similarity index 82% rename from Examples_affinity.tex rename to affinity/affinity.tex index be0c94c..c69a3b9 100644 --- a/Examples_affinity.tex +++ b/affinity/affinity.tex @@ -1,5 +1,5 @@ \pagebreak -\section{The \code{proc\_bind} Clause} +\section{\code{proc\_bind} Clause} \label{sec:affinity} The following examples demonstrate how to use the \code{proc\_bind} clause to @@ -38,8 +38,8 @@ above. Note that the threads are bound to the first place of each subpartition. \fexample[4.0]{affinity}{1} -It is unspecified on which place the master thread is initially started. If the -master thread is initially started on p0, the following placement of threads will +It is unspecified on which place the primary thread is initially started. If the +primary thread is initially started on p0, the following placement of threads will be applied in the parallel region: \begin{compactitem} @@ -53,7 +53,7 @@ be applied in the parallel region: \end{compactitem} -If the master thread would initially be started on p2, the placement of threads +If the primary thread would initially be started on p2, the placement of threads and distribution of the place partition would be as follows: \begin{compactitem} @@ -71,7 +71,7 @@ the number of threads is greater than the number of places in the parent's place partition. Let \plc{T} be the number of threads in the team, and \plc{P} be the number of places in the -parent's place partition. The first \plc{T/P} threads of the team (including the master +parent's place partition. The first \plc{T/P} threads of the team (including the primary thread) execute on the parent's place. The next \plc{T/P} threads execute on the next place in the place partition, and so on, with wrap around. @@ -79,8 +79,8 @@ place in the place partition, and so on, with wrap around. \ffreeexample[4.0]{affinity}{2} -It is unspecified on which place the master thread is initially started. If the -master thread is initially started on p0, the following placement of threads will +It is unspecified on which place the primary thread is initially started. If the +primary thread is initially started on p0, the following placement of threads will be applied in the parallel region: \begin{compactitem} @@ -101,7 +101,7 @@ be applied in the parallel region: \item threads 14,15 execute on p7 with the place partition p7 \end{compactitem} -If the master thread would initially be started on p2, the placement of threads +If the primary thread would initially be started on p2, the placement of threads and distribution of the place partition would be as follows: \begin{compactitem} @@ -134,8 +134,8 @@ The place partition is not changed by the \code{close} policy. \fexample[4.0]{affinity}{3} -It is unspecified on which place the master thread is initially started. If the -master thread is initially started on p0, the following placement of threads will +It is unspecified on which place the primary thread is initially started. If the +primary thread is initially started on p0, the following placement of threads will be applied in the \code{parallel} region: \begin{compactitem} @@ -148,7 +148,7 @@ be applied in the \code{parallel} region: \item thread 3 executes on p3 with the place partition p0-p7 \end{compactitem} -If the master thread would initially be started on p2, the placement of threads +If the primary thread would initially be started on p2, the placement of threads and distribution of the place partition would be as follows: \begin{compactitem} @@ -166,7 +166,7 @@ the number of threads is greater than the number of places in the parent's place partition. Let \plc{T} be the number of threads in the team, and \plc{P} be the number of places in the -parent's place partition. The first \plc{T/P} threads of the team (including the master +parent's place partition. The first \plc{T/P} threads of the team (including the primary thread) execute on the parent's place. The next \plc{T/P} threads execute on the next place in the place partition, and so on, with wrap around. The place partition is not changed by the \code{close} policy. @@ -175,8 +175,8 @@ is not changed by the \code{close} policy. \ffreeexample[4.0]{affinity}{4} -It is unspecified on which place the master thread is initially started. If the -master thread is initially running on p0, the following placement of threads will +It is unspecified on which place the primary thread is initially started. If the +primary thread is initially running on p0, the following placement of threads will be applied in the parallel region: \begin{compactitem} @@ -197,7 +197,7 @@ be applied in the parallel region: \item threads 14,15 execute on p7 with the place partition p0-p7 \end{compactitem} -If the master thread would initially be started on p2, the placement of threads +If the primary thread would initially be started on p2, the placement of threads and distribution of the place partition would be as follows: \begin{compactitem} @@ -218,26 +218,27 @@ and distribution of the place partition would be as follows: \item threads 14,15 execute on p1 with the place partition p0-p7 \end{compactitem} -\subsection{Master Affinity Policy} -\label{subsec:affinity_master} +\subsection{Primary Affinity Policy} +\label{subsec:affinity_primary} -The following example shows the result of the \code{master} affinity policy on +The following example shows the result of the \code{primary} affinity policy on the partition list for the machine architecture depicted above. The place partition -is not changed by the master policy. +is not changed by the primary policy. \cexample[4.0]{affinity}{5} -\fexample[4.0]{affinity}{5} +\fexample[4.0]{affinity}{5}[1] +\clearpage -It is unspecified on which place the master thread is initially started. If the -master thread is initially running on p0, the following placement of threads will +It is unspecified on which place the primary thread is initially started. If the +primary thread is initially running on p0, the following placement of threads will be applied in the parallel region: \begin{compactitem} \item threads 0-3 execute on p0 with the place partition p0-p7 \end{compactitem} -If the master thread would initially be started on p2, the placement of threads +If the primary thread would initially be started on p2, the placement of threads and distribution of the place partition would be as follows: \begin{compactitem} diff --git a/Examples_affinity_display.tex b/affinity/affinity_display.tex similarity index 97% rename from Examples_affinity_display.tex rename to affinity/affinity_display.tex index 10aa0dc..3cda660 100644 --- a/Examples_affinity_display.tex +++ b/affinity/affinity_display.tex @@ -12,9 +12,9 @@ at selected locations within code. For the first example the environment variable \code{OMP\_DISPLAY\_AFFINITY} has been set to \code{TRUE}, and execution occurs on an 8-core system with \code{OMP\_NUM\_THREADS} set to 8. -The affinity for the master thread is reported through a call to the API +The affinity for the primary thread is reported through a call to the API \code{omp\_display\_affinity()} routine. For default affinity settings -the report shows that the master thread can execute on any of the cores. +the report shows that the primary thread can execute on any of the cores. In the following parallel region the affinity for each of the team threads is reported automatically since the \code{OMP\_DISPLAY\_AFFINITY} environment variable has been set to \code{TRUE}. diff --git a/Examples_affinity_query.tex b/affinity/affinity_query.tex similarity index 100% rename from Examples_affinity_query.tex rename to affinity/affinity_query.tex diff --git a/sources/Example_affinity.1.c b/affinity/sources/affinity.1.c similarity index 92% rename from sources/Example_affinity.1.c rename to affinity/sources/affinity.1.c index dfb7e68..8ab77ba 100644 --- a/sources/Example_affinity.1.c +++ b/affinity/sources/affinity.1.c @@ -2,7 +2,7 @@ * @@name: affinity.1c * @@type: C * @@compilable: yes -* @@linkable: yes +* @@linkable: no * @@expect: success * @@version: omp_4.0 */ diff --git a/sources/Example_affinity.1.f b/affinity/sources/affinity.1.f similarity index 92% rename from sources/Example_affinity.1.f rename to affinity/sources/affinity.1.f index 3f37bbf..8bb0b48 100644 --- a/sources/Example_affinity.1.f +++ b/affinity/sources/affinity.1.f @@ -1,7 +1,7 @@ ! @@name: affinity.1f ! @@type: F-fixed ! @@compilable: yes -! @@linkable: yes +! @@linkable: no ! @@expect: success ! @@version: omp_4.0 PROGRAM EXAMPLE diff --git a/sources/Example_affinity.2.c b/affinity/sources/affinity.2.c similarity index 100% rename from sources/Example_affinity.2.c rename to affinity/sources/affinity.2.c diff --git a/sources/Example_affinity.2.f90 b/affinity/sources/affinity.2.f90 similarity index 100% rename from sources/Example_affinity.2.f90 rename to affinity/sources/affinity.2.f90 diff --git a/sources/Example_affinity.3.c b/affinity/sources/affinity.3.c similarity index 92% rename from sources/Example_affinity.3.c rename to affinity/sources/affinity.3.c index c35a063..87e9524 100644 --- a/sources/Example_affinity.3.c +++ b/affinity/sources/affinity.3.c @@ -2,7 +2,7 @@ * @@name: affinity.3c * @@type: C * @@compilable: yes -* @@linkable: yes +* @@linkable: no * @@expect: success * @@version: omp_4.0 */ diff --git a/sources/Example_affinity.3.f b/affinity/sources/affinity.3.f similarity index 92% rename from sources/Example_affinity.3.f rename to affinity/sources/affinity.3.f index 12c8225..a9225a7 100644 --- a/sources/Example_affinity.3.f +++ b/affinity/sources/affinity.3.f @@ -1,7 +1,7 @@ ! @@name: affinity.3f ! @@type: F-fixed ! @@compilable: yes -! @@linkable: yes +! @@linkable: no ! @@expect: success ! @@version: omp_4.0 PROGRAM EXAMPLE diff --git a/sources/Example_affinity.4.c b/affinity/sources/affinity.4.c similarity index 100% rename from sources/Example_affinity.4.c rename to affinity/sources/affinity.4.c diff --git a/sources/Example_affinity.4.f90 b/affinity/sources/affinity.4.f90 similarity index 100% rename from sources/Example_affinity.4.f90 rename to affinity/sources/affinity.4.f90 diff --git a/affinity/sources/affinity.5.c b/affinity/sources/affinity.5.c new file mode 100644 index 0000000..b3916b8 --- /dev/null +++ b/affinity/sources/affinity.5.c @@ -0,0 +1,21 @@ +/* +* @@name: affinity.5c +* @@type: C +* @@compilable: yes +* @@linkable: no +* @@expect: success +* @@version: omp_5.1 +*/ +#if _OPENMP < 202011 +#define primary master +#endif + +void work(); +int main() +{ +#pragma omp parallel proc_bind(primary) num_threads(4) + { + work(); + } + return 0; +} diff --git a/affinity/sources/affinity.5.f b/affinity/sources/affinity.5.f new file mode 100644 index 0000000..6b85909 --- /dev/null +++ b/affinity/sources/affinity.5.f @@ -0,0 +1,16 @@ +! @@name: affinity.5f +! @@type: F-fixed +! @@compilable: yes +! @@requires: preprocessing +! @@linkable: no +! @@expect: success +! @@version: omp_5.1 +#if _OPENMP < 202011 +#define primary master +#endif + + PROGRAM EXAMPLE +!$OMP PARALLEL PROC_BIND(primary) NUM_THREADS(4) + CALL WORK() +!$OMP END PARALLEL + END PROGRAM EXAMPLE diff --git a/sources/Example_affinity.6.c b/affinity/sources/affinity.6.c similarity index 99% rename from sources/Example_affinity.6.c rename to affinity/sources/affinity.6.c index 05f102e..061af9c 100644 --- a/sources/Example_affinity.6.c +++ b/affinity/sources/affinity.6.c @@ -6,7 +6,6 @@ * @@expect: success * @@version: omp_5.0 */ - double * alloc_init_B(double *A, int N); void compute_on_B(double *B, int N); diff --git a/sources/Example_affinity.6.f90 b/affinity/sources/affinity.6.f90 similarity index 99% rename from sources/Example_affinity.6.f90 rename to affinity/sources/affinity.6.f90 index 699d29d..47bf55e 100644 --- a/sources/Example_affinity.6.f90 +++ b/affinity/sources/affinity.6.f90 @@ -4,7 +4,6 @@ ! @@linkable: no ! @@expect: success ! @@version: omp_5.0 - subroutine task_affinity(A, N) external alloc_init_B diff --git a/sources/Example_affinity_display.1.c b/affinity/sources/affinity_display.1.c similarity index 99% rename from sources/Example_affinity_display.1.c rename to affinity/sources/affinity_display.1.c index 9e99456..8a0a98a 100644 --- a/sources/Example_affinity_display.1.c +++ b/affinity/sources/affinity_display.1.c @@ -11,7 +11,7 @@ int main(void){ //MAX threads = 8, single socket system - omp_display_affinity(NULL); //API call-- Displays Affinity of Master Thread + omp_display_affinity(NULL); //API call-- Displays Affinity of Primary Thread // API CALL OUTPUT (default format): //team_num= 0, nesting_level= 0, thread_num= 0, thread_affinity= 0,1,2,3,4,5,6,7 diff --git a/sources/Example_affinity_display.1.f90 b/affinity/sources/affinity_display.1.f90 similarity index 95% rename from sources/Example_affinity_display.1.f90 rename to affinity/sources/affinity_display.1.f90 index cea3c23..6a1957e 100644 --- a/sources/Example_affinity_display.1.f90 +++ b/affinity/sources/affinity_display.1.f90 @@ -4,17 +4,16 @@ ! @@linkable: yes ! @@expect: success ! @@version: omp_5.0 - program affinity_display ! MAX threads = 8, single socket system use omp_lib implicit none character(len=0) :: null - call omp_display_affinity(null) !API call- Displays Affinity of Master Thread + call omp_display_affinity(null) !API call- Displays Affinity of Primary Thrd ! API CALL OUTPUT (default format): -! team_num= 0, nesting_level= 0, thread_num= 0, thread_affinity= 0,1,2,3,4,5,6,7 +!team_num= 0, nesting_level= 0, thread_num= 0, thread_affinity= 0,1,2,3,4,5,6,7 ! OMP_DISPLAY_AFFINITY=TRUE, OMP_NUM_THREADS=8 diff --git a/sources/Example_affinity_display.2.c b/affinity/sources/affinity_display.2.c similarity index 100% rename from sources/Example_affinity_display.2.c rename to affinity/sources/affinity_display.2.c diff --git a/sources/Example_affinity_display.2.f90 b/affinity/sources/affinity_display.2.f90 similarity index 99% rename from sources/Example_affinity_display.2.f90 rename to affinity/sources/affinity_display.2.f90 index 18cdb42..f274473 100644 --- a/sources/Example_affinity_display.2.f90 +++ b/affinity/sources/affinity_display.2.f90 @@ -4,7 +4,6 @@ ! @@linkable: yes ! @@expect: success ! @@version: omp_5.0 - program affinity_display use omp_lib diff --git a/sources/Example_affinity_display.3.c b/affinity/sources/affinity_display.3.c similarity index 100% rename from sources/Example_affinity_display.3.c rename to affinity/sources/affinity_display.3.c diff --git a/sources/Example_affinity_display.3.f90 b/affinity/sources/affinity_display.3.f90 similarity index 99% rename from sources/Example_affinity_display.3.f90 rename to affinity/sources/affinity_display.3.f90 index 46800a4..a262411 100644 --- a/sources/Example_affinity_display.3.f90 +++ b/affinity/sources/affinity_display.3.f90 @@ -4,7 +4,6 @@ ! @@linkable: yes ! @@expect: success ! @@version: omp_5.0 - program affinity_display use omp_lib implicit none diff --git a/sources/Example_affinity_query.1.c b/affinity/sources/affinity_query.1.c similarity index 100% rename from sources/Example_affinity_query.1.c rename to affinity/sources/affinity_query.1.c diff --git a/sources/Example_affinity_query.1.f90 b/affinity/sources/affinity_query.1.f90 similarity index 99% rename from sources/Example_affinity_query.1.f90 rename to affinity/sources/affinity_query.1.f90 index 86bcba0..96dbca6 100644 --- a/sources/Example_affinity_query.1.f90 +++ b/affinity/sources/affinity_query.1.f90 @@ -4,7 +4,6 @@ ! @@linkable: no ! @@expect: success ! @@version: omp_4.5 - subroutine socket_init(socket_num) use omp_lib integer :: socket_num, n_procs diff --git a/Examples_task_affinity.tex b/affinity/task_affinity.tex similarity index 100% rename from Examples_task_affinity.tex rename to affinity/task_affinity.tex diff --git a/Examples_associate.tex b/data_environment/associate.tex similarity index 99% rename from Examples_associate.tex rename to data_environment/associate.tex index 5d7899c..339b121 100644 --- a/Examples_associate.tex +++ b/data_environment/associate.tex @@ -27,6 +27,7 @@ The association between \plc{u} and the original \plc{v} is retained (see the Da Attribute Rules section in the OpenMP 4.0 API Specifications). Inside the \code{parallel} region, \plc{v} has the value of -1 and \plc{u} has the value of the original \plc{v}. +\pagebreak \ffreenexample[4.0]{associate}{3} \fortranspecificend diff --git a/Examples_carrays_fpriv.tex b/data_environment/carrays_fpriv.tex similarity index 100% rename from Examples_carrays_fpriv.tex rename to data_environment/carrays_fpriv.tex diff --git a/Examples_copyin.tex b/data_environment/copyin.tex similarity index 84% rename from Examples_copyin.tex rename to data_environment/copyin.tex index ada9a5a..b7a9b9e 100644 --- a/Examples_copyin.tex +++ b/data_environment/copyin.tex @@ -1,9 +1,9 @@ \pagebreak -\section{The \code{copyin} Clause} +\section{\code{copyin} Clause} \label{sec:copyin} The \code{copyin} clause is used to initialize threadprivate data upon entry -to a \code{parallel} region. The value of the threadprivate variable in the master +to a \code{parallel} region. The value of the threadprivate variable in the primary thread is copied to the threadprivate variable of each other team member. \cexample{copyin}{1} diff --git a/Examples_copyprivate.tex b/data_environment/copyprivate.tex similarity index 91% rename from Examples_copyprivate.tex rename to data_environment/copyprivate.tex index d6ccf66..abf739a 100644 --- a/Examples_copyprivate.tex +++ b/data_environment/copyprivate.tex @@ -1,5 +1,5 @@ \pagebreak -\section{The \code{copyprivate} Clause} +\section{\code{copyprivate} Clause} \label{sec:copyprivate} The \code{copyprivate} clause can be used to broadcast values acquired by a single @@ -20,14 +20,14 @@ any of the threads have left the barrier at the end of the construct. \fexample{copyprivate}{1} -In this example, assume that the input must be performed by the master thread. -Since the \code{master} construct does not support the \code{copyprivate} clause, +In this example, assume that the input must be performed by the primary thread. +Since the \code{masked} construct does not support the \code{copyprivate} clause, it cannot broadcast the input value that is read. However, \code{copyprivate} -is used to broadcast an address where the input value is stored. +is used to broadcast an address where the input value is stored. -\cexample{copyprivate}{2} +\cexample[5.1]{copyprivate}{2} -\fexample{copyprivate}{2} +\fexample[5.1]{copyprivate}{2}[1] Suppose that the number of lock variables required within a \code{parallel} region cannot easily be determined prior to entering it. The \code{copyprivate} clause diff --git a/Examples_cpp_reference.tex b/data_environment/cpp_reference.tex similarity index 100% rename from Examples_cpp_reference.tex rename to data_environment/cpp_reference.tex diff --git a/Examples_default_none.tex b/data_environment/default_none.tex similarity index 93% rename from Examples_default_none.tex rename to data_environment/default_none.tex index 2b4dd5b..0129b88 100644 --- a/Examples_default_none.tex +++ b/data_environment/default_none.tex @@ -1,5 +1,5 @@ \pagebreak -\section{The \code{default(none)} Clause} +\section{\code{default(none)} Clause} \label{sec:default_none} The following example distinguishes the variables that are affected by the \code{default(none)} diff --git a/Examples_fort_loopvar.tex b/data_environment/fort_loopvar.tex similarity index 100% rename from Examples_fort_loopvar.tex rename to data_environment/fort_loopvar.tex diff --git a/Examples_fort_sa_private.tex b/data_environment/fort_sa_private.tex similarity index 92% rename from Examples_fort_sa_private.tex rename to data_environment/fort_sa_private.tex index 2333b34..1ee270b 100644 --- a/Examples_fort_sa_private.tex +++ b/data_environment/fort_sa_private.tex @@ -9,6 +9,7 @@ clause rules with regard to storage association. \fnexample{fort_sa_private}{1} \fnexample{fort_sa_private}{2} +\clearpage \fnexample{fort_sa_private}{3} % blue line floater at top of this page for "Fortran, cont." @@ -18,6 +19,6 @@ clause rules with regard to storage association. \fnexample{fort_sa_private}{4} -\fnexample{fort_sa_private}{5} +\fnexample[5.1]{fort_sa_private}{5} \fortranspecificend diff --git a/Examples_fort_sp_common.tex b/data_environment/fort_sp_common.tex similarity index 98% rename from Examples_fort_sp_common.tex rename to data_environment/fort_sp_common.tex index 0d56381..712ea67 100644 --- a/Examples_fort_sp_common.tex +++ b/data_environment/fort_sp_common.tex @@ -19,6 +19,7 @@ The following example is also conforming: %\begin{figure}[t!] %\linewitharrows{-1}{dashed}{Fortran (cont.)}{8em} %\end{figure} +\clearpage The following example is conforming: diff --git a/Examples_lastprivate.tex b/data_environment/lastprivate.tex similarity index 95% rename from Examples_lastprivate.tex rename to data_environment/lastprivate.tex index d2cbd98..d87783f 100644 --- a/Examples_lastprivate.tex +++ b/data_environment/lastprivate.tex @@ -1,5 +1,5 @@ \pagebreak -\section{The \code{lastprivate} Clause} +\section{\code{lastprivate} Clause} \label{sec:lastprivate} Correct execution sometimes depends on the value that the last iteration of a loop diff --git a/Examples_private.tex b/data_environment/private.tex similarity index 96% rename from Examples_private.tex rename to data_environment/private.tex index 8a912cf..c5f0556 100644 --- a/Examples_private.tex +++ b/data_environment/private.tex @@ -1,5 +1,5 @@ \pagebreak -\section{The \code{private} Clause} +\section{\code{private} Clause} \label{sec:private} In the following example, the values of original list items \plc{i} and \plc{j} diff --git a/Examples_reduction.tex b/data_environment/reduction.tex similarity index 94% rename from Examples_reduction.tex rename to data_environment/reduction.tex index 02cb8b0..6876a99 100644 --- a/Examples_reduction.tex +++ b/data_environment/reduction.tex @@ -5,7 +5,7 @@ This section covers ways to perform reductions in parallel, task, taskloop, and SIMD regions. -\subsection{The \code{reduction} Clause} +\subsection{\code{reduction} Clause} \label{subsec:reduction} The following example demonstrates the \code{reduction} clause; note that some @@ -49,7 +49,7 @@ to \code{MIN}. \ffreenexample{reduction}{5} \fortranspecificend -\pagebreak +%\pagebreak The following example is non-conforming because the initialization (\code{a = 0}) of the original list item \code{a} is not synchronized with the update of \code{a} as a result of the reduction computation in the \code{for} loop. Therefore, @@ -62,9 +62,9 @@ clause. This can be achieved by adding an explicit barrier after the assignment directive (which has an implied barrier), or by initializing \code{a} before the start of the \code{parallel} region. -\cexample{reduction}{6} +\cexample[5.1]{reduction}{6} -\fexample{reduction}{6} +\fexample[5.1]{reduction}{6}[1] The following example demonstrates the reduction of array \plc{a}. In C/C++ this is illustrated by the explicit use of an array section \plc{a[0:N]} in the \code{reduction} clause. The corresponding Fortran example uses array syntax supported in the base language. As of the OpenMP 4.5 specification the explicit use of array section in the \code{reduction} clause in Fortran is not permitted. But this oversight has been fixed in the OpenMP 5.0 specification. @@ -154,7 +154,7 @@ second \code{target} construct. \cexample[5.0]{target_reduction}{1} \ffreeexample[5.0]{target_reduction}{1} -\clearpage +%\clearpage In next example, the variables \code{sum1} and \code{sum2} remain on the device for the duration of the \code{target}~\code{data} region so that it is @@ -184,9 +184,9 @@ task reduction will be combined (in some order) into the original variable listed in the \code{task\_reduction} clause before exiting the \code{taskgroup} region. -\cexample[5.0]{target_task_reduction}{1} +\cexample[5.1]{target_task_reduction}{1} -\ffreeexample[5.0]{target_task_reduction}{1} +\ffreeexample[5.1]{target_task_reduction}{1}[1] In the next pair of examples, the task reduction is defined by a \code{reduction} clause with the \code{task} modifier, rather than a @@ -201,13 +201,13 @@ into the original reduction variable, \code{sum}. Next, the \code{task} modifier is again used to define a task reduction over participating tasks. This time, the participating tasks are a target task resulting from a \code{target} construct with the \code{in\_reduction} clause, -and the implicit task (executing on the master thread) that calls +and the implicit task (executing on the primary thread) that calls \code{host\_compute}. As before, the partial results from these paricipating tasks are combined in some order into the original reduction variable. -\cexample[5.0]{target_task_reduction}{2b} +\cexample[5.1]{target_task_reduction}{2b} -\ffreeexample[5.0]{target_task_reduction}{2b} +\ffreeexample[5.1]{target_task_reduction}{2b}[1] \subsection{Taskloop Reduction} @@ -266,7 +266,7 @@ by the taskloop will participate on it. \cexample[5.0]{taskloop_reduction}{2} \ffreeexample[5.0]{taskloop_reduction}{2} -\clearpage +%\clearpage In the OpenMP 5.0 Specification, \code{reduction} clauses for the \code{taskloop}~\code{ simd} construct were also added. @@ -339,8 +339,21 @@ At the end of the parallel region \plc{asum} contains the combined result of all %At the end of the parallel region \plc{asum} contains the combined result of all reductions. -\cexample[5.0]{taskloop_simd_reduction}{1} +\cexample[5.1]{taskloop_simd_reduction}{1} -\ffreeexample[5.0]{taskloop_simd_reduction}{1} +\ffreeexample[5.1]{taskloop_simd_reduction}{1}[1] +\subsection{Reduction with the \code{scope} Construct} +\label{subsec:reduction_scope} + +The following example illustrates the use of the \code{scope} construct +to perform a reduction in a \code{parallel} region. The case is useful for +producing a reduction and accessing reduction variables inside a \code{parallel} region +without using a worksharing-loop construct. + +\cppexample[5.1]{scope_reduction}{1} +\clearpage + +\ffreeexample[5.1]{scope_reduction}{1} + diff --git a/Examples_scan.tex b/data_environment/scan.tex similarity index 98% rename from Examples_scan.tex rename to data_environment/scan.tex index 4ba3232..ea8b781 100644 --- a/Examples_scan.tex +++ b/data_environment/scan.tex @@ -1,5 +1,5 @@ \pagebreak -\section{The \code{scan} Directive} +\section{\code{scan} Directive} \label{sec:scan} The following examples illustrate how to parallelize a loop that saves diff --git a/sources/Example_associate.1.f b/data_environment/sources/associate.1.f similarity index 100% rename from sources/Example_associate.1.f rename to data_environment/sources/associate.1.f diff --git a/sources/Example_associate.2.f b/data_environment/sources/associate.2.f similarity index 100% rename from sources/Example_associate.2.f rename to data_environment/sources/associate.2.f diff --git a/sources/Example_associate.3.f90 b/data_environment/sources/associate.3.f90 similarity index 100% rename from sources/Example_associate.3.f90 rename to data_environment/sources/associate.3.f90 diff --git a/sources/Example_carrays_fpriv.1.c b/data_environment/sources/carrays_fpriv.1.c similarity index 100% rename from sources/Example_carrays_fpriv.1.c rename to data_environment/sources/carrays_fpriv.1.c diff --git a/sources/Example_copyin.1.c b/data_environment/sources/copyin.1.c similarity index 100% rename from sources/Example_copyin.1.c rename to data_environment/sources/copyin.1.c diff --git a/sources/Example_copyin.1.f b/data_environment/sources/copyin.1.f similarity index 100% rename from sources/Example_copyin.1.f rename to data_environment/sources/copyin.1.f diff --git a/sources/Example_copyprivate.1.c b/data_environment/sources/copyprivate.1.c similarity index 100% rename from sources/Example_copyprivate.1.c rename to data_environment/sources/copyprivate.1.c diff --git a/sources/Example_copyprivate.1.f b/data_environment/sources/copyprivate.1.f similarity index 100% rename from sources/Example_copyprivate.1.f rename to data_environment/sources/copyprivate.1.f diff --git a/sources/Example_copyprivate.2.c b/data_environment/sources/copyprivate.2.c similarity index 83% rename from sources/Example_copyprivate.2.c rename to data_environment/sources/copyprivate.2.c index c389b36..3b46ce5 100644 --- a/sources/Example_copyprivate.2.c +++ b/data_environment/sources/copyprivate.2.c @@ -4,7 +4,12 @@ * @@compilable: yes * @@linkable: no * @@expect: success +* @@version: omp_5.1 */ +#if _OPENMP < 202011 +#define masked master +#endif + #include #include @@ -18,7 +23,7 @@ float read_next( ) { } /* copies the pointer only */ - #pragma omp master + #pragma omp masked { scanf("%f", tmp); } diff --git a/sources/Example_copyprivate.2.f b/data_environment/sources/copyprivate.2.f similarity index 76% rename from sources/Example_copyprivate.2.f rename to data_environment/sources/copyprivate.2.f index fa105a0..c78f2d7 100644 --- a/sources/Example_copyprivate.2.f +++ b/data_environment/sources/copyprivate.2.f @@ -1,8 +1,14 @@ ! @@name: copyprivate.2f ! @@type: F-fixed ! @@compilable: yes +! @@requires: preprocessing ! @@linkable: no ! @@expect: success +! @@version: omp_5.1 +#if _OPENMP < 202011 +#define MASKED MASTER +#endif + REAL FUNCTION READ_NEXT() REAL, POINTER :: TMP @@ -10,9 +16,9 @@ ALLOCATE (TMP) !$OMP END SINGLE COPYPRIVATE (TMP) ! copies the pointer only -!$OMP MASTER +!$OMP MASKED READ (11) TMP -!$OMP END MASTER +!$OMP END MASKED !$OMP BARRIER READ_NEXT = TMP diff --git a/sources/Example_copyprivate.3.c b/data_environment/sources/copyprivate.3.c similarity index 100% rename from sources/Example_copyprivate.3.c rename to data_environment/sources/copyprivate.3.c diff --git a/sources/Example_copyprivate.3.f b/data_environment/sources/copyprivate.3.f similarity index 100% rename from sources/Example_copyprivate.3.f rename to data_environment/sources/copyprivate.3.f diff --git a/sources/Example_copyprivate.4.f b/data_environment/sources/copyprivate.4.f similarity index 100% rename from sources/Example_copyprivate.4.f rename to data_environment/sources/copyprivate.4.f diff --git a/sources/Example_cpp_reference.1.cpp b/data_environment/sources/cpp_reference.1.cpp similarity index 100% rename from sources/Example_cpp_reference.1.cpp rename to data_environment/sources/cpp_reference.1.cpp diff --git a/sources/Example_default_none.1.c b/data_environment/sources/default_none.1.c similarity index 100% rename from sources/Example_default_none.1.c rename to data_environment/sources/default_none.1.c diff --git a/sources/Example_default_none.1.f b/data_environment/sources/default_none.1.f similarity index 100% rename from sources/Example_default_none.1.f rename to data_environment/sources/default_none.1.f diff --git a/sources/Example_fort_loopvar.1.f90 b/data_environment/sources/fort_loopvar.1.f90 similarity index 100% rename from sources/Example_fort_loopvar.1.f90 rename to data_environment/sources/fort_loopvar.1.f90 diff --git a/sources/Example_fort_loopvar.2.f90 b/data_environment/sources/fort_loopvar.2.f90 similarity index 100% rename from sources/Example_fort_loopvar.2.f90 rename to data_environment/sources/fort_loopvar.2.f90 diff --git a/sources/Example_fort_sa_private.1.f b/data_environment/sources/fort_sa_private.1.f similarity index 100% rename from sources/Example_fort_sa_private.1.f rename to data_environment/sources/fort_sa_private.1.f diff --git a/sources/Example_fort_sa_private.2.f b/data_environment/sources/fort_sa_private.2.f similarity index 100% rename from sources/Example_fort_sa_private.2.f rename to data_environment/sources/fort_sa_private.2.f diff --git a/sources/Example_fort_sa_private.3.f b/data_environment/sources/fort_sa_private.3.f similarity index 100% rename from sources/Example_fort_sa_private.3.f rename to data_environment/sources/fort_sa_private.3.f diff --git a/sources/Example_fort_sa_private.4.f b/data_environment/sources/fort_sa_private.4.f similarity index 100% rename from sources/Example_fort_sa_private.4.f rename to data_environment/sources/fort_sa_private.4.f diff --git a/sources/Example_fort_sa_private.5.f b/data_environment/sources/fort_sa_private.5.f similarity index 93% rename from sources/Example_fort_sa_private.5.f rename to data_environment/sources/fort_sa_private.5.f index 58fc92b..80788ff 100644 --- a/sources/Example_fort_sa_private.5.f +++ b/data_environment/sources/fort_sa_private.5.f @@ -3,6 +3,7 @@ ! @@compilable: maybe ! @@linkable: maybe ! @@expect: rt-error +! @@version: omp_5.1 SUBROUTINE SUB1(X) DIMENSION X(10) @@ -33,9 +34,8 @@ ! sequence-associated. CALL SUB1(A) -!$OMP MASTER +!$OMP MASKED PRINT *, A -!$OMP END MASTER +!$OMP END MASKED !$OMP END PARALLEL - END PROGRAM PRIV_RESTRICT5 diff --git a/sources/Example_fort_sp_common.1.f b/data_environment/sources/fort_sp_common.1.f similarity index 100% rename from sources/Example_fort_sp_common.1.f rename to data_environment/sources/fort_sp_common.1.f diff --git a/sources/Example_fort_sp_common.2.f b/data_environment/sources/fort_sp_common.2.f similarity index 100% rename from sources/Example_fort_sp_common.2.f rename to data_environment/sources/fort_sp_common.2.f diff --git a/sources/Example_fort_sp_common.3.f b/data_environment/sources/fort_sp_common.3.f similarity index 100% rename from sources/Example_fort_sp_common.3.f rename to data_environment/sources/fort_sp_common.3.f diff --git a/sources/Example_fort_sp_common.4.f b/data_environment/sources/fort_sp_common.4.f similarity index 100% rename from sources/Example_fort_sp_common.4.f rename to data_environment/sources/fort_sp_common.4.f diff --git a/sources/Example_fort_sp_common.5.f b/data_environment/sources/fort_sp_common.5.f similarity index 100% rename from sources/Example_fort_sp_common.5.f rename to data_environment/sources/fort_sp_common.5.f diff --git a/sources/Example_lastprivate.1.c b/data_environment/sources/lastprivate.1.c similarity index 100% rename from sources/Example_lastprivate.1.c rename to data_environment/sources/lastprivate.1.c diff --git a/sources/Example_lastprivate.1.f b/data_environment/sources/lastprivate.1.f similarity index 100% rename from sources/Example_lastprivate.1.f rename to data_environment/sources/lastprivate.1.f diff --git a/sources/Example_lastprivate.2.c b/data_environment/sources/lastprivate.2.c similarity index 100% rename from sources/Example_lastprivate.2.c rename to data_environment/sources/lastprivate.2.c diff --git a/sources/Example_lastprivate.2.f90 b/data_environment/sources/lastprivate.2.f90 similarity index 100% rename from sources/Example_lastprivate.2.f90 rename to data_environment/sources/lastprivate.2.f90 diff --git a/sources/Example_private.1.c b/data_environment/sources/private.1.c similarity index 100% rename from sources/Example_private.1.c rename to data_environment/sources/private.1.c diff --git a/sources/Example_private.1.f b/data_environment/sources/private.1.f similarity index 100% rename from sources/Example_private.1.f rename to data_environment/sources/private.1.f diff --git a/sources/Example_private.2.c b/data_environment/sources/private.2.c similarity index 100% rename from sources/Example_private.2.c rename to data_environment/sources/private.2.c diff --git a/sources/Example_private.2.f b/data_environment/sources/private.2.f similarity index 100% rename from sources/Example_private.2.f rename to data_environment/sources/private.2.f diff --git a/sources/Example_private.3.c b/data_environment/sources/private.3.c similarity index 100% rename from sources/Example_private.3.c rename to data_environment/sources/private.3.c diff --git a/sources/Example_private.3.f b/data_environment/sources/private.3.f similarity index 100% rename from sources/Example_private.3.f rename to data_environment/sources/private.3.f diff --git a/sources/Example_reduction.1.c b/data_environment/sources/reduction.1.c similarity index 100% rename from sources/Example_reduction.1.c rename to data_environment/sources/reduction.1.c diff --git a/sources/Example_reduction.1.f90 b/data_environment/sources/reduction.1.f90 similarity index 100% rename from sources/Example_reduction.1.f90 rename to data_environment/sources/reduction.1.f90 diff --git a/sources/Example_reduction.2.c b/data_environment/sources/reduction.2.c similarity index 100% rename from sources/Example_reduction.2.c rename to data_environment/sources/reduction.2.c diff --git a/sources/Example_reduction.2.f90 b/data_environment/sources/reduction.2.f90 similarity index 100% rename from sources/Example_reduction.2.f90 rename to data_environment/sources/reduction.2.f90 diff --git a/sources/Example_reduction.3.c b/data_environment/sources/reduction.3.c similarity index 90% rename from sources/Example_reduction.3.c rename to data_environment/sources/reduction.3.c index e5c7812..f1fcd34 100644 --- a/sources/Example_reduction.3.c +++ b/data_environment/sources/reduction.3.c @@ -4,6 +4,7 @@ * @@compilable: yes * @@linkable: yes * @@expect: rt-error +* @@version: omp_5.1 */ #include @@ -13,7 +14,7 @@ int main (void) #pragma omp parallel shared(a) private(i) { - #pragma omp master + #pragma omp masked a = 0; // To avoid race conditions, add a barrier here. diff --git a/sources/Example_reduction.3.f90 b/data_environment/sources/reduction.3.f90 similarity index 100% rename from sources/Example_reduction.3.f90 rename to data_environment/sources/reduction.3.f90 diff --git a/sources/Example_reduction.4.f90 b/data_environment/sources/reduction.4.f90 similarity index 100% rename from sources/Example_reduction.4.f90 rename to data_environment/sources/reduction.4.f90 diff --git a/sources/Example_reduction.5.f90 b/data_environment/sources/reduction.5.f90 similarity index 100% rename from sources/Example_reduction.5.f90 rename to data_environment/sources/reduction.5.f90 diff --git a/sources/Example_reduction.6.c b/data_environment/sources/reduction.6.c similarity index 81% rename from sources/Example_reduction.6.c rename to data_environment/sources/reduction.6.c index 28a507e..b470e45 100644 --- a/sources/Example_reduction.6.c +++ b/data_environment/sources/reduction.6.c @@ -4,7 +4,12 @@ * @@compilable: yes * @@linkable: yes * @@expect: rt-error +* @@version: omp_5.1 */ +#if _OPENMP < 202011 +#define masked master +#endif + #include int main (void) @@ -13,7 +18,7 @@ int main (void) #pragma omp parallel shared(a) private(i) { - #pragma omp master + #pragma omp masked a = 0; // To avoid race conditions, add a barrier here. diff --git a/sources/Example_reduction.6.f b/data_environment/sources/reduction.6.f similarity index 74% rename from sources/Example_reduction.6.f rename to data_environment/sources/reduction.6.f index eed2bab..44e2ff6 100644 --- a/sources/Example_reduction.6.f +++ b/data_environment/sources/reduction.6.f @@ -1,15 +1,21 @@ ! @@name: reduction.6f ! @@type: F-fixed ! @@compilable: yes +! @@requires: preprocessing ! @@linkable: yes ! @@expect: rt-error +! @@version: omp_5.1 +#if _OPENMP < 202011 +#define MASKED MASTER +#endif + INTEGER A, I !$OMP PARALLEL SHARED(A) PRIVATE(I) -!$OMP MASTER +!$OMP MASKED A = 0 -!$OMP END MASTER +!$OMP END MASKED ! To avoid race conditions, add a barrier here. @@ -23,4 +29,5 @@ !$OMP END SINGLE !$OMP END PARALLEL + END diff --git a/sources/Example_reduction.7.c b/data_environment/sources/reduction.7.c similarity index 100% rename from sources/Example_reduction.7.c rename to data_environment/sources/reduction.7.c diff --git a/sources/Example_reduction.7.f90 b/data_environment/sources/reduction.7.f90 similarity index 100% rename from sources/Example_reduction.7.f90 rename to data_environment/sources/reduction.7.f90 diff --git a/sources/Example_scan.1.c b/data_environment/sources/scan.1.c similarity index 100% rename from sources/Example_scan.1.c rename to data_environment/sources/scan.1.c diff --git a/sources/Example_scan.1.f90 b/data_environment/sources/scan.1.f90 similarity index 100% rename from sources/Example_scan.1.f90 rename to data_environment/sources/scan.1.f90 diff --git a/sources/Example_scan.2.c b/data_environment/sources/scan.2.c similarity index 100% rename from sources/Example_scan.2.c rename to data_environment/sources/scan.2.c diff --git a/sources/Example_scan.2.f90 b/data_environment/sources/scan.2.f90 similarity index 100% rename from sources/Example_scan.2.f90 rename to data_environment/sources/scan.2.f90 diff --git a/data_environment/sources/scope_reduction.1.cpp b/data_environment/sources/scope_reduction.1.cpp new file mode 100644 index 0000000..917e377 --- /dev/null +++ b/data_environment/sources/scope_reduction.1.cpp @@ -0,0 +1,39 @@ +/* +* @@name: scope_reduction.1c +* @@type: C++ +* @@compilable: yes +* @@linkable: no +* @@expect: success +* @@version: omp_5.1 +*/ +#include +void do_work(int n, float a[], float &s) +{ + float loc_s = 0.0f; // local sum + static int nthrs; + #pragma omp for + for (int i = 0; i < n; i++) + loc_s += a[i]; + #pragma omp single + { + s = 0.0f; // total sum + nthrs = 0; + } + #pragma omp scope reduction(+:s,nthrs) + { + s += loc_s; + nthrs++; + } + #pragma omp masked + printf("total sum = %f, nthrs = %d\n", s, nthrs); +} + +float work(int n, float a[]) +{ + float s; + #pragma omp parallel + { + do_work(n, a, s); + } + return s; +} diff --git a/data_environment/sources/scope_reduction.1.f90 b/data_environment/sources/scope_reduction.1.f90 new file mode 100644 index 0000000..678b8bc --- /dev/null +++ b/data_environment/sources/scope_reduction.1.f90 @@ -0,0 +1,39 @@ +! @@name: scope_reduction.1f +! @@type: F-free +! @@compilable: yes +! @@linkable: no +! @@expect: success +! @@version: omp_5.1 +subroutine do_work(n, a, s) + implicit none + integer n, i + real a(*), s, loc_s + integer, save :: nthrs + + loc_s = 0.0 ! local sum + !$omp do + do i = 1, n + loc_s = loc_s + a(i) + end do + !$omp single + s = 0.0 ! total sum + nthrs = 0 + !$omp end single + !$omp scope reduction(+:s,nthrs) + s = s + loc_s + nthrs = nthrs + 1 + !$omp end scope + !$omp masked + print *, "total sum = ", s, ", nthrs = ", nthrs + !$omp end masked +end subroutine + +function work(n, a) result(s) + implicit none + integer n + real a(*), s + + !$omp parallel + call do_work(n, a, s) + !$omp end parallel +end function diff --git a/sources/Example_target_reduction.1.c b/data_environment/sources/target_reduction.1.c similarity index 100% rename from sources/Example_target_reduction.1.c rename to data_environment/sources/target_reduction.1.c diff --git a/sources/Example_target_reduction.1.f90 b/data_environment/sources/target_reduction.1.f90 similarity index 100% rename from sources/Example_target_reduction.1.f90 rename to data_environment/sources/target_reduction.1.f90 diff --git a/sources/Example_target_reduction.2.c b/data_environment/sources/target_reduction.2.c similarity index 100% rename from sources/Example_target_reduction.2.c rename to data_environment/sources/target_reduction.2.c diff --git a/sources/Example_target_reduction.2.f90 b/data_environment/sources/target_reduction.2.f90 similarity index 100% rename from sources/Example_target_reduction.2.f90 rename to data_environment/sources/target_reduction.2.f90 diff --git a/sources/Example_target_task_reduction.1.c b/data_environment/sources/target_task_reduction.1.c similarity index 85% rename from sources/Example_target_task_reduction.1.c rename to data_environment/sources/target_task_reduction.1.c index 9011629..d2bde32 100644 --- a/sources/Example_target_task_reduction.1.c +++ b/data_environment/sources/target_task_reduction.1.c @@ -4,8 +4,12 @@ * @@compilable: yes * @@linkable: yes * @@expect: success -* @@version: omp_5.0 +* @@version: omp_5.1 */ +#if _OPENMP < 202011 +#define masked master +#endif + #include #pragma omp declare target to(device_compute) void device_compute(int *); @@ -14,7 +18,7 @@ int main() { int sum = 0; - #pragma omp parallel master + #pragma omp parallel masked #pragma omp taskgroup task_reduction(+:sum) { #pragma omp target in_reduction(+:sum) nowait diff --git a/sources/Example_target_task_reduction.1.f90 b/data_environment/sources/target_task_reduction.1.f90 similarity index 75% rename from sources/Example_target_task_reduction.1.f90 rename to data_environment/sources/target_task_reduction.1.f90 index b66851f..aa46179 100644 --- a/sources/Example_target_task_reduction.1.f90 +++ b/data_environment/sources/target_task_reduction.1.f90 @@ -1,9 +1,14 @@ -! @@name: target_task_reduction.1.f90 -! @@type: F-free -! @@compilable: yes -! @@linkable: no -! @@expect: success -! @@version: omp_5.0 +! @@name: target_task_reduction.1.f90 +! @@type: F-free +! @@compilable: yes +! @@requires: preprocessing +! @@linkable: no +! @@expect: success +! @@version: omp_5.1 +#if _OPENMP < 202011 +#define masked master +#endif + program target_task_reduction_ex1 interface subroutine device_compute(res) @@ -16,7 +21,7 @@ program target_task_reduction_ex1 end interface integer :: sum sum = 0 - !$omp parallel master + !$omp parallel masked !$omp taskgroup task_reduction(+:sum) !$omp target in_reduction(+:sum) nowait call device_compute(sum) @@ -25,7 +30,7 @@ program target_task_reduction_ex1 call host_compute(sum) !$omp end task !$omp end taskgroup - !$omp end parallel master + !$omp end parallel masked print *, "sum = ", sum !!OUTPUT: sum = 2 end program diff --git a/sources/Example_target_task_reduction.2a.c b/data_environment/sources/target_task_reduction.2a.c similarity index 100% rename from sources/Example_target_task_reduction.2a.c rename to data_environment/sources/target_task_reduction.2a.c diff --git a/sources/Example_target_task_reduction.2a.f90 b/data_environment/sources/target_task_reduction.2a.f90 similarity index 100% rename from sources/Example_target_task_reduction.2a.f90 rename to data_environment/sources/target_task_reduction.2a.f90 diff --git a/sources/Example_target_task_reduction.2b.c b/data_environment/sources/target_task_reduction.2b.c similarity index 81% rename from sources/Example_target_task_reduction.2b.c rename to data_environment/sources/target_task_reduction.2b.c index 07054c7..23b39eb 100644 --- a/sources/Example_target_task_reduction.2b.c +++ b/data_environment/sources/target_task_reduction.2b.c @@ -4,8 +4,12 @@ * @@compilable: yes * @@linkable: yes * @@expect: success -* @@version: omp_5.0 +* @@version: omp_5.1 */ +#if _OPENMP < 202011 +#define masked master +#endif + #include #pragma omp declare target to(device_compute) extern void device_compute(int *); @@ -14,7 +18,7 @@ int main() { int sum = 0; - #pragma omp parallel master reduction(task, +:sum) + #pragma omp parallel masked reduction(task, +:sum) { #pragma omp target in_reduction(+:sum) nowait device_compute(&sum); diff --git a/sources/Example_target_task_reduction.2b.f90 b/data_environment/sources/target_task_reduction.2b.f90 similarity index 70% rename from sources/Example_target_task_reduction.2b.f90 rename to data_environment/sources/target_task_reduction.2b.f90 index af0b6d5..ff06ab4 100644 --- a/sources/Example_target_task_reduction.2b.f90 +++ b/data_environment/sources/target_task_reduction.2b.f90 @@ -1,9 +1,14 @@ -! @@name: target_task_reduction.2b.f90 -! @@type: F-free -! @@compilable: yes -! @@linkable: yes -! @@expect: success -! @@version: omp_5.0 +! @@name: target_task_reduction.2b.f90 +! @@type: F-free +! @@compilable: yes +! @@requires: preprocessing +! @@linkable: yes +! @@expect: success +! @@version: omp_5.1 +#if _OPENMP < 202011 +#define masked master +#endif + program target_task_reduction_ex2b interface subroutine device_compute(res) @@ -16,12 +21,12 @@ program target_task_reduction_ex2b end interface integer :: sum sum = 0 - !$omp parallel master reduction(task,+:sum) + !$omp parallel masked reduction(task,+:sum) !$omp target in_reduction(+:sum) nowait call device_compute(sum) !$omp end target call host_compute(sum) - !$omp end parallel sections + !$omp end parallel masked print *, "sum = ", sum !!OUTPUT: sum = 2 end program diff --git a/sources/Example_task_reduction.1.c b/data_environment/sources/task_reduction.1.c similarity index 100% rename from sources/Example_task_reduction.1.c rename to data_environment/sources/task_reduction.1.c diff --git a/sources/Example_task_reduction.1.f90 b/data_environment/sources/task_reduction.1.f90 similarity index 100% rename from sources/Example_task_reduction.1.f90 rename to data_environment/sources/task_reduction.1.f90 diff --git a/sources/Example_task_reduction.2.c b/data_environment/sources/task_reduction.2.c similarity index 100% rename from sources/Example_task_reduction.2.c rename to data_environment/sources/task_reduction.2.c diff --git a/sources/Example_task_reduction.2.f90 b/data_environment/sources/task_reduction.2.f90 similarity index 100% rename from sources/Example_task_reduction.2.f90 rename to data_environment/sources/task_reduction.2.f90 diff --git a/sources/Example_taskloop_reduction.1.c b/data_environment/sources/taskloop_reduction.1.c similarity index 100% rename from sources/Example_taskloop_reduction.1.c rename to data_environment/sources/taskloop_reduction.1.c diff --git a/sources/Example_taskloop_reduction.1.f90 b/data_environment/sources/taskloop_reduction.1.f90 similarity index 96% rename from sources/Example_taskloop_reduction.1.f90 rename to data_environment/sources/taskloop_reduction.1.f90 index 632a08e..cb8e3f2 100644 --- a/sources/Example_taskloop_reduction.1.f90 +++ b/data_environment/sources/taskloop_reduction.1.f90 @@ -4,7 +4,6 @@ ! @@linkable: yes ! @@expect: success ! @@version: omp_5.0 - function array_sum(n, v) result(res) implicit none integer :: n, v(n), res @@ -15,7 +14,7 @@ function array_sum(n, v) result(res) do i=1, n res = res + v(i) end do - !$omp end taskoop + !$omp end taskloop end function array_sum diff --git a/sources/Example_taskloop_reduction.2.c b/data_environment/sources/taskloop_reduction.2.c similarity index 100% rename from sources/Example_taskloop_reduction.2.c rename to data_environment/sources/taskloop_reduction.2.c diff --git a/sources/Example_taskloop_reduction.2.f90 b/data_environment/sources/taskloop_reduction.2.f90 similarity index 97% rename from sources/Example_taskloop_reduction.2.f90 rename to data_environment/sources/taskloop_reduction.2.f90 index 6fb32d9..90b5c0b 100644 --- a/sources/Example_taskloop_reduction.2.f90 +++ b/data_environment/sources/taskloop_reduction.2.f90 @@ -4,7 +4,6 @@ ! @@linkable: yes ! @@expect: success ! @@version: omp_5.0 - function array_sum(n, v) result(res) implicit none integer :: n, v(n), res @@ -21,7 +20,7 @@ function array_sum(n, v) result(res) do i=2, n res = res + v(i) end do - !$omp end taskoop + !$omp end taskloop endif !$omp end taskgroup diff --git a/sources/Example_taskloop_simd_reduction.1.c b/data_environment/sources/taskloop_simd_reduction.1.c similarity index 77% rename from sources/Example_taskloop_simd_reduction.1.c rename to data_environment/sources/taskloop_simd_reduction.1.c index 55b0ec8..6ea9261 100644 --- a/sources/Example_taskloop_simd_reduction.1.c +++ b/data_environment/sources/taskloop_simd_reduction.1.c @@ -4,8 +4,11 @@ * @@compilable: yes * @@linkable: yes * @@expect: success -* @@version: omp_5.0 +* @@version: omp_5.1 */ +#if _OPENMP < 202011 +#define masked master +#endif #include #define N 100 @@ -17,35 +20,35 @@ int main(){ // taskloop reductions - #pragma omp parallel master + #pragma omp parallel masked #pragma omp taskloop reduction(+:asum) //taskloop 1 for(i=0;i @@ -22,7 +25,7 @@ int main(){ #pragma omp parallel { - #pragma omp master + #pragma omp masked #pragma omp target teams distribute parallel for nowait \ map(to: v1[0:n/2]) \ map(to: v2[0:n/2]) \ diff --git a/sources/Example_async_target.3.f90 b/devices/sources/async_target.3.f90 similarity index 84% rename from sources/Example_async_target.3.f90 rename to devices/sources/async_target.3.f90 index 7b2122f..ca2a05a 100644 --- a/sources/Example_async_target.3.f90 +++ b/devices/sources/async_target.3.f90 @@ -1,9 +1,13 @@ ! @@name: async_target.3f ! @@type: F-free ! @@compilable: yes +! @@requires: preprocessing ! @@linkable: no ! @@expect: success -! @@version: omp_4.5 +! @@version: omp_5.1 +#if _OPENMP < 202011 +#define masked master +#endif program concurrent_async use omp_lib @@ -15,13 +19,13 @@ program concurrent_async !$omp parallel - !$omp master + !$omp masked !$omp target teams distribute parallel do nowait & !$omp& map(to: v1(1:n/2)) & !$omp& map(to: v2(1:n/2)) & !$omp& map(from: vxv(1:n/2)) do i = 1,n/2; vxv(i) = v1(i)*v2(i); end do - !$omp end master + !$omp end masked !$omp do schedule(dynamic,chunk) do i = n/2+1,n; vxv(i) = v1(i)*v2(i); end do diff --git a/sources/Example_async_target.4.c b/devices/sources/async_target.4.c similarity index 100% rename from sources/Example_async_target.4.c rename to devices/sources/async_target.4.c diff --git a/sources/Example_async_target.4.f90 b/devices/sources/async_target.4.f90 similarity index 100% rename from sources/Example_async_target.4.f90 rename to devices/sources/async_target.4.f90 diff --git a/sources/Example_declare_target.1.c b/devices/sources/declare_target.1.c similarity index 100% rename from sources/Example_declare_target.1.c rename to devices/sources/declare_target.1.c diff --git a/sources/Example_declare_target.1.f90 b/devices/sources/declare_target.1.f90 similarity index 100% rename from sources/Example_declare_target.1.f90 rename to devices/sources/declare_target.1.f90 diff --git a/sources/Example_declare_target.2.f90 b/devices/sources/declare_target.2.f90 similarity index 75% rename from sources/Example_declare_target.2.f90 rename to devices/sources/declare_target.2.f90 index 061e24a..2025667 100644 --- a/sources/Example_declare_target.2.f90 +++ b/devices/sources/declare_target.2.f90 @@ -6,7 +6,12 @@ ! @@version: omp_4.0 program my_fib integer :: N = 8 -!$omp declare target(fib) +interface + subroutine fib(N) + !$omp declare target + integer :: N + end subroutine fib +end interface !$omp target call fib(N) !$omp end target diff --git a/devices/sources/declare_target.2a.cpp b/devices/sources/declare_target.2a.cpp new file mode 100644 index 0000000..f788cdb --- /dev/null +++ b/devices/sources/declare_target.2a.cpp @@ -0,0 +1,67 @@ +/* +* @@name declare_target.2a +* @@type: C++ +* @@compilable yes +* @@linkable: yes +* @@expect: success +* @@version: omp_5.1 +*/ + +#include +using namespace std; + + #pragma omp begin declare target // declare target -- class and function + class XOR1 + { + int a; + public: + XOR1(int arg): a(arg) {}; + int foo(); + } + int XOR1::foo() { return a^0x01;} + #pragma omp end declare target + + + #pragma omp begin declare target // declare target -- class, not function + class XOR2 + { + int a; + public: + XOR2(int arg): a(arg) {}; + int foo(); + }; + #pragma omp end declare target + + int XOR2::foo() { return a^0x01;} + + + class XOR3 // declare target -- neither class nor function + { + int a; + public: + XOR3(int arg): a(arg) {}; + int foo(); + }; + int XOR3::foo() { return a^0x01;} + + +int main (){ + + XOR1 my_XOR1(3); + XOR2 my_XOR2(3); + XOR3 my_XOR3(3); + int res1, res2, res3; + + #pragma omp target map(tofrom:res1) + res1=my_XOR1.foo(); + + #pragma omp target map(tofrom:res2) + res2=my_XOR2.foo(); + + #pragma omp target map(tofrom:res3) + res3=my_XOR3.foo(); + + cout << res1 << endl; // OUT1: 2 + cout << res2 << endl; // OUT2: 2 + cout << res3 << endl; // OUT3: 2 +} diff --git a/devices/sources/declare_target.2b_classes.hpp b/devices/sources/declare_target.2b_classes.hpp new file mode 100644 index 0000000..cdde89c --- /dev/null +++ b/devices/sources/declare_target.2b_classes.hpp @@ -0,0 +1,9 @@ + #pragma omp begin declare target + class XOR1 + { + int a; + public: + XOR1(int arg): a(arg) {}; + int foo(); + }; + #pragma omp end declare target diff --git a/devices/sources/declare_target.2b_functions.cpp b/devices/sources/declare_target.2b_functions.cpp new file mode 100644 index 0000000..85cb396 --- /dev/null +++ b/devices/sources/declare_target.2b_functions.cpp @@ -0,0 +1,11 @@ +/* +* @@name declare_target.2c +* @@type: C++ +* @@compilable yes +* @@linkable: no +* @@expect: failure +* @@version: omp_5.1 +*/ + +#include "classes.hpp" +int XOR1::foo() { return a^0x01;} diff --git a/devices/sources/declare_target.2b_main.cpp b/devices/sources/declare_target.2b_main.cpp new file mode 100644 index 0000000..a76c28a --- /dev/null +++ b/devices/sources/declare_target.2b_main.cpp @@ -0,0 +1,23 @@ +/* +* @@name declare_target.2b +* @@type: C++ +* @@compilable yes +* @@linkable: no +* @@expect: failure +* @@version: omp_5.1 +*/ +#include +using namespace std; + +#include "classes.hpp" + +int main (){ + + XOR1 my_XOR1(3); + int res1; + + #pragma omp target map(from: res1) + res1=my_XOR1.foo(); + + cout << res1 << endl; // OUT1: 2 +} diff --git a/sources/Example_declare_target.2.cpp b/devices/sources/declare_target.2c.cpp similarity index 100% rename from sources/Example_declare_target.2.cpp rename to devices/sources/declare_target.2c.cpp diff --git a/sources/Example_declare_target.3.c b/devices/sources/declare_target.3.c similarity index 100% rename from sources/Example_declare_target.3.c rename to devices/sources/declare_target.3.c diff --git a/sources/Example_declare_target.3.f90 b/devices/sources/declare_target.3.f90 similarity index 100% rename from sources/Example_declare_target.3.f90 rename to devices/sources/declare_target.3.f90 diff --git a/sources/Example_declare_target.4.c b/devices/sources/declare_target.4.c similarity index 100% rename from sources/Example_declare_target.4.c rename to devices/sources/declare_target.4.c diff --git a/sources/Example_declare_target.4.f90 b/devices/sources/declare_target.4.f90 similarity index 100% rename from sources/Example_declare_target.4.f90 rename to devices/sources/declare_target.4.f90 diff --git a/sources/Example_declare_target.5.c b/devices/sources/declare_target.5.c similarity index 100% rename from sources/Example_declare_target.5.c rename to devices/sources/declare_target.5.c diff --git a/sources/Example_declare_target.5.f90 b/devices/sources/declare_target.5.f90 similarity index 100% rename from sources/Example_declare_target.5.f90 rename to devices/sources/declare_target.5.f90 diff --git a/sources/Example_declare_target.6.c b/devices/sources/declare_target.6.c similarity index 100% rename from sources/Example_declare_target.6.c rename to devices/sources/declare_target.6.c diff --git a/sources/Example_declare_target.6.f90 b/devices/sources/declare_target.6.f90 similarity index 100% rename from sources/Example_declare_target.6.f90 rename to devices/sources/declare_target.6.f90 diff --git a/sources/Example_device.1.c b/devices/sources/device.1.c similarity index 100% rename from sources/Example_device.1.c rename to devices/sources/device.1.c diff --git a/sources/Example_device.1.f90 b/devices/sources/device.1.f90 similarity index 100% rename from sources/Example_device.1.f90 rename to devices/sources/device.1.f90 diff --git a/sources/Example_device.2.c b/devices/sources/device.2.c similarity index 100% rename from sources/Example_device.2.c rename to devices/sources/device.2.c diff --git a/sources/Example_device.2.f90 b/devices/sources/device.2.f90 similarity index 100% rename from sources/Example_device.2.f90 rename to devices/sources/device.2.f90 diff --git a/sources/Example_device.3.c b/devices/sources/device.3.c similarity index 100% rename from sources/Example_device.3.c rename to devices/sources/device.3.c diff --git a/sources/Example_device.3.f90 b/devices/sources/device.3.f90 similarity index 100% rename from sources/Example_device.3.f90 rename to devices/sources/device.3.f90 diff --git a/sources/Example_device.4.c b/devices/sources/device.4.c similarity index 100% rename from sources/Example_device.4.c rename to devices/sources/device.4.c diff --git a/sources/Example_target.1.c b/devices/sources/target.1.c similarity index 100% rename from sources/Example_target.1.c rename to devices/sources/target.1.c diff --git a/sources/Example_target.1.f90 b/devices/sources/target.1.f90 similarity index 100% rename from sources/Example_target.1.f90 rename to devices/sources/target.1.f90 diff --git a/sources/Example_target.2.c b/devices/sources/target.2.c similarity index 100% rename from sources/Example_target.2.c rename to devices/sources/target.2.c diff --git a/sources/Example_target.2.f90 b/devices/sources/target.2.f90 similarity index 100% rename from sources/Example_target.2.f90 rename to devices/sources/target.2.f90 diff --git a/sources/Example_target.3.c b/devices/sources/target.3.c similarity index 100% rename from sources/Example_target.3.c rename to devices/sources/target.3.c diff --git a/sources/Example_target.3.f90 b/devices/sources/target.3.f90 similarity index 100% rename from sources/Example_target.3.f90 rename to devices/sources/target.3.f90 diff --git a/sources/Example_target.4.c b/devices/sources/target.4.c similarity index 100% rename from sources/Example_target.4.c rename to devices/sources/target.4.c diff --git a/sources/Example_target.4.f90 b/devices/sources/target.4.f90 similarity index 100% rename from sources/Example_target.4.f90 rename to devices/sources/target.4.f90 diff --git a/sources/Example_target.4b.f90 b/devices/sources/target.4b.f90 similarity index 100% rename from sources/Example_target.4b.f90 rename to devices/sources/target.4b.f90 diff --git a/sources/Example_target.5.c b/devices/sources/target.5.c similarity index 100% rename from sources/Example_target.5.c rename to devices/sources/target.5.c diff --git a/sources/Example_target.5.f90 b/devices/sources/target.5.f90 similarity index 100% rename from sources/Example_target.5.f90 rename to devices/sources/target.5.f90 diff --git a/sources/Example_target.6.c b/devices/sources/target.6.c similarity index 100% rename from sources/Example_target.6.c rename to devices/sources/target.6.c diff --git a/sources/Example_target.6.f90 b/devices/sources/target.6.f90 similarity index 100% rename from sources/Example_target.6.f90 rename to devices/sources/target.6.f90 diff --git a/devices/sources/target_associate_ptr.1.c b/devices/sources/target_associate_ptr.1.c new file mode 100644 index 0000000..8ec88b9 --- /dev/null +++ b/devices/sources/target_associate_ptr.1.c @@ -0,0 +1,66 @@ +/* +* @@name: target_associate_ptr.1 +* @@type: C +* @@compilable: yes +* @@linkable: yes +* @@expect: success +* @@version: omp_4.5 +*/ +#include +#include + +#define CS 50 +#define N (CS*2) + +int main() { + int arr[N]; + int *dev_ptr; + int dev; + + for (int i = 0; i < N; i++) + arr[i] = i; + + dev = omp_get_default_device(); + + // Allocate device memory + dev_ptr = (int *)omp_target_alloc(sizeof(int) * CS, dev); + + // Loop over chunks + for (int ioff = 0; ioff < N; ioff += CS) { + + // Associate device memory with one chunk of host memory + omp_target_associate_ptr(&arr[ioff], dev_ptr, sizeof(int) * CS, 0, dev); + + printf("before: arr[%d]=%d\n", ioff, arr[ioff]); + + // Update the device data + #pragma omp target update to(arr[ioff:CS]) device(dev) + + // Explicit mapping of arr to make sure that we use the allocated + // and associated memory. + #pragma omp target map(tofrom : arr[ioff:CS]) device(dev) + for (int i = 0; i < CS; i++) { + arr[i+ioff]++; + } + + // Update the host data + #pragma omp target update from(arr[ioff:CS]) device(dev) + + printf("after: arr[%d]=%d\n", ioff, arr[ioff]); + + // Disassociate device pointer from the current chunk of host memory + // before next use + omp_target_disassociate_ptr(&arr[ioff], dev); + } + + // Free device memory + omp_target_free(dev_ptr, dev); + + return 0; +} +/* Outputs: + before: arr[0]=0 + after: arr[0]=1 + before: arr[50]=50 + after: arr[50]=51 + */ diff --git a/devices/sources/target_associate_ptr.1.f90 b/devices/sources/target_associate_ptr.1.f90 new file mode 100644 index 0000000..5a24029 --- /dev/null +++ b/devices/sources/target_associate_ptr.1.f90 @@ -0,0 +1,69 @@ +! @@name: target_associate_ptr.1 +! @@type: F-free +! @@compilable: yes +! @@linkable: yes +! @@expect: success +! @@version: omp_5.1 +program target_associate + use omp_lib + use, intrinsic :: iso_c_binding + implicit none + + integer, parameter :: CS = 50 + integer, parameter :: N = CS*2 + integer, target :: arr(N) + type(c_ptr) :: h_ptr, dev_ptr + integer(c_size_t) :: csize, dev_off + integer(c_int) :: dev + integer :: i, ioff, s + + do i = 1, N + arr(i) = i + end do + + dev = omp_get_default_device() + csize = c_sizeof(arr(1)) * CS + + ! Allocate device memory + dev_ptr = omp_target_alloc(csize, dev) + dev_off = 0 + + ! Loop over chunks + do ioff = 1, N, CS + + ! Associate device memory with one chunk of host memory + h_ptr = c_loc(arr(ioff)) + s = omp_target_associate_ptr(h_ptr, dev_ptr, csize, dev_off, dev) + + print *, "before: arr(", ioff, ")=", arr(ioff) + + ! Update the device data + !$omp target update to(arr(ioff:ioff+CS-1)) device(dev) + + ! Explicit mapping of arr to make sure that we use the allocated + ! and associated memory. + !$omp target map(tofrom: arr(ioff:ioff+CS-1)) device(dev) + do i = 0, CS-1 + arr(i+ioff) = arr(i+ioff) + 1 + end do + !$omp end target + + ! Update the host data + !$omp target update from(arr(ioff:ioff+CS-1)) device(dev) + + print *, "after: arr(", ioff, ")=", arr(ioff) + + ! Disassociate device pointer from the current chunk of host memory + ! before next use + s = omp_target_disassociate_ptr(h_ptr, dev) + end do + + ! Free device memory + call omp_target_free(dev_ptr, dev) + +end +! Outputs: +! before: arr( 1 )= 1 +! after: arr( 1 )= 2 +! before: arr( 51 )= 51 +! after: arr( 51 )= 52 diff --git a/sources/Example_target_data.1.c b/devices/sources/target_data.1.c similarity index 100% rename from sources/Example_target_data.1.c rename to devices/sources/target_data.1.c diff --git a/sources/Example_target_data.1.f90 b/devices/sources/target_data.1.f90 similarity index 100% rename from sources/Example_target_data.1.f90 rename to devices/sources/target_data.1.f90 diff --git a/sources/Example_target_data.2.c b/devices/sources/target_data.2.c similarity index 100% rename from sources/Example_target_data.2.c rename to devices/sources/target_data.2.c diff --git a/sources/Example_target_data.2.f90 b/devices/sources/target_data.2.f90 similarity index 100% rename from sources/Example_target_data.2.f90 rename to devices/sources/target_data.2.f90 diff --git a/sources/Example_target_data.3.c b/devices/sources/target_data.3.c similarity index 100% rename from sources/Example_target_data.3.c rename to devices/sources/target_data.3.c diff --git a/sources/Example_target_data.3.f90 b/devices/sources/target_data.3.f90 similarity index 100% rename from sources/Example_target_data.3.f90 rename to devices/sources/target_data.3.f90 diff --git a/sources/Example_target_data.4.c b/devices/sources/target_data.4.c similarity index 100% rename from sources/Example_target_data.4.c rename to devices/sources/target_data.4.c diff --git a/sources/Example_target_data.4.f90 b/devices/sources/target_data.4.f90 similarity index 100% rename from sources/Example_target_data.4.f90 rename to devices/sources/target_data.4.f90 diff --git a/sources/Example_target_data.5.cpp b/devices/sources/target_data.5.cpp similarity index 100% rename from sources/Example_target_data.5.cpp rename to devices/sources/target_data.5.cpp diff --git a/sources/Example_target_data.5.f90 b/devices/sources/target_data.5.f90 similarity index 96% rename from sources/Example_target_data.5.f90 rename to devices/sources/target_data.5.f90 index 79d0c45..490de6f 100644 --- a/sources/Example_target_data.5.f90 +++ b/devices/sources/target_data.5.f90 @@ -32,4 +32,5 @@ integer, parameter :: N=1024 real,allocatable, dimension(:) :: p, v1, v2 allocate( p(N), v1(N), v2(N) ) call foo(p,v1,v2,N) + deallocate( p, v1, v2 ) end program diff --git a/sources/Example_target_data.6.c b/devices/sources/target_data.6.c similarity index 100% rename from sources/Example_target_data.6.c rename to devices/sources/target_data.6.c diff --git a/sources/Example_target_data.6.f90 b/devices/sources/target_data.6.f90 similarity index 100% rename from sources/Example_target_data.6.f90 rename to devices/sources/target_data.6.f90 diff --git a/sources/Example_target_data.7.c b/devices/sources/target_data.7.c similarity index 100% rename from sources/Example_target_data.7.c rename to devices/sources/target_data.7.c diff --git a/sources/Example_target_data.7.f90 b/devices/sources/target_data.7.f90 similarity index 100% rename from sources/Example_target_data.7.f90 rename to devices/sources/target_data.7.f90 diff --git a/sources/Example_target_defaultmap.1.c b/devices/sources/target_defaultmap.1.c similarity index 100% rename from sources/Example_target_defaultmap.1.c rename to devices/sources/target_defaultmap.1.c diff --git a/sources/Example_target_defaultmap.1.f90 b/devices/sources/target_defaultmap.1.f90 similarity index 99% rename from sources/Example_target_defaultmap.1.f90 rename to devices/sources/target_defaultmap.1.f90 index 4b6d8da..69a9d59 100644 --- a/sources/Example_target_defaultmap.1.f90 +++ b/devices/sources/target_defaultmap.1.f90 @@ -102,4 +102,6 @@ program defaultmap if(A(1)==0 .and. D%A(1)==0 .and. H(1)==0 .and. s1==3) & print*," PASSED 4 of 4" + deallocate(H) + end program diff --git a/devices/sources/target_fort_allocatable_map.1.f90 b/devices/sources/target_fort_allocatable_map.1.f90 new file mode 100644 index 0000000..d681ce6 --- /dev/null +++ b/devices/sources/target_fort_allocatable_map.1.f90 @@ -0,0 +1,50 @@ +! @@name: fort_allocatable_map.1f90 +! @@type: F-free +! @@compilable: yes +! @@linkable: yes +! @@expect: success +! @@version: omp_5.1 +program main + implicit none + integer :: i + + integer, save, allocatable :: d(:) + !$omp declare target(d) + + integer, allocatable :: a(:) + integer, allocatable :: b(:) + integer, allocatable :: c(:) + + allocate(a(4)) + !$omp target ! Target 1 + a(:) = 4 + !$omp end target + print *, a ! prints 4*4 + + allocate(b(4)) + !$omp target map(b) ! Target 2 + b(:) = 4 + !$omp end target + print *, b ! prints 4*4 + + !$omp target data map(c) + + allocate(c(4), source=[1,2,3,4]) + !$omp target map(always,tofrom:c) ! Target 3 + c(:) = 4 + !$omp end target + print *, c ! prints 4*4 + + deallocate(c) + + !$omp end target data + + allocate(d(4), source=[1,2,3,4]) + !$omp target map(always,tofrom:d) ! Target 4 + d(:) = d(:) + [ ( i,i=size(d),1,-1) ] + !$omp end target + print *, d ! prints 4*5 + + deallocate(a, b, d) + +end program diff --git a/devices/sources/target_fort_allocatable_map.2.f90 b/devices/sources/target_fort_allocatable_map.2.f90 new file mode 100644 index 0000000..144d0eb --- /dev/null +++ b/devices/sources/target_fort_allocatable_map.2.f90 @@ -0,0 +1,22 @@ +! @@name: fort_allocatable_map.2f90 +! @@type: F-free +! @@compilable: yes +! @@linkable: no +! @@expect: fail +! @@version: omp_5.1 +program main + implicit none + + integer, allocatable :: a(:,:), b(:) + integer :: x(10,2) + + allocate(a(2,10)) + + !$omp target ! Target 1 + a=x ! reshape (or resize) NOT ALLOWED (implicit change) + deallocate(a) ! allocation status change of a NOT ALLOWED + allocate(b(20)) ! allocation status change of b NOT ALLOWED + print*, "ERROR: status change and resize/shaping NOT ALLOWED in target rgn." + !$omp end target + +end program diff --git a/devices/sources/target_fort_allocatable_map.3.f90 b/devices/sources/target_fort_allocatable_map.3.f90 new file mode 100644 index 0000000..affc849 --- /dev/null +++ b/devices/sources/target_fort_allocatable_map.3.f90 @@ -0,0 +1,34 @@ +! @@name: fort_allocatable_map.3f90 +! @@type: F-free +! @@compilable: yes +! @@linkable: no +! @@expect: fail +! @@version: omp_5.1 +module corfu +contains + subroutine foo(ain,bout) + implicit none + integer, allocatable, intent( in) :: ain(:) + integer, allocatable, intent(out) :: bout(:) !"out" causes de/reallocate + !$omp declare target + bout = ain + end subroutine +end module + +program main + use corfu + implicit none + + integer, allocatable :: a(:) + integer, allocatable :: b(:) + allocate(a(10),b(10)) + a(:)=10 + b(:)=10 + + !$omp target + + call foo(a,b) !ERROR: b deallocation/reallocation not allowed in target region + + !$omp end target + +end program diff --git a/sources/Example_target_mapper.1.c b/devices/sources/target_mapper.1.c similarity index 100% rename from sources/Example_target_mapper.1.c rename to devices/sources/target_mapper.1.c diff --git a/sources/Example_target_mapper.1.f90 b/devices/sources/target_mapper.1.f90 similarity index 100% rename from sources/Example_target_mapper.1.f90 rename to devices/sources/target_mapper.1.f90 diff --git a/sources/Example_target_mapper.2.c b/devices/sources/target_mapper.2.c similarity index 100% rename from sources/Example_target_mapper.2.c rename to devices/sources/target_mapper.2.c diff --git a/sources/Example_target_mapper.2.f90 b/devices/sources/target_mapper.2.f90 similarity index 100% rename from sources/Example_target_mapper.2.f90 rename to devices/sources/target_mapper.2.f90 diff --git a/sources/Example_target_mapper.3.c b/devices/sources/target_mapper.3.c similarity index 100% rename from sources/Example_target_mapper.3.c rename to devices/sources/target_mapper.3.c diff --git a/sources/Example_target_mapper.3.f90 b/devices/sources/target_mapper.3.f90 similarity index 100% rename from sources/Example_target_mapper.3.f90 rename to devices/sources/target_mapper.3.f90 diff --git a/sources/Example_target_ptr_map.1.c b/devices/sources/target_ptr_map.1.c similarity index 100% rename from sources/Example_target_ptr_map.1.c rename to devices/sources/target_ptr_map.1.c diff --git a/sources/Example_target_ptr_map.2.c b/devices/sources/target_ptr_map.2.c similarity index 100% rename from sources/Example_target_ptr_map.2.c rename to devices/sources/target_ptr_map.2.c diff --git a/sources/Example_target_ptr_map.3a.c b/devices/sources/target_ptr_map.3a.c similarity index 100% rename from sources/Example_target_ptr_map.3a.c rename to devices/sources/target_ptr_map.3a.c diff --git a/sources/Example_target_ptr_map.3b.c b/devices/sources/target_ptr_map.3b.c similarity index 100% rename from sources/Example_target_ptr_map.3b.c rename to devices/sources/target_ptr_map.3b.c diff --git a/devices/sources/target_ptr_map.4.c b/devices/sources/target_ptr_map.4.c new file mode 100644 index 0000000..e209679 --- /dev/null +++ b/devices/sources/target_ptr_map.4.c @@ -0,0 +1,34 @@ +/* +* @@name: target_ptr_map_4.c +* @@type: C +* @@compilable: yes +* @@linkable: no +* @@expect: success +* @@version: omp_5.1 +*/ +#include +#include +#include + +void do_work(int *ptr, const int size); + +int main() +{ + const int n = 1000; + const int buf_size = sizeof(int) * n; + const int dev = omp_get_default_device(); + + int *ptr = (int *) malloc(buf_size); // possibly compiled on + // Unified Shared Memory system + const int accessible = omp_target_is_accessible(ptr, buf_size, dev); + + #pragma omp metadirective \ + when(user={condition(accessible)}: target firstprivate(ptr) ) \ + default( target map(ptr[:n]) ) + { + do_work(ptr, n); + } + + free(ptr); + return 0; +} diff --git a/devices/sources/target_ptr_map.5.c b/devices/sources/target_ptr_map.5.c new file mode 100644 index 0000000..ed1061a --- /dev/null +++ b/devices/sources/target_ptr_map.5.c @@ -0,0 +1,40 @@ +/* +* @@name: target_ptr_map_5.c +* @@type: C +* @@compilable: yes +* @@linkable: no +* @@expect: success +* @@version: omp_5.1 +*/ +#include +#include +#include + +typedef struct { + int *ptr; + int buf_size; +} T; + +#pragma omp declare mapper(deep_copy: T s) map(s, s.ptr[:s.buf_size]) + +void do_work(int *ptr, const int size); + +int main() +{ + const int n = 1000; + const int buf_size = sizeof(int) * n; + T s = { 0, buf_size }; + const int dev = omp_get_default_device(); + s.ptr = (int *)malloc(buf_size); + const int accessible = omp_target_is_accessible(s.ptr, s.buf_size, dev); + + #pragma omp metadirective \ + when(user={condition(accessible)}: target) \ + default( target map(mapper(deep_copy),tofrom:s) ) + { + do_work(s.ptr, n); + } + + free(s.ptr); + return 0; +} diff --git a/devices/sources/target_ptr_map.5.f90 b/devices/sources/target_ptr_map.5.f90 new file mode 100644 index 0000000..6a59927 --- /dev/null +++ b/devices/sources/target_ptr_map.5.f90 @@ -0,0 +1,44 @@ +! @@name: target_ptr_map_5.f90 +! @@type: F-free +! @@compilable: yes +! @@linkable: no +! @@expect: success +! @@version: omp_5.1 +program main + use omp_lib + + use, intrinsic :: iso_c_binding, only : c_loc, c_size_t, c_sizeof, c_int + implicit none + external :: do_work + + type T + integer,pointer :: ptr(:) + integer :: buf_size + end type + + !$omp declare mapper(deep_copy: T :: s) map(s, s%ptr(:s%buf_size)) + + integer,parameter :: n = 1000 + integer(c_int) :: dev, accessible + integer(c_size_t) :: buf_size + + type(T) s + + allocate(s%ptr(n)) + + buf_size = c_sizeof(s%ptr(1))*n + dev = omp_get_default_device() + + accessible = omp_target_is_accessible(c_loc(s%ptr(1)), buf_size, dev) + + !$omp begin metadirective & + !$omp& when(user={condition(accessible)}: target) & + !$omp& default( target map(mapper(deep_copy),tofrom:s) ) + + call do_work(s, n) + + !$omp end metadirective + + deallocate(s%ptr) + +end program diff --git a/sources/Example_target_reverse_offload.7.c b/devices/sources/target_reverse_offload.7.c similarity index 100% rename from sources/Example_target_reverse_offload.7.c rename to devices/sources/target_reverse_offload.7.c diff --git a/sources/Example_target_reverse_offload.7.f90 b/devices/sources/target_reverse_offload.7.f90 similarity index 100% rename from sources/Example_target_reverse_offload.7.f90 rename to devices/sources/target_reverse_offload.7.f90 diff --git a/sources/Example_target_struct_map.1.c b/devices/sources/target_struct_map.1.c similarity index 98% rename from sources/Example_target_struct_map.1.c rename to devices/sources/target_struct_map.1.c index b358afb..660178a 100644 --- a/sources/Example_target_struct_map.1.c +++ b/devices/sources/target_struct_map.1.c @@ -43,5 +43,7 @@ int main() printf(" %4.0f %4.0f\n", S.p[0], S.p[N-1]); // 4 202 <- output + + free(S.p); return 0; } diff --git a/sources/Example_target_struct_map.2.c b/devices/sources/target_struct_map.2.c similarity index 100% rename from sources/Example_target_struct_map.2.c rename to devices/sources/target_struct_map.2.c diff --git a/sources/Example_target_struct_map.2.cpp b/devices/sources/target_struct_map.2.cpp similarity index 100% rename from sources/Example_target_struct_map.2.cpp rename to devices/sources/target_struct_map.2.cpp diff --git a/devices/sources/target_struct_map.3.c b/devices/sources/target_struct_map.3.c new file mode 100644 index 0000000..361a82d --- /dev/null +++ b/devices/sources/target_struct_map.3.c @@ -0,0 +1,68 @@ +/* +* @@name: target_struct_map.3c +* @@type: C +* @@compilable: yes +* @@linkable: yes +* @@expect: failure +* @@version: omp_5.0 +*/ +#include +#include +#define N 100 +#define BAZILLION 2000000 + +struct foo { + char buffera[BAZILLION]; + char bufferb[BAZILLION]; + float x; + float a, b; + float *p; +}; + +#pragma omp declare target +void saxpyfun(struct foo *S) +{ + int i; + for(i=0; ip[i] = S->p[i] * S->a + S->b; // S->p[i] invalid +} +#pragma omp end declare target + +int main() +{ + struct foo S1, S2; + int i; + + // Case 1 + + S1.a = 2.0; + S1.b = 4.0; + S1.p = (float *)malloc(sizeof(float)*N); + for(i=0; i +#include +#define N 100 +#define BAZILLION 2000000 + +struct foo { + char buffera[BAZILLION]; + char bufferb[BAZILLION]; + float x; + float a, b; + float *p; +}; + +#pragma omp declare target +void saxpyfun(struct foo *S) +{ + int i; + for(i=0; ip[i] = S->p[i]*S->a + S->b; +} +#pragma omp end declare target + +int main() +{ + struct foo S1, S2, S3; + int i; + + // Case 1 + + S1.a = 2.0; + S1.b = 4.0; + S1.p = (float *)malloc(sizeof(float)*N); + for(i=0; i +#include +#define NT 4 +#define thrd_no omp_get_thread_num + +#pragma omp declare simd linear(i) simdlen(4) +#pragma omp declare simd linear(i) simdlen(8) +double P(int i){ return (double)i * (double)i; } + +[[ omp :: directive( declare simd linear(i) simdlen(4) ) ]] +[[ omp :: directive( declare simd linear(i) simdlen(8) ) ]] +double Q(int i){ return (double)i * (double)i; } + +int main(){ + + #pragma omp parallel for num_threads(NT) // PRAG 1 + for(int i=0; i +#include +#define NT 4 +#define thrd_no omp_get_thread_num + +int main(){ + #pragma omp parallel for num_threads(NT) // PRAG 1 + for(int i=0; i +#define NT 12 + +int main(){ +int error=0, A[NT],C[NT]; +for(int i = 0; i +#include +#include + +ompt_start_tool_result_t *ompt_start_tool( +unsigned int omp_version, +const char *runtime_version +){ + if(omp_version != _OPENMP) + printf("Warning: OpenMP runtime version (%i) " + "does not match the compile time version (%i)" + " for runtime identifying as %s\n", + omp_version, _OPENMP, runtime_version); + // Returning NULL will disable this as an OMPT tool, + // allowing other tools to be loaded + return NULL; +} + +int main(void){ + printf("Running with %i threads\n", omp_get_max_threads()); + return 0; +} diff --git a/openmp-example.tex b/openmp-example.tex deleted file mode 100644 index c795675..0000000 --- a/openmp-example.tex +++ /dev/null @@ -1,83 +0,0 @@ -% Welcome to openmp-examples.tex. -% This is the master LaTex file for the OpenMP Examples document. -% -% The files in this set include: -% -% openmp-examples.tex - this file, the master file -% Makefile - makes the document -% openmp.sty - the main style file -% Title_Page.tex - the title page -% openmplogo.png - the logo -% Introduction_Chapt.tex - unnumbered introductory chapter -% Examples_Chapt.tex - unnumbered chapter -% Examples_Sects.tex - examples -% sources/*.c, *.f - C/C++/Fortran example source files -% -% When editing this file: -% -% 1. To change formatting, appearance, or style, please edit openmp.sty. -% -% 2. Custom commands and macros are defined in openmp.sty. -% -% 3. Be kind to other editors -- keep a consistent style by copying-and-pasting to -% create new content. -% -% 4. We use semantic markup, e.g. (see openmp.sty for a full list): -% \code{} % for bold monospace keywords, code, operators, etc. -% \plc{} % for italic placeholder names, grammar, etc. -% -% 5. Other recommendations: -% Use the convenience macros defined in openmp.sty for the minor headers -% such as Comments, Syntax, etc. -% -% To keep items together on the same page, prefer the use of -% \begin{samepage}.... Avoid \parbox for text blocks as it interrupts line numbering. -% When possible, avoid \filbreak, \pagebreak, \newpage, \clearpage unless that's -% what you mean. Use \needspace{} cautiously for troublesome paragraphs. -% -% Avoid absolute lengths and measures in this file; use relative units when possible. -% Vertical space can be relative to \baselineskip or ex units. Horizontal space -% can be relative to \linewidth or em units. -% -% Prefer \emph{} to italicize terminology, e.g.: -% This is a \emph{definition}, not a placeholder. -% This is a \plc{var-name}. -% - -% The following says letter size, but the style sheet may change the size -\documentclass[10pt,letterpaper,twoside,makeidx,hidelinks]{scrreprt} - -% Text to appear in the footer on even-numbered pages: -\newcommand{\VER}{5.0.1} -\newcommand{\PVER}{\VER{}p1} -\newcommand{\VERDATE}{May 2020} -\newcommand{\footerText}{OpenMP Examples Version \PVER{} - \VERDATE} - -% Unified style sheet for OpenMP documents: -\input{openmp.sty} - - -\begin{document} - \pagenumbering{roman} - - \setcounter{page}{0} - \setcounter{tocdepth}{2} - - - % Uncomment the next line to enable line numbering on the main body text: - \linenumbers\pagewiselinenumbers - - \newpage\pagenumbering{arabic} - - \setcounter{chapter}{0} % start chapter numbering here - - % \input{Chap_Single} - \input{Example} - - %\setcounter{chapter}{0} % restart chapter numbering with "letter A" - %\renewcommand{\thechapter}{\Alph{chapter}}% - %\appendix - %\input{History} - -\end{document} - diff --git a/openmp-examples.tex b/openmp-examples.tex index 686e7d1..4503a30 100644 --- a/openmp-examples.tex +++ b/openmp-examples.tex @@ -1,9 +1,9 @@ % Welcome to openmp-examples.tex. -% This is the master LaTex file for the OpenMP Examples document. +% This is the main LaTex file for the OpenMP Examples document. % % The files in this set include: % -% openmp-examples.tex - this file, the master file +% openmp-examples.tex - this file, the main file % Makefile - makes the document % openmp.sty - the main style file % Title_Page.tex - the title page @@ -49,9 +49,9 @@ \documentclass[10pt,letterpaper,twoside,makeidx,hidelinks]{scrreprt} % Text to appear in the footer on even-numbered pages: -\newcommand{\VER}{5.0.1} +\newcommand{\VER}{5.1} \newcommand{\PVER}{\VER{}} -\newcommand{\VERDATE}{June 2020} +\newcommand{\VERDATE}{August 2021} \newcommand{\footerText}{OpenMP Examples Version \PVER{} - \VERDATE} % Unified style sheet for OpenMP documents: @@ -62,7 +62,7 @@ \pagenumbering{roman} \input{Title_Page} - \setcounter{page}{0} + \setcounter{page}{1} \setcounter{tocdepth}{2} \begin{spacing}{1.3} @@ -74,142 +74,27 @@ \input{Foreword_Chapt} - \newpage\pagenumbering{arabic} + \cleardoublepage + \pagenumbering{arabic} \input{Introduction_Chapt} \input{Examples_Chapt} + \input{Deprecated_Features_Chapt} \setcounter{chapter}{0} % start chapter numbering here + \input{Chap_directives} \input{Chap_parallel_execution} - \input{Examples_ploop} - \input{Examples_parallel} - \input{Examples_host_teams} - \input{Examples_nthrs_nesting} - \input{Examples_nthrs_dynamic} - \input{Examples_fort_do} - \input{Examples_nowait} - \input{Examples_collapse} - % linear Clause 475 - \input{Examples_linear_in_loop} - \input{Examples_psections} - \input{Examples_fpriv_sections} - \input{Examples_single} - \input{Examples_workshare} - \input{Examples_master} - \input{Examples_loop} - \input{Examples_pra_iterator} - \input{Examples_set_dynamic_nthrs} - \input{Examples_get_nthrs} - \input{Chap_affinity} - \input{Examples_affinity} - \input{Examples_task_affinity} - \input{Examples_affinity_display} - \input{Examples_affinity_query} - \input{Chap_tasking} - \input{Examples_tasking} - \input{Examples_task_priority} - \input{Examples_task_dep} - \input{Examples_taskgroup} - \input{Examples_taskyield} - \input{Examples_taskloop} - \input{Examples_parallel_master_taskloop} - \input{Chap_devices} - \input{Examples_target} - \input{Examples_target_defaultmap} - \input{Examples_target_pointer_mapping} - \input{Examples_target_structure_mapping} - \input{Examples_array_sections} - \input{Examples_array_shaping} - \input{Examples_target_mapper} - \input{Examples_target_data} - \input{Examples_target_unstructured_data} - \input{Examples_target_update} - \input{Examples_declare_target} - % Link clause 474 - \input{Examples_teams} - \input{Examples_async_target_depend} - \input{Examples_async_target_with_tasks} - %Title change of 57.1 and 57.2 - %New subsection - \input{Examples_async_target_nowait} - \input{Examples_async_target_nowait_depend} - % \input{Examples_array_sections} moved after struct_ptr_map - % Structure Element in map 487 no 579 - \input{Examples_device} - % MemoryRoutine and Device ptr 473 - \input{Chap_SIMD} - \input{Examples_SIMD} - % Forward Depend 370 - % simdlen 476 - % simd linear modifier 480 - \input{Examples_linear_modifier} - - + \input{Chap_loop_transformations} \input{Chap_synchronization} - \input{Examples_critical} - \input{Examples_worksharing_critical} - \input{Examples_barrier_regions} - \input{Examples_atomic} - \input{Examples_atomic_restrict} - \input{Examples_flush_nolist} - \input{Examples_acquire_release} - \input{Examples_ordered} - \input{Examples_depobj} - % Doacross loop 405 - \input{Examples_doacross} - \input{Examples_locks} - \input{Examples_init_lock} - \input{Examples_init_lock_with_hint} - \input{Examples_lock_owner} - \input{Examples_simple_lock} - \input{Examples_nestable_lock} - % % LOCK with Hints 478 - % % Hint Clause xxxxxx (included after init_lock) - % % Lock routines with hint - - \input{Chap_data_environment} - \input{Examples_threadprivate} - \input{Examples_default_none} - \input{Examples_private} - \input{Examples_fort_loopvar} - \input{Examples_fort_sp_common} - \input{Examples_fort_sa_private} - \input{Examples_carrays_fpriv} - \input{Examples_lastprivate} - \input{Examples_reduction} - % User UDR 287 - \input{Examples_udr} - \input{Examples_scan} - \input{Examples_copyin} - \input{Examples_copyprivate} - \input{Examples_cpp_reference} - % Fortran 2003 features 482 - \input{Examples_associate} %section--> subsection - \input{Chap_memory_model} - \input{Examples_mem_model} - \input{Examples_allocators} - \input{Examples_fort_race} - \input{Chap_program_control} - \input{Examples_cond_comp} - \input{Examples_icv} - % If multi-ifs 471 - \input{Examples_standalone} - \input{Examples_cancellation} - \input{Examples_requires} - \input{Examples_variant} - \input{Examples_metadirective} - % New Section Nested Regions - \input{Examples_nested_loop} - \input{Examples_nesting_restrict} - \input{Examples_target_offload} + \input{Chap_ompt_interface} \setcounter{chapter}{0} % restart chapter numbering with "letter A" diff --git a/openmp.sty b/openmp.sty index 7bf1fa8..541a220 100644 --- a/openmp.sty +++ b/openmp.sty @@ -1,5 +1,5 @@ % This is openmp.sty, the preamble and style definitions for the OpenMP specification. -% This is an include file. Please see the master file for more information. +% This is an include file. Please see the main file for more information. % % When editing this file: % @@ -184,7 +184,7 @@ \renewcommand{\headrulewidth}{0pt} % Left side on even pages: -% This requires that \footerText be defined in the master document: +% This requires that \footerText be defined in the main document: \fancyfoot[LE]{\bfseries \thepage \mdseries \hspace{2em} \footerText} \fancyhfoffset[E]{4em} @@ -196,28 +196,34 @@ % Section header format - we use four levels: \chapter \section \subsection \subsubsection. \usepackage{titlesec} % format headers with \titleformat{} +\usepackage{tocloft} % Format and spacing for chapter, section, subsection, and subsubsection headers: \setcounter{secnumdepth}{4} % show numbers down to subsubsection level -\titleformat{\chapter}[display]% +\titleformat{\chapter}[hang]% {\normalfont\sffamily\upshape\Huge\bfseries\fontsize{20}{20}\selectfont}% -{\normalfont\sffamily\scshape\large\bfseries \hspace{-0.7in} \MakeUppercase% - {\chaptertitlename} \thechapter}% -{0.8in}{}[\vspace{2ex}\hrule] -\titlespacing{\chapter}{0ex}{0em plus 1em minus 1em}{3em plus 1em minus 1em}[10em] +{\thechapter}{0.5em}{} +\titlespacing{\chapter}{0ex}{0em plus 1em minus 1em}{2em plus 1em minus 0em}[10em] -\titleformat{\section}[hang]{\huge\bfseries\sffamily\fontsize{16}{16}\selectfont}{\thesection}{1.0em}{} -\titlespacing{\section}{-5em}{5em plus 1em minus 1em}{1em plus 0.5em minus 0em}[10em] +\titleformat{\section}[hang]{\huge\bfseries\sffamily\fontsize{16}{16}\selectfont}{\thesection}{0.5em}{} +\titlespacing{\section}{0em}{3em plus 1em minus 1em}{1em plus 0.5em minus 0em}[10em] -\titleformat{\subsection}[hang]{\LARGE\bfseries\sffamily\fontsize{14}{14}\selectfont}{\thesubsection}{1.0em}{} -\titlespacing{\subsection}{-5em}{4em plus 1em minus 2.0em}{0.75em plus 0.5em minus 0em}[10em] +\titleformat{\subsection}[hang]{\LARGE\bfseries\sffamily\fontsize{14}{14}\selectfont}{\thesubsection}{0.5em}{} +\titlespacing{\subsection}{0em}{3em plus 1em minus 1em}{0.75em plus 0.5em minus 0em}[10em] \titleformat{\subsubsection}[hang]{\needspace{1\baselineskip}% -\Large\bfseries\sffamily\fontsize{12}{12}\selectfont}{\thesubsubsection}{1.0em}{} -\titlespacing{\subsubsection}{-5em}{3em plus 1em minus 1em}{0.5em plus 0.5em minus 0em}[10em] +\Large\bfseries\sffamily\fontsize{12}{12}\selectfont}{\thesubsubsection}{0.5em}{} +\titlespacing{\subsubsection}{0em}{3em plus 1em minus 1em}{0.5em plus 0.5em minus 0em}[10em] +\setlength{\cftbeforetoctitleskip}{1.0ex} +\setlength{\cftaftertoctitleskip}{3.0ex} +\renewcommand{\cftchapaftersnum}{} +\makeatletter +\renewcommand{\l@section}{\@dottedtocline{1}{1.5em}{2.6em}} +\renewcommand{\l@subsection}{\@dottedtocline{2}{4.1em}{3.4em}} +\makeatother %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% % Macros for minor headers: Summary, Syntax, Description, etc. @@ -240,6 +246,21 @@ \newcommand{\summary} {\littleheader{Summary}} \newcommand{\syntax} {\littleheader{Syntax}} +\usepackage{scrlayer} +\DeclareNewLayer[foreground,textarea,contents={ +\phantom{a} +\emph{This page intentionally left blank} + } +]{intentionally.text} +\DeclareNewPageStyleByLayers{intentionally}{intentionally.text} +\renewcommand{\cleardoublepage}{\cleardoubleoddpageusingstyle{intentionally}} +\newcommand{\chapdirname}{} +\newcommand{\cchapter}[2] {\cleardoublepage\chapter{#1}% + \renewcommand{\chapdirname}{#2}} +\newcommand{\bchapter}[1] {\chapter*{#1}% + \addcontentsline{toc}{chapter}{\protect\numberline{}#1}} +%\newcommand{\sinput}[1] {\input{\chapdirname/#1}} + %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% % Code and placeholder semantic tagging. @@ -264,11 +285,13 @@ \usepackage{alltt} % This sets the default \code{} font to tt (monospace) and bold: -\newcommand{\code}[1]{{\texttt{\textbf{#1}}}} -\newcommand{\nspace}[1]{{\textrm{\textmd{ }}}} +\newcommand\code[1]{\texttt{\textbf{#1}}} +\newcommand\scode[1]{\protect\textbf{\protect\texttt{\protect\detokenize{#1}}}} +\newcommand\nspace[1]{\textrm{\textmd{ }}} % This defines the \plc{} placeholder font to be tt normal slanted: -\newcommand{\plc}[1] {{\textrm{\textmd{\itshape{#1}}}}} +\newcommand\plc[1]{\textrm{\textmd{\itshape{#1}}}} +\newcommand\splc[1]{\protect\textit{\protect\textrm{\protect\detokenize{#1}}}} % Environment for a paragraph of literal code, single-spaced, no outline, no indenting: \newenvironment{codepar}[1] @@ -425,6 +448,7 @@ \usepackage{color,fancyvrb} % for \VerbatimInput \usepackage{toolbox} % for \toolboxMakeSplit +\usepackage{xargs} % for optional args \renewcommand\theFancyVerbLine{\normalfont\footnotesize\sffamily S-\arabic{FancyVerbLine}} @@ -437,7 +461,7 @@ \newcommand{\escstr}[1]{\myreplace{_}{\_}{#1}} -\def\exampleheader#1#2#3#4#5{% +\newcommand{\exampleheader}[6]{% \ifthenelse{ \equal{#1}{} }{ \def\cname{#2} \def\ename\cname @@ -446,63 +470,69 @@ % Use following line for old numbering % \def\ename{\thechapter.#2.#3} % Use following for mneumonics - \def\ename{\escstr{#1}.#2.#3} + \def\ename{\escstr{#1}.\escstr{#2}.#3} } \newcount\cnt \cnt=#4 - \ifthenelse{ \equal{#5}{} }{ + \ifthenelse{ \equal{#5}{0} }{}{\global\advance\cnt by #5} + + \ifthenelse{ \equal{#6}{} }{ \def\vername{} }{ \def\myver##1{\toolboxSplitAt{##1}{_}\lefttext\righttext \lefttext\toolboxIfElse{\ifx\righttext\undefined}% {\global\advance\cnt by 1}{\expandafter{\righttext}}} - \def\vername{\;\;(\code{\small{}omp\_\myver{#5}})} + \def\vername{\;\;(\code{\small{}omp\_\myver{#6}})} } \noindent \textit{Example \ename}\vername \def\fcnt{\the\cnt} %\vspace*{-3mm} \code{\VerbatimInput[numbers=left,numbersep=5ex,firstnumber=1,firstline=\fcnt,fontsize=\small]% - {sources/Example_\cname}} + {\chapdirname/sources/\cname}} } -\newcommand\cnexample[3][]{% - \exampleheader{#2}{#3}{c}{8}{#1} +\newcommandx*\cnexample[4][1=,4=0]{% + \exampleheader{#2}{#3}{c}{8}{#4}{#1} } -\newcommand\cppnexample[3][]{% - \exampleheader{#2}{#3}{cpp}{8}{#1} +\newcommandx*\cppnexample[4][1=,4=0]{% + \exampleheader{#2}{#3}{cpp}{8}{#4}{#1} } -\newcommand\fnexample[3][]{% - \exampleheader{#2}{#3}{f}{6}{#1} +\newcommandx*\fnexample[4][1=,4=0]{% + \exampleheader{#2}{#3}{f}{6}{#4}{#1} } -\newcommand\ffreenexample[3][]{% - \exampleheader{#2}{#3}{f90}{6}{#1} +\newcommandx*\ffreenexample[4][1=,4=0]{% + \exampleheader{#2}{#3}{f90}{6}{#4}{#1} } -\newcommand\cexample[3][]{% +\newcommandx*\srcnexample[5][1=,5=0]{% + \exampleheader{#2}{#3}{#4}{0}{#5}{#1} +} + +\newcommandx*\cexample[4][1=,4=0]{% \needspace{5\baselineskip}\ccppspecificstart -\cnexample[#1]{#2}{#3} +\cnexample[#1]{#2}{#3}[#4] \ccppspecificend } -\newcommand\cppexample[3][]{% +\newcommandx*\cppexample[4][1=,4=0]{% \needspace{5\baselineskip}\cppspecificstart -\cppnexample[#1]{#2}{#3} +\cppnexample[#1]{#2}{#3}[#4] \cppspecificend } -\newcommand\fexample[3][]{% +\newcommandx*\fexample[4][1=,4=0]{% \needspace{5\baselineskip}\fortranspecificstart -\fnexample[#1]{#2}{#3} +\fnexample[#1]{#2}{#3}[#4] \fortranspecificend } -\newcommand\ffreeexample[3][]{% +\newcommandx*\ffreeexample[4][1=,4=0]{% \needspace{5\baselineskip}\fortranspecificstart -\ffreenexample[#1]{#2}{#3} +\ffreenexample[#1]{#2}{#3}[#4] \fortranspecificend } diff --git a/Examples_collapse.tex b/parallel_execution/collapse.tex similarity index 79% rename from Examples_collapse.tex rename to parallel_execution/collapse.tex index 6986ae4..7591c6b 100644 --- a/Examples_collapse.tex +++ b/parallel_execution/collapse.tex @@ -1,5 +1,5 @@ \pagebreak -\section{The \code{collapse} Clause} +\section{\code{collapse} Clause} \label{sec:collapse} In the following example, the \code{k} and \code{j} loops are associated with @@ -74,5 +74,21 @@ The code prints \cexample[3.0]{collapse}{3} \fexample[3.0]{collapse}{3} +\clearpage +The following example illustrates the collapse of a non-rectangular loop nest, +a new feature in OpenMP 5.0. In a loop nest, a non-rectangular loop has a +loop bound that references the iteration variable of an enclosing loop. + +The motivation for this feature is illustrated +in the example below that creates a symmetric correlation matrix for a set of +variables. Note that the initial value of the second loop depends on the index +variable of the first loop for the loops to be collapsed. +Here the data are represented by a 2D array, each row corresponds to a variable +and each column corresponds to a sample of the variable -- the last two columns +are the sample mean and standard deviation (for Fortran, rows and columns are swapped). + +\cexample[5.0]{collapse}{4} + +\ffreeexample[5.0]{collapse}{4} diff --git a/Examples_fort_do.tex b/parallel_execution/fort_do.tex similarity index 100% rename from Examples_fort_do.tex rename to parallel_execution/fort_do.tex diff --git a/Examples_fpriv_sections.tex b/parallel_execution/fpriv_sections.tex similarity index 91% rename from Examples_fpriv_sections.tex rename to parallel_execution/fpriv_sections.tex index 5dd2f17..319586e 100644 --- a/Examples_fpriv_sections.tex +++ b/parallel_execution/fpriv_sections.tex @@ -1,5 +1,5 @@ \pagebreak -\section{The \code{firstprivate} Clause and the \code{sections} Construct} +\section{\code{firstprivate} Clause and \code{sections} Construct} \label{sec:fpriv_sections} In the following example of the \code{sections} construct the \code{firstprivate} diff --git a/Examples_get_nthrs.tex b/parallel_execution/get_nthrs.tex similarity index 91% rename from Examples_get_nthrs.tex rename to parallel_execution/get_nthrs.tex index c0eb26b..e11311c 100644 --- a/Examples_get_nthrs.tex +++ b/parallel_execution/get_nthrs.tex @@ -1,5 +1,5 @@ \pagebreak -\section{The \code{omp\_get\_num\_threads} Routine} +\section{\code{omp\_get\_num\_threads} Routine} \label{sec:get_nthrs} In the following example, the \code{omp\_get\_num\_threads} call returns 1 in diff --git a/Examples_host_teams.tex b/parallel_execution/host_teams.tex similarity index 100% rename from Examples_host_teams.tex rename to parallel_execution/host_teams.tex diff --git a/Examples_linear_in_loop.tex b/parallel_execution/linear_in_loop.tex similarity index 100% rename from Examples_linear_in_loop.tex rename to parallel_execution/linear_in_loop.tex diff --git a/Examples_loop.tex b/parallel_execution/loop.tex similarity index 92% rename from Examples_loop.tex rename to parallel_execution/loop.tex index 2c99180..11b6300 100644 --- a/Examples_loop.tex +++ b/parallel_execution/loop.tex @@ -1,5 +1,5 @@ \pagebreak -\section{The \code{loop} Construct} +\section{\code{loop} Construct} \label{sec:loop} The following example illustrates the use of the OpenMP 5.0 \code{loop} diff --git a/parallel_execution/masked.tex b/parallel_execution/masked.tex new file mode 100644 index 0000000..f421bed --- /dev/null +++ b/parallel_execution/masked.tex @@ -0,0 +1,13 @@ +\pagebreak +\section{\code{masked} Construct} +\label{sec:masked} + +The following example demonstrates the masked construct. In the example, the primary thread +keeps track of how many iterations have been executed and prints out a progress +report. The other threads skip the \code{masked} region without waiting. + +\cexample[5.1]{masked}{1} + +\fexample[5.1]{masked}{1} + + diff --git a/Examples_nowait.tex b/parallel_execution/nowait.tex similarity index 96% rename from Examples_nowait.tex rename to parallel_execution/nowait.tex index a71b19d..ec97ab1 100644 --- a/Examples_nowait.tex +++ b/parallel_execution/nowait.tex @@ -1,5 +1,5 @@ \pagebreak -\section{The \code{nowait} Clause} +\section{\code{nowait} Clause} \label{sec:nowait} If there are multiple independent loops within a \code{parallel} region, you diff --git a/Examples_nthrs_dynamic.tex b/parallel_execution/nthrs_dynamic.tex similarity index 100% rename from Examples_nthrs_dynamic.tex rename to parallel_execution/nthrs_dynamic.tex diff --git a/Examples_nthrs_nesting.tex b/parallel_execution/nthrs_nesting.tex similarity index 100% rename from Examples_nthrs_nesting.tex rename to parallel_execution/nthrs_nesting.tex diff --git a/Examples_parallel.tex b/parallel_execution/parallel.tex similarity index 88% rename from Examples_parallel.tex rename to parallel_execution/parallel.tex index 142340c..8cdeee6 100644 --- a/Examples_parallel.tex +++ b/parallel_execution/parallel.tex @@ -1,5 +1,5 @@ \pagebreak -\section{The \code{parallel} Construct} +\section{\code{parallel} Construct} \label{sec:parallel} The \code{parallel} construct can be used in coarse-grain parallel programs. diff --git a/Examples_ploop.tex b/parallel_execution/ploop.tex similarity index 100% rename from Examples_ploop.tex rename to parallel_execution/ploop.tex diff --git a/Examples_pra_iterator.tex b/parallel_execution/pra_iterator.tex similarity index 100% rename from Examples_pra_iterator.tex rename to parallel_execution/pra_iterator.tex diff --git a/Examples_psections.tex b/parallel_execution/psections.tex similarity index 85% rename from Examples_psections.tex rename to parallel_execution/psections.tex index 08094de..b5a7f43 100644 --- a/Examples_psections.tex +++ b/parallel_execution/psections.tex @@ -1,5 +1,5 @@ \pagebreak -\section{The \code{parallel} \code{sections} Construct} +\section{\code{parallel} \code{sections} Construct} \label{sec:psections} In the following example routines \code{XAXIS}, \code{YAXIS}, and \code{ZAXIS} can diff --git a/Examples_set_dynamic_nthrs.tex b/parallel_execution/set_dynamic_nthrs.tex similarity index 96% rename from Examples_set_dynamic_nthrs.tex rename to parallel_execution/set_dynamic_nthrs.tex index e29f788..9bcf686 100644 --- a/Examples_set_dynamic_nthrs.tex +++ b/parallel_execution/set_dynamic_nthrs.tex @@ -1,5 +1,5 @@ \pagebreak -\section{The \code{omp\_set\_dynamic} and \\ +\section{\code{omp\_set\_dynamic} and \\ \code{omp\_set\_num\_threads} Routines} \label{sec:set_dynamic_nthrs} diff --git a/Examples_single.tex b/parallel_execution/single.tex similarity index 94% rename from Examples_single.tex rename to parallel_execution/single.tex index 4cabfa5..c434e88 100644 --- a/Examples_single.tex +++ b/parallel_execution/single.tex @@ -1,5 +1,5 @@ \pagebreak -\section{The \code{single} Construct} +\section{\code{single} Construct} \label{sec:single} The following example demonstrates the \code{single} construct. In the example, diff --git a/sources/Example_collapse.1.c b/parallel_execution/sources/collapse.1.c similarity index 100% rename from sources/Example_collapse.1.c rename to parallel_execution/sources/collapse.1.c diff --git a/sources/Example_collapse.1.f b/parallel_execution/sources/collapse.1.f similarity index 100% rename from sources/Example_collapse.1.f rename to parallel_execution/sources/collapse.1.f diff --git a/sources/Example_collapse.2.c b/parallel_execution/sources/collapse.2.c similarity index 100% rename from sources/Example_collapse.2.c rename to parallel_execution/sources/collapse.2.c diff --git a/sources/Example_collapse.2.f b/parallel_execution/sources/collapse.2.f similarity index 100% rename from sources/Example_collapse.2.f rename to parallel_execution/sources/collapse.2.f diff --git a/sources/Example_collapse.3.c b/parallel_execution/sources/collapse.3.c similarity index 100% rename from sources/Example_collapse.3.c rename to parallel_execution/sources/collapse.3.c diff --git a/sources/Example_collapse.3.f b/parallel_execution/sources/collapse.3.f similarity index 100% rename from sources/Example_collapse.3.f rename to parallel_execution/sources/collapse.3.f diff --git a/parallel_execution/sources/collapse.4.c b/parallel_execution/sources/collapse.4.c new file mode 100644 index 0000000..b43d525 --- /dev/null +++ b/parallel_execution/sources/collapse.4.c @@ -0,0 +1,40 @@ +/* +* @@name: collapse.4c +* @@type: C +* @@compilable: yes +* @@linkable: no +* @@expect: success +* @@version: omp_5.0 +*/ +#include +#define N 20 +#define M 10 + +// routine to calculate a +// For variable a[i]: +// a[i][0],...,a[i][n-1] contains the n samples +// a[i][n] contains the sample mean +// a[i][n+1] contains the standard deviation +extern void calc_a(int n,int m, float a[][N+2]); + +int main(){ + float a[M][N+2], b[M][M]; + + calc_a(N,M,a); + + #pragma omp parallel for collapse(2) + for (int i = 0; i < M; i++) + for (int j = i; j < M; j++) + { + float temp = 0.0f; + for (int k = 0; k < N; k++) + temp += (a[i][k]-a[i][N])*(a[j][k]-a[j][N]); + + b[i][j] = temp / (a[i][N+1] * a[j][N+1] * (N - 1)); + b[j][i] = b[i][j]; + } + + printf("b[0][0] = %f, b[M-1][M-1] = %f\n", b[0][0], b[M-1][M-1]); + + return 0; +} diff --git a/parallel_execution/sources/collapse.4.f90 b/parallel_execution/sources/collapse.4.f90 new file mode 100644 index 0000000..c5a0653 --- /dev/null +++ b/parallel_execution/sources/collapse.4.f90 @@ -0,0 +1,45 @@ +! @@name: collapse.4f +! @@type: F-free +! @@compilable: yes +! @@linkable: no +! @@expect: success +! @@version: omp_5.0 +module calc_m + interface + subroutine calc_a(n, m, a) + integer n, m + real a(n+2,m) + ! routine to calculate a + ! For variable a(*,j): + ! a(1,j),...,a(n,j) contains the n samples + ! a(n+1,j) contains the sample mean + ! a(n+2,j) contains the standard deviation + end subroutine + end interface +end module + +program main + use calc_m + integer, parameter :: N=20, M=10 + real a(N+2,M), b(M,M) + real temp + integer i, j, k + + call calc_a(N,M,a) + + !$omp parallel do collapse(2) private(k,temp) + do i = 1, M + do j = i, M + temp = 0.0 + do k = 1, N + temp = temp + (a(k,i)-a(N+1,i))*(a(k,j)-a(N+1,j)) + end do + + b(i,j) = temp / (a(N+2,i) * a(N+2,j) * (N - 1)) + b(j,i) = b(i,j) + end do + end do + + print *,"b(1,1) = ",b(1,1),", b(M,M) = ",b(M,M) + +end program diff --git a/sources/Example_fort_do.1.f b/parallel_execution/sources/fort_do.1.f similarity index 100% rename from sources/Example_fort_do.1.f rename to parallel_execution/sources/fort_do.1.f diff --git a/sources/Example_fort_do.2.f b/parallel_execution/sources/fort_do.2.f similarity index 100% rename from sources/Example_fort_do.2.f rename to parallel_execution/sources/fort_do.2.f diff --git a/sources/Example_fpriv_sections.1.c b/parallel_execution/sources/fpriv_sections.1.c similarity index 100% rename from sources/Example_fpriv_sections.1.c rename to parallel_execution/sources/fpriv_sections.1.c diff --git a/sources/Example_fpriv_sections.1.f90 b/parallel_execution/sources/fpriv_sections.1.f90 similarity index 100% rename from sources/Example_fpriv_sections.1.f90 rename to parallel_execution/sources/fpriv_sections.1.f90 diff --git a/sources/Example_get_nthrs.1.c b/parallel_execution/sources/get_nthrs.1.c similarity index 100% rename from sources/Example_get_nthrs.1.c rename to parallel_execution/sources/get_nthrs.1.c diff --git a/sources/Example_get_nthrs.1.f b/parallel_execution/sources/get_nthrs.1.f similarity index 100% rename from sources/Example_get_nthrs.1.f rename to parallel_execution/sources/get_nthrs.1.f diff --git a/sources/Example_get_nthrs.2.c b/parallel_execution/sources/get_nthrs.2.c similarity index 100% rename from sources/Example_get_nthrs.2.c rename to parallel_execution/sources/get_nthrs.2.c diff --git a/sources/Example_get_nthrs.2.f b/parallel_execution/sources/get_nthrs.2.f similarity index 100% rename from sources/Example_get_nthrs.2.f rename to parallel_execution/sources/get_nthrs.2.f diff --git a/sources/Example_host_teams.1.c b/parallel_execution/sources/host_teams.1.c similarity index 100% rename from sources/Example_host_teams.1.c rename to parallel_execution/sources/host_teams.1.c diff --git a/sources/Example_host_teams.1.f90 b/parallel_execution/sources/host_teams.1.f90 similarity index 100% rename from sources/Example_host_teams.1.f90 rename to parallel_execution/sources/host_teams.1.f90 diff --git a/sources/Example_linear_in_loop.1.c b/parallel_execution/sources/linear_in_loop.1.c similarity index 100% rename from sources/Example_linear_in_loop.1.c rename to parallel_execution/sources/linear_in_loop.1.c diff --git a/sources/Example_linear_in_loop.1.f90 b/parallel_execution/sources/linear_in_loop.1.f90 similarity index 100% rename from sources/Example_linear_in_loop.1.f90 rename to parallel_execution/sources/linear_in_loop.1.f90 diff --git a/sources/Example_loop.1.c b/parallel_execution/sources/loop.1.c similarity index 100% rename from sources/Example_loop.1.c rename to parallel_execution/sources/loop.1.c diff --git a/sources/Example_loop.1.f90 b/parallel_execution/sources/loop.1.f90 similarity index 100% rename from sources/Example_loop.1.f90 rename to parallel_execution/sources/loop.1.f90 diff --git a/sources/Example_master.1.c b/parallel_execution/sources/masked.1.c similarity index 85% rename from sources/Example_master.1.c rename to parallel_execution/sources/masked.1.c index b11cb70..41e1e38 100644 --- a/sources/Example_master.1.c +++ b/parallel_execution/sources/masked.1.c @@ -1,15 +1,16 @@ /* -* @@name: master.1c +* @@name: masked.1c * @@type: C * @@compilable: yes * @@linkable: no * @@expect: success +* @@version: omp_5.1 */ #include extern float average(float,float,float); -void master_example( float* x, float* xold, int n, float tol ) +void masked_example( float* x, float* xold, int n, float tol ) { int c, i, toobig; float error, y; @@ -32,7 +33,7 @@ void master_example( float* x, float* xold, int n, float tol ) error = y - x[i]; if( error > tol || error < -tol ) ++toobig; } - #pragma omp master + #pragma omp masked { ++c; printf( "iteration %d, toobig=%d\n", c, toobig ); diff --git a/sources/Example_master.1.f b/parallel_execution/sources/masked.1.f similarity index 82% rename from sources/Example_master.1.f rename to parallel_execution/sources/masked.1.f index e27a4c3..ae293cc 100644 --- a/sources/Example_master.1.f +++ b/parallel_execution/sources/masked.1.f @@ -1,9 +1,10 @@ -! @@name: master.1f +! @@name: masked.1f ! @@type: F-fixed ! @@compilable: yes ! @@linkable: no ! @@expect: success - SUBROUTINE MASTER_EXAMPLE( X, XOLD, N, TOL ) +! @@version: omp_5.1 + SUBROUTINE MASKED_EXAMPLE( X, XOLD, N, TOL ) REAL X(*), XOLD(*), TOL INTEGER N INTEGER C, I, TOOBIG @@ -27,10 +28,10 @@ ERROR = Y-X(I) IF( ERROR > TOL .OR. ERROR < -TOL ) TOOBIG = TOOBIG+1 ENDDO -!$OMP MASTER +!$OMP MASKED C = C + 1 PRINT *, 'Iteration ', C, 'TOOBIG=', TOOBIG -!$OMP END MASTER +!$OMP END MASKED ENDDO !$OMP END PARALLEL - END SUBROUTINE MASTER_EXAMPLE + END SUBROUTINE MASKED_EXAMPLE diff --git a/sources/Example_nowait.1.c b/parallel_execution/sources/nowait.1.c similarity index 100% rename from sources/Example_nowait.1.c rename to parallel_execution/sources/nowait.1.c diff --git a/sources/Example_nowait.1.f b/parallel_execution/sources/nowait.1.f similarity index 100% rename from sources/Example_nowait.1.f rename to parallel_execution/sources/nowait.1.f diff --git a/sources/Example_nowait.2.c b/parallel_execution/sources/nowait.2.c similarity index 100% rename from sources/Example_nowait.2.c rename to parallel_execution/sources/nowait.2.c diff --git a/sources/Example_nowait.2.f90 b/parallel_execution/sources/nowait.2.f90 similarity index 100% rename from sources/Example_nowait.2.f90 rename to parallel_execution/sources/nowait.2.f90 diff --git a/sources/Example_nthrs_dynamic.1.c b/parallel_execution/sources/nthrs_dynamic.1.c similarity index 100% rename from sources/Example_nthrs_dynamic.1.c rename to parallel_execution/sources/nthrs_dynamic.1.c diff --git a/sources/Example_nthrs_dynamic.1.f b/parallel_execution/sources/nthrs_dynamic.1.f similarity index 100% rename from sources/Example_nthrs_dynamic.1.f rename to parallel_execution/sources/nthrs_dynamic.1.f diff --git a/sources/Example_nthrs_dynamic.2.c b/parallel_execution/sources/nthrs_dynamic.2.c similarity index 100% rename from sources/Example_nthrs_dynamic.2.c rename to parallel_execution/sources/nthrs_dynamic.2.c diff --git a/sources/Example_nthrs_dynamic.2.f b/parallel_execution/sources/nthrs_dynamic.2.f similarity index 100% rename from sources/Example_nthrs_dynamic.2.f rename to parallel_execution/sources/nthrs_dynamic.2.f diff --git a/sources/Example_nthrs_nesting.1.c b/parallel_execution/sources/nthrs_nesting.1.c similarity index 100% rename from sources/Example_nthrs_nesting.1.c rename to parallel_execution/sources/nthrs_nesting.1.c diff --git a/sources/Example_nthrs_nesting.1.f b/parallel_execution/sources/nthrs_nesting.1.f similarity index 100% rename from sources/Example_nthrs_nesting.1.f rename to parallel_execution/sources/nthrs_nesting.1.f diff --git a/sources/Example_parallel.1.c b/parallel_execution/sources/parallel.1.c similarity index 100% rename from sources/Example_parallel.1.c rename to parallel_execution/sources/parallel.1.c diff --git a/sources/Example_parallel.1.f b/parallel_execution/sources/parallel.1.f similarity index 100% rename from sources/Example_parallel.1.f rename to parallel_execution/sources/parallel.1.f diff --git a/sources/Example_ploop.1.c b/parallel_execution/sources/ploop.1.c similarity index 100% rename from sources/Example_ploop.1.c rename to parallel_execution/sources/ploop.1.c diff --git a/sources/Example_ploop.1.f b/parallel_execution/sources/ploop.1.f similarity index 100% rename from sources/Example_ploop.1.f rename to parallel_execution/sources/ploop.1.f diff --git a/sources/Example_pra_iterator.1.cpp b/parallel_execution/sources/pra_iterator.1.cpp similarity index 100% rename from sources/Example_pra_iterator.1.cpp rename to parallel_execution/sources/pra_iterator.1.cpp diff --git a/sources/Example_psections.1.c b/parallel_execution/sources/psections.1.c similarity index 100% rename from sources/Example_psections.1.c rename to parallel_execution/sources/psections.1.c diff --git a/sources/Example_psections.1.f b/parallel_execution/sources/psections.1.f similarity index 100% rename from sources/Example_psections.1.f rename to parallel_execution/sources/psections.1.f diff --git a/sources/Example_set_dynamic_nthrs.1.c b/parallel_execution/sources/set_dynamic_nthrs.1.c similarity index 100% rename from sources/Example_set_dynamic_nthrs.1.c rename to parallel_execution/sources/set_dynamic_nthrs.1.c diff --git a/sources/Example_set_dynamic_nthrs.1.f b/parallel_execution/sources/set_dynamic_nthrs.1.f similarity index 100% rename from sources/Example_set_dynamic_nthrs.1.f rename to parallel_execution/sources/set_dynamic_nthrs.1.f diff --git a/sources/Example_single.1.c b/parallel_execution/sources/single.1.c similarity index 100% rename from sources/Example_single.1.c rename to parallel_execution/sources/single.1.c diff --git a/sources/Example_single.1.f b/parallel_execution/sources/single.1.f similarity index 100% rename from sources/Example_single.1.f rename to parallel_execution/sources/single.1.f diff --git a/sources/Example_workshare.1.f b/parallel_execution/sources/workshare.1.f similarity index 100% rename from sources/Example_workshare.1.f rename to parallel_execution/sources/workshare.1.f diff --git a/sources/Example_workshare.2.f b/parallel_execution/sources/workshare.2.f similarity index 100% rename from sources/Example_workshare.2.f rename to parallel_execution/sources/workshare.2.f diff --git a/sources/Example_workshare.3.f b/parallel_execution/sources/workshare.3.f similarity index 100% rename from sources/Example_workshare.3.f rename to parallel_execution/sources/workshare.3.f diff --git a/sources/Example_workshare.4.f b/parallel_execution/sources/workshare.4.f similarity index 100% rename from sources/Example_workshare.4.f rename to parallel_execution/sources/workshare.4.f diff --git a/sources/Example_workshare.5.f b/parallel_execution/sources/workshare.5.f similarity index 100% rename from sources/Example_workshare.5.f rename to parallel_execution/sources/workshare.5.f diff --git a/sources/Example_workshare.6.f b/parallel_execution/sources/workshare.6.f similarity index 100% rename from sources/Example_workshare.6.f rename to parallel_execution/sources/workshare.6.f diff --git a/sources/Example_workshare.7.f b/parallel_execution/sources/workshare.7.f similarity index 100% rename from sources/Example_workshare.7.f rename to parallel_execution/sources/workshare.7.f diff --git a/Examples_workshare.tex b/parallel_execution/workshare.tex similarity index 98% rename from Examples_workshare.tex rename to parallel_execution/workshare.tex index ea1207f..4e15b1c 100644 --- a/Examples_workshare.tex +++ b/parallel_execution/workshare.tex @@ -1,5 +1,5 @@ \pagebreak -\section{The \code{workshare} Construct} +\section{\code{workshare} Construct} \fortranspecificstart \label{sec:workshare} diff --git a/Examples_cancellation.tex b/program_control/cancellation.tex similarity index 92% rename from Examples_cancellation.tex rename to program_control/cancellation.tex index 7ebcd60..f2ee3dd 100644 --- a/Examples_cancellation.tex +++ b/program_control/cancellation.tex @@ -6,7 +6,7 @@ The following example shows how the \code{cancel} directive can be used to termi an OpenMP region. Although the \code{cancel} construct terminates the OpenMP worksharing region, programmers must still track the exception through the pointer ex and issue a cancellation for the \code{parallel} region if an exception has -been raised. The master thread checks the exception pointer to make sure that the +been raised. The primary thread checks the exception pointer to make sure that the exception is properly handled in the sequential part. If cancellation of the \code{parallel} region has been requested, some threads might have executed \code{phase\_1()}. However, it is guaranteed that none of the threads executed \code{phase\_2()}. @@ -34,11 +34,11 @@ task group to control the effect of the \code{cancel taskgroup} directive. The \plc{level} argument is used to create undeferred tasks after the first ten levels of the tree. -\cexample[4.0]{cancellation}{2} +\cexample[5.1]{cancellation}{2} The following is the equivalent parallel search example in Fortran. -\ffreeexample[4.0]{cancellation}{2} +\ffreeexample[5.1]{cancellation}{2}[1] diff --git a/Examples_cond_comp.tex b/program_control/cond_comp.tex similarity index 100% rename from Examples_cond_comp.tex rename to program_control/cond_comp.tex diff --git a/Examples_icv.tex b/program_control/icv.tex similarity index 100% rename from Examples_icv.tex rename to program_control/icv.tex diff --git a/program_control/interop.tex b/program_control/interop.tex new file mode 100644 index 0000000..6902067 --- /dev/null +++ b/program_control/interop.tex @@ -0,0 +1,28 @@ +\pagebreak +\section{\code{interop} Construct} +\label{sec:interop} + +The \scode{interop} construct allows OpenMP to interoperate with foreign runtime environments. +In the example below, asynchronous cuda memory copies and a \splc{cublasDaxpy} routine are executed +in a cuda stream. Also, an asynchronous target task execution (having a \scode{nowait} clause) +and two explicit tasks are executed through OpenMP directives. Scheduling dependences (synchronization) are +imposed on the foreign stream and the OpenMP tasks through \scode{depend} clauses. + +First, an interop object, \splc{obj}, is initialized for synchronization by including the +\scode{targetsync} \splc{interop-type} in the interop \scode{init} clause +(\scode{init(}~\scode{targetsync,obj}~\scode{)}). +The object provides access to the foreign runtime. +The \scode{depend} clause provides a dependence behavior +for foreign tasks associated with a valid object. + +Next, the \scode{omp_get_interop_int} routine is used to extract the foreign +runtime id (\scode{omp_ipr_fr_id}), and a test in the next statement ensures +that the cuda runtime (\scode{omp_ifr_cuda}) is available. + +Within the block for executing the \splc{cublasDaxpy} routine, a stream is acquired +with the \scode{omp_get_interop_ptr} routine, which returns a cuda stream (\splc{s}). +The stream is included in the cublas handle, and used directly in the asynchronous memory +routines. The following \scode{interop} construct, with the \scode{destroy} clause, +ensures that the foreign tasks have completed. + +\cexample[5.1]{interop}{1} diff --git a/Examples_metadirective.tex b/program_control/metadirective.tex similarity index 69% rename from Examples_metadirective.tex rename to program_control/metadirective.tex index a3c1340..efcdb45 100644 --- a/Examples_metadirective.tex +++ b/program_control/metadirective.tex @@ -1,5 +1,5 @@ \pagebreak -\section{Metadirective Directive} +\section{Metadirectives} \label{sec:metadirective} A \code{metadirective} directive provides a mechanism to select a directive in @@ -51,8 +51,6 @@ any clauses, as prescribed by the \code{default} clause. \ffreeexample[5.0]{metadirective}{2} -\clearpage - %\pagebreak In the third example, a \plc{construct} selector set is specified in the \code{when} clause. Here, a \code{metadirective} directive is used within a function that is also @@ -86,3 +84,40 @@ as the \plc{variant} directive of the \code{metadirective} directive within the \cexample[5.0]{metadirective}{3} \ffreeexample[5.0]{metadirective}{3} + +The \code{user} selector set can be used in a metadirective +to select directives at execution time when the +\code{condition(}~\plc{boolean-expr}~\code{)} selector expression is not a constant expression. +In this case it is a \plc{dynamic} trait set, and the selection is made at run time, rather +than at compile time. + +In the following example the \plc{foo} function employs the \code{condition} +selector to choose a device for execution at run time. +In the \plc{bar} routine metadirectives are nested. +At the outer level a selection between serial and parallel execution in performed +at run time, followed by another run time selection on the schedule kind in the inner +level when the active \plc{construct} trait is \code{parallel}. + +(Note, the variable \plc{b} in two of the ``selected'' constructs is declared private for the sole purpose +of detecting and reporting that the construct is used. Since the variable is private, its value +is unchanged outside of the construct region, whereas it is changed if the ``unselected'' construct +is used.) + +%(Note: The value of \plc{b} after the \code{parallel} region remains 0 for the +%\code{guided} scheduling case, because its \code{parallel} construct also contains +%the \code{private(}~\plc{b}~\code{)} clause. +%The variable \plc{b} is employed for the sole purpose of distinguishing which +%\code{parallel} construct is selected-- for testing.) + +%While there might be other ways to make these decisions at run time, such as using +%an \code{if} clause on a \code{parallel} construct, this mechanism is much more general. +%For instance, an input ``gpu\_type'' string could be used and tested in boolean expressions +%to select from one of several possible \code{target} constructs. +%Also, setting the scheduling variable (\plc{unbalanced}) within the execution through a +%``work balance'' function might be a more practical approach for setting the schedule kind. + + +\cexample[5.1]{metadirective}{4} + +\ffreeexample[5.1]{metadirective}{4} + diff --git a/Examples_nested_loop.tex b/program_control/nested_loop.tex similarity index 100% rename from Examples_nested_loop.tex rename to program_control/nested_loop.tex diff --git a/Examples_nesting_restrict.tex b/program_control/nesting_restrict.tex similarity index 100% rename from Examples_nesting_restrict.tex rename to program_control/nesting_restrict.tex diff --git a/Examples_requires.tex b/program_control/requires.tex similarity index 96% rename from Examples_requires.tex rename to program_control/requires.tex index b53b399..a5c5276 100644 --- a/Examples_requires.tex +++ b/program_control/requires.tex @@ -1,5 +1,5 @@ \pagebreak -\section{The \code{requires} Directive} +\section{\code{requires} Directive} \label{sec:requires} The declarative \code{requires} directive can be used to diff --git a/sources/Example_cancellation.1.cpp b/program_control/sources/cancellation.1.cpp similarity index 100% rename from sources/Example_cancellation.1.cpp rename to program_control/sources/cancellation.1.cpp diff --git a/sources/Example_cancellation.1.f90 b/program_control/sources/cancellation.1.f90 similarity index 100% rename from sources/Example_cancellation.1.f90 rename to program_control/sources/cancellation.1.f90 diff --git a/sources/Example_cancellation.2.c b/program_control/sources/cancellation.2.c similarity index 94% rename from sources/Example_cancellation.2.c rename to program_control/sources/cancellation.2.c index 283ce7c..2170277 100644 --- a/sources/Example_cancellation.2.c +++ b/program_control/sources/cancellation.2.c @@ -4,8 +4,12 @@ * @@compilable: yes * @@linkable: no * @@expect: success -* @@version: omp_4.0 +* @@version: omp_5.1 */ +#if _OPENMP < 202011 +#define masked master +#endif + #include typedef struct binary_tree_s { @@ -49,7 +53,7 @@ binary_tree_t *search_tree_parallel(binary_tree_t *tree, int value) { binary_tree_t *found = NULL; #pragma omp parallel shared(found, tree, value) { -#pragma omp master +#pragma omp masked { #pragma omp taskgroup { diff --git a/sources/Example_cancellation.2.f90 b/program_control/sources/cancellation.2.f90 similarity index 92% rename from sources/Example_cancellation.2.f90 rename to program_control/sources/cancellation.2.f90 index 590ed04..2934724 100644 --- a/sources/Example_cancellation.2.f90 +++ b/program_control/sources/cancellation.2.f90 @@ -1,9 +1,14 @@ ! @@name: cancellation.2f ! @@type: F-free ! @@compilable: yes +! @@requires: preprocessing ! @@linkable: no ! @@expect: success -! @@version: omp_4.0 +! @@version: omp_5.1 +#if _OPENMP < 202011 +#define masked master +#endif + module parallel_search type binary_tree integer :: value @@ -56,11 +61,11 @@ contains found => NULL() !$omp parallel shared(found, tree, value) -!$omp master +!$omp masked !$omp taskgroup call search_tree(tree, value, 0, found) !$omp end taskgroup -!$omp end master +!$omp end masked !$omp end parallel end subroutine diff --git a/sources/Example_cond_comp.1.c b/program_control/sources/cond_comp.1.c similarity index 100% rename from sources/Example_cond_comp.1.c rename to program_control/sources/cond_comp.1.c diff --git a/sources/Example_cond_comp.1.f b/program_control/sources/cond_comp.1.f similarity index 100% rename from sources/Example_cond_comp.1.f rename to program_control/sources/cond_comp.1.f diff --git a/sources/Example_declare_variant.1.c b/program_control/sources/declare_variant.1.c similarity index 100% rename from sources/Example_declare_variant.1.c rename to program_control/sources/declare_variant.1.c diff --git a/sources/Example_declare_variant.1.f90 b/program_control/sources/declare_variant.1.f90 similarity index 100% rename from sources/Example_declare_variant.1.f90 rename to program_control/sources/declare_variant.1.f90 diff --git a/sources/Example_declare_variant.2.c b/program_control/sources/declare_variant.2.c similarity index 100% rename from sources/Example_declare_variant.2.c rename to program_control/sources/declare_variant.2.c diff --git a/sources/Example_declare_variant.2.f90 b/program_control/sources/declare_variant.2.f90 similarity index 100% rename from sources/Example_declare_variant.2.f90 rename to program_control/sources/declare_variant.2.f90 diff --git a/program_control/sources/display_env.1.c b/program_control/sources/display_env.1.c new file mode 100644 index 0000000..2519313 --- /dev/null +++ b/program_control/sources/display_env.1.c @@ -0,0 +1,20 @@ +/* +* @@name: display_env.1.c +* @@type: C +* @@compilable: yes +* @@linkable: yes +* @@expect: success +* @@version: omp_5.1 +*/ +#include + +//implementers: customize debug routines for app debugging +int debug(){ return 1; } +int debug_omp_verbose(){ return 0; } + +int main() +{ + if( debug() ) omp_display_env( debug_omp_verbose() ); + // ... + return 0; +} diff --git a/program_control/sources/display_env.1.f90 b/program_control/sources/display_env.1.f90 new file mode 100644 index 0000000..a0f813f --- /dev/null +++ b/program_control/sources/display_env.1.f90 @@ -0,0 +1,25 @@ +! @@name: display_env.1.f90 +! @@type: F-free +! @@compilable: yes +! @@linkable: yes +! @@expect: success +! @@version: omp_5.1 +!implementers: customize debug routines for app debugging +function debug() + logical :: debug + debug = .true. +end function + +function debug_omp_verbose() + logical :: debug_omp_verbose + debug_omp_verbose = .false. +end function + +program display_omp_environment + use omp_lib + logical :: debug, debug_omp_verbose + + if( debug() ) call omp_display_env( debug_omp_verbose() ) + !! ... +end program + diff --git a/program_control/sources/error.1.c b/program_control/sources/error.1.c new file mode 100644 index 0000000..ce7a980 --- /dev/null +++ b/program_control/sources/error.1.c @@ -0,0 +1,35 @@ +/* +* @@name: error.1c +* @@type: C +* @@compilable: yes +* @@linkable: yes +* @@expect: success +* @@version: omp_5.1 +*/ +#include +#include + +int main(){ + +#pragma omp metadirective \ + when( implementation={vendor(gnu)}: nothing ) \ + default(error at(compilation) severity(fatal) \ + message("GNU compiler required.")) + + if( omp_get_num_procs() < 3 ){ + #pragma omp error at(runtime) severity(fatal) \ + message("3 or more procs required.") + } + + #pragma omp parallel master + { + // Give notice about master deprecation at compile time and run time. + #pragma omp error at(compilation) severity(warning) \ + message("Notice: master is deprecated.") + #pragma omp error at(runtime) severity(warning) \ + message("Notice: masked used next release.") + + printf(" Hello from thread number 0.\n"); + } + +} diff --git a/program_control/sources/error.1.f90 b/program_control/sources/error.1.f90 new file mode 100644 index 0000000..2772e24 --- /dev/null +++ b/program_control/sources/error.1.f90 @@ -0,0 +1,34 @@ +! @@name: error.1f +! @@type: F-free +! @@compilable: yes +! @@linkable: yes +! @@expect: success +! @@version: omp_5.1 + +program main +use omp_lib + +!$omp metadirective & +!$omp& when( implementation={vendor(gnu)}: nothing ) & +!$omp& default( error at(compilation) severity(fatal) & +!$omp& message( "GNU compiler required." ) ) + + +if( omp_get_num_procs() < 3 ) then + !$omp error at(runtime) severity(fatal) & + !$omp& message("3 or more procs required.") +endif + + !$omp parallel master + +!! Give notice about master deprecation at compile time and run time. + !$omp error at(compilation) severity(warning) & + !$omp& message("Notice: master is deprecated.") + !$omp error at(runtime) severity(warning) & + !$omp& message("Notice: masked to be used in next release.") + + print*," Hello from thread number 0." + + !$omp end parallel master + +end program diff --git a/program_control/sources/get_wtime.1.c b/program_control/sources/get_wtime.1.c new file mode 100644 index 0000000..d85bf95 --- /dev/null +++ b/program_control/sources/get_wtime.1.c @@ -0,0 +1,28 @@ +/* +* @@name: get_wtime.1c +* @@type: C +* @@compilable: yes +* @@linkable: yes +* @@expect: success +*/ +#include +#include +#include + +void work_to_be_timed() +{ + sleep(2); +} + +int main() +{ + double start, end; + + start = omp_get_wtime(); + work_to_be_timed(); // any parallel or serial codes + end = omp_get_wtime(); + + printf("Work took %f seconds\n", end - start); + printf("Precision of the timer is %f (sec)\n", omp_get_wtick()); + return 0; +} diff --git a/program_control/sources/get_wtime.1.f90 b/program_control/sources/get_wtime.1.f90 new file mode 100644 index 0000000..80e9580 --- /dev/null +++ b/program_control/sources/get_wtime.1.f90 @@ -0,0 +1,28 @@ +! @@name: get_wtime.1f +! @@type: F-free +! @@compilable: yes +! @@linkable: yes +! @@expect: success +subroutine work_to_be_timed + use, intrinsic :: iso_c_binding, only: c_int + interface + subroutine fsleep(sec) bind(C, name="sleep") + import c_int + integer(c_int), value :: sec + end subroutine + end interface + call fsleep(2) +end subroutine + +program do_work + use omp_lib + implicit none + double precision :: start, end + + start = omp_get_wtime() + call work_to_be_timed ! any parallel or serial codes + end = omp_get_wtime() + + print *, "Work took", end - start, "seconds" + print *, "Precision of the timer is", omp_get_wtick(), "(sec)" +end program diff --git a/sources/Example_icv.1.c b/program_control/sources/icv.1.c similarity index 100% rename from sources/Example_icv.1.c rename to program_control/sources/icv.1.c diff --git a/sources/Example_icv.1.f b/program_control/sources/icv.1.f similarity index 100% rename from sources/Example_icv.1.f rename to program_control/sources/icv.1.f diff --git a/program_control/sources/interop.1.c b/program_control/sources/interop.1.c new file mode 100644 index 0000000..73cf052 --- /dev/null +++ b/program_control/sources/interop.1.c @@ -0,0 +1,106 @@ +/* +* @@name: interop.1c +* @@type: C +* @@compilable: no +* @@linkable: no +* @@expect: success +* @@version: omp_5.1 +*/ +#include +#include +#include +#include +#include + +#define N 16384 + +void myVectorSet(int n, double s, double *x) +{ + for(int i=0; i +#include +#include + +void foo(int *a, int n, bool use_gpu) +{ + int b=0; // use b to detect if run on gpu + + #pragma omp metadirective \ + when( user={condition(use_gpu)}: \ + target teams distribute parallel for \ + private(b) map(from:a[0:n]) ) \ + default( parallel for ) + for (int i=0; i= 201811) printf("ERROR: OpenMP 5.0 implementation ignored MANDATORY policy.\n"); - printf("Target region executed on init dev %s\n", on_init_dev ? "TRUE":"FALSE"); + printf("Target region executed on init dev %s\n", + on_init_dev ? "TRUE":"FALSE"); return 0; } diff --git a/sources/Example_target_offload_control.1.f90 b/program_control/sources/target_offload_control.1.f90 similarity index 100% rename from sources/Example_target_offload_control.1.f90 rename to program_control/sources/target_offload_control.1.f90 diff --git a/Examples_standalone.tex b/program_control/standalone.tex similarity index 100% rename from Examples_standalone.tex rename to program_control/standalone.tex diff --git a/Examples_target_offload.tex b/program_control/target_offload.tex similarity index 100% rename from Examples_target_offload.tex rename to program_control/target_offload.tex diff --git a/program_control/utilities.tex b/program_control/utilities.tex new file mode 100644 index 0000000..16752ff --- /dev/null +++ b/program_control/utilities.tex @@ -0,0 +1,96 @@ +\pagebreak +\section{Utilities} +\label{sec:utilities} +This section contains examples of utility routines and features. + +%--------------------------- +\subsection{Timing Routines} +\label{subsec:get_wtime} + +The \scode{omp_get_wtime} routine can be used to measure the elapsed wall +clock time (in seconds) of code execution in a program. +The routine is thread safe and can be executed by multiple threads concurrently. +The precision of the timer can be obtained by a call to +the \scode{omp_get_wtick} routine. The following example shows a use case. + +\cexample{get_wtime}{1} + +\ffreeexample{get_wtime}{1} + + +%--------------------------- +\subsection{Environment Display} +\label{subsec:display_env} + +The OpenMP version number and the values of ICVs associated with the relevant +environment variables can be displayed at runtime by setting +the \scode{OMP_DISPLAY_ENV} environment variable to either +\code{TRUE} or \code{VERBOSE}. +The information is displayed once by the runtime. + +A more flexible or controllable approach is to call +the \scode{omp_display_env} API routine at any desired +point of a code to display the same information. +This OpenMP 5.1 API routine takes a single \plc{verbose} argument. +A value of 0 or .false. (for C/C++ or Fortran) indicates +the required OpenMP ICVs associated with environment variables be displayed, +and a value of 1 or .true. (for C/C++ or Fortran) will include +vendor-specific ICVs that can be modified by environment variables. + +The following example illustrates the conditional execution of the API +\scode{omp_display_env} routine. Typically it would be invoked in +various debug modes of an application. +An important use case is to have a single MPI process (e.g., rank = 0) +of a hybrid (MPI+OpenMP) code execute the routine, +instead of all MPI processes, as would be done by +setting the \scode{OMP_DISPLAY_ENV} to \code{TRUE} or \code{VERBOSE}. + +\cexample[5.1]{display_env}{1} + +\ffreeexample[5.1]{display_env}{1} +\clearpage + +\emph{Note}: +A sample output from the execution of the code might look like: +{\small\begin{verbatim} + OPENMP DISPLAY ENVIRONMENT BEGIN + _OPENMP='202011' + [host] OMP_AFFINITY_FORMAT='(null)' + [host] OMP_ALLOCATOR='omp_default_mem_alloc' + [host] OMP_CANCELLATION='FALSE' + [host] OMP_DEFAULT_DEVICE='0' + [host] OMP_DISPLAY_AFFINITY='FALSE' + [host] OMP_DISPLAY_ENV='FALSE' + [host] OMP_DYNAMIC='FALSE' + [host] OMP_MAX_ACTIVE_LEVELS='1' + [host] OMP_MAX_TASK_PRIORITY='0' + [host] OMP_NESTED: deprecated; max-active-levels-var=1 + [host] OMP_NUM_THREADS: value is not defined + [host] OMP_PLACES: value is not defined + [host] OMP_PROC_BIND: value is not defined + [host] OMP_SCHEDULE='static' + [host] OMP_STACKSIZE='4M' + [host] OMP_TARGET_OFFLOAD=DEFAULT + [host] OMP_THREAD_LIMIT='0' + [host] OMP_TOOL='enabled' + [host] OMP_TOOL_LIBRARIES: value is not defined + OPENMP DISPLAY ENVIRONMENT END +\end{verbatim}} + + +%--------------------------- +\subsection{\code{error} Directive} +\label{subsec:error} + +The \code{error} directive provides a consistent method for C, C++, and Fortran to emit a \plc{fatal} or +\plc{warning} message at \plc{compilation} or \plc{execution} time, as determined by a \code{severity} +or an \code{at} clause, respectively. When \code{severity(fatal)} is present, the compilation +or execution is aborted. Without any clauses the default behavior is as if \code{at(compilation)} +and \code{severity(fatal)} were specified. + +The C, C++, and Fortran examples below show all the cases for reporting messages. + +\cexample[5.1]{error}{1} +\ffreeexample[5.1]{error}{1} + + diff --git a/Examples_variant.tex b/program_control/variant.tex similarity index 100% rename from Examples_variant.tex rename to program_control/variant.tex diff --git a/sources/Example_affinity.5.c b/sources/Example_affinity.5.c deleted file mode 100644 index e894689..0000000 --- a/sources/Example_affinity.5.c +++ /dev/null @@ -1,17 +0,0 @@ -/* -* @@name: affinity.5c -* @@type: C -* @@compilable: yes -* @@linkable: yes -* @@expect: success -* @@version: omp_4.0 -*/ -void work(); -int main() -{ -#pragma omp parallel proc_bind(master) num_threads(4) - { - work(); - } - return 0; -} diff --git a/sources/Example_affinity.5.f b/sources/Example_affinity.5.f deleted file mode 100644 index 42e82ff..0000000 --- a/sources/Example_affinity.5.f +++ /dev/null @@ -1,11 +0,0 @@ -! @@name: affinity.5f -! @@type: F-fixed -! @@compilable: yes -! @@linkable: yes -! @@expect: success -! @@version: omp_4.0 - PROGRAM EXAMPLE -!$OMP PARALLEL PROC_BIND(MASTER) NUM_THREADS(4) - CALL WORK() -!$OMP END PARALLEL - END PROGRAM EXAMPLE diff --git a/sources/README b/sources/README new file mode 100644 index 0000000..193c0c8 --- /dev/null +++ b/sources/README @@ -0,0 +1,35 @@ +The source codes for examples in each chapter are in the sources directory +under the corresponding chapter directory: + +../SIMD/sources +../affinity/sources +../data_environment/sources +../devices/sources +../loop_transformations/sources +../memory_model/sources +../parallel_execution/sources +../program_control/sources +../synchronization/sources +../tasking/sources + +This directory contains the test_codes script that performs +a quick compilation test of all example codes (default). The +test results are stored in the test_codes.log file. The script +tries to automatically detect a compiler. Set the comp_c, comp_cc +and comp_f variables if that doesn't work. + + The test script (test_codes) is + + ***** for reference purpose only and is NOT intended to validate ***** + + OpenMP compliance of a compiler. + +July 31 2021 Changes +* Automatic compiler determination (no need to specify comp_c, etc.) +* Automatic version (DATE/NUMBER) determination +* Automatic Backdown (replacement) for deprecation if compiler is not 5.1/5.0 + (replacements: masked->master,primary->master, *lock_hint*->*lock_sync*) +* Automatic -c compiler flag for "linkable=no" metadata +* Utilities file, test_utils, created for clean coding +* Command line now takes file name argument (just file name, not path) +* sourceme file: puts test_codes in PATH, creates tc alias, defines OMP_BASE_DI diff --git a/sources/sourceme b/sources/sourceme new file mode 100644 index 0000000..aef5535 --- /dev/null +++ b/sources/sourceme @@ -0,0 +1,3 @@ +export PATH=`pwd`:$PATH #put test_cases in path +export OMP_BASE_DIR=$( dirname `pwd` ) #directory above pwd is base +alias tc=test_codes diff --git a/sources/test_code_one b/sources/test_code_one deleted file mode 100644 index 2fb738b..0000000 --- a/sources/test_code_one +++ /dev/null @@ -1,15 +0,0 @@ -set opts="" -if ( $aopt == 1 ) then - set opts=" $copt" -else if ( $aopt != 0) then - set sstr=omp.h - if ( "$1" == "f" || "$1" == "f90" ) set sstr="use omp_lib|omp_lib.h" - egrep "$sstr" $f >& /dev/null - if ( $status == 0 ) set opts=" $copt" -endif -echo "### ${f}$opts" | tee -a $logf -$comp -c$opts $f >>& $logf -if ( $status != 0 ) then - echo ">>> $f FAILED" | tee -a $logf - @ cntfail++; -endif diff --git a/sources/test_codes b/sources/test_codes index e880f71..e24ad8a 100755 --- a/sources/test_codes +++ b/sources/test_codes @@ -1,55 +1,241 @@ -#!/bin/csh +#!/bin/bash -set aopt=2 #default(auto), 0:no-copt, 1:with-copt -if ( "$1" == "no-copt" || "$1" == "0" ) then - set aopt=0 -else if ( "$1" == "with-copt"i || "$1" == "1" ) then - set aopt=1 -else if ( "$1" == "auto" || "$1" == "2" ) then - set aopt=2 -else if ( "$1" == "-h" ) then - echo "usage: test_codes [no-copt | with-copt | auto]" - exit -endif +#******************************************************************************************* +# Author Henry Jin +# test example source-codes with C, C++ and Fortran compilers +# set in comp_* variables. -set copt="-fopenmp" -set cntc=0 -set cntcpp=0 -set cntf=0 -set cntffree=0 -set cntfail=0 -set logf=test_codes.log -set comp=gcc -foreach f (*.c) - @ cntc++; - source test_code_one c -end +# Additions 2021-07-30 Kent Milfeld +# Command line entry of files to be tested. +# Automatic compiler determination +# Automatic version (DATE/NUMBER) determination +# Automatic Backdown for deprecation if compiler is not 5.1 +# (masked->master,primary->master, *lock_hint*->*sync_hint*) +# Automatic -c compiler flag for "linkable=no" metadata +# save command line option to save preprocessed files (have exmpl_ prefix) +# Utilities file, test_utils, created for clean coding +#******************************************************************************************* -set comp=g++ -foreach f (*.cpp) - @ cntcpp++; - source test_code_one cpp -end +DIR="$( cd "$( dirname "${BASH_SOURCE[0]}" )" > /dev/null && pwd )" #location of test_cases +BASE_DIR=$(dirname $DIR) #Top level directory -set comp=gfortran -foreach f (*.f) - @ cntf++; - source test_code_one f -end +source test_utils -set comp=gfortran -foreach f (*.f90) - @ cntffree++; - source test_code_one f90 -end +# Hard code Compiler names and flags here: comp_c, comp_cpp, comp_f, omp_flag, fpp_flag +# Keep comp_c="" to automatically determine compiler parameters +# Specifying compiler system (amd,cray,gnu,ibm,intel) on command line, or +# setting env. var. COMP (to AMD,CRAY,GNU,IBM,INTEL), +# will override specifying compiler here, and auto detection.. -echo "" | tee -a $logf -echo "Total number of C examples: $cntc" | tee -a $logf -echo "Total number of C++ examples: $cntcpp" | tee -a $logf -echo "Total number of F-fixed examples: $cntf" | tee -a $logf -echo "Total number of F-free examples: $cntffree" | tee -a $logf -echo "Total number of failed examples: $cntfail" | tee -a $logf +comp_c="" +comp_cpp="" +comp_f="" +omp_flag="" +fpp_flag="" -\rm -f *.o *.mod +#COMP=GNU #(or AMD,CRAY,IBM,INTEL) forces one of these compiler systems (list in test_utils) +#Compiler version DATE (e.g. 202011) and version NUMBER (e.g. 5.1) +#Function below will set these for you. +#If you set them here, auto detection function will not be called. + +VER_DATE="" +VER_NUM="" +NA_VER_NUMS=() + +#export OMP_DEBUG=1 #you can set this outside of test_cases + +printf "\n **** text_code ONLY tests compilation, and linking (when linkable=yes). ****" +printf "\n **** text_code does NOT validate examples. It just compiles them. ****\n\n" + +#=========================== No need to change code below =================================== + +# command line option +aopt=2 #default(auto), 0:no-omp, 1:with-omp + +while (($#)); do + case "$1" in + no-omp | -no-omp ) aopt=0 ;; + with-omp | -with-omp) aopt=1 ;; + auto | -auto ) aopt=2 ;; + save | -save ) save_pp_files=on ;; + help | -h |-help) + echo "USAGE: test_codes [options] + + options: meaning + -no-omp : no OpenMP compilation flag + -with-omp : with OpenMP compilation flag + -auto : (default) auto-detection + based on include files + -save : save preprocessed files + : one or more example codes to test " + exit ;; + + amd | AMD | oacc | OACC ) COMP=AMD comp_c="" ;; #comp_c="" => call get_compiler_commands_and_omp_flag + cray | CRAY | cce | CCE ) COMP=CRAY comp_c="" ;; + gnu | GNU ) COMP=GNU comp_c="" ;; + ibm | IBM | pwr | PWR ) COMP=IBM comp_c="" ;; + intel | INTEL ) COMP=INTEL comp_c="" ;; + #These will override auto detection, and comp_x setting below. + + *.f* ) FILES+=($1) ;; + *.c* ) FILES+=($1) ;; + esac + shift +done +# Set Compiler comp_c, comp_cpp, comp_f, omp_flag, fpp_flag +# *** If comp_c is set, it is assumed all above comp_x/x_flag are set +[[ -z $comp_c ]] && get_compiler_commands_and_omp_flag +[[ -z $VER_DATE ]] && get_compiler_version_date # can set as env. var. for testing +[[ -z $NA_VER_NUMS ]] && get_compiler_version_number # get VER_NUMs which compiler doesn't support (NA) + +[[ ! -z $NA_VER_NUMS ]] && REPLACE=ON + +echo -e " >>> TESTING: ${FILES[@]}\n" + +# function to test one code ($f) +test_one_code() { + local link_opt inc_suffix inc_opt comment fort_opts tested="compile & link" + + f_base=`basename $f` + ef=exmpl_${f_base} + ext=$1 + \cp -f $f $ef + opts="" + + if [ $aopt -eq 1 ]; then + opts=" $copt" + elif [ $aopt -ne 0 ]; then + sstr=omp.h + if [ "$ext" == "f" -o "$ext" == "f90" ]; then + sstr="use omp_lib|omp_lib.h" #Can non-header files require copt? + fi #Why not always have copt? KM + egrep "$sstr" $ef > /dev/null 2>&1 && opts=" $copt" + fi + + grep -ie '@@linkable\s*:\s*.*no' $ef &>/dev/null + [[ $? == 0 ]] && link_opt="-c" && tested="compile only" + + grep -ie '@@\s*requires\s*:\s*.*preprocessing' $ef &>/dev/null + [[ $? == 0 ]] && fort_opts="$fpp_flag" #JIC fpp needed for any reason + + if [[ $REPLACE == ON ]]; then + for no in ${NA_VER_NUMS[@]}; do + ver_date=${VNUM2DATE[$no]} + grep -e '\s*#if\s*_OPENMP\s*<\s*'${ver_date}'\s*' $ef &>/dev/null + if [[ $? == 0 ]]; then + [ "$ext" == "f" -o "$ext" == "f90" ] && fort_opts="$fpp_flag" + echo " -> Backing down from $no on $f_base code" + #uncomment lines if they are commented out-- What about unwanted version + #sed 's@\s*'$comment'\s*\(#\(if\|define\|endif\)\($\|\s\).*\)@\1@' $ef + fi + done + fi + + spaces=" " + eval EVAL=$f + short_f=$(echo $EVAL | sed s@$BASE_DIR@\$OMP_BASE_DIR@) + printf " >>> %-36s %-50s " \ + "$comp $omp_flag $inc_opt $fort_opts $link_opt" "$short_f${spaces:40}" | tee -a $logf + $comp $omp_flag $inc_opt $fort_opts $link_opt $ef >> $logf 2>&1 + + if [ $? -ne 0 ]; then + printf "***" >> $logf + printf " [FAILED]\n" | tee -a $logf + (( cntfail = cntfail + 1 )) + else + printf " %-25s\n" "[PASSED: $tested]" + fi + \rm -f *.o *.mod + [[ ! ( $REPLACE == ON && $save_pp_files == on) ]] && \rm -f $ef + printf " ------------------------------------------------------\n\n" +} + +# start testing +copt=$omp_flag + +cntc=0 +cntcpp=0 +cntf=0 +cntffree=0 +cntfail=0 +logf=test_codes.log + +# SELECTED FILE PROCESSING +if [[ ! -z $FILES ]]; then + + for ff in ${FILES[@]}; do + + f=$BASE_DIR/*/sources/$ff + + if [[ $f =~ .c$ ]]; then + comp=$comp_c + (( cntc = cntc + 1 )) + test_one_code c + fi + + if [[ $f =~ .cpp$ ]]; then + comp=$comp_cpp + (( cntcpp = cntcpp + 1 )) + test_one_code cpp + fi + + if [[ $f =~ .f$ ]]; then + comp=$comp_f + (( cntf = cntf + 1 )) + test_one_code f + fi + + if [[ $f =~ .f90$ ]]; then + comp=$comp_f + f=$BASE_DIR/*/sources/$ff + (( cntffree = cntffree + 1 )) + test_one_code f90 + fi + done + exit #No need for bulk processing information + +fi + +# BULK PROCESSING +if [[ -z $FILES ]]; then + + comp=$comp_c + for f in ../*/sources/*.c; do + (( cntc = cntc + 1 )) + test_one_code c + done + + comp=$comp_cpp + for f in ../*/sources/*.cpp; do + (( cntcpp = cntcpp + 1 )) + test_one_code cpp + done + + comp=$comp_f + for f in ../*/sources/*.f; do + (( cntf = cntf + 1 )) + test_one_code f + done + + for f in ../*/sources/*.f90; do + (( cntffree = cntffree + 1 )) + test_one_code f90 + done + +fi + +# print summary stats +echo " +C compiler = $comp_c +C++ compiler = $comp_cpp +Fortran compiler = $comp_f +OpenMP flag = $omp_flag + + +Total number of C examples : $cntc +Total number of C++ examples : $cntcpp +Total number of F-fixed examples: $cntf +Total number of F-free examples : $cntffree +Total number of failed examples : $cntfail" | tee -a $logf diff --git a/sources/test_utils b/sources/test_utils new file mode 100644 index 0000000..4087767 --- /dev/null +++ b/sources/test_utils @@ -0,0 +1,102 @@ + +VDATES=( 202011 201811 201511 201307 201107 200805 ) +VNUM2DATE=( [51]=202011 [50]=201811 [45]=201511 [40]=201307 [31]=201107 [30]=200805 ) +VDATE2NUM=( [202011]=51 [201811]=50 [201511]=45 [201307]=40 [201107]=31 [200805]=30 ) + +get_compiler_version_date(){ + # Uses CC and omp_flag (compiler options) to get the VER_DATE (e.g. 200805) from compiler + [[ ! -z $TEST_DEBUG ]] && echo -e "\n -> INSIDE get_compiler_version_date:" + + cat <<' EOF' >version.c + #include + #include + int main(){ printf("%d\n", _OPENMP);} + EOF + VER_DATE=$( $comp_c $omp_flag version.c; ./a.out; rm -f version.c a.out ) + VER_NUM=${VDATE2NUM[$VER_DATE]} + [[ ! -z $TEST_DEBUG ]] && echo " Finished, FOUND VER_DATE=$VER_DATE ($VER_NUM)" +} + +get_compiler_version_number(){ + # Uses VER_DATE (e.g. 200805) to determine not_available VER_NUMs, NA_VER_NUMS (4.g. 30) + [[ ! -z $TEST_DEBUG ]] && echo -e "\n -> INSIDE get_compiler_version_number:" + + for i in $(seq 0 $(( ${#VDATES[@]} - 1 )) ) ; do + if [[ "${VDATES[$i]}" -gt "$VER_DATE" ]]; then + #echo "HERE $i $VER_DATE ${VDATES[$i]}" + NA_VER_NUMS+=( ${VDATE2NUM[ ${VDATES[$i]} ]} ) + [[ ! -z $TEST_DEBUG ]] && echo " FOUND: ${VDATES[$i]} <= $VER_DATE, Compiler earlier than $NA_VER_NUMS." + fi + done + + [[ ! -z $TEST_DEBUG ]] && echo -e " Finished, using Compiler $VER_DATE (${VDATE2NUM[$VER_DATE]}).\n" +} + + +get_compiler_commands_and_omp_flag(){ + # Determine compiler and return compiler commands + + local found="" + + [[ ! -z $TEST_DEBUG ]] && echo -e "\n -> INSIDE get_compiler_commands_and_omp_flag:" + + comp_c="" comp_cpp="" comp_f="" fixed_flags="" fpp_flag="" omp_flag="" + + #Force + [[ "$COMP" == 'GNU' ]] && comp_c=gcc comp_cpp=g++ comp_f=gfortran fixed_flag="-ffixed-form" fpp_flag="-cpp" omp_flag="-fopenmp" + [[ "$COMP" == 'IBM' ]] && comp_c=xlc_r comp_cpp="xlC_r" comp_f=xlf90_r fixed_flag="-qfixed" fpp_flag="-qpreprocess" omp_flag="-qsmp" + [[ "$COMP" == 'INTEL' ]] && comp_c=icc comp_cpp=icpc comp_f=ifort fixed_flag="-fixed" fpp_flag="-fpp" omp_flag="-qopenmp" + [[ "$COMP" == 'AMD' ]] && comp_c=clang comp_cpp=clang++ comp_f=flang fixed_flag="-Mfixed" fpp_flag="-cpp" omp_flag="-fopenmp" + [[ "$COMP" == 'CRAY' ]] && comp_c=cc comp_cpp=CC comp_f=ftn fixed_flag="-f fixed" fpp_flag="-e F" omp_flag="-homp" + #use reasonable names instead of AOCC(AMD), CCE(CRAY),... + + [[ ! -z $comp_c ]] && echo " Using $COMP compiler system as provided in the COMP environment variable" + + if [[ -z $comp_c ]]; then + + command -v gcc -v &> /dev/null + [[ $? == 0 ]] && comp_c=gcc comp_cpp=g++ comp_f=gfortran fixed_flag="-ffixed-form" fpp_flag="-cpp" omp_flag="-fopenmp" found+=" GNU" + + command -v flang -v &> /dev/null + [[ $? == 0 ]] && comp_c=clang comp_cpp=clang++ comp_f=flang fixed_flag="-Mfixed" fpp_flag="-cpp" omp_flag="-fopenmp" found+=" CLANG" + + command -v ftn -v &> /dev/null + [[ $? == 0 ]] && comp_c=cc comp_cpp=CC comp_f=ftn fixed_flag="-f fixed" fpp_flag="-e F" omp_flag="-homp" found+=" CRAY" + + command -v icc -v &> /dev/null + [[ $? == 0 ]] && comp_c=icc comp_cpp=icpc comp_f=ifort fixed_flag="-fixed" fpp_flag="-fpp" omp_flag="-qopenmp" found+=" INTEL" + + command -v xlc -v &> /dev/null + [[ $? == 0 ]] && comp_c=xlc_r comp_cpp="xlC_r" comp_f=xlf90_r fixed_flag="-qfixed" fpp_flag="-qpreprocess" omp_flag="-qsmp" found+=" IBM" + + [[ ! -z $comp_c ]] && printf " Detected compilers: $found If multiple compilers, will use last found.\n" + + fi + + if [[ -z $comp_c ]]; then + echo " >>> Auto detection of GNU, CLANG, CRAY, INTEL, and IBM SYSTEM compilers failed". + echo " (flang is used to detect CLANG compiler, SYSTEM must have Fortran compiler.)" + echo " >>> REQUIREMENT: Must set compiler info (comp_xxx variables,...) in test_codes:" + echo " ** comp_c, comp_cpp, comp_f, fixed_flag, fpp_flag and omp_flag" + echo " or set COMP env. var. to one of: GNU, CLANG, CRAY, INTEL, or IBM." + echo " See COMP array in test_utils comp_xxx values." + exit 1 + fi + + + command -v lspu &> /dev/null + if [[ $? == 0 ]]; then + # # AMD HPE INTEL IBM + ARCH=$(lscpu | grep Architecture) # x86_64 x86_64 ppc64le + [[ ! -z $TEST_DEBUG ]] && echo " lscpu FOUND Architecture: $ARCH" + + # # AMD HPE INTEL IBM + VENID=$(lscpu | grep -i "Vendor ID:") # AuthenticAMD GenuineIntel NA + [[ ! -z $TEST_DEBUG ]] && echo " lscpu FOUND Vendor ID: $VENID" + fi + + + [[ ! -z $TEST_DEBUG ]] && echo " Finished, FOUND CC: $comp_c CXX: $comp_cpp F90: $comp_f" + [[ ! -z $TEST_DEBUG ]] && echo " Finished, FOUND omp_flag: $omp_flag Fortran flags: Fixedform=$fixed_flag Preprocess=$fpp_flag" +} + diff --git a/Examples_acquire_release.tex b/synchronization/acquire_release.tex similarity index 100% rename from Examples_acquire_release.tex rename to synchronization/acquire_release.tex diff --git a/Examples_atomic.tex b/synchronization/atomic.tex similarity index 97% rename from Examples_atomic.tex rename to synchronization/atomic.tex index 620c53e..1449439 100644 --- a/Examples_atomic.tex +++ b/synchronization/atomic.tex @@ -1,5 +1,5 @@ \pagebreak -\section{The \code{atomic} Construct} +\section{\code{atomic} Construct} \label{sec:atomic} The following example avoids race conditions (simultaneous updates of an element diff --git a/Examples_atomic_restrict.tex b/synchronization/atomic_restrict.tex similarity index 100% rename from Examples_atomic_restrict.tex rename to synchronization/atomic_restrict.tex diff --git a/Examples_barrier_regions.tex b/synchronization/barrier_regions.tex similarity index 100% rename from Examples_barrier_regions.tex rename to synchronization/barrier_regions.tex diff --git a/Examples_critical.tex b/synchronization/critical.tex similarity index 90% rename from Examples_critical.tex rename to synchronization/critical.tex index f897e90..fdee1c0 100644 --- a/Examples_critical.tex +++ b/synchronization/critical.tex @@ -1,5 +1,5 @@ \pagebreak -\section{The \code{critical} Construct} +\section{\code{critical} Construct} \label{sec:critical} The following example includes several \code{critical} constructs. The example @@ -17,4 +17,4 @@ The following example extends the previous example by adding the \code{hint} cla \cexample{critical}{2} -\fexample[4.5]{critical}{2} +\fexample[4.5]{critical}{2}[1] diff --git a/Examples_depobj.tex b/synchronization/depobj.tex similarity index 98% rename from Examples_depobj.tex rename to synchronization/depobj.tex index 9ca962b..e16c36e 100644 --- a/Examples_depobj.tex +++ b/synchronization/depobj.tex @@ -1,5 +1,5 @@ \pagebreak -\section{The \code{depobj} Construct} +\section{\code{depobj} Construct} \label{sec:depobj} The stand-alone \code{depobj} construct provides a mechanism diff --git a/Examples_doacross.tex b/synchronization/doacross.tex similarity index 100% rename from Examples_doacross.tex rename to synchronization/doacross.tex diff --git a/Examples_flush_nolist.tex b/synchronization/flush_nolist.tex similarity index 82% rename from Examples_flush_nolist.tex rename to synchronization/flush_nolist.tex index d39e17e..e807fd0 100644 --- a/Examples_flush_nolist.tex +++ b/synchronization/flush_nolist.tex @@ -1,5 +1,5 @@ \pagebreak -\section{The \code{flush} Construct without a List} +\section{\code{flush} Construct without a List} \label{sec:flush_nolist} The following example distinguishes the shared variables affected by a \code{flush} diff --git a/Examples_init_lock.tex b/synchronization/init_lock.tex similarity index 81% rename from Examples_init_lock.tex rename to synchronization/init_lock.tex index ba5324e..3f4151c 100644 --- a/Examples_init_lock.tex +++ b/synchronization/init_lock.tex @@ -1,4 +1,4 @@ -\subsection{The \code{omp\_init\_lock} Routine} +\subsection{\code{omp\_init\_lock} Routine} \label{subsec:init_lock} The following example demonstrates how to initialize an array of locks in a \code{parallel} diff --git a/Examples_init_lock_with_hint.tex b/synchronization/init_lock_with_hint.tex similarity index 77% rename from Examples_init_lock_with_hint.tex rename to synchronization/init_lock_with_hint.tex index 8541f6e..ac5a8de 100644 --- a/Examples_init_lock_with_hint.tex +++ b/synchronization/init_lock_with_hint.tex @@ -1,5 +1,5 @@ %\pagebreak -\subsection{The \code{omp\_init\_lock\_with\_hint} Routine} +\subsection{\code{omp\_init\_lock\_with\_hint} Routine} \label{subsec:init_lock_with_hint} The following example demonstrates how to initialize an array of locks in a \code{parallel} region by using \code{omp\_init\_lock\_with\_hint}. @@ -7,4 +7,4 @@ Note, hints are combined with an \code{|} or \code{+} operator in C/C++ and a \c \cppexample[4.5]{init_lock_with_hint}{1} -\fexample[4.5]{init_lock_with_hint}{1} +\fexample[4.5]{init_lock_with_hint}{1}[1] diff --git a/Examples_lock_owner.tex b/synchronization/lock_owner.tex similarity index 85% rename from Examples_lock_owner.tex rename to synchronization/lock_owner.tex index 08a274c..df2297c 100644 --- a/Examples_lock_owner.tex +++ b/synchronization/lock_owner.tex @@ -10,13 +10,13 @@ a task region must be owned by the same task region. This change in ownership requires extra care when using locks. The following program is conforming in OpenMP 2.5 because the thread that releases the lock \code{lck} in the parallel region is the same thread that acquired the lock in the sequential -part of the program (master thread of parallel region and the initial thread are +part of the program (primary thread of parallel region and the initial thread are the same). However, it is not conforming beginning with OpenMP 3.0, because the task region that releases the lock \code{lck} is different from the task region that acquires the lock. -\cexample{lock_owner}{1} +\cexample[5.1]{lock_owner}{1} -\fexample{lock_owner}{1} +\fexample[5.1]{lock_owner}{1}[1] diff --git a/Examples_locks.tex b/synchronization/locks.tex similarity index 100% rename from Examples_locks.tex rename to synchronization/locks.tex diff --git a/Examples_nestable_lock.tex b/synchronization/nestable_lock.tex similarity index 100% rename from Examples_nestable_lock.tex rename to synchronization/nestable_lock.tex diff --git a/Examples_ordered.tex b/synchronization/ordered.tex similarity index 91% rename from Examples_ordered.tex rename to synchronization/ordered.tex index 256a462..d0d6d92 100644 --- a/Examples_ordered.tex +++ b/synchronization/ordered.tex @@ -1,5 +1,5 @@ \pagebreak -\section{The \code{ordered} Clause and the \code{ordered} Construct} +\section{\code{ordered} Clause and \code{ordered} Construct} \label{sec:ordered} Ordered constructs are useful for sequentially ordering the output from work that diff --git a/Examples_simple_lock.tex b/synchronization/simple_lock.tex similarity index 64% rename from Examples_simple_lock.tex rename to synchronization/simple_lock.tex index cb59294..bb5a303 100644 --- a/Examples_simple_lock.tex +++ b/synchronization/simple_lock.tex @@ -3,16 +3,15 @@ In the following example, the lock routines cause the threads to be idle while waiting for entry to the first critical section, but to do other work while waiting -for entry to the second. The \code{omp\_set\_lock} function blocks, but the \code{omp\_test\_lock} +for entry to the second. The \code{omp\_set\_lock} function blocks, but the \scode{omp_test_lock} function does not, allowing the work in \code{skip} to be done. -Note that the argument to the lock routines should have type \code{omp\_lock\_t}, -and that there is no need to flush it. +Note that the argument to the lock routines should have type +\scode{omp_lock_t} (or \scode{omp_lock_kind} in Fortran), +and that there is no need to flush the lock variable (\plc{lck}). \cexample{simple_lock}{1} -Note that there is no need to flush the lock variable. - \fexample{simple_lock}{1} diff --git a/sources/Example_acquire_release.1.c b/synchronization/sources/acquire_release.1.c similarity index 100% rename from sources/Example_acquire_release.1.c rename to synchronization/sources/acquire_release.1.c diff --git a/sources/Example_acquire_release.1.f90 b/synchronization/sources/acquire_release.1.f90 similarity index 100% rename from sources/Example_acquire_release.1.f90 rename to synchronization/sources/acquire_release.1.f90 diff --git a/sources/Example_acquire_release.2.c b/synchronization/sources/acquire_release.2.c similarity index 100% rename from sources/Example_acquire_release.2.c rename to synchronization/sources/acquire_release.2.c diff --git a/sources/Example_acquire_release.2.f90 b/synchronization/sources/acquire_release.2.f90 similarity index 100% rename from sources/Example_acquire_release.2.f90 rename to synchronization/sources/acquire_release.2.f90 diff --git a/sources/Example_acquire_release.3.c b/synchronization/sources/acquire_release.3.c similarity index 100% rename from sources/Example_acquire_release.3.c rename to synchronization/sources/acquire_release.3.c diff --git a/sources/Example_acquire_release.3.f90 b/synchronization/sources/acquire_release.3.f90 similarity index 100% rename from sources/Example_acquire_release.3.f90 rename to synchronization/sources/acquire_release.3.f90 diff --git a/sources/Example_acquire_release_broke.4.c b/synchronization/sources/acquire_release_broke.4.c similarity index 100% rename from sources/Example_acquire_release_broke.4.c rename to synchronization/sources/acquire_release_broke.4.c diff --git a/sources/Example_acquire_release_broke.4.f90 b/synchronization/sources/acquire_release_broke.4.f90 similarity index 100% rename from sources/Example_acquire_release_broke.4.f90 rename to synchronization/sources/acquire_release_broke.4.f90 diff --git a/sources/Example_atomic.1.c b/synchronization/sources/atomic.1.c similarity index 100% rename from sources/Example_atomic.1.c rename to synchronization/sources/atomic.1.c diff --git a/sources/Example_atomic.1.f b/synchronization/sources/atomic.1.f similarity index 100% rename from sources/Example_atomic.1.f rename to synchronization/sources/atomic.1.f diff --git a/sources/Example_atomic.2.c b/synchronization/sources/atomic.2.c similarity index 100% rename from sources/Example_atomic.2.c rename to synchronization/sources/atomic.2.c diff --git a/sources/Example_atomic.2.f b/synchronization/sources/atomic.2.f similarity index 100% rename from sources/Example_atomic.2.f rename to synchronization/sources/atomic.2.f diff --git a/sources/Example_atomic.3.c b/synchronization/sources/atomic.3.c similarity index 100% rename from sources/Example_atomic.3.c rename to synchronization/sources/atomic.3.c diff --git a/sources/Example_atomic.3.f b/synchronization/sources/atomic.3.f similarity index 100% rename from sources/Example_atomic.3.f rename to synchronization/sources/atomic.3.f diff --git a/sources/Example_atomic_restrict.1.c b/synchronization/sources/atomic_restrict.1.c similarity index 100% rename from sources/Example_atomic_restrict.1.c rename to synchronization/sources/atomic_restrict.1.c diff --git a/sources/Example_atomic_restrict.1.f b/synchronization/sources/atomic_restrict.1.f similarity index 100% rename from sources/Example_atomic_restrict.1.f rename to synchronization/sources/atomic_restrict.1.f diff --git a/sources/Example_atomic_restrict.2.c b/synchronization/sources/atomic_restrict.2.c similarity index 100% rename from sources/Example_atomic_restrict.2.c rename to synchronization/sources/atomic_restrict.2.c diff --git a/sources/Example_atomic_restrict.2.f b/synchronization/sources/atomic_restrict.2.f similarity index 100% rename from sources/Example_atomic_restrict.2.f rename to synchronization/sources/atomic_restrict.2.f diff --git a/sources/Example_atomic_restrict.3.f b/synchronization/sources/atomic_restrict.3.f similarity index 100% rename from sources/Example_atomic_restrict.3.f rename to synchronization/sources/atomic_restrict.3.f diff --git a/sources/Example_barrier_regions.1.c b/synchronization/sources/barrier_regions.1.c similarity index 100% rename from sources/Example_barrier_regions.1.c rename to synchronization/sources/barrier_regions.1.c diff --git a/sources/Example_barrier_regions.1.f b/synchronization/sources/barrier_regions.1.f similarity index 100% rename from sources/Example_barrier_regions.1.f rename to synchronization/sources/barrier_regions.1.f diff --git a/sources/Example_critical.1.c b/synchronization/sources/critical.1.c similarity index 100% rename from sources/Example_critical.1.c rename to synchronization/sources/critical.1.c diff --git a/sources/Example_critical.1.f b/synchronization/sources/critical.1.f similarity index 100% rename from sources/Example_critical.1.f rename to synchronization/sources/critical.1.f diff --git a/sources/Example_critical.2.c b/synchronization/sources/critical.2.c similarity index 59% rename from sources/Example_critical.2.c rename to synchronization/sources/critical.2.c index 4e2040c..e603dda 100644 --- a/sources/Example_critical.2.c +++ b/synchronization/sources/critical.2.c @@ -5,6 +5,11 @@ * @@linkable: no * @@expect: success */ +#if _OPENMP < 201811 +#define omp_sync_hint_contended omp_lock_hint_contended +#define omp_sync_hint_speculative omp_lock_hint_speculative +#endif + #include int dequeue(float *a); @@ -16,11 +21,11 @@ void critical_example(float *x, float *y) #pragma omp parallel shared(x, y) private(ix_next, iy_next) { - #pragma omp critical (xaxis) hint(omp_lock_hint_contended) + #pragma omp critical (xaxis) hint(omp_sync_hint_contended) ix_next = dequeue(x); work(ix_next, x); - #pragma omp critical (yaxis) hint(omp_lock_hint_contended) + #pragma omp critical (yaxis) hint(omp_sync_hint_contended) iy_next = dequeue(y); work(iy_next, y); } diff --git a/sources/Example_critical.2.f b/synchronization/sources/critical.2.f similarity index 72% rename from sources/Example_critical.2.f rename to synchronization/sources/critical.2.f index ae4997e..7454695 100644 --- a/sources/Example_critical.2.f +++ b/synchronization/sources/critical.2.f @@ -1,9 +1,14 @@ ! @@name: critical.1f ! @@type: F-fixed ! @@compilable: yes +! @@requires: preprocessing ! @@linkable: no ! @@expect: success ! @@version: omp_4.5 +#if _OPENMP < 201811 +#define OMP_SYNC_HINT_CONTENDED OMP_LOCK_HINT_CONTENDED +#endif + SUBROUTINE CRITICAL_EXAMPLE(X, Y) USE OMP_LIB ! or INCLUDE "omp_lib.h" @@ -12,12 +17,12 @@ !$OMP PARALLEL SHARED(X, Y) PRIVATE(IX_NEXT, IY_NEXT) -!$OMP CRITICAL(XAXIS) HINT(OMP_LOCK_HINT_CONTENDED) +!$OMP CRITICAL(XAXIS) HINT(OMP_SYNC_HINT_CONTENDED) CALL DEQUEUE(IX_NEXT, X) !$OMP END CRITICAL(XAXIS) CALL WORK(IX_NEXT, X) -!$OMP CRITICAL(YAXIS) HINT(OMP_LOCK_HINT_CONTENDED) +!$OMP CRITICAL(YAXIS) HINT(OMP_SYNC_HINT_CONTENDED) CALL DEQUEUE(IY_NEXT,Y) !$OMP END CRITICAL(YAXIS) CALL WORK(IY_NEXT, Y) diff --git a/sources/Example_depobj.1.c b/synchronization/sources/depobj.1.c similarity index 100% rename from sources/Example_depobj.1.c rename to synchronization/sources/depobj.1.c diff --git a/sources/Example_depobj.1.f90 b/synchronization/sources/depobj.1.f90 similarity index 100% rename from sources/Example_depobj.1.f90 rename to synchronization/sources/depobj.1.f90 diff --git a/sources/Example_doacross.1.c b/synchronization/sources/doacross.1.c similarity index 100% rename from sources/Example_doacross.1.c rename to synchronization/sources/doacross.1.c diff --git a/sources/Example_doacross.1.f90 b/synchronization/sources/doacross.1.f90 similarity index 100% rename from sources/Example_doacross.1.f90 rename to synchronization/sources/doacross.1.f90 diff --git a/sources/Example_doacross.2.c b/synchronization/sources/doacross.2.c similarity index 100% rename from sources/Example_doacross.2.c rename to synchronization/sources/doacross.2.c diff --git a/sources/Example_doacross.2.f90 b/synchronization/sources/doacross.2.f90 similarity index 100% rename from sources/Example_doacross.2.f90 rename to synchronization/sources/doacross.2.f90 diff --git a/sources/Example_doacross.3.c b/synchronization/sources/doacross.3.c similarity index 100% rename from sources/Example_doacross.3.c rename to synchronization/sources/doacross.3.c diff --git a/sources/Example_doacross.3.f90 b/synchronization/sources/doacross.3.f90 similarity index 100% rename from sources/Example_doacross.3.f90 rename to synchronization/sources/doacross.3.f90 diff --git a/sources/Example_doacross.4.c b/synchronization/sources/doacross.4.c similarity index 100% rename from sources/Example_doacross.4.c rename to synchronization/sources/doacross.4.c diff --git a/sources/Example_doacross.4.f90 b/synchronization/sources/doacross.4.f90 similarity index 100% rename from sources/Example_doacross.4.f90 rename to synchronization/sources/doacross.4.f90 diff --git a/sources/Example_doacross.5.c b/synchronization/sources/doacross.5.c similarity index 100% rename from sources/Example_doacross.5.c rename to synchronization/sources/doacross.5.c diff --git a/sources/Example_doacross.5.f90 b/synchronization/sources/doacross.5.f90 similarity index 100% rename from sources/Example_doacross.5.f90 rename to synchronization/sources/doacross.5.f90 diff --git a/sources/Example_flush_nolist.1.c b/synchronization/sources/flush_nolist.1.c similarity index 100% rename from sources/Example_flush_nolist.1.c rename to synchronization/sources/flush_nolist.1.c diff --git a/sources/Example_flush_nolist.1.f b/synchronization/sources/flush_nolist.1.f similarity index 100% rename from sources/Example_flush_nolist.1.f rename to synchronization/sources/flush_nolist.1.f diff --git a/sources/Example_init_lock.1.cpp b/synchronization/sources/init_lock.1.cpp similarity index 100% rename from sources/Example_init_lock.1.cpp rename to synchronization/sources/init_lock.1.cpp diff --git a/sources/Example_init_lock.1.f b/synchronization/sources/init_lock.1.f similarity index 100% rename from sources/Example_init_lock.1.f rename to synchronization/sources/init_lock.1.f diff --git a/sources/Example_init_lock_with_hint.1.cpp b/synchronization/sources/init_lock_with_hint.1.cpp similarity index 56% rename from sources/Example_init_lock_with_hint.1.cpp rename to synchronization/sources/init_lock_with_hint.1.cpp index 1971c14..df17678 100644 --- a/sources/Example_init_lock_with_hint.1.cpp +++ b/synchronization/sources/init_lock_with_hint.1.cpp @@ -6,6 +6,11 @@ * @@expect: success * @@version: omp_4.5 */ +#if _OPENMP < 201811 +#define omp_sync_hint_contended omp_lock_hint_contended +#define omp_sync_hint_speculative omp_lock_hint_speculative +#endif + #include omp_lock_t *new_locks() @@ -17,8 +22,8 @@ omp_lock_t *new_locks() for (i=0; i<1000; i++) { omp_init_lock_with_hint(&lock[i], - static_cast(omp_lock_hint_contended | - omp_lock_hint_speculative)); + static_cast(omp_sync_hint_contended | + omp_sync_hint_speculative)); } return lock; } diff --git a/sources/Example_init_lock_with_hint.1.f b/synchronization/sources/init_lock_with_hint.1.f similarity index 66% rename from sources/Example_init_lock_with_hint.1.f rename to synchronization/sources/init_lock_with_hint.1.f index 56603c8..f7af0aa 100644 --- a/sources/Example_init_lock_with_hint.1.f +++ b/synchronization/sources/init_lock_with_hint.1.f @@ -1,9 +1,15 @@ ! @@name: init_lock.1f ! @@type: F-fixed ! @@compilable: yes +! @@requires: preprocessing ! @@linkable: no ! @@expect: success ! @@version: omp_4.5 +#if _OPENMP < 201811 +#define OMP_SYNC_HINT_CONTENDED OMP_LOCK_HINT_CONTENDED +#define OMP_SYNC_HINT_SPECULATIVE OMP_LOCK_HINT_SPECULATIVE +#endif + FUNCTION NEW_LOCKS() USE OMP_LIB ! or INCLUDE "omp_lib.h" INTEGER(OMP_LOCK_KIND), DIMENSION(1000) :: NEW_LOCKS @@ -13,7 +19,7 @@ !$OMP PARALLEL DO PRIVATE(I) DO I=1,1000 CALL OMP_INIT_LOCK_WITH_HINT(NEW_LOCKS(I), - & OMP_LOCK_HINT_CONTENDED + OMP_LOCK_HINT_SPECULATIVE) + & OMP_SYNC_HINT_CONTENDED + OMP_SYNC_HINT_SPECULATIVE) END DO !$OMP END PARALLEL DO diff --git a/sources/Example_lock_owner.1.c b/synchronization/sources/lock_owner.1.c similarity index 81% rename from sources/Example_lock_owner.1.c rename to synchronization/sources/lock_owner.1.c index b64a8fe..4bb90b6 100644 --- a/sources/Example_lock_owner.1.c +++ b/synchronization/sources/lock_owner.1.c @@ -4,7 +4,12 @@ * @@compilable: yes * @@linkable: yes * @@expect: success +* @@version: omp_5.1 */ +#if _OPENMP < 202011 +#define masked master +#endif + #include #include #include @@ -20,7 +25,7 @@ int main() #pragma omp parallel shared (x) { - #pragma omp master + #pragma omp masked { x = x + 1; omp_unset_lock (&lck); diff --git a/sources/Example_lock_owner.1.f b/synchronization/sources/lock_owner.1.f similarity index 77% rename from sources/Example_lock_owner.1.f rename to synchronization/sources/lock_owner.1.f index 17310a5..75590b5 100644 --- a/sources/Example_lock_owner.1.f +++ b/synchronization/sources/lock_owner.1.f @@ -1,8 +1,14 @@ ! @@name: lock_owner.1f ! @@type: F-fixed ! @@compilable: yes +! @@requires: preprocessing ! @@linkable: yes ! @@expect: success +! @@version: omp_5.1 +#if _OPENMP < 202011 +#define masked master +#endif + program lock use omp_lib integer :: x @@ -13,13 +19,14 @@ x = 0 !$omp parallel shared (x) -!$omp master +!$omp masked x = x + 1 call omp_unset_lock(lck) -!$omp end master +!$omp end masked ! Some more stuff. !$omp end parallel call omp_destroy_lock(lck) + end diff --git a/sources/Example_nestable_lock.1.c b/synchronization/sources/nestable_lock.1.c similarity index 100% rename from sources/Example_nestable_lock.1.c rename to synchronization/sources/nestable_lock.1.c diff --git a/sources/Example_nestable_lock.1.f b/synchronization/sources/nestable_lock.1.f similarity index 100% rename from sources/Example_nestable_lock.1.f rename to synchronization/sources/nestable_lock.1.f diff --git a/sources/Example_ordered.1.c b/synchronization/sources/ordered.1.c similarity index 100% rename from sources/Example_ordered.1.c rename to synchronization/sources/ordered.1.c diff --git a/sources/Example_ordered.1.f b/synchronization/sources/ordered.1.f similarity index 100% rename from sources/Example_ordered.1.f rename to synchronization/sources/ordered.1.f diff --git a/sources/Example_ordered.2.c b/synchronization/sources/ordered.2.c similarity index 100% rename from sources/Example_ordered.2.c rename to synchronization/sources/ordered.2.c diff --git a/sources/Example_ordered.2.f b/synchronization/sources/ordered.2.f similarity index 100% rename from sources/Example_ordered.2.f rename to synchronization/sources/ordered.2.f diff --git a/sources/Example_ordered.3.c b/synchronization/sources/ordered.3.c similarity index 100% rename from sources/Example_ordered.3.c rename to synchronization/sources/ordered.3.c diff --git a/sources/Example_ordered.3.f b/synchronization/sources/ordered.3.f similarity index 100% rename from sources/Example_ordered.3.f rename to synchronization/sources/ordered.3.f diff --git a/sources/Example_simple_lock.1.c b/synchronization/sources/simple_lock.1.c similarity index 100% rename from sources/Example_simple_lock.1.c rename to synchronization/sources/simple_lock.1.c diff --git a/sources/Example_simple_lock.1.f b/synchronization/sources/simple_lock.1.f similarity index 100% rename from sources/Example_simple_lock.1.f rename to synchronization/sources/simple_lock.1.f diff --git a/sources/Example_worksharing_critical.1.c b/synchronization/sources/worksharing_critical.1.c similarity index 100% rename from sources/Example_worksharing_critical.1.c rename to synchronization/sources/worksharing_critical.1.c diff --git a/sources/Example_worksharing_critical.1.f b/synchronization/sources/worksharing_critical.1.f similarity index 100% rename from sources/Example_worksharing_critical.1.f rename to synchronization/sources/worksharing_critical.1.f diff --git a/Examples_worksharing_critical.tex b/synchronization/worksharing_critical.tex similarity index 89% rename from Examples_worksharing_critical.tex rename to synchronization/worksharing_critical.tex index 7f7a7fb..04cabb5 100644 --- a/Examples_worksharing_critical.tex +++ b/synchronization/worksharing_critical.tex @@ -7,7 +7,7 @@ construct. This example is conforming because the worksharing \code{single} region is not closely nested inside the \code{critical} region. A single thread executes the one and only section in the \code{sections} region, and executes the \code{critical} region. The same thread encounters the nested \code{parallel} -region, creates a new team of threads, and becomes the master of the new team. +region, creates a new team of threads, and becomes the primary thread of the new team. One of the threads in the new team enters the \code{single} region and increments \code{i} by \code{1}. At the end of this example \code{i} is equal to \code{2}. diff --git a/tasking/parallel_masked_taskloop.tex b/tasking/parallel_masked_taskloop.tex new file mode 100644 index 0000000..458815a --- /dev/null +++ b/tasking/parallel_masked_taskloop.tex @@ -0,0 +1,31 @@ +\pagebreak +\section{Combined \code{parallel} \code{masked} and \code{taskloop} Constructs} +\label{sec:parallel_masked_taskloop} + +Just as the \code{for} and \code{do} constructs were combined +with the \code{parallel} construct for convenience, so too, the combined +\code{parallel}~\code{masked}~\code{taskloop} and +\code{parallel}~\code{masked}~\code{taskloop}~\code{simd} +constructs have been created for convenience when using the +\code{taskloop} construct. + +In the following example the first \code{taskloop} construct is enclosed +by the usual \code{parallel} and \code{masked} constructs to form +a team of threads, and a single task generator (primary thread) for +the \code{taskloop} construct. + +The same OpenMP operations for the first taskloop are accomplished by the second +taskloop with the \code{parallel}~\code{masked}~\code{taskloop} +combined construct. +The third taskloop uses the combined \code{parallel}~\code{masked}~\code{taskloop}~\code{simd} +construct to accomplish the same behavior as closely nested \code{parallel masked}, +and \code{taskloop simd} constructs. + +As with any combined construct the clauses of the components may be used +with appropriate restrictions. The combination of the \code{parallel}~\code{masked} construct +with the \code{taskloop} or \code{taskloop}~\code{simd} construct produces no additional +restrictions. + +\cexample[5.1]{parallel_masked_taskloop}{1} + +\ffreeexample[5.1]{parallel_masked_taskloop}{1}[1] diff --git a/sources/Example_parallel_master_taskloop.1.c b/tasking/sources/parallel_masked_taskloop.1.c similarity index 65% rename from sources/Example_parallel_master_taskloop.1.c rename to tasking/sources/parallel_masked_taskloop.1.c index 7b2179b..2c12184 100644 --- a/sources/Example_parallel_master_taskloop.1.c +++ b/tasking/sources/parallel_masked_taskloop.1.c @@ -1,11 +1,14 @@ /* -* @@name: parallel_master_taskloop.1c +* @@name: parallel_masked_taskloop.1c * @@type: C * @@compilable: yes * @@linkable: yes * @@expect: success -* @@version: omp_5.0 +* @@version: omp_5.1 */ +#if _OPENMP < 202011 +#define masked master +#endif #include #define N 100 @@ -17,14 +20,14 @@ int main() for(int i=0; i void foo() diff --git a/sources/Example_task_dep.6.f90 b/tasking/sources/task_dep.6.f90 similarity index 99% rename from sources/Example_task_dep.6.f90 rename to tasking/sources/task_dep.6.f90 index 51d1a02..1ebc222 100644 --- a/sources/Example_task_dep.6.f90 +++ b/tasking/sources/task_dep.6.f90 @@ -4,8 +4,6 @@ ! @@linkable: yes ! @@expect: success ! @@version: omp_5.0 - - subroutine foo() implicit none integer :: x, y diff --git a/sources/Example_task_dep.7.c b/tasking/sources/task_dep.7.c similarity index 99% rename from sources/Example_task_dep.7.c rename to tasking/sources/task_dep.7.c index df6e016..c9f7649 100644 --- a/sources/Example_task_dep.7.c +++ b/tasking/sources/task_dep.7.c @@ -6,7 +6,6 @@ * @@expect: success * @@version: omp_5.0 */ - #include void foo() diff --git a/sources/Example_task_dep.7.f90 b/tasking/sources/task_dep.7.f90 similarity index 99% rename from sources/Example_task_dep.7.f90 rename to tasking/sources/task_dep.7.f90 index 8d2cd8a..db44628 100644 --- a/sources/Example_task_dep.7.f90 +++ b/tasking/sources/task_dep.7.f90 @@ -4,8 +4,6 @@ ! @@linkable: yes ! @@expect: success ! @@version: omp_5.0 - - subroutine foo() implicit none integer :: x, y diff --git a/sources/Example_task_dep.8.c b/tasking/sources/task_dep.8.c similarity index 99% rename from sources/Example_task_dep.8.c rename to tasking/sources/task_dep.8.c index 7fbc459..92c4f09 100644 --- a/sources/Example_task_dep.8.c +++ b/tasking/sources/task_dep.8.c @@ -6,7 +6,6 @@ * @@expect: success * @@version: omp_5.0 */ - #include void foo() diff --git a/sources/Example_task_dep.8.f90 b/tasking/sources/task_dep.8.f90 similarity index 99% rename from sources/Example_task_dep.8.f90 rename to tasking/sources/task_dep.8.f90 index 01aa212..c6fc10d 100644 --- a/sources/Example_task_dep.8.f90 +++ b/tasking/sources/task_dep.8.f90 @@ -4,8 +4,6 @@ ! @@linkable: yes ! @@expect: success ! @@version: omp_5.0 - - subroutine foo() implicit nonE integer :: x, y diff --git a/sources/Example_task_dep.9.c b/tasking/sources/task_dep.9.c similarity index 100% rename from sources/Example_task_dep.9.c rename to tasking/sources/task_dep.9.c diff --git a/sources/Example_task_dep.9.f90 b/tasking/sources/task_dep.9.f90 similarity index 100% rename from sources/Example_task_dep.9.f90 rename to tasking/sources/task_dep.9.f90 diff --git a/tasking/sources/task_detach.1.c b/tasking/sources/task_detach.1.c new file mode 100644 index 0000000..8e26450 --- /dev/null +++ b/tasking/sources/task_detach.1.c @@ -0,0 +1,34 @@ +/* +* @@name: task_detach.1c +* @@type: C +* @@compilable: yes +* @@linkable: no +* @@expect: success +* @@version: omp_5.0 +*/ +#include + +void async_work(void (*)(void*), void*); +void work(); + +int main() { + int async=1; + #pragma omp parallel + #pragma omp masked + { + + omp_event_handle_t event; + #pragma omp task detach(event) + { + if(async) { + async_work( (void (*)(void*)) omp_fulfill_event, (void*) event ); + } else { + work(); + omp_fulfill_event(event); + } + } + // Other work + #pragma omp taskwait + } + return 0; +} diff --git a/tasking/sources/task_detach.1.f90 b/tasking/sources/task_detach.1.f90 new file mode 100644 index 0000000..38d2326 --- /dev/null +++ b/tasking/sources/task_detach.1.f90 @@ -0,0 +1,36 @@ +! @@name: task_detach.1f90 +! @@type: F-free +! @@compilable: yes +! @@linkable: no +! @@expect: success +! @@version: omp_5.0 +program main + use omp_lib + implicit none + + external :: async_work, work + + logical :: async=.true. + integer(omp_event_handle_kind) :: event + + !$omp parallel + !$omp masked + + !$omp task detach(event) + + if(async) then + call async_work(omp_fulfill_event, event) + else + call work() + call omp_fulfill_event(event) + endif + + !$omp end task + !! Other work + + !$omp taskwait + + !$omp end masked + !$omp end parallel + +end program diff --git a/tasking/sources/task_detach.2.c b/tasking/sources/task_detach.2.c new file mode 100644 index 0000000..b18d1fa --- /dev/null +++ b/tasking/sources/task_detach.2.c @@ -0,0 +1,88 @@ +/* +* @@name: task_detach.2c +* @@type: C +* @@compilable: yes +* @@linkable: yes +* @@expect: success +* @@version: omp_5.0 +*/ + +// use -lrt on loader line +#include +#include +#include +#include +#include +#include + +#include + +#define IO_SIGNAL SIGUSR1 // Signal used to notify I/O completion + + // Handler for I/O completion signal +static void callback_aioSigHandler(int sig, siginfo_t *si, void *ucontext) { + if (si->si_code == SI_ASYNCIO){ + printf( "OUT: I/O completion signal received.\n"); + omp_fulfill_event( (omp_event_handle_t)(si->si_value.sival_ptr) ); + } +} + +void work(int i){ printf("OUT: Executing work(%d)\n", i);} + +int main() { + // Write "Written Asynchronously." to file data, using POSIX asynchronous IO + // Error checking not included for clarity and simplicity. + + char data[] = "Written Asynchronously."; + + struct aiocb cb; + struct sigaction sa; + + omp_event_handle_t event; + + int fd = open( "async_data", O_CREAT|O_RDWR|O_TRUNC,0664); + + // Setup async io (aio) control block (cb) + cb.aio_nbytes = sizeof(data)-1; + cb.aio_fildes = fd; + cb.aio_buf = data; + cb.aio_reqprio = 0; + cb.aio_offset = 0; + cb.aio_sigevent.sigev_notify = SIGEV_SIGNAL; + cb.aio_sigevent.sigev_signo = IO_SIGNAL; + + // Setup Signal Handler Callback + sigemptyset(&sa.sa_mask); + sa.sa_flags = SA_RESTART | SA_SIGINFO; + sa.sa_sigaction = callback_aioSigHandler; //callback + sigaction(IO_SIGNAL, &sa, NULL); + + #pragma omp parallel num_threads(2) + #pragma omp masked + { + + #pragma omp task detach(event) if(0) // TASK1 + { + cb.aio_sigevent.sigev_value.sival_ptr = (void *) event; + aio_write(&cb); + } + + #pragma omp task // TASK2 + work(1); + #pragma omp task // TASK3 + work(2); + + } // Parallel region barrier ensures completion of detachable task. + + // Making sure the aio operation completed. + // With OpenMP detachable task the condition will always be false: + while(aio_error(&cb) == EINPROGRESS){printf(" INPROGRESS\n");} //Safeguard + + close(fd); + return 0; +} +/* Any Order: +OUT: I/O completion signal received. +OUT: Executing work(1) +OUT: Executing work(2) +*/ diff --git a/sources/Example_task_priority.1.c b/tasking/sources/task_priority.1.c similarity index 100% rename from sources/Example_task_priority.1.c rename to tasking/sources/task_priority.1.c diff --git a/sources/Example_task_priority.1.f90 b/tasking/sources/task_priority.1.f90 similarity index 100% rename from sources/Example_task_priority.1.f90 rename to tasking/sources/task_priority.1.f90 diff --git a/sources/Example_taskgroup.1.c b/tasking/sources/taskgroup.1.c similarity index 100% rename from sources/Example_taskgroup.1.c rename to tasking/sources/taskgroup.1.c diff --git a/sources/Example_taskgroup.1.f90 b/tasking/sources/taskgroup.1.f90 similarity index 100% rename from sources/Example_taskgroup.1.f90 rename to tasking/sources/taskgroup.1.f90 diff --git a/sources/Example_tasking.1.c b/tasking/sources/tasking.1.c similarity index 100% rename from sources/Example_tasking.1.c rename to tasking/sources/tasking.1.c diff --git a/sources/Example_tasking.1.f90 b/tasking/sources/tasking.1.f90 similarity index 100% rename from sources/Example_tasking.1.f90 rename to tasking/sources/tasking.1.f90 diff --git a/sources/Example_tasking.10.c b/tasking/sources/tasking.10.c similarity index 100% rename from sources/Example_tasking.10.c rename to tasking/sources/tasking.10.c diff --git a/sources/Example_tasking.10.f90 b/tasking/sources/tasking.10.f90 similarity index 100% rename from sources/Example_tasking.10.f90 rename to tasking/sources/tasking.10.f90 diff --git a/sources/Example_tasking.11.c b/tasking/sources/tasking.11.c similarity index 100% rename from sources/Example_tasking.11.c rename to tasking/sources/tasking.11.c diff --git a/sources/Example_tasking.11.f90 b/tasking/sources/tasking.11.f90 similarity index 100% rename from sources/Example_tasking.11.f90 rename to tasking/sources/tasking.11.f90 diff --git a/sources/Example_tasking.12.c b/tasking/sources/tasking.12.c similarity index 100% rename from sources/Example_tasking.12.c rename to tasking/sources/tasking.12.c diff --git a/sources/Example_tasking.12.f90 b/tasking/sources/tasking.12.f90 similarity index 100% rename from sources/Example_tasking.12.f90 rename to tasking/sources/tasking.12.f90 diff --git a/sources/Example_tasking.13.c b/tasking/sources/tasking.13.c similarity index 100% rename from sources/Example_tasking.13.c rename to tasking/sources/tasking.13.c diff --git a/sources/Example_tasking.13.f90 b/tasking/sources/tasking.13.f90 similarity index 100% rename from sources/Example_tasking.13.f90 rename to tasking/sources/tasking.13.f90 diff --git a/sources/Example_tasking.14.c b/tasking/sources/tasking.14.c similarity index 100% rename from sources/Example_tasking.14.c rename to tasking/sources/tasking.14.c diff --git a/sources/Example_tasking.14.f90 b/tasking/sources/tasking.14.f90 similarity index 100% rename from sources/Example_tasking.14.f90 rename to tasking/sources/tasking.14.f90 diff --git a/sources/Example_tasking.15.c b/tasking/sources/tasking.15.c similarity index 100% rename from sources/Example_tasking.15.c rename to tasking/sources/tasking.15.c diff --git a/sources/Example_tasking.15.f90 b/tasking/sources/tasking.15.f90 similarity index 100% rename from sources/Example_tasking.15.f90 rename to tasking/sources/tasking.15.f90 diff --git a/sources/Example_tasking.16.c b/tasking/sources/tasking.16.c similarity index 100% rename from sources/Example_tasking.16.c rename to tasking/sources/tasking.16.c diff --git a/sources/Example_tasking.16.f90 b/tasking/sources/tasking.16.f90 similarity index 100% rename from sources/Example_tasking.16.f90 rename to tasking/sources/tasking.16.f90 diff --git a/sources/Example_tasking.17.c b/tasking/sources/tasking.17.c similarity index 100% rename from sources/Example_tasking.17.c rename to tasking/sources/tasking.17.c diff --git a/sources/Example_tasking.17.f90 b/tasking/sources/tasking.17.f90 similarity index 100% rename from sources/Example_tasking.17.f90 rename to tasking/sources/tasking.17.f90 diff --git a/sources/Example_tasking.18.c b/tasking/sources/tasking.18.c similarity index 100% rename from sources/Example_tasking.18.c rename to tasking/sources/tasking.18.c diff --git a/sources/Example_tasking.18.f90 b/tasking/sources/tasking.18.f90 similarity index 100% rename from sources/Example_tasking.18.f90 rename to tasking/sources/tasking.18.f90 diff --git a/sources/Example_tasking.19.c b/tasking/sources/tasking.19.c similarity index 100% rename from sources/Example_tasking.19.c rename to tasking/sources/tasking.19.c diff --git a/sources/Example_tasking.19.f90 b/tasking/sources/tasking.19.f90 similarity index 100% rename from sources/Example_tasking.19.f90 rename to tasking/sources/tasking.19.f90 diff --git a/sources/Example_tasking.2.c b/tasking/sources/tasking.2.c similarity index 100% rename from sources/Example_tasking.2.c rename to tasking/sources/tasking.2.c diff --git a/sources/Example_tasking.2.f90 b/tasking/sources/tasking.2.f90 similarity index 100% rename from sources/Example_tasking.2.f90 rename to tasking/sources/tasking.2.f90 diff --git a/sources/Example_tasking.3.c b/tasking/sources/tasking.3.c similarity index 100% rename from sources/Example_tasking.3.c rename to tasking/sources/tasking.3.c diff --git a/sources/Example_tasking.3.f90 b/tasking/sources/tasking.3.f90 similarity index 100% rename from sources/Example_tasking.3.f90 rename to tasking/sources/tasking.3.f90 diff --git a/sources/Example_tasking.4.c b/tasking/sources/tasking.4.c similarity index 100% rename from sources/Example_tasking.4.c rename to tasking/sources/tasking.4.c diff --git a/sources/Example_tasking.4.f b/tasking/sources/tasking.4.f similarity index 100% rename from sources/Example_tasking.4.f rename to tasking/sources/tasking.4.f diff --git a/sources/Example_tasking.5.c b/tasking/sources/tasking.5.c similarity index 100% rename from sources/Example_tasking.5.c rename to tasking/sources/tasking.5.c diff --git a/sources/Example_tasking.5.f b/tasking/sources/tasking.5.f similarity index 100% rename from sources/Example_tasking.5.f rename to tasking/sources/tasking.5.f diff --git a/sources/Example_tasking.6.c b/tasking/sources/tasking.6.c similarity index 100% rename from sources/Example_tasking.6.c rename to tasking/sources/tasking.6.c diff --git a/sources/Example_tasking.6.f b/tasking/sources/tasking.6.f similarity index 100% rename from sources/Example_tasking.6.f rename to tasking/sources/tasking.6.f diff --git a/sources/Example_tasking.7.c b/tasking/sources/tasking.7.c similarity index 100% rename from sources/Example_tasking.7.c rename to tasking/sources/tasking.7.c diff --git a/sources/Example_tasking.7.f b/tasking/sources/tasking.7.f similarity index 100% rename from sources/Example_tasking.7.f rename to tasking/sources/tasking.7.f diff --git a/sources/Example_tasking.8.c b/tasking/sources/tasking.8.c similarity index 100% rename from sources/Example_tasking.8.c rename to tasking/sources/tasking.8.c diff --git a/sources/Example_tasking.8.f b/tasking/sources/tasking.8.f similarity index 100% rename from sources/Example_tasking.8.f rename to tasking/sources/tasking.8.f diff --git a/sources/Example_tasking.9.c b/tasking/sources/tasking.9.c similarity index 100% rename from sources/Example_tasking.9.c rename to tasking/sources/tasking.9.c diff --git a/sources/Example_tasking.9.f b/tasking/sources/tasking.9.f similarity index 100% rename from sources/Example_tasking.9.f rename to tasking/sources/tasking.9.f diff --git a/sources/Example_taskloop.1.c b/tasking/sources/taskloop.1.c similarity index 100% rename from sources/Example_taskloop.1.c rename to tasking/sources/taskloop.1.c diff --git a/sources/Example_taskloop.1.f90 b/tasking/sources/taskloop.1.f90 similarity index 100% rename from sources/Example_taskloop.1.f90 rename to tasking/sources/taskloop.1.f90 diff --git a/sources/Example_taskloop.2.c b/tasking/sources/taskloop.2.c similarity index 100% rename from sources/Example_taskloop.2.c rename to tasking/sources/taskloop.2.c diff --git a/sources/Example_taskloop.2.f90 b/tasking/sources/taskloop.2.f90 similarity index 100% rename from sources/Example_taskloop.2.f90 rename to tasking/sources/taskloop.2.f90 diff --git a/sources/Example_taskyield.1.c b/tasking/sources/taskyield.1.c similarity index 100% rename from sources/Example_taskyield.1.c rename to tasking/sources/taskyield.1.c diff --git a/sources/Example_taskyield.1.f90 b/tasking/sources/taskyield.1.f90 similarity index 100% rename from sources/Example_taskyield.1.f90 rename to tasking/sources/taskyield.1.f90 diff --git a/Examples_task_dep.tex b/tasking/task_dep.tex similarity index 100% rename from Examples_task_dep.tex rename to tasking/task_dep.tex diff --git a/tasking/task_detach.tex b/tasking/task_detach.tex new file mode 100644 index 0000000..5c6e802 --- /dev/null +++ b/tasking/task_detach.tex @@ -0,0 +1,60 @@ +\pagebreak +\section{Task Detachment} +\label{sec:task_detachment} + +% if used, then generated task must be completed. +% No definition of a detachable task + +The \code{detach} clause on a \code{task} construct provides a mechanism for an asynchronous +routine to be called within a task block, and for the routine's +callback to signal completion to the OpenMP runtime, through an +event fulfillment, triggered by a call to the \code{omp\_fulfill\_event} routine. +When a \code{detach} clause is used on a task construct, +completion of the \emph{detachable} task occurs when the task's structured +block is completed AND an \plc{allow-completion} event is +fulfilled by a call to the \code{omp\_fulfill\_event} +routine with the \plc{event-handle} argument. + +The first example illustrates the basic components used in a detachable task. +The second example is a program that executes asynchronous IO, and illustrates +methods that are also inherent in asynchronous messaging within MPI and asynchronous commands in +streams within GPU codes. +Interfaces to asynchronous operations found in IO, MPI and GPU parallel computing platforms +and their programming models are not standardized. + +------------------------- + +The first example creates a detachable task +that executes the asynchronous \plc{async\_work} routine, +passing the \plc{omp\_fulfill\_event} function and the (firstprivate) event handle +to the function. Here, the \code{omp\_fulfill\_event} function is +the ``callback'' function to be executed at the end of the \plc{async\_work} function's +asynchronous operations, +with the associated data, \plc{event}. + +\cexample[5.0]{task_detach}{1} + +\ffreeexample[5.0]{task_detach}{1} +\clearpage + +%ASYNCHRONOUS IO + +In the following example, text data is written asynchronously to the file \plc{async\_data}, +using POSIX asynchronous IO (aio). An aio ``control block'', \plc{cb}, is set up to +send a signal when IO is complete, and the \plc{sigaction} function registers +the signal action, a callback to \plc{callback\_aioSigHandler}. + +The first task (TASK1) starts the asynchronous IO and runs as a detachable task. +The second and third tasks (TASK2 and TASK3) perform synchronous IO to stdout with print statements. +The difference between the two types of tasks is that the thread for TASK1 is freed for +other execution within the parallel region, while the threads for TASK2 and TASK3 wait +on the (synchronous) IO to complete, and cannot perform other work while the +operating system is performing the synchronous IO. +The \code{if} clause ensures that the detachable task is launched +and the call to the \splc{aio_write} function returns +before TASK2 and TASK3 are generated (while the async IO occurs in the ``background'' and eventually +executes the callback function). The barrier at the end of the parallel region ensures that the +detachable task has completed. + +\cexample[5.0]{task_detach}{2} + diff --git a/Examples_task_priority.tex b/tasking/task_priority.tex similarity index 100% rename from Examples_task_priority.tex rename to tasking/task_priority.tex diff --git a/Examples_taskgroup.tex b/tasking/taskgroup.tex similarity index 95% rename from Examples_taskgroup.tex rename to tasking/taskgroup.tex index 5b0579a..aebea36 100644 --- a/Examples_taskgroup.tex +++ b/tasking/taskgroup.tex @@ -1,5 +1,5 @@ \pagebreak -\section{The \code{taskgroup} Construct} +\section{\code{taskgroup} Construct} \label{sec:taskgroup} In this example, tasks are grouped and synchronized using the \code{taskgroup} diff --git a/Examples_tasking.tex b/tasking/tasking.tex similarity index 99% rename from Examples_tasking.tex rename to tasking/tasking.tex index ec44c0d..b575695 100644 --- a/Examples_tasking.tex +++ b/tasking/tasking.tex @@ -1,5 +1,5 @@ \pagebreak -\section{The \code{task} and \code{taskwait} Constructs} +\section{\code{task} and \code{taskwait} Constructs} \label{sec:task_taskwait} The following example shows how to traverse a tree-like structure using explicit diff --git a/Examples_taskloop.tex b/tasking/taskloop.tex similarity index 95% rename from Examples_taskloop.tex rename to tasking/taskloop.tex index 37bb06c..9b48990 100644 --- a/Examples_taskloop.tex +++ b/tasking/taskloop.tex @@ -1,5 +1,5 @@ \pagebreak -\section{The \code{taskloop} Construct} +\section{\code{taskloop} Construct} \label{sec:taskloop} The following example illustrates how to execute a long running task concurrently with tasks created diff --git a/Examples_taskyield.tex b/tasking/taskyield.tex similarity index 92% rename from Examples_taskyield.tex rename to tasking/taskyield.tex index 8e2ecc0..c687d1c 100644 --- a/Examples_taskyield.tex +++ b/tasking/taskyield.tex @@ -1,5 +1,5 @@ \pagebreak -\section{The \code{taskyield} Construct} +\section{\code{taskyield} Construct} \label{sec:taskyield} The following example illustrates the use of the \code{taskyield} directive.