|
1 | 1 | # `OpenMP* Primes` Samples
|
2 |
| -The `OpenMP* Primes` sample is designed to illustrate how to use the OpenMP* API |
3 |
| -with the Intel® Fortran Compiler. |
| 2 | + |
| 3 | +The `OpenMP* Primes` sample is designed to illustrate how to use the OpenMP* API with the Intel® Fortran Compiler. |
4 | 4 |
|
5 | 5 | This program finds all primes in the first 40,000,000 integers, the number of
|
6 | 6 | 4n+1 primes, and the number of 4n-1 primes in the same range. The sample
|
7 |
| -illustrates two OpenMP* directives to help speed up code. |
| 7 | +demonstrates how to use two OpenMP* directives to help speed up code. |
| 8 | + |
8 | 9 |
|
9 | 10 | | Area | Description
|
10 | 11 | |:--- |:---
|
11 |
| -| What you will learn | How to build and run a Fortran OpenMP application using Intel® Fortran Compiler |
| 12 | +| What you will learn | How to build and run a Fortran OpenMP application using the Intel® Fortran Compiler |
12 | 13 | | Time to complete | 10 minutes
|
13 | 14 |
|
14 | 15 | ## Purpose
|
| 16 | + |
15 | 17 | This program finds all primes in the first 40,000,000 integers, the number of
|
16 |
| -4n+1 primes, and the number of 4n-1 primes in the same range. It illustrates two |
17 |
| -OpenMP* directives to help speed up the code. |
| 18 | +4n+1 primes, and the number of 4n-1 primes in the same range. It shows how to use |
| 19 | +two OpenMP directives to help speed up the code. |
18 | 20 |
|
19 |
| -First, a dynamic schedule clause is used with the OpenMP* for a directive. |
| 21 | +First, a dynamic schedule clause is used with the OpenMP for a directive. |
20 | 22 | Because the workload of the DO loop increases as its index get bigger, the
|
21 | 23 | default static scheduling does not work well. Instead, dynamic scheduling
|
22 | 24 | accounts for the increased workload. Dynamic scheduling itself has more overhead
|
23 | 25 | than static scheduling, so a chunk size of 10 is used to reduce the overhead for
|
24 | 26 | dynamic scheduling.
|
25 | 27 |
|
26 |
| -Second, a reduction clause is used instead of an OpenMP* critical directive to |
| 28 | +Second, a reduction clause is used instead of an OpenMP critical directive to |
27 | 29 | eliminate lock overhead. Using a critical directive would cause excessive lock
|
28 | 30 | overhead due to the one-thread-at-time update of the shared variables each time
|
29 | 31 | through the DO loop. Instead, the reduction clause causes only one update of the
|
30 | 32 | shared variables once at the end of the loop.
|
31 | 33 |
|
32 | 34 | ## Prerequisites
|
| 35 | + |
33 | 36 | | Optimized for | Description
|
34 | 37 | |:--- |:---
|
35 |
| -| OS | macOS* <br> Xcode* |
| 38 | +| OS | Linux*<br>Windows* |
36 | 39 | | Software | Intel® Fortran Compiler
|
37 | 40 |
|
38 | 41 | >**Note**: The Intel® Fortran Compiler is included in the [Intel® oneAPI HPC
|
39 |
| ->Toolkit (HPC |
40 |
| ->Kit)](https://www.intel.com/content/www/us/en/developer/tools/oneapi/hpc-toolkit.html). |
| 42 | +>Toolkit](https://www.intel.com/content/www/us/en/developer/tools/oneapi/hpc-toolkit.html) or available as a |
| 43 | +[stand-alone download](https://www.intel.com/content/www/us/en/developer/articles/tool/oneapi-standalone-components.html#fortran). |
41 | 44 |
|
42 | 45 | ## Key Implementation Details
|
43 |
| -The Intel® Fortran Compiler includes all libraries and headers necessary to |
44 |
| -compile and run OpenMP* enabled Fortran applications. |
45 | 46 |
|
46 |
| -You must use the following options to compile the program versions. |
47 |
| -- `-qopenmp` enables compiler recognition of OpenMP* directives. (Omitting this |
48 |
| - option results in a serial program.) |
49 |
| -- `-fpp` enables the Fortran preprocessor. |
| 47 | +The Intel Fortran Compiler includes all libraries and headers necessary to |
| 48 | +compile and run OpenMP-enabled Fortran applications. |
50 | 49 |
|
51 |
| -You can compile the program with all optimizations disabled using the `-O0` or |
52 |
| -at any level of optimization `-O1`, `-O2`, or `-O3`. |
| 50 | +Use the following options to compile the program versions. |
| 51 | + |
| 52 | +- [`-qopenmp` (Linux) or `/Qopenmp` (Windows)](https://www.intel.com/content/www/us/en/docs/fortran-compiler/developer-guide-reference/current/qopenmp-qopenmp.html) enables compiler recognition of OpenMP* directives. Omitting this |
| 53 | + option results in a serial program. |
| 54 | +- [`-O[n]` (Linux) or `/O[n]` (Windows)](https://www.intel.com/content/www/us/en/docs/fortran-compiler/developer-guide-reference/current/o-001.html) sets the optimization level from level 1 (`-O1`) to level 3 (`-O3`). You can disable all optimizations using `-O0` (Linux) or `/Od` (Windows). |
53 | 55 |
|
54 | 56 | >**Note**: You can find more information about these options in the *Compiler
|
55 | 57 | >Options* section of the [Intel® Fortran Compiler Developer Guide and
|
56 |
| ->Reference](https://www.intel.com/content/www/us/en/develop/documentation/fortran-compiler-oneapi-dev-guide-and-reference). |
| 58 | +>Reference](https://www.intel.com/content/www/us/en/docs/fortran-compiler/developer-guide-reference/current/overview.html). |
57 | 59 |
|
58 | 60 | ## Set Environment Variables
|
| 61 | + |
59 | 62 | When working with the command-line interface (CLI), you should configure the
|
60 | 63 | oneAPI toolkits using environment variables. Set up your CLI environment by
|
61 | 64 | sourcing the `setvars` script every time you open a new terminal window. This
|
62 | 65 | practice ensures that your compiler, libraries, and tools are ready for
|
63 | 66 | development.
|
64 | 67 |
|
65 |
| -## Build the `OpenMP* Primes` Sample |
66 | 68 | > **Note**: If you have not already done so, set up your CLI environment by
|
67 | 69 | > sourcing the `setvars` script in the root of your oneAPI installation.
|
68 | 70 | >
|
69 |
| -> Linux and macOS*: |
70 |
| -> - For system wide installations: `. /opt/intel/oneapi/setvars.sh` |
| 71 | +> Linux: |
| 72 | +> - For system wide installations in the default installation directory: `. /opt/intel/oneapi/setvars.sh` |
71 | 73 | > - For private installations: ` . ~/intel/oneapi/setvars.sh`
|
72 |
| -> - For non-POSIX shells, like csh, use commands similar to the following: `bash |
73 |
| -> -c 'source <install-dir>/setvars.sh ; exec csh'` |
| 74 | +> |
| 75 | +> Windows: |
| 76 | +> - Under normal circumstances, you do not need to run the setvars.bat batch file. The terminal shortcuts |
| 77 | +> in the Windows Start menu, Intel oneAPI command prompt for <target architecture> for Visual Studio <year>, |
| 78 | +> set these variables automatically. |
| 79 | +> |
| 80 | +> For additional information, see [Use the Command Line on Windows](https://www.intel.com/content/www/us/en/docs/fortran-compiler/developer-guide-reference/current/use-the-command-line-on-windows.html). |
74 | 81 | >
|
75 | 82 | > For more information on configuring environment variables, see [Use the
|
76 |
| -> setvars Script with Linux* or |
77 |
| -> macOS*](https://www.intel.com/content/www/us/en/develop/documentation/oneapi-programming-guide/top/oneapi-development-environment-setup/use-the-setvars-script-with-linux-or-macos.html). |
78 |
| -
|
79 |
| -### Use Visual Studio Code* (VS Code) (Optional) |
80 |
| -You can use Visual Studio Code* (VS Code) extensions to set your environment, |
81 |
| -create launch configurations, and browse and download samples. |
82 |
| - |
83 |
| -The basic steps to build and run a sample using VS Code include: |
84 |
| - 1. Configure the oneAPI environment with the extension **Environment |
85 |
| - Configurator for Intel® oneAPI Toolkits**. |
86 |
| - 2. Download a sample using the extension **Code Sample Browser for Intel® |
87 |
| - oneAPI Toolkits**. |
88 |
| - 3. Open a terminal in VS Code (**Terminal > New Terminal**). |
89 |
| - 4. Run the sample in the VS Code terminal using the instructions below. |
90 |
| - |
91 |
| -To learn more about the extensions and how to configure the oneAPI environment, |
92 |
| -see the [Using Visual Studio Code with Intel® oneAPI Toolkits User |
93 |
| -Guide](https://www.intel.com/content/www/us/en/develop/documentation/using-vs-code-with-intel-oneapi/top.html). |
94 |
| - |
95 |
| -### On macOS* |
| 83 | +> setvars Script with Linux and Windows](https://www.intel.com/content/www/us/en/docs/fortran-compiler/developer-guide-reference/current/specifying-the-location-of-compiler-components.html). |
| 84 | +
|
| 85 | +## Build the `OpenMP Primes` Sample |
| 86 | + |
96 | 87 | 1. Change to the sample directory.
|
97 |
| -2. Build release and debug versions of the program. |
| 88 | +2. Build debug and release versions of the program. |
| 89 | + |
| 90 | + Linux: |
| 91 | + |
98 | 92 | ```
|
99 | 93 | make clean
|
100 | 94 | make debug
|
| 95 | + make |
101 | 96 | ```
|
102 | 97 |
|
103 |
| -#### Troubleshooting |
104 |
| -If you receive an error message, troubleshoot the problem using the |
105 |
| -**Diagnostics Utility for Intel® oneAPI Toolkits**. The diagnostic utility |
106 |
| -provides configuration and system checks to help find missing dependencies, |
107 |
| -permissions errors, and other issues. See the [Diagnostics Utility for Intel® |
108 |
| -oneAPI Toolkits User |
109 |
| -Guide](https://www.intel.com/content/www/us/en/develop/documentation/diagnostic-utility-user-guide/top.html) |
110 |
| -for more information on using the utility. |
| 98 | + Windows: |
| 99 | + |
| 100 | + ``` |
| 101 | + build.bat |
| 102 | + ``` |
| 103 | + |
| 104 | +## Run the `OpenMP* Primes` Program |
111 | 105 |
|
112 |
| -## Run the `OpenMP* Primes` Programs |
113 | 106 | You can run different versions of the program to discover application runtime
|
114 | 107 | changes.
|
115 | 108 |
|
116 | 109 | ### Experiment 1: Run the Debug Version
|
| 110 | + |
117 | 111 | 1. Run the program.
|
| 112 | + |
| 113 | + Linux: |
| 114 | + |
118 | 115 | ```
|
119 | 116 | make debug_run
|
120 | 117 | ```
|
121 |
| - Notice the speed. |
122 | 118 |
|
123 |
| -### Experiment 2: Run the Optimized Version |
124 |
| -1. Build and run the release version. |
| 119 | + Windows: |
| 120 | + |
125 | 121 | ```
|
126 |
| - make |
| 122 | + debug_run.bat |
127 | 123 | ```
|
128 |
| -2. Run the program. |
| 124 | + |
| 125 | + Notice the timestamp. With multi-threaded applications, use Elapsed Time to measure the time. CPU time is the time |
| 126 | + accumulated for all threads. |
| 127 | + |
| 128 | +### Experiment 2: Run the Optimized Version |
| 129 | + |
| 130 | +1. Run the program. |
| 131 | + |
| 132 | + Linux: |
| 133 | + |
129 | 134 | ```
|
130 | 135 | make run
|
131 | 136 | ```
|
| 137 | + |
| 138 | + Windows: |
| 139 | + |
| 140 | + ``` |
| 141 | + run.bat |
| 142 | + ``` |
| 143 | + |
132 | 144 | Did the debug (unoptimized) version run slower?
|
133 | 145 |
|
134 | 146 | ### Experiment 3: Change the Number of Threads
|
| 147 | + |
135 | 148 | By default, an OpenMP application creates and uses as many threads as the number
|
136 |
| -of "processors" in a system. A "processor" is defined as the number of logical |
| 149 | +of "processors" in a system. A "processor" is defined as the number of logical |
137 | 150 | processors, which are twice the number of physical cores on hyperthreaded cores.
|
138 | 151 |
|
139 | 152 | OpenMP uses the environment variable `OMP_NUM_THREADS` to set the number of
|
140 | 153 | threads to use.
|
141 | 154 |
|
142 | 155 | 1. Experiment with a single thread.
|
| 156 | + |
| 157 | + Linux: |
| 158 | + |
143 | 159 | ```
|
144 |
| - export OMP_NUM_THREADS=1 |
| 160 | + export OMP_NUM_THREADS=1` |
145 | 161 | make run
|
146 | 162 | ```
|
| 163 | + |
| 164 | + Windows: |
| 165 | + |
| 166 | + ``` |
| 167 | + set OMP_NUM_THREADS=1 |
| 168 | + run.bat |
| 169 | + ``` |
| 170 | + |
147 | 171 | Notice the number of threads reported by the application.
|
148 | 172 |
|
149 | 173 | 2. Experiment with 2 threads.
|
| 174 | + |
| 175 | + Linux: |
| 176 | + |
150 | 177 | ```
|
151 | 178 | export OMP_NUM_THREADS=2
|
152 | 179 | make run
|
| 180 | + ``` |
| 181 | + |
| 182 | + Windows: |
| 183 | + |
| 184 | + ``` |
| 185 | + set OMP_NUM_THREADS=2 |
| 186 | + run.bat |
153 | 187 | ```
|
| 188 | + |
154 | 189 | Notice if the application ran faster with more threads.
|
155 | 190 |
|
156 |
| -3. Experiment with the number of threads, and see changing threads numbers |
| 191 | +3. Experiment with the number of threads and see how changing the number of threads |
157 | 192 | affects performance.
|
158 | 193 |
|
159 |
| -4. Clean the project files. |
| 194 | +4. On Linux clean the object and executable files. |
| 195 | + |
160 | 196 | ```
|
161 | 197 | make clean
|
162 | 198 | ```
|
163 | 199 |
|
164 | 200 | ## Further Reading
|
165 |
| -Interested in learning more? Read about using OpenMP with the Intel® Fortran |
166 |
| -Compiler in the *OpenMP Support* section of the [Intel® Fortran Compiler |
| 201 | + |
| 202 | +Read about using OpenMP with the Intel® Fortran Compiler in the *OpenMP Support* section of the [Intel® Fortran Compiler |
167 | 203 | Developer Guide and
|
168 |
| -Reference](https://www.intel.com/content/www/us/en/develop/documentation/fortran-compiler-oneapi-dev-guide-and-reference). |
| 204 | +Reference](https://www.intel.com/content/www/us/en/docs/fortran-compiler/developer-guide-reference/current/overview.html). |
169 | 205 |
|
170 | 206 | ## License
|
| 207 | + |
171 | 208 | Code samples are licensed under the MIT license. See
|
172 | 209 | [License.txt](https://github.com/oneapi-src/oneAPI-samples/blob/master/License.txt)
|
173 | 210 | for details.
|
|
0 commit comments