Skip to content

Commit 6547e0e

Browse files
author
Ron Green
committed
rwg - updates 3 README.md files to Joe O's outline
1 parent ce77f3c commit 6547e0e

File tree

6 files changed

+57
-41
lines changed

6 files changed

+57
-41
lines changed
Binary file not shown.

DirectProgramming/Fortran/openmp_samples/README.md

Lines changed: 13 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -1,11 +1,20 @@
1-
# Fortran OpenMP sample
1+
# `Fortran OpenMP*` sample
22
This sample is designed to illustrate how to use
33
the OpenMP* API with the Intel® Fortran Compiler.
44

55
This program finds all primes in the first 40,000,000 integers,
66
the number of 4n+1 primes, and the number of 4n-1 primes in the same range.
77
It illustrates two OpenMP* directives to help speed up the code.
88

9+
10+
| Optimized for | Description
11+
|:--- |:---
12+
| OS | macOS* with Xcode* installed
13+
| Software | Intel® oneAPI Intel Fortran Compiler (Beta)
14+
| What you will learn | How to build and run a Fortran OpenMP application using Intel Fortran compiler
15+
| Time to complete | 10 minutes
16+
17+
## Purpose
918

1019
This program finds all primes in the first 40,000,000 integers, the number of 4n+1 primes,
1120
and the number of 4n-1 primes in the same range. It illustrates two OpenMP* directives
@@ -31,19 +40,14 @@ This option can also be omitted, in which case the generated executable will be
3140

3241
The option -fpp enables the Fortran preprocessor.
3342
Read the Intel® Fortran Compiler Documentation for more information about these options.
34-
35-
| Optimized for | Description
36-
|:--- |:---
37-
| OS | macOS* with Xcode* installed
38-
| Software | Intel® oneAPI Intel Fortran Compiler (Beta)
39-
| What you will learn | How to build and run a Fortran OpenMP application using Intel Fortran compiler
40-
| Time to complete | 10 minutes
4143

44+
## Key Implementation Details
45+
The Intel® oneAPI Intel Fortran Compiler (Beta) includes all libraries and headers necessary to compile and run OpenMP* enabled Fortran applications. Users simply use the -qopenmp compiler option to compile and link their OpenMP enabled applications.
4246

4347
## License
4448
This code sample is licensed under MIT license
4549

46-
## How to Build
50+
## Building the `Fortran OpenMP*` sample
4751

4852
### Experiment 1: Unoptimized build and run
4953
* Build openmp_samples
Binary file not shown.

DirectProgramming/Fortran/optimize_samples/README.md

Lines changed: 18 additions & 16 deletions
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,6 @@
1-
# Fortran Optimization Sample
1+
# `Fortran Optimization` sample
22

3-
This sample is designed to illustrate specific
4-
compiler optimizations, features, tools, and programming concepts.
3+
This sample is designed to illustrate compiler optimization features and programming concepts.
54

65
This program computes the integral (area under the curve) of a user-supplied function
76
over an interval in a stepwise fashion.
@@ -14,7 +13,16 @@ more closely approximating the true value.
1413

1514
The source for this program also demonstrates recommended Fortran coding practices.
1615

17-
## Compile the sample several times using different optimization options:
16+
| Optimized for | Description
17+
|:--- |:---
18+
| OS | macOS* with Xcode* installed
19+
| Software | Intel® oneAPI Intel® Fortran Compiler (Beta)
20+
| What you will learn | Optimization using the Intel® Fortran compiler
21+
| Time to complete | 15 minutes
22+
23+
## Purpose
24+
25+
The Intel® Fortran Compiler can optimize applications for performance. The primary compiler option is -O followed by a numeric optimizaiton "level" from 0 requesting no optimization to 3, which requests all compiler optimizations for the application. The -O optimizaition levels are:
1826

1927
* O0 - No optimizations
2028
* O1 - Enables optimizations for speed and disables some optimizations that increase code size and affect speed.
@@ -23,27 +31,21 @@ The source for this program also demonstrates recommended Fortran coding practic
2331

2432
Read the [Intel® Fortran Compiler Developer Guide and Reference][1]
2533
[1]: https://software.intel.com/content/www/us/en/develop/documentation/fortran-compiler-developer-guide-and-reference/top.html "Intel® Fortran Compiler Developer Guide and Reference"
26-
for more information about these options.
34+
for more information about these options.
2735

28-
Some of these automatic optimizations use features and options that can
36+
Some of these compiler optimizations use features and options that can
2937
restrict program execution to specific architectures.
3038

31-
| Optimized for | Description
32-
|:--- |:---
33-
| OS | macOS* with Xcode* installed
34-
| Software | Intel® oneAPI Intel Fortran Compiler (Beta)
35-
| What you will learn | Vectorization using Intel Fortran compiler
36-
| Time to complete | 15 minutes
37-
3839

3940
## License
4041
This code sample is licensed under MIT license
4142

42-
## How to Build
43+
## Building the `Fortran Optimization` sample
44+
4345
Use the one of the following compiler options:
4446

4547

46-
## macOS* : -O0 -O1, -O2, -O3
48+
### macOS* : -O0 -O1, -O2, -O3
4749

4850
### STEP 1: Build and run with -O0
4951
cd optimize_samples
@@ -152,7 +154,7 @@ This does vary by application but generally with Intel® Compilers
152154
O2 is has most optimizations. Sometimes O3 can help, of course,
153155
but generally O2 is sufficient for most applications.
154156

155-
### Extra Exploration
157+
### Further Exploration
156158
The Intel® Fortran Compiler has many options for optimization.
157159
If you have a genuine Intel® Architecture processor, try these additional options
158160

Binary file not shown.

DirectProgramming/Fortran/vec_samples/README.md

Lines changed: 26 additions & 16 deletions
Original file line numberDiff line numberDiff line change
@@ -1,26 +1,23 @@
1-
# Fortran Vectorization Sample
2-
3-
The Intel® Compiler has an auto-vectorizer that detects operations in the application
4-
that can be done in parallel and converts sequential operations
5-
to parallel operations by using the
6-
Single Instruction Multiple Data (SIMD) instruction set.
1+
# `Fortran Vectorization` sample
72

83
In this sample, you will use the auto-vectorizer to improve the performance
94
of the sample application. You will compare the performance of the
105
serial version and the version that was compiled with the auto-vectorizer.
116

127
| Optimized for | Description
138
|:--- |:---
14-
| OS | macOS* with Xcode installed
9+
| OS | macOS* with Xcode* installed
10+
| Hardware | Intel-based Mac*
1511
| Software | Intel® oneAPI Intel Fortran Compiler (beta)
1612
| What you will learn | Vectorization using Intel Fortran compiler
1713
| Time to complete | 15 minutes
1814

1915

20-
## License
21-
This code sample is licensed under MIT license
22-
23-
### Introduction to Auto Vectorization
16+
## Purpose
17+
The Intel® Compiler has an auto-vectorizer that detects operations in the application
18+
that can be done in parallel and converts sequential operations
19+
to parallel operations by using the
20+
Single Instruction Multiple Data (SIMD) instruction set.
2421

2522
For the Intel® compiler, vectorization is the unrolling of a loop combined with the generation of packed SIMD instructions. Because the packed instructions operate on more than one data element at a time, the loop can execute more efficiently. It is sometimes referred to as auto-vectorization to emphasize that the compiler automatically identifies and optimizes suitable loops on its own.
2623

@@ -39,7 +36,7 @@ Vectorization is enabled with the compiler at optimization levels of O2 (default
3936

4037
4. improve performance using Interprocedural Optimization
4138

42-
### Preparing the Sample Application
39+
## Key Implementation Details
4340

4441
In this sample, you will use the following files:
4542

@@ -48,7 +45,20 @@ In this sample, you will use the following files:
4845
matvec.f90
4946

5047

51-
### Establishing a Performance Baseline
48+
## License
49+
This code sample is licensed under MIT license
50+
51+
52+
## Building the `Fortran Vectorization` sample
53+
54+
This sample contains 2 Fortran source files, in subdirectory 'src/' under the main sample root directory oneAPI-samples/DirectProgramming/Fortran/vec_samples
55+
56+
1. matvec.f90 is a Fortran source file with a matrix-times-vector algorithm
57+
2. driver.f90 is a Fortran source file with the main program calling matvec
58+
59+
## Running the `Fortran Vectorization` sample
60+
61+
### Step1 Establishing a Performance Baseline
5262

5363
To set a performance baseline for the improvements that follow in this sample, compile your sources from the src directory with these compiler options:
5464

@@ -60,7 +70,7 @@ Execute 'MatVector'
6070
and record the execution time reported in the output. This is the baseline against which subsequent improvements will be measured.
6171

6272

63-
### Generating a Vectorization Report
73+
### Step 2 Generating a Vectorization Report
6474

6575
A vectorization report shows what loops in your code were vectorized and explains why other loops were not vectorized. To generate a vectorization report, use the **qopt-report-phase=vec** compiler options together with **qopt-report=1** or **qopt-report=2**.
6676

@@ -149,7 +159,7 @@ For more information on the **qopt-report** and **qopt-report-phase** compiler o
149159
[3]: https://software.intel.com/content/www/us/en/develop/documentation/fortran-compiler-developer-guide-and-reference/top/compiler-reference/compiler-options/alphabetical-list-of-compiler-options.html "Options"
150160

151161

152-
### Improving Performance by Aligning Data
162+
### Step 3 Improving Performance by Aligning Data
153163

154164
The vectorizer can generate faster code when operating on aligned data. In this activity you will improve the vectorizer performance by aligning the arrays a, b, and c in **driver.f90** on a 16-byte boundary so the vectorizer can use aligned load instructions for all arrays rather than the slower unaligned load instructions and can avoid runtime tests of alignment. Using the ALIGNED macro will insert an alignment directive for a, b, and c in driver.f90 with the following syntax:
155165

@@ -172,7 +182,7 @@ Recompile the program after adding the ALIGNED macro to ensure consistently alig
172182
ifort -real-size 64 -qopt-report=2 -qopt-report-phase=vec -D ALIGNED matvec.f90 driver.f90 -o MatVector
173183

174184

175-
### Improving Performance with Interprocedural Optimization
185+
### Step 4 Improving Performance with Interprocedural Optimization
176186

177187
The compiler may be able to perform additional optimizations if it is able to optimize across source line boundaries. These may include, but are not limited to, function inlining. This is enabled with the **-ipo** option.
178188

0 commit comments

Comments
 (0)