Skip to content

Fortran misc edits to conform to dir struc and naming #55

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 12 commits into from
Aug 25, 2020
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
@@ -0,0 +1,7 @@
Copyright 2020 Intel Corporation

Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions:

The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.

THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
Original file line number Diff line number Diff line change
@@ -0,0 +1,37 @@
## =============================================================
## Copyright © 2020 Intel Corporation
##
## SPDX-License-Identifier: MIT
## =============================================================
##
##
##******************************************************************************
## Content:
##
## Build for openmp_sample
##******************************************************************************

FC = ifort

release: openmp_sample.exe

debug: openmp_sample_dbg.exe

run: release ; @export DYLD_LIBRARY_PATH="$(LIBRARY_PATH)" ; ./openmp_sample.exe

debug_run: debug ; @export DYLD_LIBRARY_PATH="$(LIBRARY_PATH)" ; ./openmp_sample_dbg.exe

openmp_sample.exe: openmp_sample.o
$(FC) -O2 -fpp -qopenmp $^ -o $@

openmp_sample_dbg.exe: openmp_sample_dbg.o
$(FC) -O0 -g -fpp -qopenmp $^ -o $@

%.o: src/%.f90
$(FC) -O2 -c -fpp -qopenmp -o $@ $<

%_dbg.o: src/%.f90
$(FC) -O0 -g -c -fpp -qopenmp -o $@ $<

clean:
/bin/rm -f core.* *.o *.exe
Original file line number Diff line number Diff line change
@@ -0,0 +1,98 @@
# `OpenMP Primes`
This sample is designed to illustrate how to use
the OpenMP* API with the Intel® Fortran Compiler.

This program finds all primes in the first 40,000,000 integers,
the number of 4n+1 primes, and the number of 4n-1 primes in the same range.
It illustrates two OpenMP* directives to help speed up the code.


| Optimized for | Description
|:--- |:---
| OS | macOS* with Xcode* installed
| Software | Intel&reg; oneAPI Intel Fortran Compiler (Beta)
| What you will learn | How to build and run a Fortran OpenMP application using Intel Fortran compiler
| Time to complete | 10 minutes

## Purpose

This program finds all primes in the first 40,000,000 integers, the number of 4n+1 primes,
and the number of 4n-1 primes in the same range. It illustrates two OpenMP* directives
to help speed up the code.

First, a dynamic schedule clause is used with the OpenMP* for directive.
Because the DO loop's workload increases as its index gets bigger,
the default static scheduling does not work well. Instead, dynamic scheduling
is used to account for the increasing workload.
But dynamic scheduling itself has more overhead than static scheduling,
so a chunk size of 10 is used to reduce the overhead for dynamic scheduling.

Second, a reduction clause is used instead of an OpenMP* critical directive
to eliminate lock overhead. A critical directive would cause excessive lock overhead
due to the one-thread-at-time update of the shared variables each time through the DO loop.
Instead the reduction clause causes only one update of the shared variables once at the end of the loop.

The sample can be compiled unoptimized (-O0 ), or at any level of
optimization (-O1 through -O3 ). In addition, the following compiler options are needed.

The option -qopenmp enables compiler recognition of OpenMP* directives.
This option can also be omitted, in which case the generated executable will be a serial program.

The option -fpp enables the Fortran preprocessor.
Read the Intel® Fortran Compiler Documentation for more information about these options.

## Key Implementation Details
The Intel&reg; oneAPI Intel Fortran Compiler (Beta) includes all libraries and headers necessary to compile and run OpenMP* enabled Fortran applications. Users simply use the -qopenmp compiler option to compile and link their OpenMP enabled applications.

## License
This code sample is licensed under MIT license

## Building the `Fortran OpenMP*` sample

### Experiment 1: Unoptimized build and run
* Build openmp_samples

cd openmp_samples
make clean
make debug

* Run the program

make debug_run

* What did you see?

Did the debug, unoptimized code run slower?

### Experiment 2: Default Optimized build and run

* Build openmp_samples

make
* Run the program

make run

### Experiment 3: Controlling number of threads
By default an OpenMP application creates and uses as many threads as there are "processors" in a system. A "processor" is the number of logical processors which on hyperthreaded cores is twice the number of physical cores.

OpenMP uses environment variable 'OMP_NUM_THREADS' to set number of threads to use. Try this!

export OMP_NUM_THREADS=1
make run
note the number of threads reported by the application. Now try 2 threads:

export OMP_NUM_THREADS=2
make run
Did the make the application run faster? Experiment with the number of threads and see how it affects performance.

### Clean up
* Clean the program
make clean

## Further Reading
Interested in learning more? We have a wealth of information
on using OpenMP with the Intel Fortran Compiler in our
[OpenMP section of Developer Guide and Reference][1]

[1]: https://software.intel.com/content/www/us/en/develop/documentation/fortran-compiler-developer-guide-and-reference/top/optimization-and-programming-guide/openmp-support.html "Developer Guide and Reference"
Original file line number Diff line number Diff line change
@@ -0,0 +1,30 @@
{
"name": "openmp-primes",
"categories": [ "Toolkit/Intel® oneAPI HPC Toolkit" ],
"description": "Fortran Tutorial - Using OpenMP",
"toolchain": [ "ifort" ],
"languages": [ { "fortran": {} } ],
"targetDevice": [ "CPU" ],
"os": [ "darwin" ],
"builder": [ "make" ],
"ciTests":{
"darwin": [
{
"id": "fort_release_cpu",
"steps": [
"make release",
"make run",
"make clean"
]
},
{
"id": "fort_debug_cpu",
"steps": [
"make debug",
"make debug_run",
"make clean"
]
}
]
}
}
Original file line number Diff line number Diff line change
@@ -0,0 +1,117 @@
! ==============================================================
! Copyright © 2020 Intel Corporation
!
! SPDX-License-Identifier: MIT
! =============================================================
!
! [DESCRIPTION]
! This code finds all primes in the first 40,000,000 integers, the number of
! 4n+1 primes, and the number of 4n-1 primes in the same range.
!
! This source illustrates two OpenMP directives to help speed up
! the code. First, a dynamic "schedule" clause is used with the OpenMP "for"
! directive. Because the "for" loop's workload increases as its index
! gets bigger, the default "static" scheduling does not work well.
! Instead dynamic scheduling is used to account for the increasing
! workload. But dynamic scheduling itself has more overhead than
! static scheduling, so a "chunk size" of 10 is used to reduce the
! overhead for dynamic scheduling. Second, a "reduction" clause is
! used instead of an OpenMP "critical" directive to eliminate lock overhead.
! A "critical" directive would cause excessive lock overhead due to
! the one-thread-at-time update of the shared variables each
! time through the "for" loop. Instead the reduction clause causes only
! one update of the shared variables once at the end of the loop.
!
! [COMPILE]
! Use the following compiler options to compile both multi- and
! single-threaded versions.
!
! Parallel compilation:
!
! Windows*: /Qopenmp /fpp
!
! Linux* and macOS*: -qopenmp -fpp
!
! Serial compilation:
!
! Use the same command, but omit the -fopenmp (Linux* and macOS*)
! or /Qopenmp (Windows) option.
!

program ompPrime

#ifdef _OPENMP
include 'omp_lib.h' !needed for OMP_GET_NUM_THREADS()
#endif

integer :: start = 1
integer :: end = 40000000
integer :: number_of_primes = 0
integer :: number_of_41primes = 0
integer :: number_of_43primes = 0
integer index, factor, limit, nthr
real rindex, rlimit
logical prime, print_primes

print_primes = .false.
nthr = 1 ! assume just one thread
print *, ' Range to check for Primes:',start,end

#ifdef _OPENMP
!$omp parallel

!$omp single
nthr = OMP_GET_NUM_THREADS()
print *, ' We are using',nthr,' thread(s)'
!$omp end single
!

!
!$omp do private(factor, limit, prime) &
schedule(dynamic,10) &
reduction(+:number_of_primes,number_of_41primes,number_of_43primes)
#else
print *, ' We are using',nthr,' thread(s)'
#endif

do index = start, end, 2 !workshared loop

limit = int(sqrt(real(index)))
prime = .true. ! assume number is prime
factor = 3

do
if(prime .and. factor .le. limit) then
if(mod(index,factor) .eq. 0) then
prime = .false.
endif
factor = factor + 2
else
exit ! we can jump out of non-workshared loop
endif
enddo

if(prime) then
if(print_primes) then
print *, index, ' is prime'
endif

number_of_primes = number_of_primes + 1

if(mod(index,4) .eq. 1) then
number_of_41primes = number_of_41primes + 1
endif

if(mod(index,4) .eq. 3) then
number_of_43primes = number_of_43primes + 1
endif

endif ! if(prime)
enddo
!$omp end do
!$omp end parallel

print *, ' Number of primes found:',number_of_primes
print *, ' Number of 4n+1 primes found:',number_of_41primes
print *, ' Number of 4n-1 primes found:',number_of_43primes
end program ompPrime
Original file line number Diff line number Diff line change
@@ -0,0 +1,7 @@
Copyright 2020 Intel Corporation

Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions:

The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.

THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
Original file line number Diff line number Diff line change
@@ -0,0 +1,38 @@
## =============================================================
## Copyright © 2020 Intel Corporation
##
## SPDX-License-Identifier: MIT
## =============================================================
##
##
##******************************************************************************
## Content:
##
## Build for optimize_sample
##******************************************************************************
#
# >>>>> SET OPTIMIZATION LEVEL BELOW <<<<<
#
#Uncomment one of the following with which you wish to compile

FC = ifort -O0
#FC = ifort -O1
#FC = ifort -O2
#FC = ifort -O3

OBJS = int_sin.o

all: int_sin

run: int_sin
./int_sin

int_sin: $(OBJS)
ifort $^ -o $@

%.o: src/%.f90
$(FC) $^ -c

clean:
/bin/rm -f core.* $(OBJS) int_sin

Loading