Quantcast
Channel: Intel® Fortran Compiler
Viewing all articles
Browse latest Browse all 3270

-qopt-matmul with -mkl=sequential

$
0
0

For ifort 15.0.1.133, it appears that when -qopt-matmul is specified, the multi-threaded MKL will be used regardless if -mkl=sequential is also specified.

Consider the following MATMUL benchmark code, loosely based on the matvec driver from the Intel Composer XE Fortran vec_samples:

program matmul_test
   use iso_fortran_env, only: int64, real64
   implicit none

   integer, parameter :: N=1024
   integer, parameter :: REPEAT_COUNT = 50

   integer :: i
   real(kind=real64), dimension(N,N) :: A, B, C
   real(kind=real64)   :: cputime1, cputime2
   integer(kind=int64) :: count_rate, walltime1, walltime2

   call RANDOM_NUMBER(A)
   call RANDOM_NUMBER(B)

   call cpu_time(cputime1)
   call system_clock(walltime1, count_rate)

   do i=1,REPEAT_COUNT
      C = MATMUL(A, B)
      B(1,1) = B(1,1) + 0.000001
   enddo

   call cpu_time(cputime2)
   call system_clock(walltime2, count_rate)
   write (*,'(A,X,F8.3)') 'wall time:', (walltime2-walltime1)/REAL(count_rate, KIND=real64)
   write (*,'(A,X,F8.3)') 'cpu time:', cputime2-cputime1
   write (*,*) 'SUM(c):', SUM(C)

end program matmul_test

If -qopt-matmul is specified, the multi-threaded MKL is used by default:

$ ifort -qopt-matmul matmul_test.f90
$ ./a.out
wall time:    0.551
cpu time:    8.399
 SUM(c):   268554303.414090

If -mkl=sequential is also specified, the multi-threaded MKL is still used:

$ ifort -qopt-matmul -mkl=sequential matmul_test.f90
$ OMP_DISPLAY_ENV=1 ./a.out

OPENMP DISPLAY ENVIRONMENT BEGIN
   _OPENMP='201307'
  [host] OMP_CANCELLATION='FALSE'
  [host] OMP_DISPLAY_ENV='TRUE'
  [host] OMP_DYNAMIC='FALSE'
  [host] OMP_MAX_ACTIVE_LEVELS='2147483647'
  [host] OMP_NESTED='FALSE'
  [host] OMP_NUM_THREADS: value is not defined
  [host] OMP_PLACES: value is not defined
  [host] OMP_PROC_BIND='false'
  [host] OMP_SCHEDULE='static'
  [host] OMP_STACKSIZE='4M'
  [host] OMP_THREAD_LIMIT='2147483647'
  [host] OMP_WAIT_POLICY='PASSIVE'
OPENMP DISPLAY ENVIRONMENT END


wall time:    0.545
cpu time:    8.375
 SUM(c):   268554303.414090

I realize that one could set the environment variable OMP_NUM_THREADS to 1 to effectively get sequential execution, but it would be preferable for the behavior to match  or, if this behavior is intentional, that the man page be updated to clarify this interaction between -qopt-matmul and -mkl=sequential.


Viewing all articles
Browse latest Browse all 3270

Trending Articles