Quantcast
Channel: Intel® Fortran Compiler
Viewing all articles
Browse latest Browse all 3270

IFort not vectorizing loop in specific cases

$
0
0

I have a big set of code with OMP4.0 directives (target, simd...)

In one module the compiler throws lot's of warnings about "loops not vectorized with simd" although it should.

I cut the code down to the bare minimum that still produces this behaviour:

   SUBROUTINE simdTest

      IMPLICIT NONE

      INTEGER ::  i, j, k, sr, tn,nzb,nzt,nxl,nxr,nys,nyn
      REAL    ::  s1, s2, s3, s4
      REAL, DIMENSION(:,:,:), ALLOCATABLE :: u,v,pt,rmask,sums_l
      REAL, DIMENSION(:,:), ALLOCATABLE :: usws,vsws,shf

      !$omp parallel do schedule(runtime) private(s1,s2,s3)
      DO  k = nzb, nzt+1
        !$omp simd collapse( 2 ) reduction( +: s1, s2, s3 )
        DO  i = nxl, nxr
           DO  j =  nys, nyn
              s1 = s1 + u(k,j,i)  * rmask(j,i,sr)
              s2 = s2 + v(k,j,i)  * rmask(j,i,sr)
              s3 = s3 + pt(k,j,i) * rmask(j,i,sr)
           ENDDO
        ENDDO
        sums_l(k,1,tn) = s1
        sums_l(k,2,tn) = s2
        sums_l(k,4,tn) = s3
      ENDDO

      !$omp parallel do reduction( +: s1, s2, s3, s4) schedule(runtime)
      DO  i = nxl, nxr
       DO  j =  nys, nyn
          s1 = s1 + usws(j,i) * rmask(j,i,sr)
          s2 = s2 + vsws(j,i) * rmask(j,i,sr)
          s3 = s3 + shf(j,i)  * rmask(j,i,sr)
          s4 = s4 + 0.0
       ENDDO
      ENDDO
      sums_l(nzb,12,tn) = s1
      sums_l(nzb,14,tn) = s2
      sums_l(nzb,16,tn) = s3

   END SUBROUTINE

If you compile this with "ifort -openmp -O2" it will warn about the first loop. If you remove literally anything (even from second loop) it will vectorize.

Message from vec-report is "subscript to complex".

Could you explain that? IMO not vectorizing the first loop would lead to significant performance loss.


Viewing all articles
Browse latest Browse all 3270

Latest Images

Trending Articles



Latest Images