Quantcast
Channel: Intel® Fortran Compiler
Viewing all articles
Browse latest Browse all 3270

AVX instruction using xmm ?

$
0
0
 

Hi,

I have a question about AVX instruction. I compiled my code using ifort 13 with -O2 and -xHost. I want to enable 256-bit wide AVX to perform four 64-bit floating point operations per cycle.

Here is my first code piece:

 623 !DIR$ SIMD
 624         do ii = 1, Nc
 625 ! diagonal components first
 626           StrnRt(ii,1) =   JAC(ii)   * (                &
 627                            MT1(ii,1) * VelGrad1st(ii,1) &
 628                          + MT1(ii,2) * VelGrad1st(ii,3) )
			...
 640          end do

The assembly files show that the following instructions were generated for line 627:

vmulsd    8(%r8,%r14,8), %xmm1, %xmm3                   #627.38
vmulpd    %xmm6, %xmm5, %xmm11                          #627.38
vmulpd    %ymm5, %ymm4, %ymm10                          #627.38

I understand why I got vmulsd. My question is why vmulpd %xmm6, %xmm5, %xmm11 was generated and what does it stand for? I think vmulpd should be an AVX instruction and should use ymm to have 256-bit wide vectorization.

For the second code piece:

 643 !DIR$ SIMD
 644         do ii = 1, Nc
 645 ! diagonal components first
 646           StrnRt(ii,1) =   JAC(ii)   * (                &
 647                            MT1(ii,1) * VelGrad1st(ii,1) &
 648                          + MT1(ii,2) * VelGrad1st(ii,4) &
 649                          + MT1(ii,3) * VelGrad1st(ii,7) )
 			...
 685         end do

The assembly files show that the following instructions were generated for line 647:

vmulsd    (%r12), %xmm4, %xmm6                          #647.38
vmulpd    %xmm11, %xmm10, %xmm0                         #647.38

Here again I got vmulpd with xmm. I even did NOT get vmulpd with ymm. I am worrying that this code piece is only performing two 64-bit floating point operations per cycle, rather than four.

I truly appreciate your help.

Best regards,

     Wentao


Viewing all articles
Browse latest Browse all 3270

Trending Articles



<script src="https://jsc.adskeeper.com/r/s/rssing.com.1596347.js" async> </script>