Quantcast
Viewing all articles
Browse latest Browse all 3270

Can this be made better?

 

Hello,

I wonder is anyone has the time and inclination to have look at the code below for

any possible improvements.

The extract included here is the the heaviest user of cpu in a large-ish simulation code .

A typical run would take 6-9 months of running 24/24 and 7/7 with 6 threads on six cores.

The omp part is working very well and there cannot be much inprovement with the multithreading part.

The compiler call used for the whole code is:

ifort -O3 -r8  -openmp -fpp -parallel -mcmodel=medium -i-dynamic -shared-intel

Would there be a benefit if part or all of it were written in assembler?

Lots and lots of thanks for any suggestions.

--

 

! typical values

! N1 = 768
! N2 = N3 = 12
! M3 = M2 = 42

  Do KEL = 1, N3
  Do JEL = 1, N2

...
[address calculations]

!$OMP  PARALLEL DEFAULT(SHARED) PRIVATE( I, J, K, JA, KA, JJ, KK )
!$OMP DO
   Do K = 1, M3
      Do J = 1, M2
         JJ = (J-1)*NX32

! - copy into work arrays for later fft.

         Do I = 1, N1
            WK_1( JJ+I, K ) = U( J_Jump+J, K_Jump+K, I )
            WK_2( JJ+I, K ) = V( J_Jump+J, K_Jump+K, I )
            WK_3( JJ+I, K ) = W( J_Jump+J, K_Jump+K, I )
         End Do

         Do I = 1, N1, 2
! - du/dx
            WKX_1( JJ+I,   K ) = -Wv(i)*U( J_Jump+J, K_Jump+K, I+1 )
            WKX_1( JJ+I+1, K ) =  Wv(i)*U( J_Jump+J, K_Jump+K, I   )
! - dv/dx
            WKX_2( JJ+I,   K ) = -Wv(i)*V( J_Jump+J, K_Jump+K, I+1 )
            WKX_2( JJ+I+1, K ) =  Wv(i)*V( J_Jump+J, K_Jump+K, I   )
! - dw/dx
            WKX_3( JJ+I,   K ) = -Wv(i)*W( J_Jump+J, K_Jump+K, I+1 )
            WKX_3( JJ+I+1, K ) =  Wv(i)*W( J_Jump+J, K_Jump+K, I   )

	 End Do

! - Y derivatives.

         Do JA =  1, M2
            Do I = 1, N1
               WK_4( JJ+I, K ) = WK_4( JJ+I, K ) + RDY*DYGL(J,JA)*U( J_jump+JA, K_jump+K, I )  ! du/dy
               WK_5( JJ+I, K ) = WK_5( JJ+I, K ) + RDY*DYGL(J,JA)*V( J_jump+JA, K_jump+K, I )  ! dv/dy
               WK_6( JJ+I, K ) = WK_6( JJ+I, K ) + RDY*DYGL(J,JA)*W( J_jump+JA, K_jump+K, I )  ! dw/dy
            End Do
         End Do

! - Z derivatives.

         Do KA = 1, M3
            Do I = 1, N1
               WK_7( JJ+I, K ) = WK_7( JJ+I, K ) + RDZ*DZGL(K,KA)*U( J_jump+J, K_jump+KA, I )   ! du/dz
               WK_8( JJ+I, K ) = WK_8( JJ+I, K ) + RDZ*DZGL(K,KA)*V( J_jump+J, K_jump+KA, I )   ! dv/dz
               WK_9( JJ+I, K ) = WK_9( JJ+I, K ) + RDZ*DZGL(K,KA)*W( J_jump+J, K_jump+KA, I )   ! dw/dz
            End Do
         End Do

      End Do
   End Do   ! eo single element loop.
!$OMP END DO
!$OMP END PARALLEL

...
[other stuff]

end do
end do

 


Viewing all articles
Browse latest Browse all 3270

Trending Articles



<script src="https://jsc.adskeeper.com/r/s/rssing.com.1596347.js" async> </script>