Can this be made better?

Hello,

I wonder is anyone has the time and inclination to have look at the code below for

any possible improvements.

The extract included here is the the heaviest user of cpu in a large-ish simulation code .

A typical run would take 6-9 months of running 24/24 and 7/7 with 6 threads on six cores.

The omp part is working very well and there cannot be much inprovement with the multithreading part.

The compiler call used for the whole code is:

ifort -O3 -r8 -openmp -fpp -parallel -mcmodel=medium -i-dynamic -shared-intel

Would there be a benefit if part or all of it were written in assembler?

Lots and lots of thanks for any suggestions.

! typical values

! N1 = 768
! N2 = N3 = 12
! M3 = M2 = 42

  Do KEL = 1, N3
  Do JEL = 1, N2

...
[address calculations]

!$OMP  PARALLEL DEFAULT(SHARED) PRIVATE( I, J, K, JA, KA, JJ, KK )
!$OMP DO
   Do K = 1, M3
      Do J = 1, M2
         JJ = (J-1)*NX32

! - copy into work arrays for later fft.

         Do I = 1, N1
            WK_1( JJ+I, K ) = U( J_Jump+J, K_Jump+K, I )
            WK_2( JJ+I, K ) = V( J_Jump+J, K_Jump+K, I )
            WK_3( JJ+I, K ) = W( J_Jump+J, K_Jump+K, I )
         End Do

         Do I = 1, N1, 2
! - du/dx
            WKX_1( JJ+I,   K ) = -Wv(i)*U( J_Jump+J, K_Jump+K, I+1 )
            WKX_1( JJ+I+1, K ) =  Wv(i)*U( J_Jump+J, K_Jump+K, I   )
! - dv/dx
            WKX_2( JJ+I,   K ) = -Wv(i)*V( J_Jump+J, K_Jump+K, I+1 )
            WKX_2( JJ+I+1, K ) =  Wv(i)*V( J_Jump+J, K_Jump+K, I   )
! - dw/dx
            WKX_3( JJ+I,   K ) = -Wv(i)*W( J_Jump+J, K_Jump+K, I+1 )
            WKX_3( JJ+I+1, K ) =  Wv(i)*W( J_Jump+J, K_Jump+K, I   )

	 End Do

! - Y derivatives.

         Do JA =  1, M2
            Do I = 1, N1
               WK_4( JJ+I, K ) = WK_4( JJ+I, K ) + RDY*DYGL(J,JA)*U( J_jump+JA, K_jump+K, I )  ! du/dy
               WK_5( JJ+I, K ) = WK_5( JJ+I, K ) + RDY*DYGL(J,JA)*V( J_jump+JA, K_jump+K, I )  ! dv/dy
               WK_6( JJ+I, K ) = WK_6( JJ+I, K ) + RDY*DYGL(J,JA)*W( J_jump+JA, K_jump+K, I )  ! dw/dy
            End Do
         End Do

! - Z derivatives.

         Do KA = 1, M3
            Do I = 1, N1
               WK_7( JJ+I, K ) = WK_7( JJ+I, K ) + RDZ*DZGL(K,KA)*U( J_jump+J, K_jump+KA, I )   ! du/dz
               WK_8( JJ+I, K ) = WK_8( JJ+I, K ) + RDZ*DZGL(K,KA)*V( J_jump+J, K_jump+KA, I )   ! dv/dz
               WK_9( JJ+I, K ) = WK_9( JJ+I, K ) + RDZ*DZGL(K,KA)*W( J_jump+J, K_jump+KA, I )   ! dw/dz
            End Do
         End Do

      End Do
   End Do   ! eo single element loop.
!$OMP END DO
!$OMP END PARALLEL

...
[other stuff]

end do
end do

Can this be made better?

Trending Articles

Scuffham Amps - S-GEAR 2.6.0 VST, AAX, STANDALONE x86 x64 (R2R NO iLok2, +NO...

Practice Sheet of Right form of verbs for HSC Students

VHSE First (1st) Allotment 2025 - vhscap.kerala.gov.in

UNIVERSE LEAGUE – UNIVERSE LEAGUE – WAR (We Are Ready) – EP [iTunes Plus M4A]

City Hunter Teledrama – Episode 18 – 07th May 2016

Comment on Proposed Criteria for Identifying Predatory Conferences by Luke...

Bureau of Internal Revenue: Regional Offices (Directory)

Kendrick Lamar – Not Like Us (2024) [24Bit-88.2kHz] [PMEDIA] ⭐️

Inception 2010 Hindi Dual Audio 650MB BRRip 720p ESubs HEVC

East Hull MD admits sexual assaults after another victim comes forward

Download: FK ft Shenky – Nakuyewa ”Prod by: Shenky”

R. v. Sargeant, 2023 ONSC 6406 (CanLII)

Rajasthan Board 10th Result 2016 Roll No wise & Name Wise

Who’s been sentenced at Northampton Magistrates’ Court

मतलबी दोस्त स्टेट्स | Matlabi Dost Status in Hindi – Selfish Friends Status

Family cries out as traditional ruler allegedly abducts brother, extorts N2.5m

Long-Running Conflict In Springfield (MA) Gangland Sphere Has Manzi Family &...

Wondershare Filmora X v10.1.20.16 x64

Man arrested after fracas in flat

Man charged in ongoing Sexual Assault Investigation Derek Nyilas, 46, Faces...