Channel: Intel® Fortran Compiler

↧

Puzzle: changing the order of outer loops leads to significant performance increase

March 26, 2014, 7:43 pm

Latest and popular articles on Intel Technologies

≫ Next: MPI application portable?

≪ Previous: Deep copy of unlimited polymorphism

Hi,

I have a puzzling finding that changing the order of the outer loops led to significant performance increase. I am playing with the following two versions of a small code piece:

Version 1: ii, k, j, i

1529         do ii = iis, iie
1530           value = vals(ii)
1531           do k = ks, ke
1532             do j = js, je
1533               ind_offset = ( (k-1)*N2 + (j-1) ) * N1g
1534 !DIR$ SIMD
1535               do i = is, ie
1536                 l0 = ind_offset + i
1537                 dF(l0) = dF(l0) + value * F(l0 + ii)
1538               end do
1539             end do
1540           end do
1541         end do

Version 2: k, ii, j, i

1529         do k = ks, ke
1530           do ii = iis, iie
1531             value = vals(ii)
1532             do j = js, je
1533               ind_offset = ( (k-1)*N2 + (j-1) ) * N1g
1534 !DIR$ SIMD
1535               do i = is, ie
1536                 l0 = ind_offset + i
1537                 dF(l0) = dF(l0) + value * F(l0 + ii)
1538               end do
1539             end do
1540           end do
1541         end do

The ONLY difference between these two versions is the order of the outermost two loops: Version 1 has a loop order of ii, k, j, i while Version 2 has a loop order of k, ii, j, i. The profiling results of these two versions are summarized as below:

CPU Time(s) Load Instructions L1 Cache Hits L2 Cache Hits L3 Cache Hits MainMemory Hits
Version 1 11.282 1.36E+10 75.86% 3.46% 20.69% 0.00%
Version 2 7.372 1.36E+10 94.76% 1.24% 4.00% 0.00%

The results really surprised me in two ways:
(1) I observed a non-trivial speedup 11.282/7.372 = 1.53 and a significant increase in L1 Cache Hits.
(2) The only change I made was rearranging the order of the two OUTER loops, i.e., do ii loop and do k loop.

I have checked the vectorization report and found the inner loop (do i loop) in both of the two versions have been vectorized. So now I really have no idea what is going on. I compiled the code using ifort 13.1.0 with -O2 -xHost. The loop bound (length) for each loop level is:

do ii loop: 5
do k loop: 48
do j loop: 40
do i loop: 36

I truly appreciate your time and help.

Best regards,
Wentao

↧

Latest Images

7 clever tricks Primark does to keep you walking & buying more than you need...

7 clever tricks Primark does to keep you walking & buying more than you need...

July 20, 2025, 5:14 am

Art for Everyone! Autism advocacy, local stories, and indigenous pride in one...

Art for Everyone! Autism advocacy, local stories, and indigenous pride in one...

July 20, 2025, 5:06 am

Paintings of English Downs 2

Paintings of English Downs 2

July 20, 2025, 4:30 am

How Kerala Women Rescued a Dying Forest and Turned It Into a Safe Haven for...

How Kerala Women Rescued a Dying Forest and Turned It Into a Safe Haven for...

July 20, 2025, 3:30 am

Met Eireann warns of heavy rain & spot flooding for DAYS before big...

Met Eireann warns of heavy rain & spot flooding for DAYS before big...

July 20, 2025, 1:14 am

Who is Kevin Lerena’s wife Geraldine?

Who is Kevin Lerena’s wife Geraldine?

July 20, 2025, 12:57 am

Man stabs woman, baby to death inside Queens home, police say

Man stabs woman, baby to death inside Queens home, police say

July 19, 2025, 11:00 pm

Ang papel ni whistleblower Julie Patidongan sa kaso ng mga nawawalang sabungero

Ang papel ni whistleblower Julie Patidongan sa kaso ng mga nawawalang sabungero

July 19, 2025, 9:45 pm

Telangana Human Rights Commission (TGHRC) seeks report from revenue dept on...

Telangana Human Rights Commission (TGHRC) seeks report from revenue dept on...

July 19, 2025, 7:29 pm

Crisis-hit NHS fat cats raking in MASSIVE salaries as frontline services cry...

Crisis-hit NHS fat cats raking in MASSIVE salaries as frontline services cry...

July 19, 2025, 2:11 pm

Trending Articles

Detroit Mafia’s Consigliere Tony Pal, Possible Final Tie To Hoffa Mystery,...

January 12, 2019, 10:04 am

Flux Full Pack 2.1 v3.5.16-R2R

May 6, 2016, 3:14 am

Download: D boy ft Shenky & Chester – Nafola nafulwa”Prod By: Shenky”

November 24, 2016, 2:37 pm

Class 6 SST Chapter 1 Notes in Hindi पृथ्वी पर स्थानों की स्थिति

May 9, 2025, 3:36 am

Telangana Ration Card Online Status Ahara Bhadratha Card Online Status

June 25, 2016, 4:59 am

Victim of "scary" Lucas Smith from Podsmead was beaten...

March 3, 2015, 4:02 am

Waves Complete v2019.02.14 Incl Emulator-R2R

February 16, 2019, 7:50 am

Farrah Stone Johnson Pitcher Jon Lester’s wife

October 10, 2016, 9:56 am

It’s Kind of a Funny Story 2010 Dual Audio 720p BRRip [Hindi – English] ESubs

June 8, 2016, 6:15 am

CalCen

June 4, 2020, 6:35 pm

Students hit streets to save Agriculture College land in city

October 13, 2018, 2:20 am

How can I reset the backup number?

August 19, 2015, 9:01 pm

James Martin Normandy tart on James Martin’s French Adventure

February 21, 2017, 7:26 am

Practice Sheet of Right form of verbs for HSC Students

September 22, 2019, 11:40 pm

Love (2015).H264.Italian.English.Ac3.5.1.multisub.iCV-MIRCrew Seed (62)/Leech...

September 14, 2017, 10:49 am

236 kg banned scented tobacco worth Rs 1.26 lakh seized in Wadi

June 22, 2021, 5:54 am

NCERT Solutions for My Dear Soldiers Class 7 English Poorvi

May 27, 2025, 6:53 pm

Who’s been sentenced at Northampton Magistrates’ Court

September 17, 2019, 11:00 pm

Rajasthan Board 10th Result 2016 Roll No wise & Name Wise

August 20, 2016, 5:13 pm

[RELEASE THREAD]--_A-Team_--Cricket_Dream_5G

September 25, 2022, 7:14 pm

Latest Images

7 clever tricks Primark does to keep you walking & buying more than you need...

7 clever tricks Primark does to keep you walking & buying more than you need...

July 20, 2025, 5:14 am

Art for Everyone! Autism advocacy, local stories, and indigenous pride in one...

Art for Everyone! Autism advocacy, local stories, and indigenous pride in one...

July 20, 2025, 5:06 am

Paintings of English Downs 2

Paintings of English Downs 2

July 20, 2025, 4:30 am

How Kerala Women Rescued a Dying Forest and Turned It Into a Safe Haven for...

How Kerala Women Rescued a Dying Forest and Turned It Into a Safe Haven for...

July 20, 2025, 3:30 am

Met Eireann warns of heavy rain & spot flooding for DAYS before big...

Met Eireann warns of heavy rain & spot flooding for DAYS before big...

July 20, 2025, 1:14 am

Who is Kevin Lerena’s wife Geraldine?

Who is Kevin Lerena’s wife Geraldine?

July 20, 2025, 12:57 am

Man stabs woman, baby to death inside Queens home, police say

Man stabs woman, baby to death inside Queens home, police say

July 19, 2025, 11:00 pm

Ang papel ni whistleblower Julie Patidongan sa kaso ng mga nawawalang sabungero

Ang papel ni whistleblower Julie Patidongan sa kaso ng mga nawawalang sabungero

July 19, 2025, 9:45 pm

Telangana Human Rights Commission (TGHRC) seeks report from revenue dept on...

Telangana Human Rights Commission (TGHRC) seeks report from revenue dept on...

July 19, 2025, 7:29 pm

Crisis-hit NHS fat cats raking in MASSIVE salaries as frontline services cry...

Crisis-hit NHS fat cats raking in MASSIVE salaries as frontline services cry...

July 19, 2025, 2:11 pm

© 2025 //www.rssing.com