Hee: parallel processing - Huge slow down using openmp -

Wednesday, 15 January 2014

parallel processing - Huge slow down using openmp -

i trying test speed little piece of code follows:

for(i=0;i<imgdim;i++)         {             x[0][i] = z[i] - u1[i] * rhoinv;             x[1][i] = z[i] - u2[i] * rhoinv;             x[2][i] = z[i] - u3[i] * rhoinv;         }

the iteration around 200 , imgdim 1000000. total time piece of code around 2 seconds. , whole code cost 15 seconds. after utilize openmp parallel piece of code like:

omp_set_num_threads(max_threads);     #pragma omp parallel shared(x,z,u1,u2,u3,imgdim,rhoinv) private(i)      {         #pragma omp schedule(dynamic)         for(i=0;i<imgdim;i++)         {             x[0][i] = z[i] - u1[i] * rhoinv;             x[1][i] = z[i] - u2[i] * rhoinv;             x[2][i] = z[i] - u3[i] * rhoinv;         }     }

max_threads 8. little piece of code needs around 11 seconds , entire code utilize around 27 seconds. unusual thing time decreases 6 seconds if alter max_threads 1. still much longer sequential code.

it costs me lot of time , can not find problem. appreciate if can help me that.

schedule(dynamic) introduces huge run-time overhead. should used loops each iteration take different amount of time , improved load balancing justify overhead. regular loops yours dynamic scheduling overkill introduces unnecessary overhead, slows downwards computation.

change schedule type static:

#pragma omp parallel schedule(static)  for(i=0;i<imgdim;i++) {     x[0][i] = z[i] - u1[i] * rhoinv;     x[1][i] = z[i] - u2[i] * rhoinv;     x[2][i] = z[i] - u3[i] * rhoinv; }

(note: variables declared in outer scopes shared default , parallel loop command variable implicitly private)

parallel-processing openmp

Hee

Wednesday, 15 January 2014

parallel processing - Huge slow down using openmp -

No comments:

Post a Comment