Welcome to ShenZhenJia Knowledge Sharing Community for programmer and developer-Open, Learning and Share
menu search
person
Welcome To Ask or Share your Answers For Others

Categories

I am using open MP to parallelize a part of code in HEVC. The basic structure of the code is as given below

Void funct() {

for(...)

{

#pragma OMP parallel for private(....)

for (...)

{

/// do some parallel work

} //end of inner for loop

//other tasks

} /// end of outer for loop

} //end of function

Now i have modified the inner for loop so that the code is parallelized and every thread perform task independently. I am not getting any errors but the overall processing time is increased with multiple threads than what it would have taken with single thread. I guess the main reason is that for every iteration of outer loop there is thread creation overhead for innner loop. Is there any way to avoid this issue or any way by which we can create thread only once. I cannot parallelize the outer for loop since i have made modifications in inner for loop to enable each thread to work independently. Please suggest any possible solutions.

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
thumb_up_alt 0 like thumb_down_alt 0 dislike
189 views
Welcome To Ask or Share your Answers For Others

1 Answer

You can use separate directives #pragma omp parallel and #pragma omp for.

#pragma omp parallel creates parallel threads, whereas #pragma omp for distributes the work between the threads. For sequential part of the outer loop you can use #pragma omp single.

Here is an example:

int n = 3, m = 10;
#pragma omp parallel
{
    for (int i = 0; i < n; i++){
        #pragma omp single
        {
            printf("Outer loop part 1, thread num = %d
", 
                    omp_get_thread_num());
        }
        #pragma omp for
        for(int j = 0; j < m; j++) {
            int thread_num = omp_get_thread_num();
            printf("j = %d, Thread num = %d
", j, thread_num);
        }
        #pragma omp single
        {
            printf("Outer loop part 2, thread num = %d
", 
                    omp_get_thread_num());
        }
    }
}

But I am not sure will it help you or not. To diagnose OpenMP performance issues, it would be better to use some profiler, such as Scalasca or VTune.


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
thumb_up_alt 0 like thumb_down_alt 0 dislike
Welcome to ShenZhenJia Knowledge Sharing Community for programmer and developer-Open, Learning and Share
...