I am using open MP to parallelize a part of code in HEVC. The basic structure of the code is as given below
Void funct() {
for(...)
{
#pragma OMP parallel for private(....)
for (...)
{
/// do some parallel work
} //end of inner for loop
//other tasks
} /// end of outer for loop
} //end of function
Now i have modified the inner for loop so that the code is parallelized and every thread perform task independently. I am not getting any errors but the overall processing time is increased with multiple threads than what it would have taken with single thread. I guess the main reason is that for every iteration of outer loop there is thread creation overhead for innner loop. Is there any way to avoid this issue or any way by which we can create thread only once. I cannot parallelize the outer for loop since i have made modifications in inner for loop to enable each thread to work independently. Please suggest any possible solutions.
See Question&Answers more detail:os