c++ - Doing a section with one thread and a for-loop with multiple threads -
i using openmp , want spawn threads such 1 thread executes 1 piece of code , finishes, in parallel n threads running iterations of parallel-for loop.
execution should this:
section (one thread) || section b (parallel-for, multiple threads) | || | | | | | | | | | | | || | | | | | | | | | | | || | | | | | | | | | | | || | | | | | | | | | | | || | | | | | | | | | | v || v v v v v v v v v v
i cannot write parallel-for #pragma omp once
because not want thread executes section execute for-loop.
i have tried this:
#pragma omp parallel sections { #pragma omp section { // section } #pragma omp section { // section b; #pragma omp parallel (int = 0; < x; ++i) something(); } }
however, parallel-for executes 1 thread (i know because made body of loop print omp_get_thread_num()
, same number, either 1 or 0 depending on thread of 2 executed second parallel section).
i have tried
#pragma omp sections { #pragma omp section { // section } #pragma omp section { // section b; #pragma omp parallel (int = 0; < x; ++i) something(); } }
which allows for-loop execute multiple threads, makes sections non-parallel, , first section executed sequentially before second section.
what need combination of 2 approaches, each iteration of for-loop , first section run in parallel.
nested parallelism must explicitly set, disabled default in implementations. standing openmp 4.0 standard, must set omp_nested
environment variable:
the omp_nested environment variable controls nested parallelism setting initial value of nest-var icv. value of environment variable must true or false. if environment variable set true, nested parallelism enabled; if set false, nested parallelism disabled. behavior of program implementation defined if value of omp_nested neither true nor false.
the following line should work bash:
export omp_nested=true
futhermore, noted @hristoiliev in comment below, it's want set omp_num_threads
environment variable tune performance. quoting standard:
the value of environment variable must list of positive integer values. values of list set number of threads use parallel regions @ corresponding nested levels.
this means 1 should set value of omp_num_threads
similar n,n-1
n
number of cpu cores. instance:
export omp_num_threads=8,7
for 8-core system (example copied comment below).
Comments
Post a Comment