this is my test code
void test_code() {
omp_set_num_threads(4);
#pragma omp parallel
{
int tid = omp_get_thread_num();
#pragma omp for
for (int i = 0; i < 4; i++)
printf("Hello World %d %dn", tid, i);
}
}
...
void CResizer::ThreadMain()
{
test_code();
MLINFO(ObjectName(), "%s startedn", __FUNCTION__);
CBufferMonitor bufResize("Resizer (frames):", ObjectId());
...
and result should be this
Hello World 1 1
Hello World 3 3
Hello World 2 2
Hello World 0 0
but this is my result is
Hello World 0 0 Hello World 0 1
Hello World 0 0 Hello World 0 1
Hello World 0 2 Hello World 0 3
Hello World 0 0 Hello World 0 1
Hello World 0 2 Hello World 0 3
Hello World 0 0 Hello World 0 0
Hello World 0 1 Hello World 0 2
Hello World 0 3 Hello World 0 0
Hello World 0 1 Hello World 0 2
Hello World 0 3 Hello World 0 0
Hello World 0 1 Hello World 0 2
Hello World 0 1 Hello World 0 2
Hello World 0 3 Hello World 0 0
Hello World 0 1 Hello World 0 2
Hello World 0 3 Hello World 0 3
Hello World 0 2 Hello World 0 3
Do you have any idea why it work like this?
My system has 8 core cpu, centos 7.5.
2
Answers
Looking at the OpenMP Quick Reference Card, you created "a team of OpenMP threads that execute the region." All the OpenMP specifications can be found here.
The output still looks wrong even if you were executing the parallel section multiple times. That is because
i
is shared across all threads. Also, it is unclear why you are surrounding a parallel section with astd::mutex
which identifies this as C++ code. You can also set the number of threads in the parallel section from the#pragma
statement. Thus the following is probably sufficient:First of all, I would like to emphasize that the information (code sample) you provided is not enough to answer the question, but it may be possible to guess what is going on.
I think the only possible explanation is that
#pragma omp parallel
region intest_code()
function can only use one thread (e.g. because no more threads available or your code is inside a nested parallel region and nested parallelism is not allowed) thereforetid
is always 0. Moreover, thetest_code()
(i.e.CResizer::ThreadMain()
) was executed by 8 threads, so the printout is repeated 8 times. Based on your function name (ThreadMain) it seems to be a plausible explanation, but as I mentioned without a minimal working example it is not possible to tell for sure. Anyway, I hope it helps to find the problem.To give you an example, the following code will produce similar output like you sent: