skip to Main Content

this is my test code

void test_code() {
  omp_set_num_threads(4);
  #pragma omp parallel
  {
    int tid = omp_get_thread_num();

    #pragma omp for
    for (int i = 0; i < 4; i++)
      printf("Hello World %d %dn", tid, i);
  }
}
...
void CResizer::ThreadMain()
{
  test_code();

  MLINFO(ObjectName(), "%s startedn", __FUNCTION__);
  CBufferMonitor bufResize("Resizer (frames):", ObjectId());    
  ...

and result should be this

Hello World 1 1
Hello World 3 3
Hello World 2 2
Hello World 0 0

but this is my result is

Hello World 0 0 Hello World 0 1
Hello World 0 0 Hello World 0 1
Hello World 0 2 Hello World 0 3
Hello World 0 0 Hello World 0 1
Hello World 0 2 Hello World 0 3
Hello World 0 0 Hello World 0 0
Hello World 0 1 Hello World 0 2
Hello World 0 3 Hello World 0 0
Hello World 0 1 Hello World 0 2
Hello World 0 3 Hello World 0 0
Hello World 0 1 Hello World 0 2
Hello World 0 1 Hello World 0 2
Hello World 0 3 Hello World 0 0
Hello World 0 1 Hello World 0 2
Hello World 0 3 Hello World 0 3
Hello World 0 2 Hello World 0 3

Do you have any idea why it work like this?
My system has 8 core cpu, centos 7.5.

2

Answers


  1. Looking at the OpenMP Quick Reference Card, you created "a team of OpenMP threads that execute the region." All the OpenMP specifications can be found here.

    The output still looks wrong even if you were executing the parallel section multiple times. That is because i is shared across all threads. Also, it is unclear why you are surrounding a parallel section with a std::mutex which identifies this as C++ code. You can also set the number of threads in the parallel section from the #pragma statement. Thus the following is probably sufficient:

    #pragma omp parallel
    {
        const int tid = omp_get_thread_num();
    
        #pragma omp for
        for (int i = 0; i < 4; i++)
            printf("Hello World %d %dn", tid, i);
    }
    
    Login or Signup to reply.
  2. First of all, I would like to emphasize that the information (code sample) you provided is not enough to answer the question, but it may be possible to guess what is going on.

    I think the only possible explanation is that #pragma omp parallel region in test_code() function can only use one thread (e.g. because no more threads available or your code is inside a nested parallel region and nested parallelism is not allowed) therefore tid is always 0. Moreover, the test_code() (i.e. CResizer::ThreadMain()) was executed by 8 threads, so the printout is repeated 8 times. Based on your function name (ThreadMain) it seems to be a plausible explanation, but as I mentioned without a minimal working example it is not possible to tell for sure. Anyway, I hope it helps to find the problem.

    To give you an example, the following code will produce similar output like you sent:

    #include <stdio.h>
    #include <omp.h>
        
    void test_code() {
      omp_set_num_threads(4);
      #pragma omp parallel
      {
        int tid = omp_get_thread_num();
    
        #pragma omp for
        for (int i = 0; i < 4; i++)
          printf("Hello World %d %dn", tid, i);
      }
    }
    int main() {    
        #pragma omp parallel
        test_code();
    }
    
    Login or Signup to reply.
Please signup or login to give your own answer.
Back To Top
Search