I have a parallel code with OpenMP
which is a mix of Fortran
, C
and R
. The Fortran
and C
are the back-end or solver of the code (it’s a computational code) and the R
part is just an interface to launch a simulation, meaning that the simulation parameters and configuration is defined using an R
script that in turn calls the functions and methods in the back-end part of the the code (you know how it works).
The parallelism is implemented in the Fortran
part of the code and by default OpenMP
uses all the cores/threads. This is not the desired behavior I want the code run serially by default and give the user the possibility of defining the number of cores/threads to use.
To this end I defined a module as follows
module thread_control
use omp_lib
implicit none
contains
subroutine set_default_threads()
! Get the number of threads set by the environment
integer :: nthreads
nthreads = omp_get_max_threads()
! If no environment variable is set, default to 1 thread
if (nthreads == omp_get_num_procs()) then
call omp_set_num_threads(1)
endif
end subroutine set_default_threads
end module thread_control
Then I call to the subroutine of the above module in the files where I have parallel regions using use thread_control
and call set_default_threads()
.
Here is an example of the call to the subroutine and the type of OpenMP
directive I’m using:
call set_default_threads()
!$OMP PARALLEL PRIVATE(param1, param2, param3, param4, param5), &
!$OMP& PRIVATE(Num1, Num2, Num3, Num4)
!$OMP DO
! Loop over the mesh
! Here goes a huge for loop and conditionals (if statements)
!$OMP END DO
!$OMP END PARALLEL
Here goes a second example
!$OMP PARALLEL PRIVATE(Tmp)
!$OMP SECTIONS
!$OMP SECTION
! some stuff happens here....
!$OMP SECTION
! some stuff happens here....
!$OMP SECTION
! some stuff happens here....
!$OMP END SECTIONS NOWAIT
!$OMP END PARALLEL
I can’t share the whole code, it’s a big code base and it’s not open source so unfortunately there is no way I can share it with you.
Now the problem is when I run my R
script in an R
session (either RStudio
or an interactive R
session in my terminal) using the following command
> source("PATH-TO-MY-R-SCRIPT")
It gives me an error like
libgomp: Out of memory allocating 418040474304 bytes
I didn’t have this problem before introducing the thread_control.f90
module.
Now the interesting part is that if I use the following command
> system("OMP_NUM_THREADS=4 Rscript PATH-TO-MY-R-SCRIPT")
Things work smoothly.
Another interesting observation is that before introducing the thread_control.f90
module I experimented with my R
script to see if I can override the number of cores/threads using my R
script so I added the following lines to the beginning of my R
script
# R script: simulation.R
# Set the number of threads (e.g., 4)
num_threads <- 4
Sys.setenv(OMP_NUM_THREADS = num_threads)
Despite the introduction of the above lines my code used all the 16 existing threads on my system, as it does by default using OpenMP
. So it seems to me that there is a conflict between R
environment and Fortran
code.
I’m not sure if this amount of information could be enough for some geeks out here to help me out with my problem, but I hope it is. Although if you think you need some more details please feel free to ask, I’ll provide you with necessary details.
I asked this question from chatgpt
and it gave an answer along the following lines
Potential Causes and Solutions
Differences in Execution Environment:
When you use system
("OMP_NUM_THREADS=4 Rscript ...")
, a newR
process is created with its own environment, which can differ from the environment of an interactive R session. This new process might have different memory management or threading behavior.When running the script using
source()
, the code runs within the currentR
session, which might have different memory constraints or conflicting settings.Threading Issues:
Running
OpenMP
parallel regions in a shared environment (like an interactive R session) can cause threading issues if the environment isn’t properly isolated. This can lead to race conditions, deadlocks, or memory allocation issues.
OpenMP
may not always behave as expected within anR
session due to how R handles its memory and threading. This is especially true if other packages or R’s internal processes are also using threads or if memory is fragmented.
- Memory Management:
- The "Out of memory" error might be caused by trying to allocate an enormous amount of memory due to an issue with how memory is being requested or managed within the parallel regions.
- When running within the
R
session, memory fragmentation or other processes within the same session may cause unpredictable memory availability.Solutions
Here are some strategies to address these issues:
Run Code in a Separate Process
- Continue Using
system()
: Since running theR
script withsystem("OMP_NUM_THREADS=4 Rscript ...")
works correctly, this suggests that separating the execution from the interactive session resolves the issues. Consider using this method as the primary way to run your script.- Isolate Execution: If you need to execute the script within R but still want to isolate it, you could wrap the logic that runs your Fortran code in an R function and then run that function in a separate R process using
parallel::mclapply()
or a similar approach.Control Threading and Memory
- Limit Threads in Interactive Sessions: You can try limiting the number of threads even more conservatively when running interactively, by explicitly setting
OMP_NUM_THREADS=1
when usingsource()
. This can help avoid threading conflicts.- Check Memory Allocation: Review your Fortran code for any large memory allocations, especially within parallel regions. Ensure that these allocations are necessary and that they don’t inadvertently request too much memory.
Debugging
- Use Debugging Tools: Tools like
Valgrind
(for memory errors) orGDB
(for debugging segmentation faults) can be helpful if you want to track down the specific cause of the errors when running in source(). This might reveal issues related to memory access or threading that only appear in certain environments.- Add Debugging Information: Print statements or logging in both R and Fortran can help trace where the issue occurs. Start by identifying if the issue is in the Fortran code or in the interaction between R and Fortran.
P.S.
My code is compiled using Makefile
and I’m working on Ubuntu.
2
Answers
There is a logical problem in the fortran module used to give the user the possibility of defining the number of threads in an
R
script. The problem is the following :omp_get_max_threads()
to obtain the number of threads. However,omp_get_max_threads()
does not necessarily return the number of threads based on theOMP_NUM_THREADS
environment variable alone. By default (ifOMP_NUM_THREADS
is not set),omp_get_max_threads()
returns the maximum number of threads available based on the number of hardware cores or processors. This means thatomp_get_max_threads()
doesn’t provide any information about whetherOMP_NUM_THREADS
is set explicitly by the user. It just returns whatOpenMP
is going to use based on the system.omp_get_max_threads()
withomp_get_num_procs()
. This logic is also flawed because:omp_get_num_procs()
returns the number of physical processors or cores available on the machine, not the number of threadsOpenMP
is using. Even if the environment variableOMP_NUM_THREADS
is set, this comparison would still fail to properly detect the user's intention.So the analysis explained in the two above bullet points goes against the fact that the user is passing the number of desired threads using
OMP_NUM_THREADS
in the R script. The check forOMP_NUM_THREADS
is not explicitly addressed in the fortran module. Here is what the proper module should look like:The above version directly retrieves the value of
OMP_NUM_THREADS
using theget_environment_variable
function. This ensures that if the user setsOMP_NUM_THREADS
, the Fortran code captures that value and uses it.The
call get_environment_variable('OMP_NUM_THREADS', omp_env, status=ierror)
line fetches the environment variable, and if successful(ierror == 0)
, the value is read and converted to an integer usingread(omp_env, *) nthreads
.One should keep in mind that the environment variables are treated by the system as
string
so a conversion is necessary.After defining the module as explained above; in the Fortran files where there exist a parallel region you should add
use thread_control
once at the beginning of the routine or program (Fortran file in general) and then right before the parallel region you should addnthreads = get_default_threads()
and eventually the parallel region should containNUM_THREADS(nthreads)
like the followingNow you can simply add the following block at the very beginning of the
R
script before any call to the functions that use parallel processand this should be taken into account by the fortran code and
OpenMP
.You can pass the number of threads as an argument, say
my_omp
to the Fortran code and usecall omp_set_num_threads(my_omp)
in there. If we assume that your package is calledmesh
then you implement an interface such thatmesh.threads(n)
setsmy_omp=n
,mesh.threads()
setsmy_omp=-1
and ifmesh.threads
isn’t used, then my_omp=1. Ifmy_omp=-1
you skip thecall omp_set_num_threads(my_omp)
in Fortran and get all cores.I don’t know
R
so I might be off on the syntax, but the logic should hold.