I’m trying to implement parallel read and write of NetCDF-4 files.
For this purpose I have installed netcdf-fortran and mpi.
I have also installed pnetcdf (I don’t know if I need it).
All the installation were done via package manager (I am using Debian 10).
When I try to read normally (without parallelization) the NetCDF files everything is going well.
However, when I try to use mpi I receive the following message:
“NetCDF: Parallel operation on file opened for non-parallel access”
I tried to run the following example (because probably my program was erroneous).
https://github.com/Unidata/netcdf-fortran/blob/master/examples/F90/simple_xy_par_rd.f90
and I received the same message …
What does it means?
I am using gfortran, and in order to compile the program I type the following:
mpif90 -o executablename -I/usr/include/ mycode.f90 -lnetcdff -lnetcdf -lpnetcdf
and after:
mpirun ./executablename
Am I doing something wrong?
I have never used mpi before, so maybe I have messed up the installation.
Is there another way to parallel read/write netcdf files using Fortran?
2
Answers
Software for NetCDF Parallel I/O
There are several software packages available for NetCDF parallel I/O.
Unidata netcdf-c/netcdf-fortran
The canonical netCDF C and Fortran library from Unidata. See https://www.unidata.ucar.edu/software/netcdf/. The netcdf-fortran library consists of 2 different Fortran APIs, one is based on Fortran 77, one based on Fortran 90. The Fortran APIs wrap the C API, so the C library is required for Fortran as well. When netcdf-fortran is built, the location of netcdf-c must be found or specified.
The netcdf-c supports several binary formats:
* The original netCDF format and its varients (a.k.a. “classic” formats).
* The netcdf/HDF5 format introduced in netCDF 4.0. (a.k.a netcdf-4/HDF5 format).
Parallel I/O is possible with both classic and HDF5 formats, but netcdf-c must be built correctly.
To get parallel I/O with classic formats, pnetcdf (a.k.a parallel-netcdf) must be installed.
To get parallel I/O with netCDF-4/HDF5 format, HDF5 must be installed, and it must be installed with parallel I/O features enabled.
pnetcdf
The pnetcdf package (sometimes called parallel-netcdf) is an independent library, a totally separate implementation of netCDF, from Argonne National Labs. From it’s webpage: “PnetCDF is a high-performance parallel I/O library for accessing Unidata’s NetCDF, files in classic formats, specifically the formats of CDF-1, 2, and 5.”
See https://parallel-netcdf.github.io/.
It’s great for high performance computing. It can be used stand-alone, without installing netCDF at all. pnetcdf has a netCDF-like API, but the function names are different. If pnetcdf is used in stand-alone mode (i.e. without the Unidata netCDF libraries) then user code must be written in the pnetcdf API. This code will not run using the Unidata netCDF library, it will run with pnetcdf only.
Also, pnetcdf can only be used with netCDF classic formats. It cannot read/write HDF5 files.
HDF5
HDF5 is a well-known high performance data format. See https://portal.hdfgroup.org/display/support
HDF5 supports parallel I/O. HDF5 must be built with MPI compilers, and the –enable-parallel option must be specified at configure. This will cause HDF5 to build with parallel I/O features enabled.
PIO (a.k.a. ParallelIO)
PIO is a C/Fortran library for parallel I/O on many processors.
PIO provides a netCDF-like API, and allows users to designate some subset of processors to perform IO. Computational code calls netCDF-like functions to read and write data, and PIO uses the IO processors to perform all necessary IO. See https://ncar.github.io/ParallelIO/.
PIO has it’s own API, but also supports using the netCDF native API. So PIO can be used with existing netCDF code. PIO also provides a good way to decompose data across processors, and easily handle that with netCDF calls. PIO can use Unidata netCDF, HDF5, and pnetcdf, so can read/write all varieties of netCDF file.
How you proceed depends on your situation.
I/O on few processors (<10)
Using Unidata’s netcdf-c/netcdf-fortran libraries would be simplest. Build pnetcdf, HDF5, then netcdf-c, then netcdf-fortran, all with MPI compilers. Make sure you specify –enable-parallel when building HDF5. (Not necessary with netcdf-c, netcdf-fortran, they will automatically detect parallel features of the HDF5 build).
Once built, the netcdf C and Fortran APIs can do parallel I/O on any netCDF file. (And also on almost all HDF5 files.) Use nc_open_par()/nc_create_par() to get parallel I/O.
I/O on some processors (10 – 1000)
Use of pnetcdf may be simplest and give best performance for classic format files. It has a slightly different API and will not work for HDF5 files.
I/O on many processors (1K – 1M)
PIO will allow you to designate some subset of processors to do all I/O. That subset may use pnetcdf, HDF5, or Unidata code, depending on the underlying data format and the choices made by the user.
I was having the same problem. I solved by using the netcdf4 and MPI libraries and linking flags given by the outputs from the following commands:
and for MPI also:
Note that the include and libraries directories are specific for my system, in your case should be different but the commands
nf-fortran-config --cflags
andpkg-config
will give you the right ones for your system.Once you get these information you need to convine them to compile your code, that in your case would look something like:
Of course, you can simplify the above example with a Makefile file.