I am running the following (minimal reproducing) code:
#include <stdio.h>
#include <sys/mman.h>
#include <stdlib.h>
#include <sys/stat.h>
#include <fcntl.h>
#include <unistd.h>
#include <errno.h>
void main() {
int fd = open("file.data", O_RDONLY);
void* ptr = mmap(0, (size_t)240 * 1024 * 1024 * 1024, PROT_READ, MAP_SHARED, fd, 0);
printf("Result = %pn", ptr);
printf("Errno = %dn", errno);
}
It outputs (compiled and run with gcc test.c && ./a.out
):
Result = 0xffffffffffffffff
Errno = 9
file.data
is a 243GiB file:
$ stat file.data
File: file.data
Size: 260165023654 Blocks: 508135088 IO Block: 4096 regular file
Device: 801h/2049d Inode: 6815790 Links: 1
Access: (0644/-rw-r--r--) Uid: ( 1001/ user) Gid: ( 1001/ user)
Access: 2023-05-08 09:22:07.314477587 -0400
Modify: 2023-06-16 07:53:12.275187040 -0400
Change: 2023-06-16 07:53:12.275187040 -0400
Birth: -
Other configuration (debian stretch, Linux 5.2.21):
$ sysctl vm.overcommit_memory
vm.overcommit_memory = 1
$ ulimit -a
core file size (blocks, -c) 0
data seg size (kbytes, -d) unlimited
scheduling priority (-e) 0
file size (blocks, -f) unlimited
pending signals (-i) 768178
max locked memory (kbytes, -l) unlimited
max memory size (kbytes, -m) unlimited
open files (-n) 1024
pipe size (512 bytes, -p) 8
POSIX message queues (bytes, -q) 819200
real-time priority (-r) 0
stack size (kbytes, -s) 8192
cpu time (seconds, -t) unlimited
max user processes (-u) 768178
virtual memory (kbytes, -v) unlimited
file locks (-x) unlimited
$ free -m
total used free shared buff/cache available
Mem: 192105 671 189213 9 2220 190314
Swap: 0 0 0
Advice I have already followed:
- The mapping is not MAP_PRIVATE, and not PROT_WRITE
ulimit -v
is set to unlimitedvm.overcommit_memory=1
To my understanding, I should be able to mmap
this file. I am mapping it as read-only, so the kernel can freely swap it all out back to disk whenever needed. There should be enough continuous memory, as this is a 64-bit system.
How do I make the mmap
call work?
Edit: The output of /proc/*/maps
for the program:
2aaaaaa000-2aaaaab000 r-xp 00000000 08:01 6815754 /home/<username>/a.out
2aaacaa000-2aaacab000 r--p 00000000 08:01 6815754 /home/<username>/a.out
2aaacab000-2aaacac000 rw-p 00001000 08:01 6815754 /home/<username>/a.out
3ff7a3a000-3ff7bcf000 r-xp 00000000 08:01 5767499 /lib/x86_64-linux-gnu/libc-2.24.so
3ff7bcf000-3ff7dcf000 ---p 00195000 08:01 5767499 /lib/x86_64-linux-gnu/libc-2.24.so
3ff7dcf000-3ff7dd3000 r--p 00195000 08:01 5767499 /lib/x86_64-linux-gnu/libc-2.24.so
3ff7dd3000-3ff7dd5000 rw-p 00199000 08:01 5767499 /lib/x86_64-linux-gnu/libc-2.24.so
3ff7dd5000-3ff7dd9000 rw-p 00000000 00:00 0
3ff7dd9000-3ff7dfc000 r-xp 00000000 08:01 5767249 /lib/x86_64-linux-gnu/ld-2.24.so
3ff7fe8000-3ff7fea000 rw-p 00000000 00:00 0
3ff7ff8000-3ff7ffb000 r--p 00000000 00:00 0 [vvar]
3ff7ffb000-3ff7ffc000 r-xp 00000000 00:00 0 [vdso]
3ff7ffc000-3ff7ffd000 r--p 00023000 08:01 5767249 /lib/x86_64-linux-gnu/ld-2.24.so
3ff7ffd000-3ff7ffe000 rw-p 00024000 08:01 5767249 /lib/x86_64-linux-gnu/ld-2.24.so
3ff7ffe000-3ff7fff000 rw-p 00000000 00:00 0
3ffffde000-3ffffff000 rw-p 00000000 00:00 0 [stack]
ffffffffff600000-ffffffffff601000 r-xp 00000000 00:00 0 [vsyscall]
2
Answers
The problem you are experiencing comes from the fact that even though your system is a 64-bit system, the kernel still has an addressing limit, which depends on your system’s architecture.
By default, Linux allocates half the addressable memory space to the kernel and half to the user. So, for a 64-bit system, this would be 2^63 bytes for the kernel and the same amount for user space.
However, the kernel doesn’t use this whole space. The kernel uses a range of the addressable memory for memory mapping, which is from
mmap_min_addr
toTASK_SIZE
.TASK_SIZE
is typically set in the kernel to a certain value, depending on the architecture of your system, which might be less than the maximum addressable space.Your mmap request is likely failing because it’s trying to allocate more memory than your system’s
TASK_SIZE
. If you try to mmap 240GiB at once, it might exceed theTASK_SIZE
on your system.One solution would be to mmap smaller chunks of the file in a loop until you’ve mmapped the whole file. Here is an example of how to do that:
This code reads the file in chunks of 64GiB. It would be best to adapt the chunk size to your specific case. You should always check the return values of the system calls to handle any errors. The error messages will be more descriptive and informative.
Remember to call
munmap()
when you’re done with a section of the file before moving onto the next one.I’m unable to reproduce your issue.
I modified your program to add an option to generate a sample/test file:
It can just do a
truncate
to create a large file. This takes a fraction of a second.It can then fill it in with real data. This takes about 10 minutes to create a 243GB file on my system.
The result is the same in either mode. So, IMO, the quick mode is sufficient (i.e. the file has holes). In other words, anybody can run the program in a matter of seconds on their system.
I tried every combination I could think of this and other options. In no circumstance, could I reproduce. See below for a comparison of my system and yours.
After reading below, if you can think of any other idea, I’d be glad to try it on my system to reproduce your failure.
Here is the modified program:
Here is my configuration:
Slight differences:
You have 192GB of ram. But, I only have 12GB of ram. This difference should work in your favor. But, it doesn’t. The program works on my system that has less than 1/10 of the amount of ram.
I have a 128GB swap disk. But, I reran the program after doing
swapoff -a
to disable all swap disks. There was no difference in program operation.vm.overcommit_memory
is 0. But, I set it to 1 and there was no difference in program operation.On my
vm.mmap_min_addr
is 65536 (seeTASK_SIZE
below)My computer system is over ten years old.
I’m [probably] running a much older kernel version.
At the time of the test, I had:
gnome-terminal
windowsfirefox
with pages on SOthunderbird
Because of my much smaller ram, I have to dispute neo-jgrec‘s answer:
On an x86 (64 bit) system,
TASK_SIZE
can be either:1ul << 47
131,072 GB (128 TB)1ul << 56
67,108,864 GB (65,536 TB)Even using the smaller address value we are clearly not going beyond
TASK_SIZE
I’ve done
mmap
on many 100+GB files, in the past, without issue. For example, see my answer: read line by line in the most efficient way platform specificHere is the stat of the file:
Here is the program output: