For educational purposes, I was figuring out how much virtual memory I can allocate on linux. On x86_64, it turns out to allocate 128TB of virtual memory as indicated in the documentation. But on arm64, I manage to allocate only 170TB of virtual memory, although the documentation says 256Tb. I want to understand what prevents me from allocating 256TB of virtual memory.
So I wrote a program
#define _GNU_SOURCE
#include <stdio.h>
#include <stdlib.h>
#include <sys/mman.h>
int main() {
char *chars;
size_t nbytes;
while(chars != MAP_FAILED) {
nbytes += 0x10000000000; // 1TB
chars = mmap(
NULL,
nbytes,
PROT_READ | PROT_WRITE,
MAP_SHARED | MAP_ANONYMOUS,
-1,
0
);
munmap(chars, nbytes);
}
printf("Allocated %ld total TBn", nbytes/1024/1024/1024/1024);
exit(EXIT_FAILURE);
}
Set the parameters of the overcommit:
echo 1 > /proc/sys/vm/overcommit_memory
And got result:
Allocated 171 total TB
I tried to increase the parameters:
sysctl -w vm.max_map_count=655300000
ulimit -l unlimited
But nothing helps.
My kernel params:
# grep CONFIG_ARM64_VA_BITS /boot/config-$(uname -r)
# CONFIG_ARM64_VA_BITS_39 is not set
CONFIG_ARM64_VA_BITS_48=y
CONFIG_ARM64_VA_BITS=48
# grep CONFIG_ARM64_PA_BITS /boot/config-$(uname -r)
CONFIG_ARM64_PA_BITS_48=y
CONFIG_ARM64_PA_BITS=48
My system:
ARM Cortex A53 (ARMv8) - 1GB RAM
5.15.0-1034-raspi #37-Ubuntu SMP PREEMPT
# free -m
total used free shared buff/cache available
Mem: 905 198 346 3 360 616
2
Answers
you can’t just magically create virtual memory. The amount is virtual memory depends of the amount of phisical memory available. Allocating 256TB of virtual memory on a Raspberry Pi 3 with 1GB of RAM is not going to happend virtual memory relies on your physical RAM, so the usable amount is limited by what’s actually available.
You’re allocating a contiguous chunk of memory, and that’s the biggest hole in the address space.
(Needless to say, the address space of your program is 48 bits, i.e. 256TiB.)
You can see this by looking at
/proc/self/maps
(or/proc/<pid>/maps
) while your process is running. Here’s a very lazy test case that just calls out tosystem()
:If I compile it with
cc -o t t.c -Wall -O3
and run it, then I get this output:For reference, I’m running this in an arm64 Debian VM under a macOS host, but that hardly matters here.
What you can see here is that the lowest address in use is
0xaaaad16a0000
, where the segments of the main binary of the process itself are mapped. That is somewhere between 170 and 171TiB from address0x0
. Following that is some heap memory, and then there’s another large gap until0xffff88150000
, where libc is mapped. That gap makes up another 85 to 86TiB.If you requested two separate allocations of 170TiB and 85TiB respectively, then both of those calls should succeed, bringing you to a total of 255TiB. But you can’t get a contiguous mapping because there’s stuff sitting in the middle of the address space.
That, of course, is the fault of ASLR. If you run the above code, the addresses will vary somewhat between runs, but the ones where the main binary is mapped should always start with
0xaaaa
followed by another 8 digits. For consistent results, you can turn off ASLR for the test binary by invoking it withsetarch $(uname -m) -R ./[binary_name]
. Then your binary should always be mapped at0xaaaaaaaa0000
.But it’s still gonna sit in the middle of the address space. If you want to change that, you need to compile with
-no-pie
. If you do that to the above code and run it, you’ll see that now you can get a much bigger contiguous allocation:With
0xff0000000000
being 255TiB. You can again addsetarch $(uname -m) -R
on top of this to prevent heap address randomisation, but at this point it hardly matters anymore.Of course these things depend on arbitrary default values of the kernel and the toolchain used, so it’s most likely that nothing in this post is guaranteed – but at this point in time, this is how it works.