skip to Main Content

My test code shows that after free() and before the program exits, the heap memory is returned to the OS. I use htop(same for top) to observe the behaviour. My glibc version is ldd (Ubuntu GLIBC 2.31-0ubuntu9.9) 2.31 .

#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
#include <string.h>

#define BUFSIZE 10737418240 

int main(){
    printf("startn");
    u_int32_t* p = (u_int32_t*)malloc(BUFSIZE);
    if (p == NULL){
        printf("alloc 10GB failedn");
        exit(1);
    }
    memset(p, 0, BUFSIZ);
    for(size_t i = 0; i < (BUFSIZE / 4); i++){
        p[i] = 10;
    }
    printf("before freen");
    free(p);
    sleep(1000);
    printf("exitn");
}

Why this question Why does the free() function not return memory to the operating system? observes an opposite behaviour compared to mine? The OP also uses linux and the question is asked in 2018. Do I miss something?

2

Answers


  1. Chosen as BEST ANSWER

    I did some experiments, read a chapter of The Linux Programming Interface and get an satisfying answer for myself.

    First , the conclusion I have is:

    • Library call malloc uses system calls brk and mmap under the hood when allocating memory.
    • As @John Zwinck describs, a linux process would choose to use brk or mmap allocating mem depending on how much you request.
    • If allocating by brk, the process is probably not returning the memory to the OS before it terminates (sometimes it does). If by mmap, for my simple test the process returns the mem to OS before it terminates.

    Experiment code (examine memory stats in htop at the same time):

    code sample 1

    #include <stdio.h>
    #include <stdlib.h>
    #include <unistd.h>
    #include <string.h>
    #include <stdint.h>
    
    #define BUFSIZE 1073741824 //1GiB
    
    // run `ulimit -s unlimited` first
    
    int main(){
        printf("startn");
        printf("%lu n", sizeof(uint32_t));
        uint32_t* p_arr[BUFSIZE / 4]; 
        sleep(10); 
        for(size_t i = 0; i < (BUFSIZE / 4); i++){
            uint32_t* p = (uint32_t*)malloc(sizeof(uint32_t));
            if (p == NULL){
                printf("alloc failedn");
                exit(1);
            }
            p_arr[i] = p;
        } 
        printf("alloc donen"); 
        for(size_t i = 0; i < (BUFSIZE / 4); i++){
            free(p_arr[i]);
        }
        
        printf("free donen");
        sleep(20);
        printf("exitn");
    }
    

    When it comes to "free donen", and sleep(), you can see that the program still takes up the memory and doesn't return to the OS. And strace ./a.out showing brk gets called many times.

    Note:

    I am looping malloc to allocate memory. I expected it to take up only 1GiB ram but in fact it takes up 8GiB ram in total. malloc adds some extra bytes for bookeeping or whatever else. One should never allocate 1GiB in this way, in a loop like this.

    code sample 2:

    #include <stdio.h>
    #include <stdlib.h>
    #include <unistd.h>
    #include <string.h>
    #include <stdint.h>
    
    #define BUFSIZE 1073741824 //1GiB
    
    int main(){
        printf("startn");
        printf("%lu n", sizeof(uint32_t));
        uint32_t* p_arr[BUFSIZE / 4]; 
        sleep(3); 
        for(size_t i = 0; i < (BUFSIZE / 4); i++){
            uint32_t* p = (uint32_t*)malloc(sizeof(uint32_t));
            if (p == NULL){
                printf("alloc failedn");
                exit(1);
            }
            p_arr[i] = p;
        } 
        printf("%pn", p_arr[0]);
        printf("alloc donen"); 
        for(size_t i = 0; i < (BUFSIZE / 4); i++){
            free(p_arr[i]);
        }
        printf("free donen");
        printf("allocate againn");
        sleep(10);
        for(size_t i = 0; i < (BUFSIZE / 4); i++){
            uint32_t* p = malloc(sizeof(uint32_t));
            if (p == NULL){
                PFATAL("alloc failedn");
            }
            p_arr[i] = p;
        } 
        printf("allocate again donen");
        sleep(10);
        for(size_t i = 0; i < (BUFSIZE / 4); i++){
            free(p_arr[i]);
        }
        printf("%pn", p_arr[0]);
        sleep(3);
        printf("exitn");
    }
    

    This one is similar to sample 1, but it allocate again after free. The scecond allocation doesn't increase memory usage, it uses the freed yet not returned mem again.

    code sample 3:

    #include <unistd.h>
    #include <stdlib.h>
    #include <stdio.h>
    #include <assert.h>
    
    #define MAX_ALLOCS 1000000
    
    int main(int argc, char* argv[]){
        int freeStep, freeMin, freeMax, blockSize, numAllocs, j;
        char* ptr[MAX_ALLOCS];
        printf("n");
        numAllocs = atoi(argv[1]);
        blockSize = atoi(argv[2]);
        freeStep = (argc > 3) ? atoi(argv[3]) : 1;
        freeMin = (argc > 4) ? atoi(argv[4]) : 1;
        freeMax = (argc > 5) ? atoi(argv[5]) : numAllocs;
        assert(freeMax <= numAllocs);
    
        printf("Initial program break:   %10pn", sbrk(0));
        printf("Allocating %d*%d bytesn", numAllocs, blockSize);
        for(j = 0; j < numAllocs; j++){
            ptr[j] = malloc(blockSize);
            if(ptr[j] == NULL){
                perror("malloc return NULL");
                exit(EXIT_FAILURE);
            }
        }
    
        printf("Program break is now:    %10pn", sbrk(0));
        printf("Freeing blocks from %d to %d in steps of %dn", freeMin, freeMax, freeStep);
        for(j = freeMin - 1; j < freeMax; j += freeStep){
            free(ptr[j]);
        }
        printf("After free(), program break is : %10pn", sbrk(0));
        printf("n");
        exit(EXIT_SUCCESS);
    }
    

    This one takes from The Linux Programming Interface and I simplifiy a bit.

    Chapter 7:

    The first two command-line arguments specify the number and size of blocks to allocate. The third command-line argument specifies the loop step unit to be used when freeing memory blocks. If we specify 1 here (which is also the default if this argument is omitted), then the program frees every memory block; if 2, then every second allocated block; and so on. The fourth and fifth command-line arguments specify the range of blocks that we wish to free. If these arguments are omitted, then all allocated blocks (in steps given by the third command-line argument) are freed.

    Try run with:

    • ./free_and_sbrk 1000 10240 2
    • ./free_and_sbrk 1000 10240 1 1 999
    • ./free_and_sbrk 1000 10240 1 500 1000

    you will see only for the last example, the program break decreases, aka, the process returns some blocks of mem to OS (if I understand correctly).

    This sample code is evidence of

    "If allocating by brk, the process is probably not returning the memory to the OS before it terminates (sometimes it does)."


    At last, quotes some useful paragraph from the book. I suggest reading Chapter 7 (section 7.1) of TLPI, very helpful.

    In general, free() doesn’t lower the program break, but instead adds the block of memory to a list of free blocks that are recycled by future calls to malloc(). This is done for several reasons:

    • The block of memory being freed is typically somewhere in the middle of the heap, rather than at the end, so that lowering the program break is not possible.
    • It minimizes the number of sbrk() calls that the program must perform. (As noted in Section 3.1, system calls have a small but significant overhead.)
    • In many cases, lowering the break would not help programs that allocate large amounts of memory, since they typically tend to hold on to allocated memory or repeatedly release and reallocate memory, rather than release it all and then continue to run for an extended period of time.

    What is program break (also from the book):

    enter image description here

    Also: https://www.wikiwand.com/en/Data_segment


  2. Linux treats allocations larger than MMAP_THRESHOLD differently. See Why does malloc rely on mmap starting from a certain threshold?

    The question you linked, where allocations may not appear to be fully reclaimed immediately, uses small allocations which are sort of pooled together by malloc() and not instantly returned to the OS on each small deallocation (that would be slow). Your single huge allocation definitely goes via the mmap() path, and so is a totally independent allocation which will be fully and immediately reclaimed.

    Think of it this way: if you ask someone to buy you eggs and milk, they will likely make a single trip and return with what you requested. But if you ask for eggs and a diamond ring, they will treat those as two totally separate requests, fulfilled using very different strategies. If you then say you no longer need the eggs and the ring, they may keep the eggs for when they get hungry, but they’ll probably try to get their money back for the ring right away.

    Login or Signup to reply.
Please signup or login to give your own answer.
Back To Top
Search