Magic of Malloc - Behind the Scenes

Dynamic memory allocation in C is a fundamental concept, and most of us are familiar with the infamous malloc function. However, did you know that "malloc" isn't a system call itself but rather a clever interface built on top of the brk() and mmap() system calls? In this blog post, we'll delve into the internals of malloc, exploring how it operates and the kernel's ingenious mechanisms to optimize system calls when using malloc repeatedly.

Optimizing System Calls:
One intriguing aspect of malloc is its ability to minimize system calls when allocating memory multiple times. The kernel employs intelligent strategies to optimize this process. Rather than invoking a system call for each malloc request, the kernel manages memory in more efficient ways.

Compile below program ( gcc mallocTest.c -o mallocTest)

// mallocTest.c
#include <stdio.h> 
#include <stdlib.h> 

#define FourtyKB 40*(1 << 10) 

int main() { 
    // Loop 10 times 
    for (int i = 0; i < 10; ++i) 
    { 
        // Allocate 40KB of memory 
        void *memory = malloc(FourtyKB); 
        printf("Press Enter to continue..."); 
        getchar(); 

        if (memory == NULL) 
        { 
            fprintf(stderr, "Memory allocation failed\n"); 
            return 1; // Exit with an error code 
        } 
        printf("Allocated 40KB of memory, iteration: %d\n", i + 1); 
    } 
    return 0; // Exit successfully 
}

In above program we are trying to allocate 40KB of memory in a loop of 10.

Here "getchar" acts like a breakpoint in every "for loop" so that we can analyze what system calls are being made at every Malloc call.

After you have compiled the program (gcc mallocTest.c -o mallocTest)
simply run strace on it.

\>> strace ./mallocTest

What you will notice is that Malloc does not invoke brk() system call every single time.
This is because the first time you request 40KB of memory using malloc, it allocates a buffer larger than the requested size, typically 128KB. Subsequent calls to malloc within this buffer do not trigger the brk() system call until this buffer is exhausted. The brk() system call is observed to occur approximately every 3-4 iterations.

This optimization in the kernel is done to minimize costly operations and enhance overall program efficiency.

This 128KB is a default threshold which can be changed as well.
To further fine-tune memory management, developers can delve into Malloc Tuning parameters provided by the GNU C Library. These parameters allow you to customize how malloc interacts with the underlying system calls, providing greater control over memory allocation and deallocation. Understanding and experimenting with these parameters can lead to improved performance tailored to your application's needs or your embedded system.