Lecture from: 07.10.2025 | Video: Videos ETHZ
The previous lecture demonstrated how to use the dynamic memory API (malloc, calloc, realloc, and free). This lecture opens the “black box” to examine how a memory allocator works, addressing the fundamental challenges and trade-offs in its design. This classic systems programming problem sits at the intersection of data structures, algorithms, and hardware reality.
The Role of a Dynamic Memory Allocator
A dynamic memory allocator serves as a software layer between the application and the operating system.
/Semester-3/Systems-Programming-and-Computer-Architecture/Lecture-Notes/attachments/Pasted-image-20251014103522.png)
- OS and Hardware Level: The kernel manages physical memory in large, fixed-size pages (typically 4 KB).
- Application Level: Apps require memory for objects much smaller than a page.
- Allocator’s Role: The allocator acts as a broker, subdividing large chunks from the OS into smaller blocks.
Recall: The malloc Package Contract
The API contract includes:
#include <stdlib.h>
void *malloc(size_t size);
void free(void *p);
void *realloc(void *p, size_t size);mallocreturns a pointer to a block of at leastsizebytes.- The returned pointer is aligned (address is a multiple of 8 or 16) for efficient CPU access.
freereturns a block to the resource pool.reallocresizes a previously allocated block.
Where Does Everything Go? A Memory Map Example
A concrete example illustrates the process address space.
char big_array[1<<24]; /* 16 MB, global initialized (implicitly to 0) */
char *p1, *p2, *p3, *p4; /* Global pointers */
int main() {
p1 = malloc(1 << 28); /* 256 MB */
p2 = malloc(1 << 8); /* 256 B */
// ...
}/Semester-3/Systems-Programming-and-Computer-Architecture/Lecture-Notes/attachments/Pasted-image-20251014103707.png)
big_array: Resides in the Data segment (.bss).p1,p2,p3,p4: Pointer variables (8 bytes each) in the Data segment.- The
malloc’d memory: Large blocks pointed to byp1andp2reside in the Heap, which grows upwards. main(): Machine code resides in the Text segment.$rsp(Stack Pointer): Points to the top of the Stack, which grows downwards.
Explicit vs. Implicit Allocators
Memory management strategies fall into two categories:
-
Explicit Allocators (C, C++): The programmer is responsible for both allocation (
malloc) and deallocation (free). This offers control but risks bugs./Semester-3/Systems-Programming-and-Computer-Architecture/Lecture-Notes/attachments/Pasted-image-20251014103821.png)
-
Implicit Allocators (Java, Python): The application allocates memory, but a Garbage Collector (GC) automatically reclaims unreachable memory.
/Semester-3/Systems-Programming-and-Computer-Architecture/Lecture-Notes/attachments/Pasted-image-20251014103842.png)
This discussion focuses on building an explicit allocator.
The Core Problem: Managing the Heap
Writing malloc and free requires managing the heap as a large, contiguous block of memory to satisfy a sequence of requests.
Constraints on the Allocator
- Cannot control requests: Must handle arbitrary sequences of
malloc/free. - Must respond immediately: No reordering or buffering requests.
- Must use the heap: Allocated blocks must come from existing free memory.
- Must respect alignment: Addresses must be multiples of the alignment boundary.
- Cannot move allocated blocks: Once pointers are returned, data cannot be moved (no compaction).
Performance Goals
Two conflicting goals exist:
- Maximize Throughput: Operations per second should be maximized (ideally ).
- Maximize Memory Utilization: Minimize waste, specifically fragmentation.
Challenge: Fragmentation
Fragmentation prevents satisfying requests despite having enough total free memory.
Internal Fragmentation
Occurs when an allocated block exceeds the payload size.
/Semester-3/Systems-Programming-and-Computer-Architecture/Lecture-Notes/attachments/Pasted-image-20251014104000.png)
Causes:
- Overhead: Metadata storage (e.g., block size).
- Alignment: Padding to meet alignment requirements (e.g., 1 byte payload in a 16-byte block).
- Policy Decisions: Minimum block sizes.
External Fragmentation
Occurs when sufficient total free memory exists, but no single contiguous block is large enough.
/Semester-3/Systems-Programming-and-Computer-Architecture/Lecture-Notes/attachments/Pasted-image-20251014104015.png)
This depends on future request patterns and is the central challenge of allocator design.
Implementing a Simple Allocator: The Implicit Free List
The simplest allocator design solves two problems: knowing how much to free and tracking free blocks.
Knowing How Much to Free: The Header Field
The standard solution stores the block size in a header word immediately preceding the payload.
/Semester-3/Systems-Programming-and-Computer-Architecture/Lecture-Notes/attachments/Pasted-image-20251014104150.png)
free(p0) reads the word at p0 - sizeof(word) to determine the block size.
Tracking Free Blocks: The Implicit List
An implicit list links all blocks (allocated and free) by size. Since the heap is contiguous, knowing a block’s size locates the next block.
/Semester-3/Systems-Programming-and-Computer-Architecture/Lecture-Notes/attachments/Pasted-image-20251014105229.png)
The Header Trick: Size and Allocated Bit in One Word
Block sizes are multiples of 8 or 16, leaving the least significant bits unused (always 0). One bit can store the allocated/free flag.
Standard Optimization
...000: Size is a multiple of 8.- Use the last bit:
1= allocated,0= free.
/Semester-3/Systems-Programming-and-Computer-Architecture/Lecture-Notes/attachments/Pasted-image-20251014105303.png)
- True Size:
header & ~0x7 - Status:
header & 0x1
Finding a Free Block: Placement Policies
malloc must traverse the list to find a suitable free block.
First-Fit Policy
Search from the beginning and choose the first sufficient block.
p = start_of_heap;
while (p < end_of_heap) {
if ( (*p & 1) || ( (*p & -2) < requested_len) ) {
p = p + (*p & -2); // Go to next block
} else {
break; // Found
}
}- Pros: Simple.
- Cons: Linear time scan. Creates small fragments at the start.
Other Policies:
- Next-Fit: Starts search where the last one finished.
- Best-Fit: Searches the entire list for the tightest fit. Slower but minimizes fragmentation.
Allocating in a Free Block: Splitting
If a free block is larger than requested, it is split. The beginning is allocated, and the remainder becomes a new free block.
/Semester-3/Systems-Programming-and-Computer-Architecture/Lecture-Notes/attachments/Pasted-image-20251014105642.png)
Freeing a Block and the Need for Coalescing
Simply clearing the allocated flag creates “false fragmentation” where adjacent free blocks remain separate.
/Semester-3/Systems-Programming-and-Computer-Architecture/Lecture-Notes/attachments/Pasted-image-20251014105801.png)
Coalescing merges newly freed blocks with adjacent free neighbors.
Bidirectional Coalescing with Boundary Tags
Coalescing with the next block is straightforward. Coalescing with the previous block is difficult in a one-way list.
The Solution: Boundary Tags (Footers)
Replicating the header as a boundary tag (footer) at the end of the block allows bidirectional traversal.
/Semester-3/Systems-Programming-and-Computer-Architecture/Lecture-Notes/attachments/Pasted-image-20251014105854.png)
Checking the previous block involves reading the word before the current header (the previous footer).
This enables constant-time coalescing by checking both neighbors.
/Semester-3/Systems-Programming-and-Computer-Architecture/Lecture-Notes/attachments/Pasted-image-20251014105926.png)
Implicit List Summary
- Allocating: (linear search).
- Freeing: (constant time with boundary tags).
- Memory Overhead: Headers and footers cause internal fragmentation.
The linear allocation cost makes implicit lists impractical for high-performance allocators.
Explicit Free Lists
To optimize, an allocator can track only free blocks using an explicit free list (a linked list of free blocks).
/Semester-3/Systems-Programming-and-Computer-Architecture/Lecture-Notes/attachments/Pasted-image-20251014111049.png)
The Clever Part: Where to Store the Pointers?
Pointers (next and prev) are stored in the payload area of free blocks, which is otherwise unused.
Overlaying Data
- Allocated: Payload holds application data.
- Free: Payload holds list pointers.
/Semester-3/Systems-Programming-and-Computer-Architecture/Lecture-Notes/attachments/Pasted-image-20251014111234.png)
This imposes a minimum block size (Header + Next + Prev + Footer).
Logical vs. Physical Order
The logical list order need not match the physical memory order. Pointers can jump across allocated blocks.
/Semester-3/Systems-Programming-and-Computer-Architecture/Lecture-Notes/attachments/Pasted-image-20251014111751.png)
Allocating from an Explicit Free List
- Search the free list.
- Splice the block out of the list.
- Split if necessary, adding the remainder back to the list.
/Semester-3/Systems-Programming-and-Computer-Architecture/Lecture-Notes/attachments/Pasted-image-20251014112302.png)
Freeing with an Explicit Free List
Freeing involves inserting the block back into the list.
Insertion Policies:
- LIFO (Last-In, First-Out): Insert at the root. but may fragment worse.
- Address-Ordered: Maintain sorted order. search but lower fragmentation.
Freeing with LIFO: The Four Cases
Coalescing is managed alongside list insertion.
Case 1: No Coalescing
Insert new block at root.
/Semester-3/Systems-Programming-and-Computer-Architecture/Lecture-Notes/attachments/Pasted-image-20251014112335.png)
Case 2: Coalesce with Next Block
Splice out next block, merge, and insert new larger block at root.
/Semester-3/Systems-Programming-and-Computer-Architecture/Lecture-Notes/attachments/Pasted-image-20251014112423.png)
Case 3: Coalesce with Previous Block
Symmetric to Case 2.
/Semester-3/Systems-Programming-and-Computer-Architecture/Lecture-Notes/attachments/Pasted-image-20251014112740.png)
Case 4: Coalesce with Both Blocks
Splice out both neighbors, merge all three, and insert at root.
/Semester-3/Systems-Programming-and-Computer-Architecture/Lecture-Notes/attachments/Pasted-image-20251014112800.png)
Explicit List Summary
- Allocating: (faster than ).
- Freeing: (LIFO).
- Overhead: Larger minimum block size.
Segregated Free Lists: The Best of Both Worlds
Segregated free lists improve performance by maintaining multiple free lists, each for a specific size class of blocks (e.g., 16, 32, 64-128 bytes).
/Semester-3/Systems-Programming-and-Computer-Architecture/Lecture-Notes/attachments/Pasted-image-20251014113021.png)
Seglist Allocator Algorithm
-
Allocate: Determine size class, search suitable list. If empty, try larger classes. If all fail, request more heap memory.
-
Free: Coalesce, determine new size, insert into appropriate list.
-
Higher Throughput: Logarithmic time search for power-of-two classes.
-
Better Memory Utilization: Segregated lists approximate a best-fit search.
Practice: Allocator Mechanics
Designing an allocator requires careful bit manipulation and pointer arithmetic.
Exercise: Header Decoding
A block header in a 64-bit allocator has the value 0x0000000000000031. Assuming 8-byte alignment:
- What is the total block size?
- Answer: 48 bytes. The header stores the size and the allocated bit. The size is
0x30( in decimal).
- Answer: 48 bytes. The header stores the size and the allocated bit. The size is
- Is the block allocated or free?
- Answer: Allocated. The last bit is
1.
- Answer: Allocated. The last bit is
Exercise: Fragmentation Identification
Consider a heap with three free blocks of size 16, 32, and 16 bytes.
The user calls malloc(48).
- Can the allocator satisfy this request if it cannot move blocks?
- Answer: No. Even though total free space is 64 bytes, no single block is bytes. This is external fragmentation.
Continue here: 08 Garbage Collection and Software Engineering Principles