05 Pointers, Stack

Lecture from: 30.09.2025 | Video: Videos ETHZ

Welcome to one of the most foundational and powerful topics in C. Pointers are what give C its reputation for low-level control and performance. They are variables that hold memory addresses, allowing us to directly manipulate the layout and content of memory. To understand pointers, we must first understand the environment where they operate: the process address space.

The Process Address Space

When the operating system (OS) runs a program, it creates a process. A program is the static code on disk (the executable file); a process is that program in execution. The OS provides each process with its own private sandbox to play in, a concept known as the process address space.

This address space is a view of the computer’s memory from the process’s perspective. It has several key properties:

It contains the process’s virtual memory: This is a crucial concept we’ll explore in depth later in the course. For now, think of it as the process’s private map of memory. The addresses are not necessarily the physical RAM addresses but are managed and translated by the OS and CPU.
It is private: One process cannot (normally) see or access the address space of another. This isolation is a fundamental security and stability feature of modern operating systems.
It is byte-addressable: Every single byte in this memory map has a unique numerical address.
It is vast:
- On a 32-bit system, the address space contains $2^{32}$ bytes (4 Gigabytes).
- On a 64-bit system, it contains $2^{64}$ bytes (16 Exabytes). This is a theoretical maximum far larger than any physical memory available today.

Loading a Program: Populating the Address Space

When you execute a program, the OS loader reads your executable file and populates this address space. It sets up several distinct regions, or segments, each with a specific purpose.

Here is a typical layout for a Linux process, starting from low memory addresses and going up:

Unused: The very lowest part of memory (around address 0) is intentionally left unmapped. This helps catch errors, as trying to use a NULL pointer will access this invalid region and cause an immediate crash (a segmentation fault).
Read-Only Segment (.text, .rodata): This is where the program’s machine code (the compiled instructions) and any read-only data (like string literals) are loaded. It’s marked as read-only by the hardware to prevent a program from accidentally or maliciously modifying its own instructions.
Read/Write Segment (.data, .bss): This segment holds global and static variables that are initialized (.data) or uninitialized (.bss).
Run-time Heap: This is a large, flexible area of memory for dynamic allocation. When your program needs memory at runtime (e.g., for a data structure whose size isn’t known at compile time), it requests it from the heap using functions like malloc(). The heap grows upwards from the top of the data segment.
Shared Libraries: Modern programs rely on shared libraries (like the standard C library). The OS maps these libraries into the middle of the address space so that multiple processes can share the same code in physical memory.
User Stack: This is where local variables, function arguments, and return information are stored. The stack is crucial for managing function calls. By convention, it starts at a high address and grows downwards.

The Stack

Languages that support recursion, like C, rely on a stack data structure to manage function calls. The code must be reentrant, meaning a single function can have multiple active instances at the same time (e.g., amI() calling amI()). Each instance needs its own private state.

Stack Frames: The stack is allocated in chunks called frames. Each time a function is called, a new frame is “pushed” onto the stack. When the function returns, its frame is “popped.”
Stack Discipline: This follows a strict Last-In, First-Out (LIFO) order. The callee always returns before the caller.

Stack Frames in Detail

A stack frame contains all the necessary information for a single function activation:

Local Variables: Variables declared inside the function.
Return Information: The address in the caller’s code to return to when the function finishes.
Temporary Space: For intermediate calculations.

The CPU uses two special registers to manage the stack:

Frame Pointer: Points to a fixed location within the current frame (often the beginning).
Stack Pointer: Points to the “top” of the stack (the lowest memory address currently in use by the stack).

Visualizing the Call Chain

Let’s trace a series of function calls to see the stack in action. Consider a program where yoo() calls who(), and who() calls the recursive function amI().

yoo() calls who(): When who() is called, a new frame for who is pushed onto the stack, on top of yoo’s frame.
who() calls amI(): A frame for the first activation of amI is pushed.
amI() calls amI(): A second amI frame is pushed. The stack now has four frames: yoo, who, amI, amI.
Functions Return: As each amI returns, its frame is popped. Eventually, who returns, its frame is popped, and finally yoo returns, leaving the stack empty.

Pointers in C

With our understanding of memory, we can now define a pointer precisely.

What is a Pointer?

A pointer is a variable whose value is the memory address of another variable.

It’s that simple. It’s not the data itself; it’s the location of the data.

Addresses and the `&` Operator

To get the memory address of a variable, we use the unary & (address-of) operator.

#include <stdio.h>
 
int main(int argc, char *argv[]) {
    int x, y;
    int a[2];
 
    // Use %p in printf to print addresses (pointers) in hexadecimal.
    printf("x is at %p\n", &x);
    printf("y is at %p\n", &y);
    printf("a[0] is at %p\n", &a[0]);
    printf("a[1] is at %p\n", &a[1]);
    printf("main is at %p\n", &main); // Yes, functions have addresses too!
    return 0;
}

Running this code produces output that reveals the memory layout:

Notice that the local variables (x, y, a) have high memory addresses. This is because they live on the stack, which grows downwards from the top of the address space. The main function’s address is low because it lives in the read-only code segment.

Declaring and Dereferencing Pointers

We use the asterisk * for two distinct pointer operations.

Declaration: To declare a pointer variable, we specify the type of data it will point to, followed by an asterisk.

type *name; // Declares a pointer named 'name' that can hold the address of a 'type'

Example:

int x = 42;
int *p;      // p is a pointer to an integer.
p = &x;      // p now stores the address of x. p "points to" x.

Dereferencing: To access the value at the address stored in a pointer, we use the unary * (dereference or indirection) operator.

v = *pointer; // v gets the value that 'pointer' points to.
*pointer = value; // The memory location that 'pointer' points to gets a new value.

Example:

int x = 42;
int *p = &x; // p points to x
 
printf("The value of x is %d\n", x); // Prints 42
 
*p = 99; // Go to the address stored in p (x's address) and write 99 there.
 
printf("The value of x is now %d\n", x); // Prints 99

We changed x without ever mentioning x by name! We did it indirectly through the pointer p.

Visualizing Pointers: Box and Arrow Diagrams

The best way to reason about pointers is with “box and arrow” diagrams. We draw boxes for memory locations, showing their address, name (if any), and value. If a value is an address, we draw an arrow from it to the location it points to.

Double Pointers

You can have a pointer to a pointer. This is declared with two asterisks (**). A double pointer stores the address of another pointer.

int x = 1;
int *p = &x;
int **dp = &p; // dp points to p, which points to x.

To get to the value of x from dp, you must dereference twice: **dp.

Address Space Layout Randomization (ASLR) and NULL

ASLR: You may notice that the addresses of your variables change every time you run your program. This is a security feature called Address Space Layout Randomization (ASLR). By randomizing the base addresses of the stack, heap, and libraries, it makes it much harder for attackers to exploit memory corruption bugs.
NULL: There is a special pointer value called NULL. It’s a guaranteed-to-be-invalid memory location (usually address 0). It’s incredibly useful as a sentinel value to indicate that a pointer “doesn’t point to anything.” Any attempt to dereference a NULL pointer will immediately cause a segmentation fault. The type of NULL is void *, a generic pointer type.

Pointer Arithmetic

You can perform a limited set of arithmetic operations on pointers. The most common is adding an integer to a pointer. This is where pointer types become critical.

The Golden Rule of Pointer Arithmetic

When you add an integer n to a pointer p, the compiler does not add n to the raw address. Instead, it advances the pointer by n elements of the type it points to. The actual address calculation is: $new_address = old_address + n \times sizeof (* p)$

Example:

int arr[3] = {2, 3, 4};
int *p = &arr[1]; // p points to the '3'

Let’s trace what happens with arithmetic.

*p += 1;
- This is not pointer arithmetic. It dereferences p to get the value 3.
- It increments that value to 4.
- The memory at arr[1] now holds 4. The pointer p itself is unchanged.
p += 1;
- This is pointer arithmetic. We are adding 1 to the pointer p.
- p is an int *, and sizeof(int) is 4 bytes.
- The new address will be address_of_arr[1] + 1 * 4.
- p now points to the next integer in memory, which is arr[2].

The Importance of Type: `int ` vs. `char `

Let’s see a detailed example of how type affects pointer arithmetic.

int arr[3] = {1, 2, 3};
int *int_ptr = &arr[0];
char *char_ptr = (char *) int_ptr; // A char pointer pointing to the same location

int_ptr += 1;:
- sizeof(int) is 4. The address in int_ptr increases by 4.
- It now points to arr[1]. *int_ptr is now 2.
char_ptr += 1;:
- sizeof(char) is 1. The address in char_ptr increases by just 1.
- It now points to the second byte of the integer arr[0].
- On a little-endian machine, the integer 1 is stored as 01 00 00 00. The first byte is 1. The second byte is 0.
- *char_ptr is now 0.

Arrays and Pointers: A Close Relationship

Arrays and pointers are not the same, but they are deeply intertwined in C.

An array is a contiguous block of memory holding elements of the same type.
A pointer is a single variable that holds a memory address.

The connection is this: In most expressions, an array’s name “decays” into a pointer to its first element.

This means that if you have int a[10];, the following are true:

a is equivalent to &a[0].
a[i] is syntactic sugar for *(a + i).

This equivalence is why all these ways of accessing an array element are identical:

Exceptions to Array Decay

There are three important situations where an array name is not treated as a pointer to its first element:

When it’s an operand of sizeof(): sizeof(a) returns the size of the entire array in bytes (10 * sizeof(int)), not the size of a pointer.
When it’s an operand of &: &a gives you the address of the array itself. The value is the same as a (the starting address), but the type is different. It’s a “pointer to an array of 10 ints” (int (*)[10]), not a “pointer to an int” (int *).

When it’s a string literal used to initialize a char array:

char a[] = "Hello"; // 'a' is a 6-byte array on the stack.
char *b = "Hello";  // 'b' is a pointer on the stack pointing to a
                    // 6-byte read-only string literal in the code segment.

Arrays as Function Parameters

When you pass an array to a function, you are always passing a pointer. The array does not get copied. The function receives a pointer to the first element of the original array.

This means these three function signatures are precisely equivalent to the compiler:

int arrfun(int *myarray);
int arrfun(int myarray[]);
int arrfun(int myarray[42]); // The size is ignored by the compiler!

Inside arrfun, myarray is always treated as an int *. This is why sizeof(myarray) inside the function will return the size of a pointer, not the size of the original array.

Passing by Value vs. Passing by Reference

This leads to a final, critical topic: how C passes arguments to functions.

Pass-by-value (the default): C passes a copy of the argument’s value to the function. If the function modifies its parameter, it’s only modifying the local copy. The original variable in the caller is unaffected.

This is why the classic swap function fails:

void swap(int a, int b) { // a and b are copies
    int tmp = a;
    a = b;
    b = tmp;
} // The copies are swapped, then destroyed. The originals are untouched.

Pass-by-reference (the C way): To allow a function to modify the caller’s variables, we don’t pass the variables themselves. We pass pointers to them.

void swap(int *addr_a, int *addr_b) { // Takes pointers as arguments
    int tmp = *addr_a;       // Dereference to get the value
    *addr_a = *addr_b;   // Dereference to assign to the original location
    *addr_b = tmp;
}
 
// In main:
int x = 42, y = -7;
swap(&x, &y); // Pass the addresses of the variables

Now, swap receives the addresses of x and y. By dereferencing those addresses, it directly manipulates the variables in main’s stack frame, and the swap works as intended.

Continue here: 06 Pointers, Heap, Dynamic Memory, Structs

CS Notes

Explorer

05 Pointers, Stack

The Process Address Space

Loading a Program: Populating the Address Space

The Stack

Stack Frames in Detail

Visualizing the Call Chain

Pointers in C

Addresses and the `&` Operator

Declaring and Dereferencing Pointers

Visualizing Pointers: Box and Arrow Diagrams

Double Pointers

Address Space Layout Randomization (ASLR) and NULL

Pointer Arithmetic

The Importance of Type: `int ` vs. `char `

Arrays and Pointers: A Close Relationship

Exceptions to Array Decay

Arrays as Function Parameters

Passing by Value vs. Passing by Reference

Table of Contents

Graph View

Backlinks

CS Notes

Explorer

05 Pointers, Stack

The Process Address Space

Loading a Program: Populating the Address Space

The Stack

Stack Frames in Detail

Visualizing the Call Chain

Pointers in C

Addresses and the & Operator

Declaring and Dereferencing Pointers

Visualizing Pointers: Box and Arrow Diagrams

Double Pointers

Address Space Layout Randomization (ASLR) and NULL

Pointer Arithmetic

The Importance of Type: int * vs. char *

Arrays and Pointers: A Close Relationship

Exceptions to Array Decay

Arrays as Function Parameters

Passing by Value vs. Passing by Reference

Table of Contents

Graph View

Backlinks

Addresses and the `&` Operator

The Importance of Type: `int ` vs. `char `