Lecture from: 30.09.2025 | Video: Videos ETHZ

This lecture covers one of the most foundational and powerful topics in C: pointers. Pointers provide the low-level control and performance for which C is renowned. They allow direct manipulation of memory layout and content. Understanding pointers necessitates a prior understanding of the environment in which they operate: the process address space.

The Process Address Space

When the operating system (OS) executes a program, it creates a process. A program is the static code on disk; a process is that program in execution. The OS provides each process with its own private sandbox known as the process address space.

This address space represents the process’s view of the computer’s memory. It has several key properties:

  • It contains the process’s virtual memory: This concept acts as the process’s private map of memory. Addresses are managed and translated by the OS and CPU, not necessarily corresponding directly to physical RAM.
  • It is private: Processes are isolated; one process cannot normally access the address space of another. This isolation is fundamental to security and stability.
  • It is byte-addressable: Every byte in this map has a unique numerical address.
  • It is vast:
    • On a 32-bit system, the address space contains bytes (4 Gigabytes).
    • On a 64-bit system, it contains bytes (16 Exabytes), far exceeding current physical memory limits.

Loading a Program

When a program is executed, the OS loader populates this address space with distinct regions, or segments.

A typical Linux process layout, from low to high memory addresses, includes:

  • Unused: The lowest memory (around address 0) remains unmapped to catch errors like dereferencing NULL pointers.
  • Read-Only Segment (.text, .rodata): Contains machine code and constants.
  • Read/Write Segment (.data, .bss): Holds global and static variables.
  • Run-time Heap: A flexible area for dynamic allocation. It grows upwards.
  • Shared Libraries: The OS maps common libraries here for efficient sharing.
  • User Stack: Stores local variables and function arguments. It grows downwards.

The Stack

Languages supporting recursion, like C, utilize a stack to manage function calls. Code must be reentrant, allowing multiple active instances of a single function, each with private state.

  • Stack Frames: Memory is allocated in chunks called frames. Calling a function pushes a new frame; returning pops it.
  • Stack Discipline: Adheres to Last-In, First-Out (LIFO). The callee returns before the caller.

Stack Frames in Detail

A stack frame contains:

  • Local Variables: Variables declared within the function scope.
  • Return Information: The address to return to upon completion.
  • Temporary Space: Space for intermediate calculations that don’t fit in registers.

The CPU manages the stack using two registers:

  • Frame Pointer: Points to a fixed location (typically the base) of the current frame.
  • Stack Pointer: Points to the “top” of the stack (the lowest used address).

Visualizing the Call Chain

Tracing a sequence where yoo() calls who(), which calls the recursive amI():

  1. yoo() calls who(): A frame for who is pushed onto yoo’s frame.
  2. who() calls amI(): A frame for amI is pushed.
  3. amI() calls amI(): A second amI frame is pushed. The stack holds four frames.
  4. Functions Return: Frames are popped in reverse order as functions return, eventually emptying the stack.

Pointers

A pointer is simply a variable whose value is the memory address of another variable. It represents the location of data rather than the data itself.

Addresses and the & Operator

The unary & (address-of) operator yields the memory address of a variable.

#include <stdio.h>
 
int main(int argc, char *argv[]) {
    int x, y;
    int a[2];
 
    // Use %p in printf to print addresses (pointers) in hexadecimal.
    printf("x is at %p\n", &x);
    printf("y is at %p\n", &y);
    printf("a[0] is at %p\n", &a[0]);
    printf("a[1] is at %p\n", &a[1]);
    printf("main is at %p\n", &main); // Yes, functions have addresses too!
    return 0;
}

Output reveals the memory layout:

Local variables (x, y, a) have high addresses (stack), while main has a low address (read-only code segment).

Declaring and Dereferencing Pointers

The asterisk * denotes two distinct operations:

  1. Declaration: Declares a pointer variable.

    int x = 42;
    int *p;      // p is a pointer to an integer.
    p = &x;      // p now holds the address of x.
  2. Dereferencing: Accesses the value at the pointer’s stored address.

    int x = 42;
    int *p = &x; // p points to x
     
    printf("The value of x is %d\n", x); // Prints 42
     
    *p = 99; // Writes 99 to the address stored in p (modifying x).
     
    printf("The value of x is now %d\n", x); // Prints 99

    This allows modifying x indirectly.

Visualizing Pointers

“Box and arrow” diagrams act as effective reasoning tools. Boxes represent memory locations (with address, name, value), and arrows represent pointers linking addresses.

Double Pointers

A pointer to a pointer is declared with **. It stores the address of another pointer.

int x = 1;
int *p = &x;
int **dp = &p; // dp points to p, which points to x.

Dereferencing twice (**dp) retrieves the value of x.

ASLR and NULL

  • ASLR (Address Space Layout Randomization): Randomizes memory segment locations for security.
  • NULL: A sentinel value (usually 0) indicating an invalid memory location. Dereferencing it causes a segmentation fault.

Pointer Arithmetic

Limited arithmetic operations are possible on pointers, most notably adding an integer. Pointer types are crucial here.

The Golden Rule of Pointer Arithmetic

Adding an integer n to a pointer p advances the pointer by n elements of the pointed-to type, not n bytes.

Example:

int arr[3] = {2, 3, 4};
int *p = &arr[1]; // p points to the '3'
  1. *p += 1;

    • Dereferences p (value 3), increments it to 4. arr[1] becomes 4. p remains unchanged.
  2. p += 1;

    • Adds 1 to pointer p. Since p is int * (4 bytes), p advances by 4 bytes.
    • p now points to arr[2].

The Importance of Type

Casting pointers changes arithmetic behavior.

int arr[3] = {1, 2, 3};
int *int_ptr = &arr[0];
char *char_ptr = (char *) int_ptr;
  • int_ptr += 1: Advances by 4 bytes (next integer). *int_ptr becomes 2.

  • char_ptr += 1: Advances by 1 byte. It now points to the second byte of arr[0]. On a little-endian machine, this byte is 0.

Arrays and Pointers

Arrays and pointers are distinct but closely related.

  • Array: Contiguous block of memory elements.
  • Pointer: Variable holding an address.

In most expressions, an array’s name “decays” into a pointer to its first element. For int a[10];, a is equivalent to &a[0]. Thus a[i] is equivalent to *(a + i).

Exceptions to Decay

Array names do not decay when:

  1. Used with sizeof(): Returns total array size in bytes.
  2. Used with &: Returns address of the array (type int (*)[10]), not int *.
  3. Initializing a character array with a string literal.

Arrays as Function Parameters

Passing an array to a function always passes a pointer to its first element. The array is not copied.

int arrfun(int *myarray);
int arrfun(int myarray[]); // Compiler treats myarray as int *
int arrfun(int myarray[42]); // The size is ignored by the compiler!

Inside arrfun, sizeof(myarray) returns the pointer size.

Pass-by-Value vs. Pass-by-Reference

  • Pass-by-value: The default. Functions receive copies of arguments. Modifying parameters affects only the local copy. This explains why a naive swap(int a, int b) fails.
void swap(int a, int b) { // a and b are copies
    int tmp = a;
    a = b;
    b = tmp;
} // The copies are swapped, then destroyed. The originals are untouched.

  • Pass-by-reference: Achieved by passing pointers.

    void swap(int *addr_a, int *addr_b) { // Takes pointers as arguments
        int tmp = *addr_a;       // Dereference to get the value
        *addr_a = *addr_b;   // Dereference to assign to the original location
        *addr_b = tmp;
    }
     
    // In main:
    int x = 42, y = -7;
    swap(&x, &y); // Pass the addresses of the variables

    swap receives addresses. Dereferencing manipulates the original variables in the caller’s stack frame.

Practice: Pointers and Memory

Mastering pointers is about visualizing where data lives.

Exercise: Pointer Arithmetic

Consider the following:

int arr[4] = {10, 20, 30, 40};
int *p = arr; // p points to 10
p = p + 2;    // What does p point to?
  • Answer: p points to 30. Since p is an int *, p+2 advances the address by bytes, which is the 3rd element.

Exercise: Pass-by-Reference

What will be printed?

void mystery(int *a, int b) {
    *a = *a + b;
    b = 0;
}
int main() {
    int x = 5, y = 10;
    mystery(&x, y);
    printf("%d %d", x, y);
}
  • Answer: 15 10. x is passed by reference and modified. y is passed by value, so the local change to b in mystery does not affect y.

Continue here: 06 C Pointers and Memory Management