Lecture from: 30.09.2025 | Video: Videos ETHZ
Welcome to one of the most foundational and powerful topics in C. Pointers are what give C its reputation for low-level control and performance. They are variables that hold memory addresses, allowing us to directly manipulate the layout and content of memory. To understand pointers, we must first understand the environment where they operate: the process address space.
The Process Address Space
When the operating system (OS) runs a program, it creates a process. A program is the static code on disk (the executable file); a process is that program in execution. The OS provides each process with its own private sandbox to play in, a concept known as the process address space.
/Semester-3/Systems-Programming-and-Computer-Architecture/Lecture-Notes/attachments/Pasted-image-20251004144937.png)
This address space is a view of the computer’s memory from the process’s perspective. It has several key properties:
- It contains the process’s virtual memory: This is a crucial concept we’ll explore in depth later in the course. For now, think of it as the process’s private map of memory. The addresses are not necessarily the physical RAM addresses but are managed and translated by the OS and CPU.
- It is private: One process cannot (normally) see or access the address space of another. This isolation is a fundamental security and stability feature of modern operating systems.
- It is byte-addressable: Every single byte in this memory map has a unique numerical address.
- It is vast:
- On a 32-bit system, the address space contains bytes (4 Gigabytes).
- On a 64-bit system, it contains bytes (16 Exabytes). This is a theoretical maximum far larger than any physical memory available today.
Loading a Program: Populating the Address Space
When you execute a program, the OS loader reads your executable file and populates this address space. It sets up several distinct regions, or segments, each with a specific purpose.
/Semester-3/Systems-Programming-and-Computer-Architecture/Lecture-Notes/attachments/Pasted-image-20251004145035.png)
Here is a typical layout for a Linux process, starting from low memory addresses and going up:
- Unused: The very lowest part of memory (around address 0) is intentionally left unmapped. This helps catch errors, as trying to use a
NULLpointer will access this invalid region and cause an immediate crash (a segmentation fault). - Read-Only Segment (
.text,.rodata): This is where the program’s machine code (the compiled instructions) and any read-only data (like string literals) are loaded. It’s marked as read-only by the hardware to prevent a program from accidentally or maliciously modifying its own instructions. - Read/Write Segment (
.data,.bss): This segment holds global and static variables that are initialized (.data) or uninitialized (.bss). - Run-time Heap: This is a large, flexible area of memory for dynamic allocation. When your program needs memory at runtime (e.g., for a data structure whose size isn’t known at compile time), it requests it from the heap using functions like
malloc(). The heap grows upwards from the top of the data segment. - Shared Libraries: Modern programs rely on shared libraries (like the standard C library). The OS maps these libraries into the middle of the address space so that multiple processes can share the same code in physical memory.
- User Stack: This is where local variables, function arguments, and return information are stored. The stack is crucial for managing function calls. By convention, it starts at a high address and grows downwards.
The Stack
Languages that support recursion, like C, rely on a stack data structure to manage function calls. The code must be reentrant, meaning a single function can have multiple active instances at the same time (e.g., amI() calling amI()). Each instance needs its own private state.
- Stack Frames: The stack is allocated in chunks called frames. Each time a function is called, a new frame is “pushed” onto the stack. When the function returns, its frame is “popped.”
- Stack Discipline: This follows a strict Last-In, First-Out (LIFO) order. The callee always returns before the caller.
Stack Frames in Detail
A stack frame contains all the necessary information for a single function activation:
- Local Variables: Variables declared inside the function.
- Return Information: The address in the caller’s code to return to when the function finishes.
- Temporary Space: For intermediate calculations.
The CPU uses two special registers to manage the stack:
- Frame Pointer: Points to a fixed location within the current frame (often the beginning).
- Stack Pointer: Points to the “top” of the stack (the lowest memory address currently in use by the stack).
/Semester-3/Systems-Programming-and-Computer-Architecture/Lecture-Notes/attachments/Pasted-image-20251004145153.png)
Visualizing the Call Chain
Let’s trace a series of function calls to see the stack in action. Consider a program where yoo() calls who(), and who() calls the recursive function amI().
/Semester-3/Systems-Programming-and-Computer-Architecture/Lecture-Notes/attachments/Pasted-image-20251004145212.png)
yoo()callswho(): Whenwho()is called, a new frame forwhois pushed onto the stack, on top ofyoo’s frame.who()callsamI(): A frame for the first activation ofamIis pushed.amI()callsamI(): A secondamIframe is pushed. The stack now has four frames:yoo,who,amI,amI.- Functions Return: As each
amIreturns, its frame is popped. Eventually,whoreturns, its frame is popped, and finallyyooreturns, leaving the stack empty.
Pointers in C
With our understanding of memory, we can now define a pointer precisely.
What is a Pointer?
A pointer is a variable whose value is the memory address of another variable.
It’s that simple. It’s not the data itself; it’s the location of the data.
Addresses and the & Operator
To get the memory address of a variable, we use the unary & (address-of) operator.
#include <stdio.h>
int main(int argc, char *argv[]) {
int x, y;
int a[2];
// Use %p in printf to print addresses (pointers) in hexadecimal.
printf("x is at %p\n", &x);
printf("y is at %p\n", &y);
printf("a[0] is at %p\n", &a[0]);
printf("a[1] is at %p\n", &a[1]);
printf("main is at %p\n", &main); // Yes, functions have addresses too!
return 0;
}Running this code produces output that reveals the memory layout:
/Semester-3/Systems-Programming-and-Computer-Architecture/Lecture-Notes/attachments/Pasted-image-20251004145300.png)
Notice that the local variables (x, y, a) have high memory addresses. This is because they live on the stack, which grows downwards from the top of the address space. The main function’s address is low because it lives in the read-only code segment.
Declaring and Dereferencing Pointers
We use the asterisk * for two distinct pointer operations.
-
Declaration: To declare a pointer variable, we specify the type of data it will point to, followed by an asterisk.
type *name; // Declares a pointer named 'name' that can hold the address of a 'type'Example:
int x = 42; int *p; // p is a pointer to an integer. p = &x; // p now stores the address of x. p "points to" x. -
Dereferencing: To access the value at the address stored in a pointer, we use the unary
*(dereference or indirection) operator.v = *pointer; // v gets the value that 'pointer' points to. *pointer = value; // The memory location that 'pointer' points to gets a new value.Example:
int x = 42; int *p = &x; // p points to x printf("The value of x is %d\n", x); // Prints 42 *p = 99; // Go to the address stored in p (x's address) and write 99 there. printf("The value of x is now %d\n", x); // Prints 99We changed
xwithout ever mentioningxby name! We did it indirectly through the pointerp.
Visualizing Pointers: Box and Arrow Diagrams
The best way to reason about pointers is with “box and arrow” diagrams. We draw boxes for memory locations, showing their address, name (if any), and value. If a value is an address, we draw an arrow from it to the location it points to.
/Semester-3/Systems-Programming-and-Computer-Architecture/Lecture-Notes/attachments/Pasted-image-20251004145351.png)
Double Pointers
You can have a pointer to a pointer. This is declared with two asterisks (**). A double pointer stores the address of another pointer.
int x = 1;
int *p = &x;
int **dp = &p; // dp points to p, which points to x.To get to the value of x from dp, you must dereference twice: **dp.
/Semester-3/Systems-Programming-and-Computer-Architecture/Lecture-Notes/attachments/Pasted-image-20251004145400.png)
Address Space Layout Randomization (ASLR) and NULL
- ASLR: You may notice that the addresses of your variables change every time you run your program. This is a security feature called Address Space Layout Randomization (ASLR). By randomizing the base addresses of the stack, heap, and libraries, it makes it much harder for attackers to exploit memory corruption bugs.
- NULL: There is a special pointer value called
NULL. It’s a guaranteed-to-be-invalid memory location (usually address 0). It’s incredibly useful as a sentinel value to indicate that a pointer “doesn’t point to anything.” Any attempt to dereference aNULLpointer will immediately cause a segmentation fault. The type ofNULLisvoid *, a generic pointer type.
Pointer Arithmetic
You can perform a limited set of arithmetic operations on pointers. The most common is adding an integer to a pointer. This is where pointer types become critical.
The Golden Rule of Pointer Arithmetic
When you add an integer
nto a pointerp, the compiler does not addnto the raw address. Instead, it advances the pointer bynelements of the type it points to. The actual address calculation is:
Example:
int arr[3] = {2, 3, 4};
int *p = &arr[1]; // p points to the '3'Let’s trace what happens with arithmetic.
-
*p += 1;- This is not pointer arithmetic. It dereferences
pto get the value3. - It increments that value to
4. - The memory at
arr[1]now holds4. The pointerpitself is unchanged.
/Semester-3/Systems-Programming-and-Computer-Architecture/Lecture-Notes/attachments/Pasted-image-20251004145929.png)
- This is not pointer arithmetic. It dereferences
-
p += 1;- This is pointer arithmetic. We are adding
1to the pointerp. pis anint *, andsizeof(int)is 4 bytes.- The new address will be
address_of_arr[1] + 1 * 4. pnow points to the next integer in memory, which isarr[2]./Semester-3/Systems-Programming-and-Computer-Architecture/Lecture-Notes/attachments/Pasted-image-20251004145945.png)
- This is pointer arithmetic. We are adding
The Importance of Type: int * vs. char *
Let’s see a detailed example of how type affects pointer arithmetic.
/Semester-3/Systems-Programming-and-Computer-Architecture/Lecture-Notes/attachments/Pasted-image-20251004150027.png)
int arr[3] = {1, 2, 3};
int *int_ptr = &arr[0];
char *char_ptr = (char *) int_ptr; // A char pointer pointing to the same location-
int_ptr += 1;:sizeof(int)is 4. The address inint_ptrincreases by 4.- It now points to
arr[1].*int_ptris now2./Semester-3/Systems-Programming-and-Computer-Architecture/Lecture-Notes/attachments/Pasted-image-20251004150055.png)
-
char_ptr += 1;:sizeof(char)is 1. The address inchar_ptrincreases by just 1.- It now points to the second byte of the integer
arr[0]. - On a little-endian machine, the integer
1is stored as01 00 00 00. The first byte is1. The second byte is0. *char_ptris now0./Semester-3/Systems-Programming-and-Computer-Architecture/Lecture-Notes/attachments/Pasted-image-20251004150115.png)
Arrays and Pointers: A Close Relationship
Arrays and pointers are not the same, but they are deeply intertwined in C.
- An array is a contiguous block of memory holding elements of the same type.
- A pointer is a single variable that holds a memory address.
The connection is this: In most expressions, an array’s name “decays” into a pointer to its first element.
This means that if you have int a[10];, the following are true:
ais equivalent to&a[0].a[i]is syntactic sugar for*(a + i).
This equivalence is why all these ways of accessing an array element are identical:
/Semester-3/Systems-Programming-and-Computer-Architecture/Lecture-Notes/attachments/Pasted-image-20251004150223.png)
Exceptions to Array Decay
There are three important situations where an array name is not treated as a pointer to its first element:
- When it’s an operand of
sizeof():sizeof(a)returns the size of the entire array in bytes (10 * sizeof(int)), not the size of a pointer. - When it’s an operand of
&:&agives you the address of the array itself. The value is the same asa(the starting address), but the type is different. It’s a “pointer to an array of 10 ints” (int (*)[10]), not a “pointer to an int” (int *). - When it’s a string literal used to initialize a
chararray:char a[] = "Hello"; // 'a' is a 6-byte array on the stack. char *b = "Hello"; // 'b' is a pointer on the stack pointing to a // 6-byte read-only string literal in the code segment.
Arrays as Function Parameters
When you pass an array to a function, you are always passing a pointer. The array does not get copied. The function receives a pointer to the first element of the original array.
This means these three function signatures are precisely equivalent to the compiler:
int arrfun(int *myarray);
int arrfun(int myarray[]);
int arrfun(int myarray[42]); // The size is ignored by the compiler!Inside arrfun, myarray is always treated as an int *. This is why sizeof(myarray) inside the function will return the size of a pointer, not the size of the original array.
Passing by Value vs. Passing by Reference
This leads to a final, critical topic: how C passes arguments to functions.
- Pass-by-value (the default): C passes a copy of the argument’s value to the function. If the function modifies its parameter, it’s only modifying the local copy. The original variable in the caller is unaffected.
This is why the classic swap function fails:
void swap(int a, int b) { // a and b are copies
int tmp = a;
a = b;
b = tmp;
} // The copies are swapped, then destroyed. The originals are untouched./Semester-3/Systems-Programming-and-Computer-Architecture/Lecture-Notes/attachments/Pasted-image-20251004150325.png)
-
Pass-by-reference (the C way): To allow a function to modify the caller’s variables, we don’t pass the variables themselves. We pass pointers to them.
void swap(int *addr_a, int *addr_b) { // Takes pointers as arguments int tmp = *addr_a; // Dereference to get the value *addr_a = *addr_b; // Dereference to assign to the original location *addr_b = tmp; } // In main: int x = 42, y = -7; swap(&x, &y); // Pass the addresses of the variablesNow,
swapreceives the addresses ofxandy. By dereferencing those addresses, it directly manipulates the variables inmain’s stack frame, and the swap works as intended.
/Semester-3/Systems-Programming-and-Computer-Architecture/Lecture-Notes/attachments/Pasted-image-20251004150345.png)
/Semester-3/Systems-Programming-and-Computer-Architecture/Lecture-Notes/attachments/Pasted-image-20251004150357.png)
Continue here: 06 Pointers, Heap, Dynamic Memory, Structs