14 Non-Local Jumps and Coroutines

Lecture from: 04.11.2025 | Video: Videos ETHZ

The course has thus far explored the world of C, its types, data structures, pointers, and memory layout. While structs allow for abstractions, these are always rooted in the physical layout of memory. C is not an object-oriented language; it is a low-level language fundamentally close to the hardware. C serves as a portable layer of abstraction over the machine’s operations.

This philosophy contrasts with languages like Rust, which present higher-level abstractions enforced by a powerful compiler. A Rust program is a specification of semantics, whereas a C program is a set of instructions to the compiler. Because C is so low-level, it provides the building blocks to implement almost any higher-level concept, which is why runtimes for languages like Haskell are themselves written in C.

This lecture utilizes C’s low-level capabilities to explore a form of control flow that breaks standard rules. Having covered loops, conditionals, and the strict hierarchy of procedure call and ret, the focus shifts to a standard library feature that provides a mechanism for unorthodox control flow.

`setjmp()` and `longjmp()`

This mechanism is provided by two functions in the <setjmp.h> library: setjmp() and longjmp(). While rarely used by application programmers, they are powerful tools.

`setjmp()`

#include <setjmp.h>
int setjmp(jmp_buf env);

The setjmp() function takes one argument, env, which is of the opaque type jmp_buf. This is effectively a pointer to a structure or array where the execution environment can be stored.

Action: When setjmp() is called, it saves the current execution environment into the env buffer. This “environment” is a snapshot of the machine’s state, including the current stack state and registers.
Return Value: When called directly, setjmp() returns 0.

Its utility is revealed when paired with longjmp().

`longjmp()`

#include <setjmp.h>
void longjmp(jmp_buf env, int val);

The longjmp() function is the counterpart to setjmp(). It is a function that never returns in the conventional sense.

Action: longjmp() takes a previously saved environment env and an integer val. It restores the machine state to exactly what was saved in env. This causes execution to resume as if the original setjmp() call had just returned for a second time.
The “New” Return: On this second return, setjmp() does not return 0. Instead, it returns the val that was passed to longjmp(). (If val=0 is passed, setjmp() returns 1 to ensure the second return is always non-zero).

Constraints on longjmp()

One can only longjmp() once for each corresponding setjmp() call.

It is invalid to longjmp() to an environment if the function that called setjmp() has already returned. The stack frame containing the context for that setjmp would be gone, and jumping back would restore the machine to an invalid state.

A single line of code, the call to setjmp(), can return multiple times with different return values.

A Toy Example

This example illustrates the control flow.

#include <stdio.h>
#include <setjmp.h>
 
static jmp_buf buf;
 
void second(void) {
    printf("second\n");
    longjmp(buf, 1);
}
 
void first(void) {
    second();
    printf("first\n"); // This line is never reached!
}
 
int main() {
    if (!setjmp(buf)) {
        // Block A: Runs on the FIRST return from setjmp (which is 0)
        first();
    } else {
        // Block B: Runs on the SECOND return from setjmp (which is 1)
        printf("main\n");
    }
    return 0;
}

Execution Trace:

main starts. It calls setjmp(buf).
setjmp() saves the current state into buf and returns 0.
The condition !0 is true. Block A is executed.
main calls first().
first() calls second().
second() prints "second".
second() calls longjmp(buf, 1). longjmp does not return to second().
Instead, longjmp() restores the machine state from buf. The stack is unwound, registers are reset, and execution jumps directly back to the point where setjmp() was called in main.
The setjmp() call now returns for a second time, with the return value 1.
The condition !1 is false. Block B is executed.
printf("main\n") is executed.
main returns 0.

Output:

second
main

Notice that printf("first\n") is never executed. The longjmp performed a non-local jump, bypassing the normal function return stack.

Implementing `setjmp()` and `longjmp()`

This mechanism is a direct manipulation of the machine’s state.

The `jmp_buf` Environment

The jmp_buf is a structure that holds the snapshot of the machine state. On x86-64, it must store:

The values of all callee-saved registers (%rbx, %rbp, %r12-%r15). Caller-saved registers are not saved because the caller is responsible for them.
The stack pointer (%rsp) of the function that called setjmp.
The instruction pointer (%rip), which is the address to return to.

X86-64 `setjmp()` Implementation (from Musl C library)

This is the actual assembly code.

setjmp:
    ; %rdi holds the pointer to the jmp_buf 'env'
    ; Save all callee-saved registers into the buffer at their respective offsets
    mov %rbx, (%rdi)
    mov %rbp, 8(%rdi)
    mov %r12, 16(%rdi)
    mov %r13, 24(%rdi)
    mov %r14, 32(%rdi)
    mov %r15, 40(%rdi)
    
    ; Calculate and save the caller's stack pointer.
    ; (%rsp) currently holds our return address. 8(%rsp) is the caller's stack.
    lea 8(%rsp), %rdx
    mov %rdx, 48(%rdi)
    
    ; Get our own return address (from the top of our stack) and save it.
    ; This is where longjmp will jump back to.
    mov (%rsp), %rdx
    mov %rdx, 56(%rdi)
    
    ; Per the C standard, the initial call to setjmp must return 0.
    xor %eax, %eax
    ret

X86-64 `longjmp()` Implementation

longjmp is the inverse operation.

longjmp:
    ; %rdi holds the jmp_buf pointer, %rsi holds the 'val' to return
    
    ; Ensure the return value is not 0. This implements the (val ? val : 1) logic.
    xor %eax, %eax
    cmp $1, %esi
    adc %esi, %eax         
    
    ; Restore all the saved state FROM the buffer
    mov (%rdi), %rbx       ; Restore all callee-saved registers
    mov 8(%rdi), %rbp
    mov 16(%rdi), %r12
    mov 24(%rdi), %r13
    mov 32(%rdi), %r14
    mov 40(%rdi), %r15
    mov 48(%rdi), %rsp     ; CRITICAL: Restore the caller's stack pointer
    
    ; Jump to the saved return address without altering the (now restored) stack.
    jmp *56(%rdi)

This code does not use the ret instruction. It manually restores the stack pointer and all the callee-saved registers. Then, it performs an indirect jump to the saved instruction pointer. This instantly unwinds the stack and resumes execution as if setjmp had just returned, but with the new value in %eax.

Practice: Control Flow Trace

Tracing setjmp/longjmp requires keeping track of the return values and where the stack is.

Exercise: Predict the Output

What is the output of the following C program?

#include <stdio.h>
#include <setjmp.h>
 
jmp_buf env;
 
void f(int n) {
    printf("f(%d) enter\n", n);
    if (n > 0) longjmp(env, n);
    printf("f(%d) leave\n", n);
}
 
int main() {
    int r = setjmp(env);
    if (r <= 2) {
        printf("main r=%d\n", r);
        f(r + 1);
    }
    printf("done\n");
    return 0;
}

Solution:

setjmp returns 0. main prints main r=0.
f(1) is called. Prints f(1) enter.
longjmp(env, 1) jumps back to setjmp.
setjmp returns 1. main prints main r=1.
f(2) is called. Prints f(2) enter.
longjmp(env, 2) jumps back to setjmp.
setjmp returns 2. main prints main r=2.
f(3) is called. Prints f(3) enter.
longjmp(env, 3) jumps back to setjmp.
setjmp returns 3. if (r <= 2) is now false.
main prints done.

Total Output:

main r=0
f(1) enter
main r=1
f(2) enter
main r=2
f(3) enter
done

Why is this useful? Coroutines

This mechanism is key to implementing coroutines, a powerful programming paradigm. A coroutine is a generalization of a subroutine. While a subroutine has one entry point and returns to its caller, a coroutine can be suspended, transfer control to another coroutine, and later be resumed exactly where it left off.

The Producer-Consumer Problem

Imagine a decompressor that produces characters and a lexer that consumes them.

The decompressor is naturally written with emit(c).
The lexer is naturally written with c = getchar().

Their interfaces are incompatible. The conventional solution is to rewrite one as a complex state machine.

The Coroutine Solution

Ideally, these two functions would run as peers, passing control back and forth. This is called cooperative multitasking, and setjmp/longjmp enable it.

This is the concept of a continuation:

The decompressor runs until it has a character. It saves its state and calls into the parser.
The parser continues where it left off, processes the character, and runs until it needs a new one.
It saves its state and calls back to the decompressor.
The decompressor continues exactly where it left off.

Implementing Coroutines

A minimal coroutine library can be built using setjmp and longjmp.

The coroutine struct:

struct coroutine {
    void *stack;       // A separate stack for this coroutine
    jmp_buf env;       // The saved context (registers, rip, rsp)
    co_start_fn *start;// The function this coroutine will run
    void *arg;         // The argument to that function
};

Each coroutine gets its own stack and its own jmp_buf to save its context.

The co_switchto function: This is the heart of the system.

void *co_switchto(struct coroutine *next, void *arg) {
    // 1. Save the context of the CURRENT coroutine
    if (setjmp(cur_co->env) == 0) {
        // This is the first return from setjmp. We are switching AWAY.
        cur_co = next;            // Update the global pointer to the next coroutine
        cur_co->arg = arg;        // Pass the argument to the next coroutine
        longjmp(cur_co->env, 1);  // Jump to the NEXT coroutine's saved context
    }
    // 2. This code is executed on the SECOND return from setjmp.
    // We have just been switched BACK TO.
    return cur_co->arg; // Return the argument that was passed to us
}

This is the context switch. When Coroutine A calls co_switchto(B), it saves its state in its own jmp_buf and then longjmps to the state saved in B’s jmp_buf. Execution resumes in Coroutine B, which eventually returns from its own call to co_switchto. This symmetric handoff is the foundation of almost all concurrent programming.

Initialization (The Hard Bit)

Starting a new coroutine is tricky because it doesn’t have a saved jmp_buf to jump to. This is solved with a small, machine-dependent hack.

When creating a new coroutine in co_new:

calloc a new stack and a new coroutine struct.
Call setjmp on the new jmp_buf to fill it with valid placeholder values.
Then manually overwrite two key fields in the jmp_buf:
- Set the saved stack pointer (__jmpbuf[6]) to the top of the newly allocated stack.
- Set the saved instruction pointer (__jmpbuf[7]) to the address of a special start_cl wrapper function.

Now, the first time longjmp jumps to this new coroutine, it will start executing the start_cl function on its own private stack. That wrapper then calls the user’s desired function.

Putting It All Together

With this library, the decompressor and lexer can be written in their natural style.

emit(c) becomes a macro: DEC_PUTCHAR(c) which is co_switchto(lexer_co).
getchar() becomes a macro: LEX_GETCHAR() which is co_switchto(decompressor_co).

The main function simply initializes the library, creates the two coroutines, and kicks off the process by switching to one of them. When control eventually returns to main, the work is done.

Coroutines are Not Threads

It is crucial to understand what this coroutine package is and is not.

What it is: A generalization of subroutines, also known as lightweight threads, fibers, or cooperative multitasking. The context switches are directed and explicit.
What is missing:
- True Concurrency/Parallelism: This is all single-threaded. Only one coroutine runs at a time.
- Scheduling: There is no scheduler deciding who runs next. The programmer explicitly names the next coroutine to run.
- Blocking: If a coroutine makes a blocking system call (like reading from a file), the entire program stops.
- Preemption: A coroutine will run forever unless it voluntarily co_switchto (or yields). There is no mechanism to interrupt it.

To get true pre-emptive threads, two more things are needed: a scheduler and processor exceptions (interrupts) to asynchronously trigger the scheduler and switch coroutines. Processor exceptions + coroutines = threads and processes. This is the fundamental abstraction of modern computer science.

Continue here: 15 Linking

CS Notes

Explorer

14 Non-Local Jumps and Coroutines

`setjmp()` and `longjmp()`

`setjmp()`

`longjmp()`

A Toy Example

Implementing `setjmp()` and `longjmp()`

The `jmp_buf` Environment

X86-64 `setjmp()` Implementation (from Musl C library)

X86-64 `longjmp()` Implementation

Practice: Control Flow Trace

Exercise: Predict the Output

Why is this useful? Coroutines

The Producer-Consumer Problem

The Coroutine Solution

Implementing Coroutines

Initialization (The Hard Bit)

Putting It All Together

Coroutines are Not Threads

Table of Contents

Graph View

Backlinks

CS Notes

Explorer

14 Non-Local Jumps and Coroutines

setjmp() and longjmp()

setjmp()

longjmp()

A Toy Example

Implementing setjmp() and longjmp()

The jmp_buf Environment

X86-64 setjmp() Implementation (from Musl C library)

X86-64 longjmp() Implementation

Practice: Control Flow Trace

Exercise: Predict the Output

Why is this useful? Coroutines

The Producer-Consumer Problem

The Coroutine Solution

Implementing Coroutines

Initialization (The Hard Bit)

Putting It All Together

Coroutines are Not Threads

Table of Contents

Graph View

Backlinks

`setjmp()` and `longjmp()`

`setjmp()`

`longjmp()`

Implementing `setjmp()` and `longjmp()`

The `jmp_buf` Environment

X86-64 `setjmp()` Implementation (from Musl C library)

X86-64 `longjmp()` Implementation