10 X86 Architecture and Machine Level Programming

Lecture from: 21.10.2025 | Video: Videos ETHZ

The course has thus far operated at the level of C: writing code, compiling it, and running it. The next step is to examine the transformation process that turns C code, a human-readable text file, into something the processor can execute. This requires an understanding of the machine itself.

This chapter introduces the language of the hardware: assembly language and the underlying Instruction Set Architecture (ISA). The x86 architecture serves as the primary example. While not the only or simplest architecture, its history makes it dominant in desktops and servers, and understanding it provides insight into modern computer operation.

What is an Instruction Set Architecture?

Distinguishing between architecture and microarchitecture is crucial.

Architecture (or ISA): This is the programmer’s view of the processor. It is the abstract interface defining what the hardware can do. It includes the available instructions (e.g., add, mov), programmer-visible registers, and the memory model.
Microarchitecture: This is the implementation of the architecture. It is the specific arrangement of transistors, caches, and pipelines a particular chip uses to execute the instructions defined by the ISA.

The ISA as a Contract

The ISA acts as a contract between hardware and software. The software (like a compiler) promises to only generate instructions defined in the ISA. The hardware promises that any of its implementations (microarchitectures) will correctly execute those instructions. This separation allows a program compiled today to run on a processor built years later, provided they adhere to the same ISA.

There are many ISAs, each with its own history and design philosophy, such as x86, ARM, RISC-V, and MIPS.

The Great Debate: CISC vs. RISC

Processor architectures have historically been divided into two main camps.

CISC: Complex Instruction Set Computer

Dominant through the mid-80s, with x86 as its most famous example, CISC aimed to make hardware powerful and the compiler’s job easy.

Complex, Powerful Instructions: CISC ISAs have instructions performing multi-step operations. A single x86 instruction can read a value from a complex memory address, perform an arithmetic operation, and write the result back.
```
addl %eax, 12(%rbx,%rcx,4)
```
This involves a complex address calculation (%rbx + %rcx*4 + 12), a memory read, an addition, and a memory write.
Variable-Length Instructions: Instructions are encoded using a variable number of bytes to save memory.
Memory-to-Memory Operations: Instructions can often operate directly on memory operands without loading into registers first.
Philosophy: Add instructions to perform “typical” programming tasks.

RISC: Reduced Instruction Set Computer

Pioneered at IBM and popularized by researchers at Stanford (MIPS) and Berkeley (RISC-V), RISC aimed to make hardware simple and fast, leaving complex tasks to the compiler.

Fewer, Simpler Instructions: A small set of basic operations.
Fixed-Size Instructions: Every instruction is the same length (e.g., 32 bits), simplifying decoding.
Load-Store Architecture: Only dedicated load and store instructions can access memory. Arithmetic instructions operate only on registers.
More Registers: Simpler logic allows for more general-purpose registers.
Philosophy: A simple, uniform instruction set enables faster and more efficient hardware implementation.

CISC vs. RISC Today

The debate has largely subsided due to Moore’s Law.

Desktops and Servers: The choice of ISA is no longer a primary technical issue. Modern CISC processors like x86 internally have a RISC-like core.
Embedded Processors: For low-power devices, RISC maintains an edge due to smaller, cheaper, and lower-power designs.
Non-Technical Factors: Factors like the software ecosystem, code compatibility, licensing models, and geopolitics are often more important than technical purity.

A Brief History of x86

The x86 architecture has a long history driven by Moore’s Law and backward compatibility.

1971: Intel 4004: The first commercial microprocessor, a 4-bit CPU.
1978: Intel 8086: The first 16-bit processor and the origin of the x86 architecture. It had a 1MB address space.
1985: Intel 80386 (i386): The first 32-bit processor (IA32). It introduced “flat addressing” and could run modern OSs like Unix.
~2003: AMD Opteron / Intel Pentium 4F: The introduction of the 64-bit extension (x86-64). AMD developed it first, and Intel adopted it. This dramatically expanded the number of registers.
Present Day: Modern processors have tens of billions of transistors and numerous instruction set extensions (MMX, SSE, AVX).

Remarkably, code written for the 8086 can often still run on a modern CPU.

Basics of x86 Machine Code

Bridging the gap between C and the machine involves understanding how a C function becomes executable bytes.

Compiling into Assembly

Consider a simple C function:

int sum(int x, int y) {
    int t = x + y;
    return t;
}

The GCC compiler can stop after generating the assembly language file:

gcc -O0 -S code.c

This produces a human-readable text file, code.s, containing x86 assembly code.

The assembly code consists of mnemonics (like pushq, movl, addl) and operands (like %rbp, %edi).

The Assembly Programmer’s View

An assembly programmer works with a simple machine model:

Programmer-Visible State:
- Program Counter (%rip on x86-64): Holds the address of the next instruction to execute.
- Register File: A small, fast set of storage locations inside the CPU. x86-64 has 16 general-purpose 64-bit integer registers.
- Condition Codes: Single-bit flags storing status information about the most recent arithmetic operation.
Memory: A large, byte-addressable array holding code, data, and the stack.

From Assembly to Object Code

The assembly text file (.s) is fed to an assembler, which translates mnemonics into binary machine code, producing an object file (.o).

Linking resolves references between files and combines them with libraries to create the final executable. A disassembler (like objdump -d or gdb’s disassemble) translates machine code back into assembly.

A Machine Instruction Example

Consider a single line of C, its assembly, and its object code.

C Code: int t = x+y;
Assembly: addl 8(%rbp), %eax
Object Code: 03 45 08

This 3-byte instruction performs the following:

addl means “add long” (32 bits).
The operands are 8(%rbp) (source) and %eax (destination).
It adds the 32-bit integer at address %rbp + 8 to the value in %eax, storing the result in %eax.

x86-64 Architecture in Detail

Registers

The x86-64 architecture provides 16 general-purpose 64-bit integer registers.

The 64-bit registers are named %rax, %rbx, …, %r15.
The lower 32 bits are %eax, %ebx, …, %r15d.
The lower 16 bits use %ax, %bx, etc.
The lower 8 bits use %al, %ah, etc.

Special roles by convention:

%rsp: Stack pointer.
%rbp: Base pointer (frame pointer).
%rdi, %rsi, %rdx, %rcx, %r8, %r9: Integer/pointer arguments.
%rax: Return value.

Moving Data: The `mov` Instruction

The mov instruction transfers data between registers and memory. Suffixes indicate data size: movb (byte), movw (word), movl (long), movq (quad word).

Operand Types

Immediate: Constant integer value, prefixed with $ (e.g., movl $0x400, %eax).
Register: Value in a register (e.g., movq %rax, %rbx).
Memory: Value from memory.

Memory-to-Memory Transfers

A single mov instruction cannot have both source and destination as memory locations. Data must be loaded into a register first.

Simple Memory Addressing Modes

Normal: (%rcx)
- Uses value in %rcx as memory address.
- C analog: *p.
Displacement: 8(%rbp)
- Adds a constant offset to the register value.
- C analog: p->field or local variable.

Understanding `swap`: A Complete Example

Tracing the unoptimized assembly for a simple C swap function:

void swap(int *xp, int *yp) {
    int t0 = *xp;
    int t1 = *yp;
    *xp = t1;
    *yp = t0;
}

With optimizer off (-O0), the compiler uses the stack for temporary variables.

Function Prologue:

pushq   %rbp
movq    %rsp, %rbp

This saves the old base pointer and sets up a new stack frame.

Body: Tracing int t0 = *xp;:

movq    -24(%rbp), %rax   // Move xp (from stack) into register %rax
movl    (%rax), %eax      // Dereference xp: load value from address in %rax into %eax
movl    %eax, -8(%rbp)    // Store that value into t0's location on the stack

The compiler loads the pointer, dereferences it, and stores the result in a temporary memory location.

With the Optimizer On (-O2):

The compiler keeps everything in registers.

swap:
    movl    (%rdi), %edx   // t0 = *xp (xp is in %rdi)
    movl    (%rsi), %eax   // t1 = *yp (yp is in %rsi)
    movl    %eax, (%rdi)   // *xp = t1
    movl    %edx, (%rsi)   // *yp = t0
    retq

Temporary variables t0 and t1 reside in %edx and %eax without touching memory for storage.

Complete Memory Addressing Modes

The general form is:

D (R_{b}, R_{i}, S)

This computes an address as:

Address = Reg [R_{b}] + Reg [R_{i}] \times S + D

$D$ : Constant displacement.
$R_{b}$ : Base register.
$R_{i}$ : Index register.
$S$ : Scale factor (1, 2, 4, or 8).

Why these scale factors?

Scale factors 1, 2, 4, and 8 correspond to sizes of common data types (char, short, int/float, long/double/pointer). This facilitates array element address calculation: &array[i] becomes address_of_array + i * sizeof(element).

The `lea` Instruction: Address Calculation as Arithmetic

The lea (Load Effective Address) instruction performs the address calculation of the general addressing mode but stores the calculated address itself into the destination register.

lea Src, Dest

Main uses:

Computing addresses: Corresponds to C’s & operator (e.g., p = &x[i];).
Fast arithmetic: Computes x + k*y in a single instruction, where k is 1, 2, 4, or 8.

Practice: Data Movement and Addressing

Understanding which mov operations are legal is essential for reading assembly.

Exercise: Valid/Invalid `mov` Instructions

Consider a 64-bit system. Identify which of these are ILLEGAL and why.

movq $0x1, $0x2
movl %eax, (%rsp)
movb (%rdi), (%rsi)
movw %ax, %bx

Solutions:

Illegal: You cannot move an immediate value into another immediate value. The destination must be a register or memory.
Legal: Moving a 32-bit register value into a memory location.
Illegal: Memory-to-memory transfers are not allowed in a single x86 instruction. You must go through a register.
Legal: Moving a 16-bit register value to another 16-bit register.

Exercise: `lea` Arithmetic

What is the result in %rax after this instruction, if %rdx = 10? leaq 5(%rdx, %rdx, 4), %rax

Solution:

Formula: Base + Index * Scale + Displacement
10 + 10 * 4 + 5 = 10 + 40 + 5 = 55.
Result: %rax = 55.

Continue here: 11 Assembly Control Flow and Conditionals

CS Notes

Explorer

10 X86 Architecture and Machine Level Programming

What is an Instruction Set Architecture?

The Great Debate: CISC vs. RISC

CISC: Complex Instruction Set Computer

RISC: Reduced Instruction Set Computer

CISC vs. RISC Today

A Brief History of x86

Basics of x86 Machine Code

Compiling into Assembly

The Assembly Programmer’s View

From Assembly to Object Code

A Machine Instruction Example

x86-64 Architecture in Detail

Registers

Moving Data: The `mov` Instruction

Operand Types

Simple Memory Addressing Modes

Understanding `swap`: A Complete Example

Complete Memory Addressing Modes

The `lea` Instruction: Address Calculation as Arithmetic

Practice: Data Movement and Addressing

Exercise: Valid/Invalid `mov` Instructions

Exercise: `lea` Arithmetic

Table of Contents

Graph View

Backlinks

CS Notes

Explorer

10 X86 Architecture and Machine Level Programming

What is an Instruction Set Architecture?

The Great Debate: CISC vs. RISC

CISC: Complex Instruction Set Computer

RISC: Reduced Instruction Set Computer

CISC vs. RISC Today

A Brief History of x86

Basics of x86 Machine Code

Compiling into Assembly

The Assembly Programmer’s View

From Assembly to Object Code

A Machine Instruction Example

x86-64 Architecture in Detail

Registers

Moving Data: The mov Instruction

Operand Types

Simple Memory Addressing Modes

Understanding swap: A Complete Example

Complete Memory Addressing Modes

The lea Instruction: Address Calculation as Arithmetic

Practice: Data Movement and Addressing

Exercise: Valid/Invalid mov Instructions

Exercise: lea Arithmetic

Table of Contents

Graph View

Backlinks

Moving Data: The `mov` Instruction

Understanding `swap`: A Complete Example

The `lea` Instruction: Address Calculation as Arithmetic

Exercise: Valid/Invalid `mov` Instructions

Exercise: `lea` Arithmetic