Chapter 3 - Finite Automata

Welcome to the world of finite automata, the simplest formal model of computation. Imagine a machine with strictly limited memory, capable only of remembering its current “state” from a finite set. (Think of a traffic light: it can only be red, yellow, or green, and its next color depends only on its current color and a timer.) It reads an input string, one symbol at a time, from left to right. Based on the symbol it reads and its current state, it transitions to a new state. It has no memory of how it got there, other than what the state itself represents.

This model, despite its simplicity, is incredibly powerful and forms the basis of many real-world applications, from the lexical analysis phase of a compiler to pattern matching in a text editor. In this chapter, we will use finite automata as a gentle introduction to the core concepts of computation: configurations, computation steps, determinism, nondeterminism, and the fundamental question of what a given computational model can and cannot do.

3.1 Aims

By the end of this chapter, you will be able to:

Model Simple Computations: Understand how to formally define a computation using the model of a Deterministic Finite Automaton (DFA) and apply this model to solve simple decision problems.
Master DFA Design: Develop a core design strategy for building DFAs that recognize specific patterns or “languages,” focusing on the “state as memory” principle.
Grasp Key Computational Concepts: Build intuition for fundamental ideas like configurations, computation steps, determinism, nondeterminism, and simulation in a simple, tangible context, providing a foundation for more complex models.
Explore Nondeterminism: Learn about Nondeterministic Finite Automata (NFAs), understand their relationship to DFAs via the subset construction, and appreciate the trade-offs between expressive power and size.
Prove Limitations: Learn how to formally prove that certain problems cannot be solved by finite automata using powerful techniques like the Pumping Lemma for Regular Languages and the Kolmogorov Complexity Method.

3.2 Representations of Finite Automata

At its heart, a computational model is defined by the answers to a few basic questions: What are its elementary operations? How much memory does it have? How does it get input and produce output?

For a finite automaton, memory is its most constrained aspect: it possesses no explicit memory beyond its inherent program structure. The machine cannot use variables to store data. The only information it has is its current location in the program, its current “state.”

The most intuitive way to visualize finite automata is as a directed graph where nodes are states and edges are transitions. This visual representation often provides the quickest path to understanding how a DFA works before diving into the formal definitions.

3.2.1 Formal Definition of a Deterministic Finite Automaton (DFA)

To reason about these machines precisely, we move from the graphical intuition to a formal mathematical definition.

Definition 3.1 (Deterministic Finite Automaton)

A deterministic finite automaton (DFA) is a 5-tuple $M = (Q, Σ, δ, q_{0}, F)$ , where:

$Q$ is a finite, non-empty set of states. These represent the machine’s limited memory.
$Σ$ is a finite, non-empty set called the input alphabet. These are the symbols the machine can read.
$δ : Q \times Σ \to Q$ is the transition function. It’s a total function that takes the current state and the current input symbol and determines the unique next state. (Think of it as a rule: “IF I am in state ‘q’ AND I read symbol ‘a’, THEN I MUST go to state ‘p’”).
$q_{0} \in Q$ is the start state. The state the machine begins in before processing any input.
$F \subseteq Q$ is the set of accepting states (or final states). If the machine finishes processing input in one of these states, the input word is accepted.

To define what it means for a DFA to “run” and process an input, we introduce the following concepts:

A configuration of a DFA is a pair $(q, w)$ , where $q \in Q$ is the current state and $w \in Σ^{*}$ is the unread portion of the input. (Imagine it as a snapshot of the machine’s current situation: “where am I now, and what’s left to read?”). The initial configuration for an input $x$ is $(q_{0}, x)$ .
A computation step is a relation $⊢_{M}$ on configurations. We write $(q, a w) ⊢_{M} (p, w)$ if $δ (q, a) = p$ . This describes how the machine moves from one configuration to the next by reading one symbol and changing state.
A computation on an input $x$ is a finite sequence of computation steps starting from the initial configuration $(q_{0}, x)$ and ending in a configuration $(q_{f}, λ)$ where the input is empty.
A DFA accepts a word $w \in Σ^{*}$ if its computation on $(q_{0}, w)$ ends in an accepting state, i.e., $(q_{0}, w) ⊢_{M}^{*} (q_{f}, λ)$ with $q_{f} \in F$ . Otherwise, it rejects the word.

The language accepted by a DFA M, denoted $L (M)$ , is the set of all words that $M$ accepts. Languages that can be accepted by a DFA are called regular languages.

Definition 3.2 (Extended Transition Function)

To simplify notation and describe the state after processing an entire string, we define the extended transition function $Θ : Q \times Σ^{*} \to Q$ . It is defined recursively:

$Θ (q, λ) = q$ for any state $q \in Q$ . (Processing the empty string doesn’t change the state).
$Θ (q, w a) = δ (Θ (q, w), a)$ for any state $q \in Q$ , string $w \in Σ^{*}$ , and symbol $a \in Σ$ . (Process $w$ , then apply the transition for $a$ ).

Using this, the language of a DFA can be defined concisely as:

3.2.2 Designing DFAs: State as Memory

The key to designing a DFA is to determine what finite information the machine needs to remember as it reads the input. Each state effectively acts as a tiny memory cell, representing a specific property or summary of the prefix read so far. For example, a state might remember if an even or odd number of ‘0’s has been encountered, or the longest suffix of the input read so far that is also a prefix of a target pattern.

Design Strategy: The Meaning of States

Identify Required Information: Determine the distinct properties or states of knowledge about the input prefix that you need to track to solve the problem (i.e., to decide whether the input belongs to the language).

Assign States: Assign one unique state to each of these properties. These states collectively form the set $Q$ .

Define Transitions: For each state $q \in Q$ and each input symbol $a \in Σ$ , determine how reading $a$ changes the property represented by $q$ . This defines the transition $δ (q, a)$ .

Designate Start State: The start state $q_{0}$ usually represents the property of having read the empty prefix.

Designate Accepting States: The set of accepting states $F$ consists of all states that represent properties corresponding to the acceptance condition of the language.

DFA to accept strings containing "001"

Let’s design a DFA for the language $L = {w \in {0, 1}^{*} ∣ w contains the substring 001}$ .

We need to remember how much of the “001” pattern we have just seen as a suffix.

State Meanings:

$q_{0}$ : The initial state. No prefix of “001” has been seen yet, or the last part seen doesn’t contribute to “001” (e.g., ends in 1).

$q_{1}$ : The suffix seen so far is “0”. We are looking for “01”.

$q_{2}$ : The suffix seen so far is “00”. We are looking for “1”.

$q_{3}$ : We have seen “001”. This is an accepting state, and once this pattern is found, any further input doesn’t change the acceptance outcome.

Transitions:

From $q_{0}$ :

If we read a 1, we are still looking for “001” from scratch. So, $δ (q_{0}, 1) = q_{0}$ .

If we read a 0, we have “0”. Move to $q_{1}$ . So, $δ (q_{0}, 0) = q_{1}$ .

From $q_{1}$ (suffix is “0”):

If we read a 0, the suffix is “00”. Move to $q_{2}$ . So, $δ (q_{1}, 0) = q_{2}$ .

If we read a 1, the suffix is “01”. This breaks the “001” pattern that started with “0”. We effectively reset back to the state where we’ve seen nothing productive, which is $q_{0}$ . So, $δ (q_{1}, 1) = q_{0}$ .

From $q_{2}$ (suffix is “00”):

If we read a 1, the pattern “001” is complete. Move to $q_{3}$ . So, $δ (q_{2}, 1) = q_{3}$ .

If we read a 0, the suffix is “000”. This still ends in “00”, so we remain in $q_{2}$ . So, $δ (q_{2}, 0) = q_{2}$ .

From $q_{3}$ (seen “001”):

Any further input ( 0 or 1) means the string still contains “001”. Stay in $q_{3}$ . So, $δ (q_{3}, 0) = q_{3}$ , $δ (q_{3}, 1) = q_{3}$ .

Start State: $q_{0}$ .

Accepting States: $F = {q_{3}}$ .

The resulting state diagram elegantly captures this logic.

3.3 Simulations and Closure Properties

A powerful technique in computer science is to show that one model can simulate another. We can use this idea to combine automata and prove important properties about the class of regular languages. Understanding closure properties is crucial because it tells us that if we have machines for two languages, we can build a new machine for their union, intersection, etc., without leaving the class of regular languages.

Lemma 3.2 (Closure under Set Operations)

Let $L_{1}$ and $L_{2}$ be two regular languages. Then the languages $L_{1} \cup L_{2}$ , $L_{1} \cap L_{2}$ , and $L_{1} ∖ L_{2}$ are also regular.

Proof by Product Construction

Since $L_{1}$ and $L_{2}$ are regular, there exist DFAs $M_{1} = (Q_{1}, Σ, δ_{1}, q_{0, 1}, F_{1})$ and $M_{2} = (Q_{2}, Σ, δ_{2}, q_{0, 2}, F_{2})$ that accept them.

We can construct a new DFA, $M$ , that simulates both $M_{1}$ and $M_{2}$ simultaneously. The states of $M$ will be pairs of states, one from $M_{1}$ and one from $M_{2}$ .

Let $M = (Q, Σ, δ, q_{0}, F)$ , where:

$Q = Q_{1} \times Q_{2}$ (The set of states of $M$ is the Cartesian product of the states of $M_{1}$ and $M_{2}$ ).

$q_{0} = (q_{0, 1}, q_{0, 2})$ (The start state of $M$ is the pair of start states).

$δ ((q_{1}, q_{2}), a) = (δ_{1} (q_{1}, a), δ_{2} (q_{2}, a))$ for each pair of states $(q_{1}, q_{2}) \in Q$ and input symbol $a \in Σ$ . (When $M$ reads a symbol, it simulates the transition in $M_{1}$ and $M_{2}$ independently).

The only thing that changes based on the set operation is the set of accepting states $F$ :

For Union ( $L_{1} \cup L_{2}$ ): $F = {(q_{1}, q_{2}) ∣ q_{1} \in F_{1} or q_{2} \in F_{2}}$ . The new machine accepts if either of the original machines accepts.

For Intersection ( $L_{1} \cap L_{2}$ ): $F = {(q_{1}, q_{2}) ∣ q_{1} \in F_{1} and q_{2} \in F_{2}}$ . The new machine accepts only if both original machines accept.

For Difference ( $L_{1} ∖ L_{2}$ ): $F = {(q_{1}, q_{2}) ∣ q_{1} \in F_{1} and q_{2} \in / F_{2}}$ . The new machine accepts only if the first accepts and the second rejects. (Note: The complement $\overline{L_{2}} = Σ^{*} ∖ L_{2}$ is also regular, which means $L_{1} ∖ L_{2} = L_{1} \cap \overline{L_{2}}$ is regular).

In each case, we have constructively built a valid DFA that accepts the desired language. Therefore, the class of regular languages is closed under these set operations. This modular design strategy is very useful for constructing complex DFAs from simpler ones.

3.4 Proofs of Nonexistence

How can we be sure that a language like $L = {0^{n} 1^{n} ∣ n \geq 0}$ is not regular? We need a way to prove that no DFA, no matter how cleverly designed, can accept it. The core limitation of a DFA is its finite memory (its states). If a string is long enough, the DFA must repeat a state, creating a loop. This observation is the key to techniques like the Pumping Lemma and the application of Kolmogorov Complexity.

Lemma 3.3

Let $A = (Q, Σ, δ_{A}, q_{0}, F)$ be an EA. Let $x, y \in Σ^{*}$ , $x \neq = y$ , such that
$(q_{0}, x) ⊢_{A}^{*} (p, λ) and (q_{0}, y) ⊢_{A}^{*} (p, λ)$
for a $p \in Q$ (i.e., $\hat{δ}_{A} (q_{0}, x) = \hat{δ}_{A} (q_{0}, y) = p$ ). Then for every $z \in Σ^{*}$ there exists an $r \in Q$ , such that $x z$ and $yz$ lead to the same state $r$ , i.e., in particular
$x z \in L (A) ⟺ yz \in L (A) .$

Proof

From the existence of the computations
$(q_{0}, x) ⊢_{A}^{*} (p, λ) and (q_{0}, y) ⊢_{A}^{*} (p, λ)$
of $A$ follows the existence of the following computation on $x z$ and $yz$ :
$(q_{0}, x z) ⊢_{A}^{*} (p, z) and (q_{0}, yz) ⊢_{A}^{*} (p, z)$
for all $z \in Σ^{*}$ . If $r = \hat{δ}_{A} (p, z)$ (i.e., if $(p, z) ⊢_{A}^{*} (r, λ)$ is the computation of $A$ on $z$ starting from state $p$ ), then the computation of $A$ on $x z$ is
$(q_{0}, x z) ⊢_{A}^{*} (p, z) ⊢_{A}^{*} (r, λ)$
and the computation of $A$ on $yz$ is
$(q_{0}, yz) ⊢_{A}^{*} (p, z) ⊢_{A}^{*} (r, λ) .$
If $r \in F$ , then both words $x z$ and $yz$ are in $L (A)$ . If $r \in / F$ , then $x z, yz \in / L (A) .$

3.4.1 The Pumping Lemma for Regular Languages

The Pumping Lemma for Regular Languages is a fundamental tool used to prove that certain languages are not regular. At its heart, it captures the idea that any finite automaton, when processing a sufficiently long string, must eventually repeat a state. This repetition implies a loop in its computation, and anything in that loop can be “pumped” (repeated or removed) without changing whether the string is accepted.

Lemma 3.4 (The Pumping Lemma for Regular Languages)

If $L$ is a regular language, then there exists a constant $p \geq 1$ (called the pumping length, which is at least the number of states in a DFA for $L$ ) such that for any string $w \in L$ with $∣ w ∣ \geq p$ , $w$ can be divided into three parts, $w = x yz$ , satisfying the following conditions:

$∣ x y ∣ \leq p$ . (The “pumpable” part $x y$ must occur within the first $p$ symbols of $w$ ).

$∣ y ∣ \geq 1$ . (The middle part $y$ must not be empty, meaning there is actually a loop).

For all $i \geq 0$ , the string $x y^{i} z$ is also in $L$ . (This is the “pumping” part: we can repeat the middle segment $y$ any number of times (including zero) and the resulting word will still be in the language).

Proof Idea

Let $M = (Q, Σ, δ, q_{0}, F)$ be a DFA that accepts $L$ . Let $p = ∣ Q ∣$ be the number of states in $M$ . Consider any word $w \in L$ such that $∣ w ∣ \geq p$ . When $M$ processes $w$ , it makes $∣ w ∣$ transitions and visits $∣ w ∣ + 1$ states. Since $∣ w ∣ \geq p$ , by the Pigeonhole Principle, there must be at least one state that is visited twice among the first $p + 1$ states encountered during the computation (i.e., among the states $Θ (q_{0}, λ), Θ (q_{0}, w_{1}), \dots, Θ (q_{0}, w_{p})$ ).

Let $q_{k}$ be the first state repeated. Then the path for $w$ looks like this: $q_{0} ⊢_{x} q_{k} ⊢_{y} q_{k} ⊢_{z} q_{f} \in F$ .

Here, $x$ is the prefix of $w$ that leads to the first occurrence of $q_{k}$ . $y$ is the segment of $w$ that leads from the first $q_{k}$ back to itself. This $y$ must be non-empty ( $∣ y ∣ \geq 1$ ). $z$ is the remaining suffix of $w$ that leads from $q_{k}$ to an accepting state $q_{f} \in F$ .

Since the loop $y$ can be repeated any number of times (or skipped entirely), the DFA will also accept $x y^{i} z$ for any $i \geq 0$ . The condition $∣ x y ∣ \leq p$ arises because the repeated state must occur within the first $p$ transitions (i.e., within the first $p + 1$ states visited).

How to Use the Pumping Lemma (Proof by Contradiction):

The Pumping Lemma is primarily used to prove that a language is not regular. The procedure is as follows:

Assume for Contradiction: Assume that the language $L$ is regular.
Apply Pumping Lemma: By the Pumping Lemma, there must exist a pumping length $p \geq 1$ .
Choose a “Clever” String: Select a specific word $w \in L$ such that $∣ w ∣ \geq p$ . This choice is critical and must be strategic, often picking a word that highlights the “non-regular” property (e.g., $0^{p} 1^{p}$ for $L = {0^{n} 1^{n} ∣ n \geq 0}$ ).
Consider All Decompositions: Show that for all possible ways to decompose $w$ into $x yz$ that satisfy conditions (1) and (2) of the Pumping Lemma ( $∣ x y ∣ \leq p$ and $∣ y ∣ \geq 1$ ).
Find a Contradiction: For each valid decomposition, find at least one value of $i \geq 0$ for which the “pumped” string $x y^{i} z$ is not in $L$ .
Conclude: Since we found a contradiction to the Pumping Lemma, our initial assumption that $L$ is regular must be false. Therefore, $L$ is not a regular language.

Proving $L = {0^{n} 1^{n} ∣ n \geq 0}$ is Not Regular Using Lemma 3.3

Intuitively, $L = {0^{n} 1^{n} ∣ n \geq 0}$ should be hard for any DFA because a DFA would need to “count” the number of zeros to ensure it matches the number of ones. However, a DFA has a fixed, finite number of states, meaning it cannot count arbitrarily high. We can formalize this intuition using Lemma 3.3.

Assume for Contradiction: Assume that $L$ is regular.

DFA Existence: Then there exists a DFA $A = (Q, Σ, δ, q_{0}, F)$ that accepts $L$ . Let $k = ∣ Q ∣$ be the number of states in $A$ .

Consider a Set of Words: Consider the $k + 1$ words: $0^{0}, 0^{1}, 0^{2}, \dots, 0^{k}$ .

Apply Pigeonhole Principle: When $A$ processes these $k + 1$ words, it must end in some state after reading each word. Since there are $k + 1$ words and only $k$ states, by the Pigeonhole Principle, there must exist two distinct words $0^{i}$ and $0^{j}$ (where $0 \leq i < j \leq k$ ) such that $A$ ends in the same state after reading them. That is, $\hat{δ} (q_{0}, 0^{i}) = \hat{δ} (q_{0}, 0^{j}) = p$ for some state $p \in Q$ .

Apply Lemma 3.3: According to Lemma 3.3, if $\hat{δ} (q_{0}, 0^{i}) = \hat{δ} (q_{0}, 0^{j})$ , then for any suffix $z \in Σ^{*}$ , it must hold that $0^{i} z \in L ⟺ 0^{j} z \in L$ .

Find a Contradiction: Let’s choose a specific suffix $z = 1^{i}$ .

Consider the word $0^{i} z = 0^{i} 1^{i}$ . This word is clearly in $L$ by definition.

Now consider the word $0^{j} z = 0^{j} 1^{i}$ . Since $i < j$ , the number of zeros ( $j$ ) is strictly greater than the number of ones ( $i$ ). Therefore, $0^{j} 1^{i} \in / L$ .

Contradiction: We have $0^{i} 1^{i} \in L$ but $0^{j} 1^{i} \in / L$ , which contradicts the conclusion from Lemma 3.3 that $0^{i} z \in L ⟺ 0^{j} z \in L$ .

Conclusion: Our initial assumption that $L$ is regular must be false. Therefore, $L = {0^{n} 1^{n} ∣ n \geq 0}$ is not a regular language.

3.4.2 Proving Non-Regularity using Kolmogorov Complexity

Another powerful method for proving that a language is not regular stems from Kolmogorov complexity. Regular languages are processed by finite automata, which have finite memory. This implies that regular languages cannot “count” arbitrarily high or store unbounded information about the input prefix. Kolmogorov complexity provides a formal framework for this intuition.

Theorem 3.1 (Kolmogorov Complexity and Regular Languages)

If $L$ is a regular language over $Σ_{bool}^{*}$ , then there exists a constant $c$ such that for every $x \in Σ_{bool}^{*}$ , the Kolmogorov complexity of $y$ is bounded:
$K (y) \leq lo g_{2} (n) + c$
where $y$ is the $n$ -th word in $L_{x} = {y ∣ x y \in L}$ (the sublanguage consisting of suffixes $y$ such that $x y$ (for fixed prefix $x$ ) is in $L$ ), ordered canonically.

Proof Idea

Let $M = (Q, Σ, δ, q_{0}, F)$ be a DFA recognizing $L$ .
The core idea is that to generate the $n$ -th word in $L_{x}$ , we do not need the full information of $x$ ; we only need the state $Θ (q_{0}, x)$ that $M$ reaches after processing the prefix $x$ .

A program to generate $y$ would:

Be given the state $q = Θ (q_{0}, x)$ and the index $n$ .

Systematically generate words $z \in Σ_{bool}^{*}$ in canonical order.

For each $z$ , simulate $M$ starting from state $q$ with input $z$ .

If $z$ leads to an accepting state (i.e., $Θ (q, z) \in F$ ), increment a counter.

When the counter reaches $n$ , output $z$ .

The cost of such a program is the constant size of the simulation logic plus the length of encoding $q$ (which is bounded by $lo g_{2} ∣ Q ∣$ , and thus a constant) and the length of $n$ (which is $lo g_{2} n$ ). This implies that $K (y)$ is roughly bounded by $lo g_{2} n$ .

This theorem essentially states that if a language is regular, then for any fixed prefix $x$ , the “information density” of the suffixes that complete $x$ into a word of $L$ cannot be arbitrarily high; their complexity is bounded by their canonical index.

How to Use Kolmogorov Complexity to Prove Non-Regularity:

This method also proceeds by contradiction:

Assume for Contradiction: Assume the language $L$ is regular.
Apply Theorem 3.1: Choose a sequence of prefixes $x_{m}$ and consider the corresponding languages $L_{x_{m}}$ .
Find a Contradiction: Show that for some $y_{m} \in L_{x_{m}}$ , its Kolmogorov complexity $K (y_{m})$ grows faster than $lo g_{2} (n_{m}) + c$ (where $n_{m}$ is the canonical index of $y_{m}$ ), thus contradicting Theorem 3.1. This is often done by choosing $y_{m}$ to be an incompressible string or one whose information content is provably high.

Proving $L = {0^{k} 1^{k} ∣ k \geq 0}$ is Not Regular Using Kolmogorov Complexity

Assume $L$ is regular. By Theorem 3.1, there exists a constant $c$ such that $K (y) \leq lo g_{2} (n) + c$ for the $n$ -th word $y$ in $L_{x}$ .

Consider the prefix $x = 0^{m}$ for an arbitrary large $m$ . The language $L_{x} = L_{0^{m}} = {y ∣ 0^{m} y \in L}$ . For $0^{m} y$ to be in $L$ , $y$ must be $1^{m}$ . (Since $0^{m} 1^{m}$ is the only word in $L$ that starts with $0^{m}$ ).

So, $L_{0^{m}} = {1^{m}}$ . The 1st word ( $n = 1$ ) in $L_{0^{m}}$ in canonical order.

According to Theorem 3.1, $K (1^{m}) \leq lo g_{2} (1) + c = c$ .

However, we know that $K (1^{m})$ for sufficiently large $m$ must be approximately $m + c^{'}$ (or at least $m$ itself for random strings). More precisely, the string $1^{m}$ can be generated by a program that prints ‘1’ $m$ times, and the shortest such program would have length approximately $lo g_{2} m$ . So $K (1^{m}) \approx lo g_{2} m + c^{''}$ .

Comparing $lo g_{2} m + c^{''}$ with $c$ : for sufficiently large $m$ , $lo g_{2} m + c^{''} > c$ . This is a contradiction.

Therefore, $L = {0^{k} 1^{k} ∣ k \geq 0}$ is not regular.

This method often feels more intuitive for some problems, especially those involving “counting” or “matching” patterns where finite memory proves insufficient.

3.5 Nondeterminism

What if we allow a machine to have multiple possible next states for the same input symbol? Or even to change state without reading any input at all (known as $λ$ -transitions or $ϵ$ -transitions)? This is the concept of nondeterminism. (Imagine the machine “guessing” the correct path to an accepting state, or exploring all paths simultaneously.)

3.5.1 Formal Definition of a Nondeterministic Finite Automaton (NFA)

Definition 3.3 (Nondeterministic Finite Automaton)

A nondeterministic finite automaton (NFA) is a 5-tuple $M = (Q, Σ, δ, q_{0}, F)$ , where all components are defined similarly to a DFA, except for the transition function:

$Q$ is a finite, non-empty set of states.
$Σ$ is a finite, non-empty set called the input alphabet.
$δ : Q \times Σ \to P (Q)$ (or $Q \times (Σ \cup {λ}) \to P (Q)$ for NFAs with $λ$ -transitions) is the transition function. Here, $P (Q)$ is the power set of $Q$ , meaning the transition function maps to a set of possible next states.
$q_{0} \in Q$ is the start state.
$F \subseteq Q$ is the set of accepting states.

An NFA accepts a word $w$ if there is at least one path of transitions from the start state $q_{0}$ to an accepting state $q_{f} \in F$ that consumes the entire word $w$ . If all possible computation paths for a word lead to a non-accepting state or get stuck (no transition defined), the word is rejected.

NFA for strings ending with "01"

Let’s design an NFA for $L = {w \in {0, 1}^{*} ∣ w ends with 01}$ . This NFA can “guess” when it is about to see the “01” suffix.

$q_{0}$ : Initial state. Can stay here on any input, or nondeterministically guess that a 0 is the start of the “01” suffix.

$δ (q_{0}, 0) = {q_{0}, q_{1}}$ (can stay in $q_{0}$ or move to $q_{1}$ if it sees 0).

$δ (q_{0}, 1) = {q_{0}}$ (must stay in $q_{0}$ if it sees 1).

$q_{1}$ : We have seen a 0 that might be the start of “01”. We are looking for 1.

$δ (q_{1}, 1) = {q_{2}}$ (move to accepting state if 1 is seen).

$δ (q_{1}, 0) = {q_{1}}$ (if 0 is seen, we still have a 0, so can stay in $q_{1}$ ).

$q_{2}$ : Accepting state. We have seen “01”. This state is only reached at the end of the word for acceptance.

Start State: $q_{0}$ .

Accepting States: $F = {q_{2}}$ .

This NFA is simpler than a DFA for the same language. For example, a DFA would need to remember if it saw a 0, 10, or 110, etc.

3.5.2 Equivalence of NFAs and DFAs (Subset Construction)

Surprisingly, nondeterminism does not add any fundamental computational power to finite automata. Any language that can be recognized by an NFA can also be recognized by an equivalent DFA. This is a crucial result: it means NFAs are a convenient tool for designing automata due to their flexibility, while DFAs are what we actually implement for unambiguous execution.

Theorem 3.2 (Equivalence of NFAs and DFAs)

For every NFA $M_{N}$ , there exists an equivalent DFA $M_{D}$ such that $L (M_{N}) = L (M_{D})$ .

Proof Idea: The Subset Construction

We can construct a DFA $M_{D}$ that simulates the NFA $M_{N}$ . The key idea is that the states of the DFA correspond to sets of states of the NFA. If the NFA could be in any one of the states in a set $S \subseteq Q_{N}$ after reading a prefix, the simulating DFA will be in a single state representing that exact set $S$ .

Let $M_{N} = (Q_{N}, Σ, δ_{N}, q_{0, N}, F_{N})$ be an NFA. We construct an equivalent DFA $M_{D} = (Q_{D}, Σ, δ_{D}, q_{0, D}, F_{D})$ as follows:

States of $M_{D}$ : $Q_{D} = P (Q_{N})$ (the power set of the NFA’s states). Each state in $M_{D}$ is a set of states from $M_{N}$ .

Start state of $M_{D}$ : $q_{0, D} = EpsilonClosure ({q_{0, N}})$ . This is the set containing the NFA’s start state and all states reachable from it via $λ$ -transitions (if applicable).

Transition of $M_{D}$ : For every state $S \in Q_{D}$ (where $S$ is a set of NFA states) and every input symbol $a \in Σ$ : $δ_{D} (S, a) = EpsilonClosure q \in S ⋃ δ_{N} (q, a)$ This means: if the DFA is currently in a state representing the set $S$ of NFA states, on reading symbol $a$ , its next state is the set of all NFA states reachable from any state in $S$ after reading $a$ , including closure under $λ$ -transitions.

Accepting states of $M_{D}$ : $F_{D} = {S \in Q_{D} ∣ S \cap F_{N} \neq = \emptyset}$ . The DFA accepts if any of the NFA states it currently “might be in” is an accepting state of the NFA.

This construction proves the equivalence. While equivalent in expressive power, an NFA can be exponentially smaller than its corresponding DFA. For example, for a language that determines whether the $k$ -th to last symbol is a 1, the NFA has $k + 1$ states, while the minimal DFA requires $2^{k}$ states.

This demonstrates a crucial trade-off: nondeterminism can offer conciseness at the cost of potential exponential blow-up in deterministic simulation.

3.6 Summary

Finite Automata are fundamental, simple computational models with finite memory, defined by a set of states and transitions based on input symbols.
DFAs are deterministic, meaning for any given state and input symbol, there’s always a single, unique next state. Their design relies on encoding necessary information about the input’s history into the states.
Regular Languages are the class of languages recognized by DFAs (and NFAs). This class is robustly closed under common set operations like union, intersection, and complement, provable using constructive techniques like the product construction.
The Pumping Lemma is a critical theoretical tool for proving that a language is not regular. It highlights the finite memory limitation of DFAs by showing that sufficiently long strings in a regular language must contain a “pumpable” segment.
The Kolmogorov Complexity Method offers an alternative, often more intuitive, approach to proving non-regularity. It exploits the fact that regular languages cannot “compress” or “count” arbitrary amounts of information about their words or their suffixes.
NFAs introduce nondeterminism, allowing multiple possible transitions from a state for a given input, or even $λ$ -transitions. While NFAs can be significantly more concise and easier to design for certain languages, they do not increase the fundamental computational power beyond DFAs.
The Subset Construction (or power set construction) provides a constructive proof of the equivalence of NFAs and DFAs by showing how to build a deterministic machine that simulates all possible paths of a nondeterministic one. This construction, however, can lead to an exponential increase in the number of states.

Previous Chapter: Chapter 2 - Alphabets, Words, Languages, and Problem Representation Next Up: Chapter 4 - Turing Machines

Exercises

Exercise 3.1 (DFA Design)

Design a DFA that accepts the language $L = {w \in {0, 1}^{*} ∣ (∣ w ∣_{0} + ∣ w ∣_{1}) (mod 2) = 0}$ (i.e., the length of $w$ is even).

Solution

We need to remember the parity of the length of the string seen so far.

$Q = {q_{even}, q_{odd}}$

$Σ = {0, 1}$

$q_{0} = q_{even}$ (empty string has even length)

$F = {q_{even}}$

Transitions:

$δ (q_{even}, 0) = q_{odd}$

$δ (q_{even}, 1) = q_{odd}$

$δ (q_{odd}, 0) = q_{even}$

$δ (q_{odd}, 1) = q_{even}$

Exercise 3.2 (DFA Design)

Design a DFA that accepts strings over ${0, 1}$ representing binary numbers divisible by 3.

Solution

We need states to keep track of the remainder when the binary number read so far is divided by 3. (Binary numbers are read most significant bit first).

$Q = {q_{0}, q_{1}, q_{2}}$ (where $q_{i}$ means the number read so far has remainder $i$ when divided by 3).

$Σ = {0, 1}$

$q_{0} = q_{0}$ (empty string represents 0, $0 (mod 3) = 0$ )

$F = {q_{0}}$

Transitions: If the current remainder is $r$ and we read bit $b$ , the new number is $2 r + b$ . So the new remainder is $(2 r + b) (mod 3)$ .

$δ (q_{0}, 0) = q_{0}$ (from remainder 0, read 0, new number $2 \cdot 0 + 0 = 0$ , rem 0)

$δ (q_{0}, 1) = q_{1}$ (from remainder 0, read 1, new number $2 \cdot 0 + 1 = 1$ , rem 1)

$δ (q_{1}, 0) = q_{2}$ (from remainder 1, read 0, new number $2 \cdot 1 + 0 = 2$ , rem 2)

$δ (q_{1}, 1) = q_{0}$ (from remainder 1, read 1, new number $2 \cdot 1 + 1 = 3$ , rem 0)

$δ (q_{2}, 0) = q_{1}$ (from remainder 2, read 0, new number $2 \cdot 2 + 0 = 4$ , rem 1)

$δ (q_{2}, 1) = q_{2}$ (from remainder 2, read 1, new number $2 \cdot 2 + 1 = 5$ , rem 2)

Exercise 3.3 (Closure Properties - Product Construction)

Let $L_{1} = {w \in {0, 1}^{*} ∣ ∣ w ∣_{0} (mod 3) = 1}$ and $L_{2} = {w \in {0, 1}^{*} ∣ ∣ w ∣_{1} (mod 2) = 0}$ . Design a DFA that accepts $L_{1} \cap L_{2}$ .

Solution

First, design DFAs for $L_{1}$ and $L_{2}$ separately.

For $L_{1}$ (count 0s mod 3, must be 1):

$Q_{1} = {r_{0}, r_{1}, r_{2}}$ (remainder of $∣ w ∣_{0}$ mod 3).

$Σ = {0, 1}$

$q_{0, 1} = r_{0}$

$F_{1} = {r_{1}}$

$δ_{1}$ :

$δ_{1} (r_{0}, 0) = r_{1}$ , $δ_{1} (r_{0}, 1) = r_{0}$

$δ_{1} (r_{1}, 0) = r_{2}$ , $δ_{1} (r_{1}, 1) = r_{1}$

$δ_{1} (r_{2}, 0) = r_{0}$ , $δ_{1} (r_{2}, 1) = r_{2}$

For $L_{2}$ (count 1s mod 2, must be 0):

$Q_{2} = {p_{0}, p_{1}}$ (remainder of $∣ w ∣_{1}$ mod 2).

$Σ = {0, 1}$

$q_{0, 2} = p_{0}$

$F_{2} = {p_{0}}$

$δ_{2}$ :

$δ_{2} (p_{0}, 0) = p_{0}$ , $δ_{2} (p_{0}, 1) = p_{1}$

$δ_{2} (p_{1}, 0) = p_{1}$ , $δ_{2} (p_{1}, 1) = p_{0}$

Now, use product construction for $L_{1} \cap L_{2}$ :

$Q = Q_1 \times Q_2 = \{ (r_i, p_j) \mid i \in \{0,1,2\}, j \in \{0,1}\}$ . There are $3 \times 2 = 6$ states.

$q_{0} = (q_{0, 1}, q_{0, 2}) = (r_{0}, p_{0})$

$F = {(r_{1}, p_{0})}$ (because $r_{1} \in F_{1}$ and $p_{0} \in F_{2}$ )

$δ ((r_{i}, p_{j}), a) = (δ_{1} (r_{i}, a), δ_{2} (p_{j}, a))$ For example:

$δ ((r_{0}, p_{0}), 0) = (δ_{1} (r_{0}, 0), δ_{2} (p_{0}, 0)) = (r_{1}, p_{0})$

$δ ((r_{0}, p_{0}), 1) = (δ_{1} (r_{0}, 1), δ_{2} (p_{0}, 1)) = (r_{0}, p_{1})$

$δ ((r_{1}, p_{0}), 0) = (δ_{1} (r_{1}, 0), δ_{2} (p_{0}, 0)) = (r_{2}, p_{0})$

$δ ((r_{1}, p_{0}), 1) = (δ_{1} (r_{1}, 1), δ_{2} (p_{0}, 1)) = (r_{1}, p_{1})$

Exercise 3.4 (Pumping Lemma)

Use the Pumping Lemma to prove that $L = {a^{n^{2}} ∣ n \geq 1}$ is not regular.

Solution

Assume $L$ is regular. Let $p$ be the pumping length.

Choose $w = a^{p^{2}}$ . This string is in $L$ (for $n = p$ ) and $∣ w ∣ = p^{2} \geq p$ .

By the Pumping Lemma, $w$ can be decomposed into $x yz$ such that $∣ x y ∣ \leq p$ , $∣ y ∣ \geq 1$ . Since $w$ consists only of a’s, $x = a^{a}$ , $y = a^{b}$ , $z = a^{c}$ where $a \geq 0$ , $b \geq 1$ , $c \geq 0$ , and $a + b + c = p^{2}$ . Also, $∣ x y ∣ = a + b \leq p$ .

Consider the pumped string $x y^{2} z = a^{a} a^{b} a^{b} a^{c} = a^{a + 2 b + c}$ . Its length is $∣ x y^{2} z ∣ = p^{2} + b$ .

We need to check if $p^{2} + b$ can be a perfect square. Since $b \geq 1$ , we have $p^{2} < p^{2} + b$ . Since $a + b \leq p$ and $b \geq 1$ , we have $b \leq p$ . Therefore, $p^{2} + b \leq p^{2} + p$ . The next perfect square after $p^{2}$ is $(p + 1)^{2} = p^{2} + 2 p + 1$ . So we have $p^{2} < ∣ x y^{2} z ∣ = p^{2} + b \leq p^{2} + p < p^{2} + 2 p + 1 = (p + 1)^{2}$ . If $p \geq 1$ , then $p^{2} + p < p^{2} + 2 p + 1$ (strictly). This implies that $p^{2} + b$ is strictly between two consecutive perfect squares ( $p^{2}$ and $(p + 1)^{2}$ ). Hence, $p^{2} + b$ cannot be a perfect square.

Thus, $x y^{2} z \in / L$ . This contradicts condition (3) of the Pumping Lemma.

Therefore, $L = {a^{n^{2}} ∣ n \geq 1}$ is not regular.

Exercise 3.5 (Kolmogorov Complexity for Non-Regularity)

Use the Kolmogorov Complexity Method to prove that $L = {w \in {0, 1}^{*} ∣ w is a palindrome}$ is not regular. (A palindrome reads the same forwards and backwards, e.g., 0110, 101).

Solution

Assume $L$ is regular. By Theorem 3.1, there is a constant $c$ such that for any prefix $x$ , the $n$ -th word $y$ in $L_{x} = {y ∣ x y \in L}$ has $K (y) \leq lo g_{2} n + c$ .

Choose $x = S$ to be an arbitrary binary word of length $m$ .

For $x y$ to be a palindrome, $y$ must be $S^{R}$ (the reverse of $S$ ). So $L_{S} = {S^{R}}$ . This language contains only one word, $S^{R}$ , which is the $1^{s t}$ word ( $n = 1$ ) in $L_{S}$ in canonical order.

According to Theorem 3.1, $K (S^{R}) \leq lo g_{2} (1) + c = c$ .

This implies that for any binary string $S$ , its reverse $S^{R}$ must have Kolmogorov complexity at most $c$ . This is false. We know that for almost all strings $S$ of length $m$ , $K (S)$ is approximately $m$ . And $K (S^{R})$ is not significantly different from $K (S)$ , i.e., $∣ K (S) - K (S^{R}) ∣ \leq c^{'}$ . So for random strings $S$ of length $m$ , $K (S^{R})$ is approximately $m$ . For sufficiently large $m$ , $m > c$ , which implies $K (S^{R}) > c$ .

This contradicts $K (S^{R}) \leq c$ .

Therefore, the language of palindromes is not regular.

Exercise 3.6 (NFA Design)

Design an NFA that accepts strings over ${0, 1}$ containing 010 as a subword. Provide its formal definition and a state diagram.

Solution

Let $M = (Q, Σ, δ, q_{0}, F)$ .

$Q = {q_{0}, q_{1}, q_{2}, q_{3}}$

$Σ = {0, 1}$

$q_{0} = q_{0}$

$F = {q_{3}}$

Transitions ( $δ$ function):

$δ (q_{0}, 0) = {q_{0}, q_{1}}$ (Nondeterministically stay in $q_{0}$ or start matching 010)

$δ (q_{0}, 1) = {q_{0}}$

$δ (q_{1}, 0) = {q_{1}}$ (If we have 0, and get another 0, longest match is still 0. For 010).

$δ (q_{1}, 1) = {q_{2}}$ (Matched 01)

$δ (q_{2}, 0) = {q_{3}}$ (Matched 010)

$δ (q_{2}, 1) = {q_{0}}$ (If we have 01 then 1, pattern is broken. Go back to q0 state for new attempt)

$δ (q_{3}, 0) = {q_{3}}$ (Once 010 is found, stay in $q_{3}$ )

$δ (q_{3}, 1) = {q_{3}}$ (Once 010 is found, stay in $q_{3}$ )

Exercise 3.7 (Subset Construction)

Convert the NFA from Exercise 3.6 (strings containing 010) into an equivalent DFA using the Subset Construction algorithm.

Solution

Let the NFA be $M_{N} = (Q_{N}, Σ, δ_{N}, q_{0, N}, F_{N})$ where $Q_{N} = {q_{0}, q_{1}, q_{2}, q_{3}}$ , $q_{0, N} = q_{0}$ , $F_{N} = {q_{3}}$ , and $δ_{N}$ is:

State Input 0 Input 1
$q_{0}$ ${q_{0}, q_{1}}$ ${q_{0}}$
$q_{1}$ ${q_{1}}$ ${q_{2}}$
$q_{2}$ ${q_{3}}$ ${q_{0}}$
$q_{3}$ ${q_{3}}$ ${q_{3}}$

Now, let’s construct the DFA $M_{D} = (Q_{D}, Σ, δ_{D}, q_{0, D}, F_{D})$

Start State $q_{0, D}$ : ${q_{0}}$

Explore Transitions:

From $A = {q_{0}}$ :

$δ_{D} (A, 0) = δ_{N} (q_{0}, 0) = {q_{0}, q_{1}}$ (Let $B = {q_{0}, q_{1}}$ )

$δ_{D} (A, 1) = δ_{N} (q_{0}, 1) = {q_{0}}$ (Self-loop to $A$ )

From $B = {q_{0}, q_{1}}$ :

$δ_{D} (B, 0) = δ_{N} (q_{0}, 0) \cup δ_{N} (q_{1}, 0) = {q_{0}, q_{1}} \cup {q_{1}} = {q_{0}, q_{1}}$ (Self-loop to $B$ )

$δ_{D} (B, 1) = δ_{N} (q_{0}, 1) \cup δ_{N} (q_{1}, 1) = {q_{0}} \cup {q_{2}} = {q_{0}, q_{2}}$ (Let $C = {q_{0}, q_{2}}$ )

From $C = {q_{0}, q_{2}}$ :

$δ_{D} (C, 0) = δ_{N} (q_{0}, 0) \cup δ_{N} (q_{2}, 0) = {q_{0}, q_{1}} \cup {q_{3}} = {q_{0}, q_{1}, q_{3}}$ (Let $D = {q_{0}, q_{1}, q_{3}}$ )

$δ_{D} (C, 1) = δ_{N} (q_{0}, 1) \cup δ_{N} (q_{2}, 1) = {q_{0}} \cup {q_{0}} = {q_{0}}$ (Transition to $A$ )

From $D = {q_{0}, q_{1}, q_{3}}$ : Note that $q_{3}$ is an accepting state of NFA. Any state in DFA containing $q_{3}$ will be an accepting state.

$δ_{D} (D, 0) = δ_{N} (q_{0}, 0) \cup δ_{N} (q_{1}, 0) \cup δ_{N} (q_{3}, 0) = {q_{0}, q_{1}} \cup {q_{1}} \cup {q_{3}} = {q_{0}, q_{1}, q_{3}}$ (Self-loop to $D$ )

$δ_{D} (D, 1) = δ_{N} (q_{0}, 1) \cup δ_{N} (q_{1}, 1) \cup δ_{N} (q_{3}, 1) = {q_{0}} \cup {q_{2}} \cup {q_{3}} = {q_{0}, q_{2}, q_{3}}$ (Let $E = {q_{0}, q_{2}, q_{3}}$ )

From $E = {q_{0}, q_{2}, q_{3}}$ :

$δ_{D} (E, 0) = δ_{N} (q_{0}, 0) \cup δ_{N} (q_{2}, 0) \cup δ_{N} (q_{3}, 0) = {q_{0}, q_{1}} \cup {q_{3}} \cup {q_{3}} = {q_{0}, q_{1}, q_{3}}$ (Transition to $D$ )

$δ_{D} (E, 1) = δ_{N} (q_{0}, 1) \cup δ_{N} (q_{2}, 1) \cup δ_{N} (q_{3}, 1) = {q_{0}} \cup {q_{0}} \cup {q_{3}} = {q_{0}, q_{3}}$ (Let $F = {q_{0}, q_{3}}$ )

From $F = {q_{0}, q_{3}}$ :

$δ_{D} (F, 0) = δ_{N} (q_{0}, 0) \cup δ_{N} (q_{3}, 0) = {q_{0}, q_{1}} \cup {q_{3}} = {q_{0}, q_{1}, q_{3}}$ (Transition to $D$ )

$δ_{D} (F, 1) = δ_{N} (q_{0}, 1) \cup δ_{N} (q_{3}, 1) = {q_{0}} \cup {q_{3}} = {q_{0}, q_{3}}$ (Self-loop to $F$ )

DFA Components:

$Q_{D} = {A, B, C, D, E, F}$

$q_{0, D} = A = {q_{0}}$

$F_{D} = {D, E, F}$ (any state containing $q_{3}$ )

$δ_{D}$ is described in the table above.

State	Input 0	Input 1
$q_{0}$	${q_{0}, q_{1}}$	${q_{0}}$
$q_{1}$	${q_{1}}$	${q_{2}}$
$q_{2}$	${q_{3}}$	${q_{0}}$
$q_{3}$	${q_{3}}$	${q_{3}}$

Exercise 3.8 (Pumping Lemma)

Use the Pumping Lemma to prove that $L = {w \in {0, 1, 2}^{*} ∣ ∣ w ∣_{0} = ∣ w ∣_{1} = ∣ w ∣_{2}}$ is not regular.

Solution

Assume $L$ is regular. Let $p$ be the pumping length.

Choose $w = 0^{p} 1^{p} 2^{p}$ . This string is in $L$ and $∣ w ∣ = 3 p \geq p$ .

By the Pumping Lemma, $w$ can be decomposed into $x yz$ such that $∣ x y ∣ \leq p$ , $∣ y ∣ \geq 1$ . Since $∣ x y ∣ \leq p$ and $w$ starts with $p$ zeros, both $x$ and $y$ must consist entirely of 0s. So, $x = 0^{a}$ , $y = 0^{b}$ , $z = 0^{c} 1^{p} 2^{p}$ where $a \geq 0$ , $b \geq 1$ , $c \geq 0$ , and $a + b + c = p$ .

Consider the pumped string $x y^{2} z = 0^{a + 2 b + c} 1^{p} 2^{p} = 0^{p + b} 1^{p} 2^{p}$ . Since $b \geq 1$ , the number of 0s in $x y^{2} z$ is $p + b > p$ , while the number of 1s and 2s remains $p$ .

Thus, $∣ x y^{2} z ∣_{0} \neq = ∣ x y^{2} z ∣_{1}$ and $∣ x y^{2} z ∣_{0} \neq = ∣ x y^{2} z ∣_{2}$ . So, $x y^{2} z \in / L$ .

This contradicts condition (3) of the Pumping Lemma.

Therefore, $L$ is not a regular language.

Key Takeaways

Finite Automata are Memory-Limited: DFAs and NFAs represent computation with finite, fixed memory (states). This is their defining characteristic and limitation.
DFAs are Formal Language Processors: They provide a precise mathematical model for recognizing regular languages, forming the basis of lexical analysis and simple pattern matching.
State as Memory: The art of DFA design lies in encoding all necessary information about the input prefix into the finite set of states.
Closure Properties: Regular languages are closed under fundamental set operations (union, intersection, complement), meaning applying these operations to regular languages always yields another regular language. This is often proven constructively using techniques like the product construction.
Proving Non-Regularity: The Pumping Lemma and Kolmogorov Complexity provide powerful formal methods to demonstrate that a language cannot be recognized by any finite automaton, thereby establishing limits of computational models.
Nondeterminism’s Role: NFAs simplify language description by allowing multiple transitions. Crucially, they do not increase the computational power beyond DFAs; any NFA can be converted to an equivalent DFA (via Subset Construction), though this may involve an exponential increase in states. This highlights that nondeterminism can offer conciseness, but not fundamental power, for this class of automata.

CS Notes

Explorer