Chapter 5 - Computability

In the previous chapters, we built up our understanding of computation, culminating in the Turing machine, a model that captures the full power of any conceivable algorithm. We have focused on what machines can do. Now, we pivot to a more profound question: What are the absolute limits of computation? Are there problems that no algorithm, no matter how clever, can ever solve? This question is fundamental because it defines the inherent boundaries of what computers, and by extension, we, can ever hope to achieve algorithmically.

The answer is a resounding yes. This chapter introduces computability theory, the field that classifies problems into those that are algorithmically solvable (decidable) and those that are not (undecidable). We will develop two of the most powerful proof techniques in computer science: diagonalization and reduction.

First, we will use a simple counting argument from set theory to show that there are fundamentally more problems than there are algorithms, which implies the existence of unsolvable problems. Then, we will use diagonalization to construct a concrete example of an uncomputable problem. Finally, we will use the method of reduction to prove that many other important problems, including the famous Halting Problem, are also undecidable.

5.1 Aims

This chapter delves into the profound question of what problems can and cannot be solved by algorithms. By exploring the theoretical boundaries of computation, we gain a deeper understanding of the inherent limitations of computers and the fundamental nature of decidability. The aims outlined below will guide our journey through the core concepts and powerful proof techniques that define computability theory.

By the end of this chapter, you will be able to:

Compare Infinite Sets: Use Cantor’s method of comparing cardinalities to understand why there are more languages than Turing machines, distinguishing between countable and uncountable infinities.
Master Diagonalization: Understand how to use the diagonalization proof technique to construct a language that is not recursively enumerable, demonstrating the existence of inherently unsolvable problems.
Master the Reduction Method: Learn how to prove a problem is undecidable by reducing a known undecidable problem to it, establishing a hierarchy of computational difficulty.
Understand Key Undecidable Problems: Grasp the nature and significance of the Universal Language ( $L_{U}$ ) and the Halting Problem ( $L_{H}$ ), two cornerstone undecidable problems.
Apply Rice’s Theorem: Use Rice’s Theorem to instantly prove that any non-trivial semantic property of programs (Turing machines) is undecidable, highlighting the limits of automatic program analysis.
Appreciate Kolmogorov Complexity in Undecidability: Understand how Kolmogorov complexity can be used to prove the algorithmic unsolvability of calculating the shortest program length, and its implications for the limits of mathematical proof.

5.2 The Method of Diagonalization

The diagonalization method, pioneered by Georg Cantor, is a powerful proof technique used to demonstrate that some infinite sets are “larger” than others. In computability theory, we adapt this technique to prove the existence of problems that cannot be solved by any algorithm. The core idea is to assume that we can list all possible solutions (e.g., all Turing machines or all languages) and then construct a new problem that is guaranteed to differ from every item on our list. This contradiction proves that our initial assumption (that we could list all problems) must be false, thereby establishing the existence of uncomputable problems.

Our first step is to show that unsolvable problems must exist. The argument is surprisingly simple: there are fundamentally “more” problems (languages) than there are algorithms (Turing machines). This numerical imbalance guarantees that some problems must be beyond algorithmic reach. To make this precise, we need a way to compare the sizes of infinite sets, a concept known as cardinality.

Definition 5.1 (Comparing Cardinalities)

Let $A$ and $B$ be two sets.

We say $∣ A ∣ \leq ∣ B ∣$ if there exists an injective (one-to-one) function $f : A \to B$ .
We say $∣ A ∣ = ∣ B ∣$ if there exists a bijective (one-to-one and onto) function $f : A \to B$ .
A set $A$ is countable if it is finite or if $∣ A ∣ = ∣ N ∣$ (the set of natural numbers). This means its elements can be put into a one-to-one correspondence with the natural numbers, allowing them to be listed. Otherwise, it is uncountable.

Examples of Countable Sets:

The set of natural numbers $N = {0, 1, 2, \dots}$ .
The set of integers $Z = {\dots, - 2, - 1, 0, 1, 2, \dots}$ . We can list them as $0, 1, - 1, 2, - 2, \dots$ .
The set of positive rational numbers $Q^{+} = {p / q ∣ p, q \in N^{+}}$ . These can be enumerated using a zigzag pattern across a grid of $(p, q)$ pairs.
The set of all finite binary strings $Σ^{*} = {λ, 0, 1, 00, 01, 10, 11, \dots}$ . These can be listed in canonical (lexicographical) order.
The set of all Turing machines. Since every Turing machine can be uniquely encoded as a finite binary string $⟨ M ⟩$ , and the set of all finite binary strings is countable, the set of all Turing machines is also countable.

The Uncountability of Languages:

While the set of all Turing machines is countable, the set of all possible languages is much larger.

Theorem 5.3 (Cantor)

The set of all languages over an alphabet $Σ$ (i.e., the power set $P (Σ^{*})$ ) is uncountable.

Proof by Diagonalization

Let’s assume for contradiction that the set of all languages over $Σ_{b oo l}^{*}$ is countable. This means we can create an infinite list (a matrix) containing all languages, $L_{1}, L_{2}, L_{3}, \dots$ . Let $w_{1}, w_{2}, w_{3}, \dots$ be the canonical ordering of all binary strings. We can represent membership in these languages using an infinite matrix $A$ , where $A_{ij} = 1$ if $w_{j} \in L_{i}$ and $0$ otherwise.

$w_{1}$ $w_{2}$ $w_{3}$ $w_{4}$ …
$L_{1}$ 1 0 1 0 …
$L_{2}$ 0 0 1 1 …
$L_{3}$ 1 1 0 1 …
$L_{4}$ 0 0 1 1 …
… … … … … …

Now, we construct a new language, $L_{D}$ , by “flipping the diagonal.” Specifically, $L_{D}$ is defined such that for each $i$ , $w_{i} \in L_{D}$ if and only if $w_{i} \in / L_{i}$ . In terms of the matrix, $L_{D}$ is defined by taking the diagonal elements $A_{ii}$ and flipping their values.

By its very construction, $L_{D}$ cannot be any of the languages in our list. For any given language $L_{k}$ in the list, $L_{D}$ is guaranteed to differ from $L_{k}$ on the word $w_{k}$ (i.e., $w_{k} \in L_{D} ⟺ w_{k} \in / L_{k}$ ).

This is a contradiction. Our initial assumption that we could list all languages must be false. Therefore, the set of all languages is uncountable.

Since there are uncountably many languages but only countably many Turing machines, there must be languages that are not recognized by any Turing machine. These languages are not just undecidable; they are not even recursively enumerable.

	$w_{1}$	$w_{2}$	$w_{3}$	$w_{4}$	…
$L_{1}$	1	0	1	0	…
$L_{2}$	0	0	1	1	…
$L_{3}$	1	1	0	1	…
$L_{4}$	0	0	1	1	…
…	…	…	…	…	…

We can apply this same technique directly to Turing machines to construct a concrete language that is not recursively enumerable.

Definition 5.2 (The Diagonal Language)

Let $M_{1}, M_{2}, \dots$ be the canonical enumeration of all Turing machines (based on their encodings $⟨ M_{i} ⟩$ ) and $w_{1}, w_{2}, \dots$ be the canonical enumeration of all binary strings. We define the diagonal language as: $L_{diag} = {w_{i} ∣ M_{i} does not accept w_{i}}$ In simpler terms, $L_{diag}$ contains a word $w_{i}$ if and only if the $i$ -th Turing machine $M_{i}$ fails to accept its own encoding $w_{i}$ (when $w_{i}$ is interpreted as an input string). This clever self-referential definition is key to proving its uncomputability.

Theorem 5.5

$L_{diag}$ is not recursively enumerable.

Proof by Diagonalization

Assume for contradiction that $L_{diag}$ is recursively enumerable. Then, by definition, there must exist some Turing machine, say $M_{k}$ , that accepts it, so $L (M_{k}) = L_{diag}$ .

Now, we consider the behavior of $M_{k}$ on the input $w_{k}$ (the $k$ -th word in the canonical enumeration, which is also the encoding of $M_{k}$ ). We analyze two exhaustive possibilities:

Case 1: Assume $M_{k}$ accepts $w_{k}$ ( $w_{k} \in L (M_{k})$ ).

By the definition of $L_{diag}$ ( $w_{i} \in L_{diag}$ if $M_{i}$ does not accept $w_{i}$ ), if $M_{k}$ accepts $w_{k}$ , then $w_{k}$ cannot be in $L_{diag}$ . So, $w_{k} \in / L_{diag}$ .

However, we assumed $L (M_{k}) = L_{diag}$ . If $w_{k} \in L (M_{k})$ , then it must also be that $w_{k} \in L_{diag}$ .

This leads to a contradiction: $w_{k} \in / L_{diag}$ and $w_{k} \in L_{diag}$ simultaneously.

Case 2: Assume $M_{k}$ does not accept $w_{k}$ ( $w_{k} \in / L (M_{k})$ ).

By the definition of $L_{diag}$ ( $w_{i} \in L_{diag}$ if $M_{i}$ does not accept $w_{i}$ ), if $M_{k}$ does not accept $w_{k}$ , then $w_{k}$ must be in $L_{diag}$ . So, $w_{k} \in L_{diag}$ .

However, we assumed $L (M_{k}) = L_{diag}$ . If $w_{k} \in / L (M_{k})$ , then it must also be that $w_{k} \in / L_{diag}$ .

This also leads to a contradiction: $w_{k} \in L_{diag}$ and $w_{k} \in / L_{diag}$ simultaneously.

In both possible cases, we reach a logical impossibility. The only way out is that our initial assumption (that $L_{diag}$ is recursively enumerable) was wrong. Therefore, $L_{diag}$ is not recursively enumerable. This means no Turing machine can even recognize $L_{diag}$ , let alone decide it.

This result is profound: it proves the existence of problems that are fundamentally beyond the reach of any algorithm.

5.3 The Method of Reduction

Diagonalization gives us our first uncomputable problem, but it’s a bit abstract. To prove that more practical problems are undecidable, we use reduction. The logic is simple: “If we could solve problem A, we could use it as a subroutine to solve problem B. But we know B is unsolvable. Therefore, A must be unsolvable too.” This method allows us to transfer undecidability from one problem to another.

Definition 5.4 (Reducibility)

Let $L_{1}$ and $L_{2}$ be two languages. We say $L_{1}$ is reducible to $L_{2}$ , written $L_{1} \leq_{R} L_{2}$ , if the decidability of $L_{2}$ implies the decidability of $L_{1}$ . More formally, $L_{1} \leq_{R} L_{2}$ if there exists a Turing machine $M_{re d}$ that, for any input $x$ for $L_{1}$ , computes an output $y = M_{re d} (x)$ such that $x \in L_{1} ⟺ y \in L_{2}$ . This $M_{re d}$ must always halt.

Crucially, to prove a new problem $P_{n e w}$ is undecidable, we pick a known undecidable problem $P_{kn o w n}$ and show that $P_{kn o w n} \leq_{R} P_{n e w}$ . The logic is: if $P_{n e w}$ were decidable, then $P_{kn o w n}$ would also be decidable (by using the $P_{n e w}$ decider as a subroutine), which is a contradiction. Therefore, $P_{n e w}$ must be undecidable.

5.3.1 The Universal Language (The Acceptance Problem)

Our first “natural” undecidable problem is the problem of simulation itself. This is a crucial problem because it directly relates to whether one computer can perfectly predict the behavior of any other computer.

Definition 5.5 (The Universal Language)

The universal language is the set of all pairs $⟨ M, w ⟩$ where $M$ is a TM and $w$ is a string that $M$ accepts. $L_{U} = {⟨ M, w ⟩ ∣ M is a TM and w \in L (M)}$

Theorem 5.6

$L_{U}$ is recursively enumerable, but it is not recursive (it is undecidable).

Proof of $L_{U}$ Properties

$L_{U}$ is RE: We can build a Universal Turing Machine (UTM), $U$ , that takes $⟨ M, w ⟩$ as input. $U$ simulates $M$ on input $w$ . If the simulated $M$ accepts $w$ , $U$ accepts $⟨ M, w ⟩$ . If the simulated $M$ rejects $w$ , $U$ rejects $⟨ M, w ⟩$ . If the simulated $M$ loops on $w$ , $U$ also loops on $⟨ M, w ⟩$ . This machine $U$ recognizes $L_{U}$ , so $L_{U}$ is recursively enumerable.

$L_{U}$ is not Recursive: We prove this by reducing $\overline{L_{diag}}$ to $L_{U}$ . (Note: $\overline{L_{diag}} = {w_{i} ∣ M_{i} accepts w_{i}}$ ). Assume for contradiction that $L_{U}$ is recursive. Then there exists a TM $D_{U}$ that decides $L_{U}$ (i.e., $D_{U}$ halts on all inputs and accepts if $⟨ M, w ⟩ \in L_{U}$ , rejects otherwise). We can construct a TM $D_{\overline{diag}}$ that decides $\overline{L_{diag}}$ as follows:

On input $w_{i}$ :

Construct the encoding $⟨ M_{i} ⟩$ of the $i$ -th Turing machine.

Form the input $⟨ M_{i}, w_{i} ⟩$ .

Run $D_{U}$ on $⟨ M_{i}, w_{i} ⟩$ .

If $D_{U}$ accepts, then $M_{i}$ accepts $w_{i}$ , so $w_{i} \in \overline{L_{diag}}$ . $D_{\overline{diag}}$ accepts.

If $D_{U}$ rejects, then $M_{i}$ does not accept $w_{i}$ , so $w_{i} \in / \overline{L_{diag}}$ . $D_{\overline{diag}}$ rejects. This TM $D_{\overline{diag}}$ would decide $\overline{L_{diag}}$ . However, we know that $L_{diag}$ is not RE, which implies $\overline{L_{diag}}$ is not RE (and thus not recursive). This is a contradiction. Therefore, $L_{U}$ cannot be recursive.

5.3.2 The Halting Problem

The most famous undecidable problem is the Halting Problem. It asks a seemingly simple question with profound implications for software development and verification.

Definition 5.6 (The Halting Problem)

The Halting Problem is to decide, for a given TM $M$ and input $w$ , whether $M$ will eventually halt (either accept or reject) on $w$ . The corresponding language is: $L_{H} = {⟨ M, w ⟩ ∣ M is a TM that halts on input w}$

Theorem 5.8

The Halting Problem is undecidable.

Proof by Reduction from $L_{U}$

We will show that $L_{U} \leq_{R} L_{H}$ . Assume for contradiction that we have an algorithm (a TM that always halts), let’s call it $D_{H}$ , that decides the Halting Problem. We can use $D_{H}$ to build an algorithm, $D_{U}$ , to decide the universal acceptance problem ( $L_{U}$ ).

Algorithm $D_{U}$ for input $⟨ M, w ⟩$ :

Run $D_{H}$ on $⟨ M, w ⟩$ . ( $D_{H}$ is guaranteed to halt).

If $D_{H}$ rejects (meaning $M$ loops on $w$ ), then we know $M$ does not accept $w$ . So, $D_{U}$ rejects.

If $D_{H}$ accepts (meaning $M$ halts on $w$ ), then we know $M$ will eventually halt. In this case, we can safely simulate $M$ on $w$ without fear of an infinite loop. Run the simulation of $M$ on $w$ .

If the simulation accepts, $D_{U}$ accepts. If it rejects, $D_{U}$ rejects.

This algorithm $D_{U}$ is guaranteed to halt on all inputs and correctly decides if $w \in L (M)$ . Thus, we have built an algorithm for $L_{U}$ . But we know $L_{U}$ is undecidable. The only flawed assumption was the existence of $D_{H}$ . Therefore, the Halting Problem is undecidable.

5.3.3 Undecidability of the Empty Language Problem

Another important undecidable problem is determining if a Turing machine accepts the empty language.

Definition 5.7 (The Empty Language Problem)

The empty language problem is to decide, for a given TM $M$ , whether $L (M) = \emptyset$ . The corresponding language is: $L_{EMPT Y} = {⟨ M ⟩ ∣ L (M) = \emptyset}$

Theorem 5.9

$L_{EMPT Y}$ is undecidable.

Proof by Reduction from $L_{U}$

We will show that $L_{U} \leq_{R} L_{EMPT Y}$ . Assume for contradiction that we have an algorithm $D_{EMPT Y}$ that decides $L_{EMPT Y}$ . We can use $D_{EMPT Y}$ to build an algorithm $D_{U}$ that decides $L_{U}$ .

Algorithm $D_{U}$ for input $⟨ M, w ⟩$ :

Construct a new Turing machine $M^{'}$ from $M$ and $w$ . $M^{'}$ is designed to accept an input $x$ if and only if $M$ accepts $w$ . Specifically, $M^{'}$ on input $x$ does the following:

Ignores its own input $x$ .

Simulates $M$ on $w$ .

If $M$ accepts $w$ , then $M^{'}$ accepts $x$ .

If $M$ rejects $w$ or loops on $w$ , then $M^{'}$ rejects $x$ or loops on $x$ .

Now, consider the language accepted by $M^{'}$ :

If $M$ accepts $w$ , then $L (M^{'}) = Σ^{*}$ (since $M^{'}$ accepts any $x$ ). In this case, $L (M^{'}) \neq = \emptyset$ .

If $M$ does not accept $w$ (i.e., $M$ rejects $w$ or loops on $w$ ), then $L (M^{'}) = \emptyset$ (since $M^{'}$ never accepts any $x$ ).

Run $D_{EMPT Y}$ on $⟨ M^{'} ⟩$ .

If $D_{EMPT Y}$ accepts $⟨ M^{'} ⟩$ (meaning $L (M^{'}) = \emptyset$ ), then $M$ does not accept $w$ . So, $D_{U}$ rejects.

If $D_{EMPT Y}$ rejects $⟨ M^{'} ⟩$ (meaning $L (M^{'}) \neq = \emptyset$ ), then $M$ accepts $w$ . So, $D_{U}$ accepts.

This algorithm $D_{U}$ is guaranteed to halt and correctly decides $L_{U}$ . But we know $L_{U}$ is undecidable. This is a contradiction. Therefore, $L_{EMPT Y}$ is undecidable.

5.3.4 Relationship between a Language and its Complement

The relationship between a language and its complement is important for understanding decidability.

Lemma 5.4

A language $L$ is recursive if and only if its complement $\overline{L}$ is recursive.

Proof Idea

If $L$ is recursive, there exists a TM $M$ that decides $L$ (halts on all inputs, accepts if $x \in L$ , rejects if $x \in / L$ ). We can construct a TM $M^{'}$ for $\overline{L}$ by simply swapping the accepting and rejecting states of $M$ . Since $M$ always halts, $M^{'}$ will also always halt, and $L (M^{'}) = \overline{L}$ . Thus, $\overline{L}$ is recursive. The argument is symmetric for the other direction.

This lemma implies that if a language is recursive, its complement is also recursive. However, if a language is recursively enumerable but not recursive, its complement cannot be recursively enumerable. For example, $\overline{L_{U}}$ is not recursively enumerable.

5.3.5 The Post Correspondence Problem (PCP)

Undecidable problems are not limited to those directly involving Turing machines. The Post Correspondence Problem (PCP) is a classic example of a natural undecidable problem from formal language theory.

Definition 5.8 (Post Correspondence Problem)

An instance of the Post Correspondence Problem (PCP) consists of two lists of non-empty words, $A = (w_{1}, \dots, w_{k})$ and $B = (x_{1}, \dots, x_{k})$ , over some alphabet $Σ$ . A solution to this instance is a sequence of indices $i_{1}, i_{2}, \dots, i_{m}$ (where $m \geq 1$ and $1 \leq i_{j} \leq k$ ) such that: $w_{i_{1}} w_{i_{2}} \dots w_{i_{m}} = x_{i_{1}} x_{i_{2}} \dots x_{i_{m}}$ The problem is to decide whether a given instance of PCP has a solution.

Theorem 5.10

The Post Correspondence Problem is undecidable.

Proof Idea (Reduction from Halting Problem)

The proof involves a complex reduction from the Halting Problem (or a variant like the Modified PCP, MPCP). The core idea is to construct an instance of PCP such that a solution exists if and only if a given Turing machine $M$ accepts a given input $w$ . The domino tiles are designed to simulate the configurations and transitions of $M$ on $w$ . A match in the PCP corresponds to $M$ reaching an accepting configuration. This shows that if PCP were decidable, the Halting Problem would also be decidable, which is a contradiction.

5.4 Rice’s Theorem

Many undecidable problems concern properties of the language a TM accepts (e.g., “Is $L (M)$ empty?”, “Is $L (M)$ regular?”). Rice’s Theorem gives us a powerful shortcut to prove that almost all such properties are undecidable.

Definition 5.7 (Semantic and Non-Trivial Properties)

A property $P$ of a Turing machine is:

Semantic if it depends only on the language the TM accepts, not the TM’s specific implementation. That is, for any two TMs $M_{1}$ and $M_{2}$ , if $L (M_{1}) = L (M_{2})$ , then $M_{1}$ has property $P$ if and only if $M_{2}$ has property $P$ . (Think “what it does,” not “how it does it.“)
Non-trivial if there is at least one TM that has the property and at least one TM that does not have the property. (Meaning it’s not a property that all TMs have, nor one that no TMs have.)

Theorem 5.9 (Rice's Theorem)

Every non-trivial, semantic property of Turing machines is undecidable.

Examples of undecidable properties covered by Rice’s Theorem:

Is $L (M)$ empty? (This is $L_{EMPT Y}$ , which we proved undecidable by reduction).
Is $L (M)$ finite?
Is $L (M)$ regular?
Does $L (M)$ contain the string “001”?
Is $L (M) = Σ^{*}$ ?
Is $L (M)$ recursive?

Rice’s Theorem is a sweeping statement about the impossibility of automatic program verification. It tells us that almost any interesting question we might ask about what a program does (its language) is undecidable. Any tool that claims to analyze what a program does (its language) rather than how it does it (its syntax) will be unable to solve the problem for all possible programs.

5.5 The Method of Kolmogorov Complexity in Computability

Kolmogorov complexity, introduced in Chapter 2, provides an alternative and often elegant way to prove undecidability results. It leverages the idea that if a problem were decidable, it would imply that certain strings have unexpectedly low Kolmogorov complexity, leading to a contradiction.

Theorem 5.11

The problem of calculating the Kolmogorov complexity $K (x)$ of $x$ for every $x \in {0, 1}^{*}$ is algorithmically unsolvable (undecidable).

Proof by Contradiction

Assume for contradiction that there exists an algorithm (a TM that always halts), let’s call it $A_{K}$ , that calculates $K (x)$ for any given $x \in {0, 1}^{*}$ .

We can then construct a new algorithm, $B$ , that takes an integer $n$ as input and outputs the first word $x_{n}$ (in canonical order) such that $K (x_{n}) \geq n$ .

Algorithm $B$ for input $n$ :

Initialize $x = λ$ (the empty string).

Loop indefinitely:

Use $A_{K}$ to calculate $K (x)$ .

If $K (x) \geq n$ , then $x$ is the desired word. Output $x$ and halt.

Otherwise, set $x$ to its canonical successor (e.g., if $x = 0$ , next is $1$ ; if $x = 1$ , next is $00$ ).

This algorithm $B$ is guaranteed to halt because, by Lemma 2.5, for every $n$ , there exists at least one word $x$ such that $K (x) \geq n$ .

Now, consider the Kolmogorov complexity of the output of $B$ . The algorithm $B$ itself can be encoded as a program. The only variable part of $B$ is the input $n$ . Thus, the length of the shortest program that generates $x_{n}$ (the output of $B$ for input $n$ ) would be approximately $lo g_{2} n$ (for encoding $n$ ) plus a constant $c$ (for the fixed part of $B$ ). So, $K (x_{n}) \leq lo g_{2} n + c$ .

However, by the definition of $x_{n}$ , we chose it such that $K (x_{n}) \geq n$ .

Combining these, we get $n \leq K (x_{n}) \leq lo g_{2} n + c$ .

For sufficiently large $n$ , $n$ grows much faster than $lo g_{2} n + c$ . This inequality ( $n \leq lo g_{2} n + c$ ) can only hold for a finite number of values of $n$ . This contradicts our assumption that such an algorithm $A_{K}$ exists.

Therefore, the problem of calculating Kolmogorov complexity is algorithmically unsolvable.

This result has profound implications, including for the limits of formal systems and mathematical proof. It implies that there are correct mathematical assertions (e.g., " $K (x) \geq n$ " for certain $x$ and $n$ ) for which no mathematical proof exists within a given formal system. This connects to Gödel’s incompleteness theorems, highlighting the inherent limitations of formal axiomatic systems.

Summary

Computability theory draws the fundamental line between problems that are algorithmically solvable and those that are inherently unsolvable.
The diagonalization method, originating from Cantor’s work on infinite sets, proves that there are uncountably many languages but only countably many Turing machines, implying the existence of unsolvable problems. It allows us to construct a concrete language ( $L_{diag}$ ) that is not even recursively enumerable.
The reduction method is the primary tool for proving that a problem is undecidable. It works by showing that if a new problem were decidable, a known undecidable problem could also be solved, leading to a contradiction.
Key undecidable problems include the Universal Language ( $L_{U}$ ) (the acceptance problem) and the Halting Problem ( $L_{H}$ ), which asks whether a given TM halts on a given input. These are central to understanding the limits of program analysis.
Rice’s Theorem provides a powerful generalization, stating that any non-trivial, semantic property of Turing machines (i.e., any property that depends only on the language a TM accepts and is not universally true or false) is undecidable. This has significant implications for automatic program verification.
The Post Correspondence Problem (PCP) is an example of a natural undecidable problem outside the direct realm of Turing machines, demonstrating the pervasive nature of undecidability.
The Kolmogorov Complexity Method offers an alternative approach to proving undecidability, showing that the problem of calculating the Kolmogorov complexity of an arbitrary string is algorithmically unsolvable. This result has deep connections to the limits of formal proof systems.
The ultimate consequence of computability theory is that there can be no general-purpose algorithm to verify the correctness or behavior of all computer programs, nor to solve many other fundamental problems.

Previous Chapter: Chapter 4 - Turing Machines Next Up: Chapter 6 - Complexity Theory

Exercises

Exercise 5.1

Prove that the set of all integers $Z$ is countable.

Solution

We can establish a bijection $f : N \to Z$ . One way to do this is to list the integers in an alternating pattern: $0, 1, - 1, 2, - 2, 3, - 3, \dots$ . Formally, define $f (n)$ as:

$f (0) = 0$

$f (n) = n /2$ if $n$ is even and $n > 0$ .

$f (n) = - (n + 1) /2$ if $n$ is odd. This function is both injective and surjective, proving that $∣ Z ∣ = ∣ N ∣$ .

Exercise 5.2

Prove that the set of all positive rational numbers $Q^{+}$ is countable.

Solution

We can arrange all positive rational numbers $p / q$ (where $p, q \in N^{+}$ ) in an infinite grid, with $p$ as the row index and $q$ as the column index.

1 2 3 4 …
1 1/1 1/2 1/3 1/4 …
2 2/1 2/2 2/3 2/4 …
3 3/1 3/2 3/3 3/4 …
4 4/1 4/2 4/3 4/4 …
… … … … … …

We can then enumerate these numbers by following a diagonal path, skipping duplicates (e.g., 1/1, 1/2, 2/1, 1/3, 3/1, 1/4, 2/3, 3/2, 4/1, …). This systematic enumeration establishes a bijection between $N$ and $Q^{+}$ , proving that $Q^{+}$ is countable.

	1	2	3	4	…
1	1/1	1/2	1/3	1/4	…
2	2/1	2/2	2/3	2/4	…
3	3/1	3/2	3/3	3/4	…
4	4/1	4/2	4/3	4/4	…
…	…	…	…	…	…

Exercise 5.3

Prove that the language $L_{EMPT Y} = {⟨ M ⟩ ∣ L (M) = \emptyset}$ is undecidable.

Solution

We can use Rice’s Theorem, which states that every non-trivial, semantic property of Turing machines is undecidable.

Semantic Property: The property " $L (M) = \emptyset$ " is semantic because it depends only on the language accepted by $M$ , not on the specific implementation of $M$ . If $L (M_{1}) = L (M_{2})$ , then $L (M_{1}) = \emptyset ⟺ L (M_{2}) = \emptyset$ .

Non-trivial Property:

There exists a TM that has the property: A TM that immediately rejects all inputs (e.g., by transitioning to $q_{reject}$ from $q_{0}$ on any input) accepts the empty language. So $L (M) = \emptyset$ is possible.

There exists a TM that does not have the property: A TM that accepts all inputs (e.g., by transitioning to $q_{accept}$ from $q_{0}$ on any input) accepts $Σ^{*}$ , which is not empty. So $L (M) \neq = \emptyset$ is possible.

Since the property " $L (M) = \emptyset$ " is both semantic and non-trivial, by Rice’s Theorem, $L_{EMPT Y}$ is undecidable.

Exercise 5.4

Is the language $L_{EQ} = {⟨ M_{1}, M_{2} ⟩ ∣ L (M_{1}) = L (M_{2})}$ decidable?

Solution

No, the language $L_{EQ}$ is undecidable. We can prove this by reducing $L_{EMPT Y}$ to $L_{EQ}$ .

Assume for contradiction that there exists an algorithm $D_{EQ}$ that decides $L_{EQ}$ . We can construct an algorithm $D_{EMPT Y}$ that decides $L_{EMPT Y}$ as follows:

Let $M_{empty}$ be a fixed Turing machine that accepts the empty language (e.g., a TM that immediately rejects all inputs). Its encoding $⟨ M_{empty} ⟩$ is a constant.

For an input $⟨ M ⟩$ to $D_{EMPT Y}$ :

Construct the pair $⟨ M, M_{empty} ⟩$ .

Run $D_{EQ}$ on $⟨ M, M_{empty} ⟩$ .

If $D_{EQ}$ accepts, it means $L (M) = L (M_{empty})$ . Since $L (M_{empty}) = \emptyset$ , this implies $L (M) = \emptyset$ . So, $D_{EMPT Y}$ accepts.

If $D_{EQ}$ rejects, it means $L (M) \neq = L (M_{empty})$ . This implies $L (M) \neq = \emptyset$ . So, $D_{EMPT Y}$ rejects.

This algorithm $D_{EMPT Y}$ would decide $L_{EMPT Y}$ . However, we know from Exercise 5.3 (or Rice’s Theorem) that $L_{EMPT Y}$ is undecidable. This is a contradiction. Therefore, $L_{EQ}$ must be undecidable.

Exercise 5.5

Prove that the problem of generating the first word $x_{n}$ with $K (x_{n}) \geq n$ for every number $n \in N^{+}$ is an algorithmically unsolvable problem.

Solution

This is a direct consequence of the proof of Theorem 5.11. If we could algorithmically generate the first word $x_{n}$ with $K (x_{n}) \geq n$ for every $n$ , then we could use this algorithm as a subroutine to construct an algorithm that calculates $K (x)$ for any $x$ .

Specifically, if an algorithm $G$ exists that, given $n$ , outputs $x_{n}$ such that $K (x_{n}) \geq n$ , then the length of the shortest program to generate $x_{n}$ would be approximately $lo g_{2} n + c$ (where $c$ is the constant size of algorithm $G$ ). So, $K (x_{n}) \leq lo g_{2} n + c$ .

But by definition of $x_{n}$ , $K (x_{n}) \geq n$ .

This leads to $n \leq lo g_{2} n + c$ , which is false for sufficiently large $n$ . Thus, no such algorithm $G$ can exist.

Key Takeaways

The Limits of Algorithms: Computability theory establishes that there are fundamental problems that no algorithm can solve, regardless of computational power or time.
Countability and Uncountability: The diagonalization method demonstrates that the set of all possible problems (languages) is infinitely larger than the set of all possible algorithms (Turing machines), guaranteeing the existence of unsolvable problems.
Diagonalization as a Proof Technique: This method constructs a specific problem ( $L_{diag}$ ) that cannot be recognized by any Turing machine, serving as the initial benchmark for undecidability.
Reduction as a General Tool: Reduction is the primary technique to prove new problems are undecidable by showing they are “at least as hard” as a known undecidable problem.
Core Undecidable Problems: The Universal Language ( $L_{U}$ ) and the Halting Problem ( $L_{H}$ ) are central examples of undecidable problems, illustrating that even seemingly simple questions about program behavior are uncomputable.
Rice’s Theorem: This powerful theorem generalizes many undecidability results, stating that any non-trivial, semantic property of Turing machines is undecidable. It implies that automatic verification of what a program does is generally impossible.
Kolmogorov Complexity and Undecidability: The inability to algorithmically compute Kolmogorov complexity provides another avenue for proving undecidability and highlights the limits of formal systems, even suggesting the existence of true mathematical statements that are unprovable.
Profound Implications: The existence of undecidable problems has profound implications for computer science, mathematics, and philosophy, defining the inherent boundaries of what can be automated and formally proven.

CS Notes

Explorer

Chapter 5 - Computability

5.1 Aims

5.2 The Method of Diagonalization

Definition 5.1 (Comparing Cardinalities)

Definition 5.2 (The Diagonal Language)

5.3 The Method of Reduction

Definition 5.4 (Reducibility)

5.3.1 The Universal Language (The Acceptance Problem)

Definition 5.5 (The Universal Language)

5.3.2 The Halting Problem

Definition 5.6 (The Halting Problem)

5.3.3 Undecidability of the Empty Language Problem

Definition 5.7 (The Empty Language Problem)

5.3.4 Relationship between a Language and its Complement

5.3.5 The Post Correspondence Problem (PCP)

Definition 5.8 (Post Correspondence Problem)

5.4 Rice’s Theorem

Definition 5.7 (Semantic and Non-Trivial Properties)

5.5 The Method of Kolmogorov Complexity in Computability

Summary

Exercises

Key Takeaways

Table of Contents

Graph View

Backlinks

	1	2	3	4	…
1	1/1	1/2	1/3	1/4	…
2	2/1	2/2	2/3	2/4	…
3	3/1	3/2	3/3	3/4	…
4	4/1	4/2	4/3	4/4	…
…	…	…	…	…	…

	1	2	3	4	…
1	1/1	1/2	1/3	1/4	…
2	2/1	2/2	2/3	2/4	…
3	3/1	3/2	3/3	3/4	…
4	4/1	4/2	4/3	4/4	…
…	…	…	…	…	…

	1	2	3	4	…
1	1/1	1/2	1/3	1/4	…
2	2/1	2/2	2/3	2/4	…
3	3/1	3/2	3/3	3/4	…
4	4/1	4/2	4/3	4/4	…
…	…	…	…	…	…