In the previous chapters, we built up our understanding of computation, culminating in the Turing machine, a model that captures the full power of any conceivable algorithm. We have focused on what machines can do. Now, we pivot to a more profound question: What are the absolute limits of computation? Are there problems that no algorithm, no matter how clever, can ever solve? This question is fundamental because it defines the inherent boundaries of what computers, and by extension, we, can ever hope to achieve algorithmically.

The answer is a resounding yes. This chapter introduces computability theory, the field that classifies problems into those that are algorithmically solvable (decidable) and those that are not (undecidable). We will develop two of the most powerful proof techniques in computer science: diagonalization and reduction.

First, we will use a simple counting argument from set theory to show that there are fundamentally more problems than there are algorithms, which implies the existence of unsolvable problems. Then, we will use diagonalization to construct a concrete example of an uncomputable problem. Finally, we will use the method of reduction to prove that many other important problems, including the famous Halting Problem, are also undecidable.

5.1 Aims

This chapter delves into the profound question of what problems can and cannot be solved by algorithms. By exploring the theoretical boundaries of computation, we gain a deeper understanding of the inherent limitations of computers and the fundamental nature of decidability. The aims outlined below will guide our journey through the core concepts and powerful proof techniques that define computability theory.

By the end of this chapter, you will be able to:

  • Compare Infinite Sets: Use Cantor’s method of comparing cardinalities to understand why there are more languages than Turing machines, distinguishing between countable and uncountable infinities.
  • Master Diagonalization: Understand how to use the diagonalization proof technique to construct a language that is not recursively enumerable, demonstrating the existence of inherently unsolvable problems.
  • Master the Reduction Method: Learn how to prove a problem is undecidable by reducing a known undecidable problem to it, establishing a hierarchy of computational difficulty.
  • Understand Key Undecidable Problems: Grasp the nature and significance of the Universal Language () and the Halting Problem (), two cornerstone undecidable problems.
  • Apply Rice’s Theorem: Use Rice’s Theorem to instantly prove that any non-trivial semantic property of programs (Turing machines) is undecidable, highlighting the limits of automatic program analysis.
  • Appreciate Kolmogorov Complexity in Undecidability: Understand how Kolmogorov complexity can be used to prove the algorithmic unsolvability of calculating the shortest program length, and its implications for the limits of mathematical proof.

5.2 The Method of Diagonalization

The diagonalization method, pioneered by Georg Cantor, is a powerful proof technique used to demonstrate that some infinite sets are “larger” than others. In computability theory, we adapt this technique to prove the existence of problems that cannot be solved by any algorithm. The core idea is to assume that we can list all possible solutions (e.g., all Turing machines or all languages) and then construct a new problem that is guaranteed to differ from every item on our list. This contradiction proves that our initial assumption (that we could list all problems) must be false, thereby establishing the existence of uncomputable problems.

Our first step is to show that unsolvable problems must exist. The argument is surprisingly simple: there are fundamentally “more” problems (languages) than there are algorithms (Turing machines). This numerical imbalance guarantees that some problems must be beyond algorithmic reach. To make this precise, we need a way to compare the sizes of infinite sets, a concept known as cardinality.

Definition 5.1 (Comparing Cardinalities)

Let and be two sets.

  • We say if there exists an injective (one-to-one) function .
  • We say if there exists a bijective (one-to-one and onto) function .
  • A set is countable if it is finite or if (the set of natural numbers). This means its elements can be put into a one-to-one correspondence with the natural numbers, allowing them to be listed. Otherwise, it is uncountable.

Examples of Countable Sets:

  • The set of natural numbers .
  • The set of integers . We can list them as .
  • The set of positive rational numbers . These can be enumerated using a zigzag pattern across a grid of pairs.
  • The set of all finite binary strings . These can be listed in canonical (lexicographical) order.
  • The set of all Turing machines. Since every Turing machine can be uniquely encoded as a finite binary string , and the set of all finite binary strings is countable, the set of all Turing machines is also countable.

The Uncountability of Languages:

While the set of all Turing machines is countable, the set of all possible languages is much larger.

Theorem 5.3 (Cantor)

The set of all languages over an alphabet (i.e., the power set ) is uncountable.

We can apply this same technique directly to Turing machines to construct a concrete language that is not recursively enumerable.

Definition 5.2 (The Diagonal Language)

Let be the canonical enumeration of all Turing machines (based on their encodings ) and be the canonical enumeration of all binary strings. We define the diagonal language as: In simpler terms, contains a word if and only if the -th Turing machine fails to accept its own encoding (when is interpreted as an input string). This clever self-referential definition is key to proving its uncomputability.

Theorem 5.5

is not recursively enumerable.

This result is profound: it proves the existence of problems that are fundamentally beyond the reach of any algorithm.


5.3 The Method of Reduction

Diagonalization gives us our first uncomputable problem, but it’s a bit abstract. To prove that more practical problems are undecidable, we use reduction. The logic is simple: “If we could solve problem A, we could use it as a subroutine to solve problem B. But we know B is unsolvable. Therefore, A must be unsolvable too.” This method allows us to transfer undecidability from one problem to another.

Definition 5.4 (Reducibility)

Let and be two languages. We say is reducible to , written , if the decidability of implies the decidability of . More formally, if there exists a Turing machine that, for any input for , computes an output such that . This must always halt.

Crucially, to prove a new problem is undecidable, we pick a known undecidable problem and show that . The logic is: if were decidable, then would also be decidable (by using the decider as a subroutine), which is a contradiction. Therefore, must be undecidable.

5.3.1 The Universal Language (The Acceptance Problem)

Our first “natural” undecidable problem is the problem of simulation itself. This is a crucial problem because it directly relates to whether one computer can perfectly predict the behavior of any other computer.

Definition 5.5 (The Universal Language)

The universal language is the set of all pairs where is a TM and is a string that accepts.

Theorem 5.6

is recursively enumerable, but it is not recursive (it is undecidable).

5.3.2 The Halting Problem

The most famous undecidable problem is the Halting Problem. It asks a seemingly simple question with profound implications for software development and verification.

Definition 5.6 (The Halting Problem)

The Halting Problem is to decide, for a given TM and input , whether will eventually halt (either accept or reject) on . The corresponding language is:

Theorem 5.8

The Halting Problem is undecidable.

5.3.3 Undecidability of the Empty Language Problem

Another important undecidable problem is determining if a Turing machine accepts the empty language.

Definition 5.7 (The Empty Language Problem)

The empty language problem is to decide, for a given TM , whether . The corresponding language is:

Theorem 5.9

is undecidable.

5.3.4 Relationship between a Language and its Complement

The relationship between a language and its complement is important for understanding decidability.

Lemma 5.4

A language is recursive if and only if its complement is recursive.

This lemma implies that if a language is recursive, its complement is also recursive. However, if a language is recursively enumerable but not recursive, its complement cannot be recursively enumerable. For example, is not recursively enumerable.

5.3.5 The Post Correspondence Problem (PCP)

Undecidable problems are not limited to those directly involving Turing machines. The Post Correspondence Problem (PCP) is a classic example of a natural undecidable problem from formal language theory.

Definition 5.8 (Post Correspondence Problem)

An instance of the Post Correspondence Problem (PCP) consists of two lists of non-empty words, and , over some alphabet . A solution to this instance is a sequence of indices (where and ) such that: The problem is to decide whether a given instance of PCP has a solution.

Theorem 5.10

The Post Correspondence Problem is undecidable.


5.4 Rice’s Theorem

Many undecidable problems concern properties of the language a TM accepts (e.g., “Is empty?”, “Is regular?”). Rice’s Theorem gives us a powerful shortcut to prove that almost all such properties are undecidable.

Definition 5.7 (Semantic and Non-Trivial Properties)

A property of a Turing machine is:

  • Semantic if it depends only on the language the TM accepts, not the TM’s specific implementation. That is, for any two TMs and , if , then has property if and only if has property . (Think “what it does,” not “how it does it.“)
  • Non-trivial if there is at least one TM that has the property and at least one TM that does not have the property. (Meaning it’s not a property that all TMs have, nor one that no TMs have.)

Theorem 5.9 (Rice's Theorem)

Every non-trivial, semantic property of Turing machines is undecidable.

Examples of undecidable properties covered by Rice’s Theorem:

  • Is empty? (This is , which we proved undecidable by reduction).
  • Is finite?
  • Is regular?
  • Does contain the string “001”?
  • Is ?
  • Is recursive?

Rice’s Theorem is a sweeping statement about the impossibility of automatic program verification. It tells us that almost any interesting question we might ask about what a program does (its language) is undecidable. Any tool that claims to analyze what a program does (its language) rather than how it does it (its syntax) will be unable to solve the problem for all possible programs.


5.5 The Method of Kolmogorov Complexity in Computability

Kolmogorov complexity, introduced in Chapter 2, provides an alternative and often elegant way to prove undecidability results. It leverages the idea that if a problem were decidable, it would imply that certain strings have unexpectedly low Kolmogorov complexity, leading to a contradiction.

Theorem 5.11

The problem of calculating the Kolmogorov complexity of for every is algorithmically unsolvable (undecidable).

This result has profound implications, including for the limits of formal systems and mathematical proof. It implies that there are correct mathematical assertions (e.g., "" for certain and ) for which no mathematical proof exists within a given formal system. This connects to Gödel’s incompleteness theorems, highlighting the inherent limitations of formal axiomatic systems.


Summary

  • Computability theory draws the fundamental line between problems that are algorithmically solvable and those that are inherently unsolvable.
  • The diagonalization method, originating from Cantor’s work on infinite sets, proves that there are uncountably many languages but only countably many Turing machines, implying the existence of unsolvable problems. It allows us to construct a concrete language () that is not even recursively enumerable.
  • The reduction method is the primary tool for proving that a problem is undecidable. It works by showing that if a new problem were decidable, a known undecidable problem could also be solved, leading to a contradiction.
  • Key undecidable problems include the Universal Language () (the acceptance problem) and the Halting Problem (), which asks whether a given TM halts on a given input. These are central to understanding the limits of program analysis.
  • Rice’s Theorem provides a powerful generalization, stating that any non-trivial, semantic property of Turing machines (i.e., any property that depends only on the language a TM accepts and is not universally true or false) is undecidable. This has significant implications for automatic program verification.
  • The Post Correspondence Problem (PCP) is an example of a natural undecidable problem outside the direct realm of Turing machines, demonstrating the pervasive nature of undecidability.
  • The Kolmogorov Complexity Method offers an alternative approach to proving undecidability, showing that the problem of calculating the Kolmogorov complexity of an arbitrary string is algorithmically unsolvable. This result has deep connections to the limits of formal proof systems.
  • The ultimate consequence of computability theory is that there can be no general-purpose algorithm to verify the correctness or behavior of all computer programs, nor to solve many other fundamental problems.

Previous Chapter: Chapter 4 - Turing Machines Next Up: Chapter 6 - Complexity Theory

Exercises

Exercise 5.1

Prove that the set of all integers is countable.

Exercise 5.2

Prove that the set of all positive rational numbers is countable.

Exercise 5.3

Prove that the language is undecidable.

Exercise 5.4

Is the language decidable?

Exercise 5.5

Prove that the problem of generating the first word with for every number is an algorithmically unsolvable problem.


Key Takeaways

  • The Limits of Algorithms: Computability theory establishes that there are fundamental problems that no algorithm can solve, regardless of computational power or time.
  • Countability and Uncountability: The diagonalization method demonstrates that the set of all possible problems (languages) is infinitely larger than the set of all possible algorithms (Turing machines), guaranteeing the existence of unsolvable problems.
  • Diagonalization as a Proof Technique: This method constructs a specific problem () that cannot be recognized by any Turing machine, serving as the initial benchmark for undecidability.
  • Reduction as a General Tool: Reduction is the primary technique to prove new problems are undecidable by showing they are “at least as hard” as a known undecidable problem.
  • Core Undecidable Problems: The Universal Language () and the Halting Problem () are central examples of undecidable problems, illustrating that even seemingly simple questions about program behavior are uncomputable.
  • Rice’s Theorem: This powerful theorem generalizes many undecidability results, stating that any non-trivial, semantic property of Turing machines is undecidable. It implies that automatic verification of what a program does is generally impossible.
  • Kolmogorov Complexity and Undecidability: The inability to algorithmically compute Kolmogorov complexity provides another avenue for proving undecidability and highlights the limits of formal systems, even suggesting the existence of true mathematical statements that are unprovable.
  • Profound Implications: The existence of undecidable problems has profound implications for computer science, mathematics, and philosophy, defining the inherent boundaries of what can be automated and formally proven.