Chapter 6 - Complexity Theory

In the previous chapter, we established the limits of what is computable. We now know that some problems, like the Halting Problem, are undecidable; no algorithm can solve them. But among the problems that are solvable, do they all present equivalent computational difficulty?

Consider sorting a list versus finding the optimal route for a traveling salesperson visiting thousands of cities. Both are solvable, but the latter feels intuitively harder. An algorithm for sorting might finish in seconds, while the best-known algorithm for the traveling salesperson problem could run for centuries on the fastest supercomputers.

Complexity theory addresses this gap. It provides a framework for classifying decidable problems based on the computational resources, primarily time and memory (space), required to solve them. Its central goal is to distinguish between problems that are “practically solvable” and those that are “intractable,” even if they are theoretically decidable. This chapter introduces the foundational concepts of time and space complexity, the crucial complexity classes P and NP, and the theory of NP-completeness, which provides a powerful tool for identifying the hardest problems in NP.

6.1 Aims

By the end of this chapter, you will be able to:

Measure Complexity: Understand how time and space complexity are formally defined for Turing machines, emphasizing worst-case analysis and the robustness of these measures.
Use Asymptotic Notation: Effectively use O, $Ω$ , and $Θ$ notation to analyze and compare the growth rates of algorithms, abstracting away less significant details.
Understand the Class P: Grasp the definition of the class P, its significance as the formal model for “efficiently solvable” problems, and its robustness across computational models.
Understand the Class NP: Learn the two equivalent definitions of NP: via nondeterministic Turing machines and via polynomial-time verifiers, and appreciate its connection to proof verification.
Grasp NP-Completeness: Understand the concepts of polynomial-time reduction, NP-hardness, and NP-completeness, and appreciate the significance of the P vs. NP problem and the Cook-Levin Theorem.
Explore Complexity Hierarchy: Understand the relationships and hierarchy between fundamental complexity classes like DLOG, NLOG, P, NP, PSPACE, and EXPTIME.

6.2 Complexity Measures

To analyze the difficulty of a problem, we first need a formal way to measure the resources used by an algorithm. We use the multi-tape Turing machine as our standard model for abstract complexity theory. This model is simple enough for theoretical analysis yet robust enough to reflect the behavior of real computers.

6.2.1 Time Complexity

Definition 6.1 (Time Complexity)

Let $M$ be a multi-tape TM that halts on all inputs.

The time complexity of $M$ on input $x$ , denoted $Time_{M} (x)$ , is the number of computational steps in the computation of $M$ on $x$ .
The time complexity of $M$ , denoted $Time_{M} (n)$ , is a function of the input length $n$ . It is the maximum time complexity over all inputs of length $n$ . $Time_{M} (n) = max {Time_{M} (x) ∣ ∣ x ∣ = n}$ This is known as worst-case analysis. We focus on the worst case to guarantee that an algorithm will always perform at least this well, providing a reliable upper bound on its resource consumption. While average-case analysis can be useful, worst-case analysis is often simpler and provides stronger guarantees.

6.2.2 Space Complexity

Definition 6.2 (Space Complexity)

Let $M$ be a multi-tape TM that halts on all inputs.

The space complexity of $M$ on input $x$ , denoted $Space_{M} (x)$ , is the maximum number of cells on any single work tape that are accessed by a head during the computation of $M$ on $x$ .
The space complexity of $M$ , denoted $Space_{M} (n)$ , is the maximum space complexity over all inputs of length $n$ . $Space_{M} (n) = max {Space_{M} (x) ∣ ∣ x ∣ = n}$ It’s important to note that the space complexity typically refers to the work tapes, excluding the read-only input tape. The definition focuses on the maximum usage of a single tape, but it is equivalent to considering the sum of all tape usages, as a multi-tape TM can be simulated by a single-tape TM with only a constant factor increase in space (Lemma 6.1).

6.2.3 Robustness of Complexity Measures

The specific choice of the multi-tape Turing machine model and the definition of time/space complexity are robust. This means that fundamental results about complexity defined in this way also apply to the complexity of executing programs in arbitrary programming languages.

Constant Factor Improvements: It’s possible to reduce time or space complexity by a constant factor by using a larger tape alphabet (Lemma 6.2, Task 6.1). This means that constant factors are generally ignored in complexity theory, as they don’t change the fundamental scaling behavior.
Logarithmic vs. Uniform Cost Measures:
- Uniform Cost Measure: Each basic operation (e.g., arithmetic, comparison) costs 1 unit, regardless of the size of the operands. This is simple but can be unrealistic if operand sizes grow significantly.
- Logarithmic Cost Measure: The cost of an operation depends on the length of the binary representation of the operands. This is more realistic for arbitrary-precision arithmetic but can be more cumbersome to analyze. Complexity theory primarily uses the uniform cost measure for basic TM operations, but the robustness of the model ensures that the asymptotic classifications remain valid.

6.2.4 Asymptotic Notation

Exact complexity functions can be messy. In complexity theory, we are usually more interested in the rate of growth of the complexity as the input size increases. Asymptotic notation allows us to abstract away constant factors and lower-order terms, focusing on the dominant behavior for large inputs.

For a function $f : N \to R^{+}$

Big-O ( $O (f (n))$ ): $g (n) \in O (f (n))$ means $g (n)$ grows no faster than $f (n)$ . Formally, there exist constants $c > 0$ and $n_{0} \geq 0$ such that for all $n \geq n_{0}$ , $g (n) \leq c \cdot f (n)$ .
Big-Omega ( $Ω (f (n))$ ): $g (n) \in Ω (f (n))$ means $g (n)$ grows at least as fast as $f (n)$ . Formally, there exist constants $c > 0$ and $n_{0} \geq 0$ such that for all $n \geq n_{0}$ , $g (n) \geq c \cdot f (n)$ .
Theta ( $Θ (f (n))$ ): $g (n) \in Θ (f (n))$ means $g (n)$ grows at the same rate as $f (n)$ . Formally, $g (n) \in O (f (n))$ and $g (n) \in Ω (f (n))$ .
Little-o ( $o (f (n))$ ): $g (n) \in o (f (n))$ means $g (n)$ grows strictly slower than $f (n)$ . Formally, $lim_{n \to \infty} \frac{g ( n )}{f ( n )} = 0$ .

For example, an algorithm with time complexity $3 n^{2} + 10 n + 5$ is said to run in $Θ (n^{2})$ time.

6.3 Complexity Classes and the Class P

For the definition of complexity classes, we use the model of the multi-tape Turing machine. We consider complexity classes here only as language classes, i.e., as sets of decision problems.

Definition 6.5 (Basic Complexity Classes)

For all functions $f, g : N \to R^{+}$

$TIME (f) = {L (B) ∣ B is an MTM with Time_{B} (n) \in O (f (n))}$
$SPACE (g) = {L (A) ∣ A is an MTM with Space_{A} (n) \in O (g (n))}$
$DLOG = SPACE (lo g_{2} n)$
$P = ⋃_{k \in N} TIME (n^{k})$ (Polynomial Time)
$PSPACE = ⋃_{k \in N} SPACE (n^{k})$ (Polynomial Space)
$EXPTIME = ⋃_{k \in N} TIME (2^{n^{k}})$ (Exponential Time)

6.3.1 The Class P: Efficiently Solvable Problems

The most important distinction in complexity theory is between algorithms that run in polynomial time and those that run in exponential time.

$n$	$n^{2}$	$n^{3}$	$2^{n}$
10	100	1,000	1,024
20	400	8,000	~1 million
50	2,500	125,000	~ $1 0^{15}$
100	10,000	1,000,000	~ $1 0^{30}$

An algorithm with exponential complexity quickly becomes unusable for even moderately sized inputs. This observation leads to the definition of the class of “efficiently solvable” problems.

P is the class of all languages that are decidable by a deterministic Turing machine in polynomial time. A problem is in P if there exists an algorithm that solves it in time $O (n^{k})$ for some constant $k$ . The class P is considered the formal counterpart to the intuitive notion of problems that are practically or efficiently solvable.

Robustness of P: The definition of P is robust. It remains the same whether we use single-tape, multi-tape, or other “reasonable” deterministic models, up to a polynomial factor.

6.3.2 Relationships Between Deterministic Complexity Classes

There are fundamental relationships between these classes:

$TIME (t) \subseteq SPACE (t)$ : Any TM that runs in time $t (n)$ can use at most $t (n)$ space (Lemma 6.3).
$P \subseteq PSPACE$ : This follows directly from the above.
$SPACE (s) \subseteq TIME (c^{s (n)})$ : Any TM that uses $s (n)$ space can be simulated in exponential time $c^{s (n)}$ (Theorem 6.2). This is because the number of possible configurations in $s (n)$ space is exponential in $s (n)$ , and a TM cannot repeat a configuration in a halting computation.
$PSPACE \subseteq EXPTIME$ : This follows directly from the above.

Combining these, we get the hierarchy: $DLOG \subseteq P \subseteq PSPACE \subseteq EXPTIME$ For each of these inclusions, it is unknown whether it is proper. However, some proper inclusions must exist, as shown by hierarchy theorems (e.g., $DLOG ⊊ PSPACE$ and $P ⊊ EXPTIME$ ). These theorems prove that for sufficiently larger resource bounds, strictly more problems can be solved.

6.3.3 Constructible Functions

The definition of complexity classes often relies on constructible functions, which are functions whose values can be computed efficiently.

Definition 6.6 (Space- and Time-Constructible Functions)

A function $s : N \to N$ is space-constructible if there exists a 1-tape TM $M$ such that for any input $0^{n}$ , $M$ writes $0^{s (n)}$ on its tape and halts, using $O (s (n))$ space.
A function $t : N \to N$ is time-constructible if there exists an MTM $A$ such that for any input $0^{n}$ , $A$ writes $0^{t (n)}$ on its tape and halts, using $O (t (n))$ time.

Most common functions like $n^{k}$ , $lo g n$ , $2^{n}$ are constructible. These functions are important because they allow us to define complexity classes with well-behaved bounds.

6.4 Nondeterministic Complexity Measures

Nondeterministic Turing machines (NTMs) can perform many different computations on an input. To define their complexity, we take an optimistic view: an NTM always chooses the best possible option.

Definition 6.7 (Nondeterministic Time and Space Complexity)

Let $M$ be an NTM or a nondeterministic MTM.

For $x \in L (M)$ , the nondeterministic time complexity of $M$ on $x$ , $Time_{M} (x)$ , is the length of a shortest accepting computation of $M$ on $x$ .
The nondeterministic time complexity of $M$ , $Time_{M} (n)$ , is $max {Time_{M} (x) ∣ x \in L (M) and ∣ x ∣ = n}$ .
For $x \in L (M)$ , the nondeterministic space complexity of $M$ on $x$ , $Space_{M} (x)$ , is the minimum space used by an accepting computation of $M$ on $x$ .
The nondeterministic space complexity of $M$ , $Space_{M} (n)$ , is $max {Space_{M} (x) ∣ x \in L (M) and ∣ x ∣ = n}$ .

Definition 6.8 (Nondeterministic Complexity Classes)

For all functions $f, g : N \to R^{+}$

$NTIME (f) = {L (M) ∣ M is a nondeterministic MTM with Time_{M} (n) \in O (f (n))}$
$NSPACE (g) = {L (M) ∣ M is a nondeterministic MTM with Space_{M} (n) \in O (g (n))}$
$NLOG = NSPACE (lo g_{2} n)$
$NP = ⋃_{k \in N} NTIME (n^{k})$ (Nondeterministic Polynomial Time)
$NPSPACE = ⋃_{k \in N} NSPACE (n^{k})$ (Nondeterministic Polynomial Space)

6.4.1 Relationships Between Deterministic and Nondeterministic Classes

$TIME (t) \subseteq NTIME (t)$ and $SPACE (t) \subseteq NSPACE (t)$ : Any deterministic TM is also a nondeterministic TM, so deterministic classes are subsets of their nondeterministic counterparts.
$NTIME (s) \subseteq SPACE (s)$ : If an NTM runs in time $s (n)$ , it can use at most $s (n)$ space.
$NSPACE (s) \subseteq TIME (c^{s (n)})$ : An NTM using $s (n)$ space can be simulated by a DTM in exponential time $c^{s (n)}$ (Lemma 6.6).
Savitch’s Theorem (Theorem 6.7): For any space-constructible function $s (n) \geq lo g_{2} n$ , $NSPACE (s (n)) \subseteq SPACE (s (n)^{2})$ . This is a powerful result showing that nondeterministic space can be simulated by deterministic space with only a polynomial overhead.
- Consequence: $PSPACE = NPSPACE$ . Nondeterminism does not add power to polynomial space.

The overall hierarchy of complexity classes, including nondeterministic ones, is: $DLOG \subseteq NLOG \subseteq P \subseteq NP \subseteq PSPACE = NPSPACE \subseteq EXPTIME$ Again, for each of these inclusions (except PSPACE = NPSPACE), it is unknown whether it is proper.

6.5 The Class NP and Proof Verification

The relationship between P and NP is one of the most central and profound questions in theoretical computer science. It touches upon the very nature of problem-solving and the efficiency of verification versus discovery.

6.5.1 NP as Verifiability

Many problems for which we don’t know a polynomial-time algorithm share an interesting property: if provided with a potential solution, you can quickly verify if it’s correct.

Consider the Hamiltonian Path problem: given a graph, is there a path that visits every node exactly once? Finding such a path seems to require trying many possibilities. But if someone gives you a path, you can easily check in polynomial time if it’s a valid Hamiltonian path.

This “guess and check” structure is the intuition behind the class NP. The class NP can be equivalently defined using the concept of a verifier.

Definition 6.9 (Polynomial-Time Verifier)

A verifier for a language $L$ is a deterministic algorithm $V$ that takes two inputs: a string $w$ (the problem instance) and a “certificate” or “witness” $c$ (a potential solution). The verifier $V$ accepts the pair $⟨ w, c ⟩$ if and only if $c$ is a valid certificate that $w \in L$ . The verifier must run in polynomial time with respect to the length of $w$ , and the length of the certificate $c$ must also be polynomial in the length of $w$ .

The class VP is the set of all languages that have a polynomial-time verifier.

Theorem 6.8 (Equivalence of NP and VP)

A language is in NP if and only if it has a polynomial-time verifier. $NP = VP$

Proof Idea

$NP \subseteq VP$ : If $L \in NP$ , then there exists a polynomial-time NTM $M$ that accepts $L$ . The verifier $V$ for $L$ takes $⟨ w, c ⟩$ as input. The certificate $c$ is an encoding of the sequence of non-deterministic choices that $M$ makes to accept $w$ . $V$ simulates $M$ on $w$ , following the choices specified by $c$ . If $M$ accepts, $V$ accepts. Since $M$ runs in polynomial time, $V$ also runs in polynomial time.

$VP \subseteq NP$ : If $L \in VP$ , then there exists a polynomial-time verifier $V$ for $L$ . An NTM $M$ for $L$ works as follows: on input $w$ , $M$ nondeterministically “guesses” a certificate $c$ (of polynomial length). Then, $M$ runs $V$ on $⟨ w, c ⟩$ . If $V$ accepts, $M$ accepts. Since $V$ runs in polynomial time, $M$ also runs in polynomial time.

This theorem is fundamental. It provides two equivalent ways to think about NP: as problems solvable by a polynomial-time NTM, or as problems where a proposed solution can be verified in deterministic polynomial time.

6.5.2 The P versus NP Problem

Clearly, if a problem is in P, it is also in NP (a deterministic TM is just a special case of an NTM, and a solution can be verified by simply computing it). This gives us the inclusion: $P \subseteq NP$ (It’s important to remember that “NP” stands for Nondeterministic Polynomial time, not “Not Polynomial” time. Problems in NP are not necessarily hard; they just have efficiently verifiable solutions.)

The most famous open question in computer science, and one of the Millennium Prize Problems, is whether this inclusion is proper: $Does P = NP ?$ If P = NP, it would mean that any problem for which a solution can be efficiently verified can also be efficiently solved. This would have staggering consequences, revolutionizing fields from cryptography to medicine to artificial intelligence. However, the overwhelming consensus among computer scientists is that $P \neq = NP$ . This belief is largely based on the lack of discovery of polynomial-time algorithms for thousands of problems in NP that have been studied for decades.

P vs. NP and Proofs: The P vs. NP question can be rephrased in terms of mathematical proofs:

P: Corresponds to problems where finding a proof (solution) is efficient.
NP: Corresponds to problems where verifying a given proof (solution) is efficient. The question “Does P = NP?” is equivalent to asking: “Is it as easy to find a mathematical proof as it is to verify one?” Most mathematicians and computer scientists believe that finding proofs is inherently harder than verifying them.

6.6 NP-Completeness

Within the vast landscape of NP, some problems are special: they are the “hardest” problems in the class. If we could solve any one of them efficiently, we could solve all problems in NP efficiently. These are the NP-complete problems.

To formalize this, we need the concept of a polynomial-time reduction.

Definition 6.10 (Polynomial-Time Reduction)

A language $L_{1}$ is polynomial-time reducible to a language $L_{2}$ , written $L_{1} \leq_{p} L_{2}$ , if there exists a polynomial-time computable function $f$ that transforms instances of $L_{1}$ into instances of $L_{2}$ such that: $w \in L_{1} ⟺ f (w) \in L_{2}$ This means that if we have a “black box” solver for $L_{2}$ , we can solve $L_{1}$ by transforming its input and feeding it to the $L_{2}$ solver. The reduction itself must be efficient (polynomial time).

Definition 6.11 (NP-Hard and NP-Complete)

A language $L$ is NP-hard if every language in NP is polynomial-time reducible to it ( $L^{'} \leq_{p} L$ for all $L^{'} \in NP$ ).
A language $L$ is NP-complete if it is NP-hard and it is also in NP itself.

The NP-complete problems form a crucial equivalence class. If any NP-complete problem is found to be in P, then it would follow that P = NP. Conversely, if P $\neq =$ NP, then no NP-complete problem can be solved in polynomial time.

6.6.1 The First NP-Complete Problem: SAT (Cook-Levin Theorem)

How do we find the first NP-complete problem? We need to show that every problem in NP can be reduced to it. This was the monumental achievement of Stephen Cook and Leonid Levin.

Theorem 6.9 (Cook-Levin Theorem)

The Boolean Satisfiability Problem (SAT) is NP-complete.

Proof Idea

The proof involves constructing a polynomial-time reduction from any language $L \in NP$ to SAT.

Since $L \in NP$ , there exists a polynomial-time NTM $M$ that accepts $L$ . Let $p (n)$ be the polynomial time bound for $M$ .

For any input $w$ to $M$ , we construct a Boolean formula $Φ_{w}$ that is satisfiable if and only if $M$ accepts $w$ . The formula $Φ_{w}$ encodes the entire computation of $M$ on $w$ .

The variables in $Φ_{w}$ represent the state of $M$ ‘s tape, head position, and internal state at each time step $t$ up to $p (∣ w ∣)$ . For example:

$C_{i, j, t}$ : True if the $i$ -th tape cell contains symbol $X_{j}$ at time $t$ .

$S_{k, t}$ : True if $M$ is in state $q_{k}$ at time $t$ .

$H_{i, t}$ : True if $M$ ‘s head is at position $i$ at time $t$ .

The formula $Φ_{w}$ is a conjunction of several clauses that enforce:

Initial Configuration: The state of $M$ at $t = 0$ corresponds to the start state and input $w$ .

Unique State/Position/Symbol: At any time $t$ , $M$ is in exactly one state, its head is at exactly one position, and each tape cell contains exactly one symbol.

Valid Transitions: The configuration at time $t + 1$ must follow from the configuration at time $t$ according to $M$ ‘s transition function. This is the most complex part, encoding all possible moves.

Accepting State: $M$ reaches an accepting state at some time $t \leq p (∣ w ∣)$ .

The size of $Φ_{w}$ is polynomial in $p (∣ w ∣)$ , and thus polynomial in $∣ w ∣$ . The construction of $Φ_{w}$ can be done in polynomial time.

This reduction shows that if we could solve SAT in polynomial time, we could solve any problem in NP in polynomial time. Since SAT is also in NP (a satisfying assignment can be verified in polynomial time), it is NP-complete.

Once we have one NP-complete problem, we can find others by reduction. Thousands of important problems have been proven to be NP-complete, including:

3-SAT: A restricted version of SAT where each clause has exactly three literals.
CLIQUE: Given a graph $G$ and an integer $k$ , does $G$ contain a complete subgraph of size $k$ ?
VERTEX-COVER: Given a graph $G$ and an integer $k$ , does $G$ have a set of $k$ vertices that touches every edge?
HAM-PATH: Given a graph $G$ , does $G$ contain a Hamiltonian path (a path that visits every node exactly once)?
TSP (Decision Version): Given a list of cities and distances between them, and an integer $D$ , is there a tour that visits every city exactly once and has a total cost less than or equal to $D$ ?

6.6.2 NP-Hardness for Optimization Problems

The concept of NP-hardness can be extended to optimization problems. An optimization problem is typically NP-hard if its decision version (e.g., “is there a solution with cost at most $k$ ?”) is NP-hard.

Definition 6.14 (NP-Hard Optimization Problem)

An optimization problem $U$ is NP-hard if its corresponding threshold language $Lang_{U}$ is NP-hard.

$Lang_{U} = {(x, a) ∣ Opt_{U} (x) \leq number (a)}$ (for minimization problems)
$Lang_{U} = {(x, a) ∣ Opt_{U} (x) \geq number (a)}$ (for maximization problems)

This allows us to classify optimization problems as NP-hard, implying that finding an optimal solution is likely intractable if P $\neq =$ NP.

Summary

Complexity Theory classifies solvable problems by the resources (time, space) required to solve them, distinguishing between practical and intractable problems.
Time and Space Complexity are measured using multi-tape Turing machines, focusing on worst-case asymptotic behavior.
The class P consists of problems solvable in deterministic polynomial time and is the formal model for “efficiently solvable” problems.
The class NP consists of problems whose solutions can be verified in polynomial time (equivalently, solvable by a nondeterministic TM in polynomial time).
The P vs. NP question, which asks if P = NP, is the most important open problem in computer science, with profound implications for many fields.
NP-complete problems are the hardest problems in NP. The discovery of a polynomial-time algorithm for any one of them would imply P = NP.
The Cook-Levin Theorem established SAT as the first NP-complete problem, providing a starting point for proving the NP-completeness of thousands of other problems.
The hierarchy of complexity classes ( $DLOG \subseteq NLOG \subseteq P \subseteq NP \subseteq PSPACE = NPSPACE \subseteq EXPTIME$ ) provides a structured view of computational difficulty.

Previous Chapter: Chapter 5 - Computability Next Up: Chapter 7 - Algorithmics for Hard Problems

Exercises

Exercise 6.1 (CLIQUE is in NP)

A $k$ -clique in a graph is a set of $k$ vertices where every two vertices are connected by an edge. The language CLIQUE is defined as ${⟨ G, k ⟩ ∣ G is a graph with a k-clique}$ . Show that CLIQUE is in NP by describing a polynomial-time verifier for it.

Solution

To show that CLIQUE is in NP, we need to describe a polynomial-time verifier $V$ that takes an instance $⟨ G, k ⟩$ and a certificate $c$ .

Input to Verifier: $⟨⟨ G, k ⟩, c ⟩$ .

Certificate $c$ : The certificate $c$ is a list of $k$ vertices, say $v_{1}, v_{2}, \dots, v_{k}$ , which are claimed to form a $k$ -clique in $G$ .

Verification Algorithm $V$ : a. Check if $c$ indeed contains $k$ distinct vertices from the vertex set of $G$ . If not, reject. b. For every pair of distinct vertices $(v_{i}, v_{j})$ in $c$ , check if there is an edge between $v_{i}$ and $v_{j}$ in $G$ . c. If all $(2 k)$ pairs of vertices in $c$ are connected by an edge in $G$ , then $V$ accepts. Otherwise, $V$ rejects.

Runtime Analysis:

Step (a) takes polynomial time (e.g., $O (k \cdot ∣ V ∣)$ to check membership and distinctness).

Step (b) involves checking $(2 k)$ pairs. Since $k \leq ∣ V ∣$ , this is at most $(2 ∣ V ∣)$ checks. Each check involves looking up an edge in $G$ , which can be done in polynomial time (e.g., $O (1)$ if using an adjacency matrix, $O (lo g ∣ V ∣)$ or $O (∣ V ∣)$ if using adjacency lists depending on implementation).

Overall, the verification algorithm runs in time polynomial in the size of $G$ and $k$ .

Since a polynomial-time verifier exists, CLIQUE is in NP.

Exercise 6.2 (P=NP and Cryptography)

Explain why the statement “If P = NP, then all modern public-key cryptography is broken” is likely true.

Solution

Modern public-key cryptography (e.g., RSA, elliptic curve cryptography) relies on the computational difficulty of certain mathematical problems. These problems typically have the characteristic that:

Generating keys/encrypting/decrypting with the private key is efficient.

Breaking the encryption (e.g., finding the private key from the public key) is believed to be computationally intractable.

Many of these “hard” problems, such as integer factorization (for RSA) or the discrete logarithm problem, are known to be in NP. This means that if you are given a potential solution (e.g., the prime factors of a large number), you can verify its correctness in polynomial time.

If P = NP, it would imply that any problem whose solution can be efficiently verified can also be efficiently solved. Therefore, if P = NP, there would exist polynomial-time algorithms for problems like integer factorization and discrete logarithm. An attacker could then efficiently:

Factor large numbers: Break RSA by finding the private key from the public modulus.

Solve discrete logarithms: Break Diffie-Hellman key exchange and other related cryptosystems.

This would effectively render most widely used public-key cryptographic systems insecure, as the underlying “hard” problems would become “easy.”

Exercise 6.3 (Transitivity of Polynomial-Time Reductions)

Show that if $L_{1} \leq_{p} L_{2}$ and $L_{2} \in P$ , then $L_{1} \in P$ .

Solution

To show $L_{1} \in P$ , we need to construct a deterministic polynomial-time algorithm for $L_{1}$ .

Since $L_{2} \in P$ , there exists a deterministic Turing machine $M_{2}$ that decides $L_{2}$ in polynomial time. Let its time complexity be $T_{2} (n) \in O (n^{c})$ for some constant $c$ .

Since $L_{1} \leq_{p} L_{2}$ , there exists a polynomial-time computable function $f$ that transforms instances of $L_{1}$ into instances of $L_{2}$ . Let the time complexity of computing $f (w)$ be $T_{f} (∣ w ∣) \in O (∣ w ∣^{d})$ for some constant $d$ . Also, the length of the output of $f$ is polynomially bounded, i.e., $∣ f (w) ∣ \in O (∣ w ∣^{e})$ for some constant $e$ .

Now, we construct an algorithm $M_{1}$ to decide $L_{1}$ on input $w$ :

a. Compute $y = f (w)$ . This step takes $T_{f} (∣ w ∣) \in O (∣ w ∣^{d})$ time. The length of $y$ is $∣ y ∣ \in O (∣ w ∣^{e})$ . b. Run $M_{2}$ on input $y$ . This step takes $T_{2} (∣ y ∣)$ time. Since $∣ y ∣ \in O (∣ w ∣^{e})$ , $T_{2} (∣ y ∣) \in O ((∣ w ∣^{e})^{c}) = O (∣ w ∣^{ec})$ . c. $M_{1}$ accepts $w$ if and only if $M_{2}$ accepts $y$ .

The total time complexity for $M_{1}$ is $T_{1} (∣ w ∣) = T_{f} (∣ w ∣) + T_{2} (∣ y ∣) \in O (∣ w ∣^{d} + ∣ w ∣^{ec})$ . Since $d$ and $ec$ are constants, this is a polynomial in $∣ w ∣$ . Therefore, $L_{1} \in P$ .

Exercise 6.4 (NP-Hardness of MAX-SAT)

Prove that MAX-SAT is NP-hard.

Solution

To prove that an optimization problem $U$ is NP-hard, we need to show that its corresponding threshold language $Lang_{U}$ is NP-hard. For MAX-SAT, the threshold language is: $Lang_{MAX-SAT} = {⟨ Φ, k ⟩ ∣ Φ is a Boolean formula in CNF, and there exists an assignment that satisfies at least k clauses}$ We will show that $SAT \leq_{p} Lang_{MAX-SAT}$ .

Input to Reduction: An instance of SAT, which is a Boolean formula $Φ$ in CNF. Let $Φ$ have $m$ clauses.

Reduction Function $f$ : The reduction function $f$ takes $Φ$ as input and outputs the pair $⟨ Φ, m ⟩$ . This function is clearly computable in polynomial time (it just counts the clauses).

Equivalence: We need to show that $Φ \in SAT ⟺ f (Φ) \in Lang_{MAX-SAT}$ .

( $\Rightarrow$ ): If $Φ \in SAT$ , then there exists a satisfying assignment for $Φ$ . This assignment satisfies all $m$ clauses. Therefore, there exists an assignment that satisfies at least $m$ clauses. So, $⟨ Φ, m ⟩ \in Lang_{MAX-SAT}$ .

( $\Leftarrow$ ): If $⟨ Φ, m ⟩ \in Lang_{MAX-SAT}$ , then there exists an assignment that satisfies at least $m$ clauses. Since $Φ$ has only $m$ clauses, this means there must be an assignment that satisfies all $m$ clauses. Therefore, $Φ$ is satisfiable, so $Φ \in SAT$ .

Since SAT is NP-complete (and thus NP-hard), and we have a polynomial-time reduction from SAT to $Lang_{MAX-SAT}$ , it follows that $Lang_{MAX-SAT}$ is NP-hard. Therefore, MAX-SAT is NP-hard.

Key Takeaways

Complexity Theory: Beyond Computability: While computability theory tells us what problems can be solved, complexity theory tells us how efficiently they can be solved, classifying them by resource requirements (time and space).
Polynomial Time as “Efficient”: The class P (deterministic polynomial time) is the widely accepted formal definition of problems that are “efficiently solvable” or “practically solvable.”
NP: Efficiently Verifiable Solutions: The class NP (nondeterministic polynomial time) encompasses problems for which a proposed solution can be verified in polynomial time. This is equivalent to problems solvable by a nondeterministic Turing machine in polynomial time.
The P vs. NP Problem: The fundamental open question in computer science is whether P = NP. The prevailing belief is that P $\neq =$ NP, implying that finding solutions is generally harder than verifying them.
NP-Completeness: The Hardest Problems in NP: NP-complete problems are those in NP to which all other NP problems can be efficiently reduced. If any NP-complete problem could be solved in polynomial time, then P would equal NP.
Cook-Levin Theorem: This landmark result established the Boolean Satisfiability Problem (SAT) as the first NP-complete problem, providing a starting point for proving the NP-completeness of thousands of other problems.
Complexity Hierarchy: The relationships between complexity classes form a hierarchy: $DLOG \subseteq NLOG \subseteq P \subseteq NP \subseteq PSPACE = NPSPACE \subseteq EXPTIME$ . Understanding this hierarchy helps to map the landscape of computational difficulty.
Implications for Algorithm Design: Complexity theory guides algorithm design by identifying problems that are likely intractable, prompting the development of approximation algorithms, heuristics, or specialized solutions for restricted problem instances.

CS Notes

Explorer

Chapter 6 - Complexity Theory

6.1 Aims

6.2 Complexity Measures

6.2.1 Time Complexity

Definition 6.1 (Time Complexity)

6.2.2 Space Complexity

Definition 6.2 (Space Complexity)

6.2.3 Robustness of Complexity Measures

6.2.4 Asymptotic Notation

6.3 Complexity Classes and the Class P

Definition 6.5 (Basic Complexity Classes)

6.3.1 The Class P: Efficiently Solvable Problems

6.3.2 Relationships Between Deterministic Complexity Classes

6.3.3 Constructible Functions

Definition 6.6 (Space- and Time-Constructible Functions)

6.4 Nondeterministic Complexity Measures

Definition 6.7 (Nondeterministic Time and Space Complexity)

Definition 6.8 (Nondeterministic Complexity Classes)

6.4.1 Relationships Between Deterministic and Nondeterministic Classes

6.5 The Class NP and Proof Verification

6.5.1 NP as Verifiability

Definition 6.9 (Polynomial-Time Verifier)

6.5.2 The P versus NP Problem

6.6 NP-Completeness

Definition 6.10 (Polynomial-Time Reduction)

Definition 6.11 (NP-Hard and NP-Complete)

6.6.1 The First NP-Complete Problem: SAT (Cook-Levin Theorem)

6.6.2 NP-Hardness for Optimization Problems

Definition 6.14 (NP-Hard Optimization Problem)

Summary

Exercises

Key Takeaways

Table of Contents

Graph View

Backlinks