Chapter 7 - Algorithmics for Hard Problems

In Chapter 6, we delved into the realm of complexity theory, classifying problems based on the computational resources required to solve them. We identified NP-hard problems as those that are at least as difficult as any problem in NP, and NP-complete problems as the hardest problems in NP. Assuming the widely believed conjecture that $P \neq = NP$ , these problems cannot be solved by any polynomial-time algorithm.

This presents a significant challenge: many NP-hard problems, such as the Traveling Salesperson Problem (TSP), the Knapsack Problem, and various scheduling and routing problems, are of immense practical importance. If we cannot solve them efficiently in the worst case, what can we do in practice? This is where the trade-off between finding an absolutely optimal solution and finding a sufficiently good solution efficiently becomes critical.

This chapter explores various algorithmic strategies designed to tackle NP-hard problems. The core idea is to transition from computationally infeasible to manageable complexity by weakening the requirements for the solution. This might involve:

Restricting the Problem Instances: Focusing on subclasses of inputs that are efficiently solvable, or on “typical” instances that are easier than the worst case.
Accepting Superpolynomial Algorithms: Designing algorithms that are exponential, but with a base small enough to be practical for realistic input sizes (e.g., $1. 2^{n}$ instead of $2^{n}$ ).
Weakening Solution Quality: Instead of an optimal solution, accepting a “good enough” solution (approximation algorithms).
Accepting Probabilistic Guarantees: Using randomization to achieve efficiency, with a small, controlled probability of error (randomized algorithms, covered in Chapter 8).

We will explore these concepts, demonstrating how clever algorithmic design can yield practical solutions even for theoretically “hard” problems.

7.1 Aims

By the end of this chapter, you will be able to:

Understand Pseudopolynomial Algorithms: Recognize a class of algorithms that are efficient for number problems with small numerical values, and understand the concept of strongly NP-hard problems.
Design Approximation Algorithms: Develop and analyze algorithms that find near-optimal solutions for optimization problems with provable quality guarantees, such as for Vertex Cover and Metric TSP.
Apply Local Search Techniques: Understand the principles of local search, how neighborhoods are defined, and how it can be used to find local optima for problems like MAX-CUT.
Grasp Simulated Annealing: Learn about a metaheuristic that extends local search to escape local optima and find better solutions, inspired by a physical process.

7.2 Pseudopolynomial Algorithms

Some NP-hard problems, particularly those involving numbers, can be solved by algorithms whose running time is polynomial in the numerical value of the input numbers, but exponential with respect to the length of their binary representation. Such algorithms are called pseudopolynomial.

We consider number problems, where inputs can be interpreted as sequences of numbers. For an input $x = x_{1} # x_{2} # \dots # x_{n}$ , where $x_{i}$ are binary representations of numbers, let $MaxInt (x) = max {Number (x_{i})}$ .

Definition 7.1 (Pseudopolynomial Algorithm)

Let $U$ be a number problem and $A$ an algorithm that solves $U$ . $A$ is a pseudopolynomial algorithm for $U$ if there exists a bivariate polynomial $p$ such that $Time_{A} (x) \in O (p (∣ x ∣, MaxInt (x)))$ for all problem instances $x$ of $U$ .

If $MaxInt (x)$ is bounded by a polynomial in $∣ x ∣$ (e.g., if the numbers in the input are small), then a pseudopolynomial algorithm runs in polynomial time with respect to the input length. This means that for “light” problem instances (where numbers are small), these algorithms are efficient.

The Knapsack Problem

The Knapsack Problem is a classic NP-hard optimization problem:

Input: $n$ objects, each with a weight $w_{i}$ and a benefit $c_{i}$ , and a knapsack capacity $b$ . All $w_{i}, c_{i}, b$ are positive integers.
Goal: Select a subset of objects such that their total weight does not exceed $b$ , and their total benefit is maximized.

We can design a pseudopolynomial algorithm for the Knapsack Problem using dynamic programming. The algorithm DPR (Dynamic Programming for Knapsack) computes, for each object $i$ and each possible total benefit $k$ , the minimum weight required to achieve benefit $k$ using a subset of the first $i$ objects.

Algorithm DPR (Dynamic Programming for Knapsack)

Input: An instance $I = (w_{1}, \dots, w_{n}, c_{1}, \dots, c_{n}, b)$ of the Knapsack Problem. Output: A subset of indices $T$ maximizing total benefit within weight $b$ .

Initialize TRIPLE(0) = {(0, 0, $\emptyset$)}. (Represents: (Current Benefit, Current Weight, Items Included))

For $i = 1$ to $n$ : (Consider adding the $i$ -th object) a. SET(i) = TRIPLE(i-1) (Solutions without object $i$ ) b. For each $(k, W, T) \in TRIPLE (i - 1)$ : If $W + w_{i} \leq b$ : (If object $i$ can be added without exceeding capacity) Add $(k + c_{i}, W + w_{i}, T \cup {i})$ to SET(i). (Solutions including object $i$ ) c. TRIPLE(i) is formed from SET(i) by keeping, for each unique benefit $k$ , only the triple with the minimum weight $W$ . (If multiple ways to achieve benefit $k$ , keep the most efficient one).

Find the triple $(c_{m a x}, W_{m a x}, T_{m a x}) \in TRIPLE (n)$ with the maximum benefit $c_{m a x}$ .

Output: $T_{m a x}$ .

Theorem 7.2

For every instance $I$ of the Knapsack Problem, $Time_{DPR} (I) \in O (∣ I ∣^{2} \cdot MaxInt (I))$ , making DPR a pseudopolynomial algorithm.

Proof

The maximum possible total benefit is $\sum_{j = 1}^{n} c_{j}$ . In the worst case, each $c_{j}$ could be up to $MaxInt (I)$ , so the maximum total benefit is at most $n \cdot MaxInt (I)$ .

In each step $i$ (from $1$ to $n$ ), the set TRIPLE(i) stores information for each reachable benefit. The number of distinct benefit values is at most $n \cdot MaxInt (I)$ . Therefore, $∣ TRIPLE (i) ∣$ is at most $n \cdot MaxInt (I)$ .

Each step $i$ involves iterating through TRIPLE(i-1) and updating SET(i), which takes $O (n \cdot MaxInt (I))$ time.

Since there are $n$ such steps (for $i = 1$ to $n$ ), the total time complexity for Phase 2 is $O (n \cdot (n \cdot MaxInt (I))) = O (n^{2} \cdot MaxInt (I))$ .

Phase 1 and 3 take negligible time. Since $n \leq ∣ I ∣$ (the length of the input encoding all $n$ objects and $b$ ), the overall time complexity is $O (∣ I ∣^{2} \cdot MaxInt (I))$ .

This algorithm is pseudopolynomial because its runtime depends polynomially on the numerical value of the input numbers ( $MaxInt (I)$ ), not just their length.

Strongly NP-Hard Problems

Some NP-hard problems are so difficult that they don’t even admit pseudopolynomial algorithms (unless P=NP). These are called strongly NP-hard.

Definition 7.3 (Strongly NP-Hard)

A number problem $U$ is strongly NP-hard if there exists a polynomial $p$ such that the problem $U$ restricted to instances $x$ where $MaxInt (x) \leq p (∣ x ∣)$ is NP-hard.

Theorem 7.3

If $U$ is a strongly NP-hard number problem, then (assuming $P \neq = NP$ ) there exists no pseudopolynomial algorithm for $U$ .

Proof

Assume for contradiction that $U$ is strongly NP-hard and has a pseudopolynomial algorithm $A$ . By definition of strongly NP-hard, there exists a polynomial $p$ such that the problem $U$ restricted to instances $x$ where $MaxInt (x) \leq p (∣ x ∣)$ is NP-hard. Let this restricted problem be $U_{p}$ .

Since $A$ is a pseudopolynomial algorithm for $U$ , its time complexity is $O (q (∣ x ∣, MaxInt (x)))$ for some bivariate polynomial $q$ . For instances $x$ of $U_{p}$ , we have $MaxInt (x) \leq p (∣ x ∣)$ . Substituting this into the runtime of $A$ : $Time_{A} (x) \in O (q (∣ x ∣, p (∣ x ∣)))$ . Since $q$ and $p$ are polynomials, $q (∣ x ∣, p (∣ x ∣))$ is also a polynomial in $∣ x ∣$ . This means that $A$ solves $U_{p}$ in polynomial time.

However, $U_{p}$ is NP-hard. If $U_{p}$ can be solved in polynomial time, then by definition of NP-hardness, every problem in NP can be solved in polynomial time, implying P = NP. This contradicts our assumption that $P \neq = NP$ . Therefore, no pseudopolynomial algorithm for $U$ can exist if $U$ is strongly NP-hard and $P \neq = NP$ .

This concept helps classify NP-hard problems further. For example, the Traveling Salesperson Problem (TSP) is strongly NP-hard, implying that even if all edge weights are small, it remains NP-hard and thus does not admit a pseudopolynomial algorithm (unless P=NP).

7.3 Approximation Algorithms

For many NP-hard optimization problems, finding an optimal solution is intractable. However, a solution that is “good enough” might be found efficiently. Approximation algorithms aim to find such near-optimal solutions with provable guarantees on their quality.

Definition 7.4 (Approximation Quality)

Let $U$ be an optimization problem and $A$ an algorithm for $U$ . For an instance $x$ , let $cost (A (x))$ be the cost of the solution found by $A$ , and $Opt_{U} (x)$ be the cost of an optimal solution. The approximation quality of $A$ on $x$ is: $Quality_{A} (x) = max {\frac{cost ( A ( x ))}{Opt _{U} ( x )}, \frac{Opt _{U} ( x )}{cost ( A ( x ))}}$ For a positive number $δ \geq 1$ , $A$ is a $δ$ -approximation algorithm for $U$ if $Quality_{A} (x) \leq δ$ for all instances $x$ . The definition ensures that $Quality_{A} (x) \geq 1$ , with values closer to 1 indicating better approximation.

Minimal Vertex Cover (MIN-VC)

The Vertex Cover problem is to find a minimum set of vertices in a graph such that every edge is incident to at least one vertex in the set.

Algorithm VCA (Vertex Cover Approximation)

Input: Graph $G = (V, E)$ Output: A vertex cover $C$ for $G$ .

Initialize $C = \emptyset$ .

Initialize $E^{'} = E$ (set of uncovered edges).

While $E^{'} \neq = \emptyset$ : a. Pick an arbitrary edge ${u, v}$ from $E^{'}$ . b. Add $u$ and $v$ to $C$ . c. Remove all edges incident to $u$ or $v$ from $E^{'}$ .

Output: $C$ .

Theorem 7.4

The algorithm VCA is a polynomial-time 2-approximation algorithm for MIN-VC.

Proof

Polynomial Time: The algorithm iterates as long as there are uncovered edges. In each iteration, one edge is chosen, and its two endpoints are added to the cover. All edges incident to these two vertices are removed. Since at least one edge is removed in each iteration, the loop runs at most $∣ E ∣$ times. Operations within the loop (picking an edge, adding to $C$ , removing edges) can be implemented efficiently (e.g., using adjacency lists). Thus, $Time_{VCA} (G) \in O (∣ V ∣ + ∣ E ∣)$ , which is polynomial.

Vertex Cover: When the algorithm terminates, $E^{'}$ is empty, meaning all edges in the original graph $E$ are covered by at least one vertex in $C$ . So $C$ is indeed a vertex cover.

Approximation Quality: Let $A$ be the set of edges chosen in step 3a throughout the algorithm’s execution. By construction, all edges in $A$ are disjoint (they form a matching). For each edge ${u, v} \in A$ , both $u$ and $v$ are added to $C$ . Therefore, $∣ C ∣ = 2 \cdot ∣ A ∣$ . Now, consider any optimal vertex cover $Opt_{MIN-VC} (G)$ . To cover all edges in $A$ , $Opt_{MIN-VC} (G)$ must include at least one endpoint from each edge in $A$ . Since the edges in $A$ are disjoint, $∣ Opt_{MIN-VC} (G) ∣$ must contain at least $∣ A ∣$ vertices. So, $∣ Opt_{MIN-VC} (G) ∣ \geq ∣ A ∣$ . Combining these, the approximation ratio is: $\frac{cost ( C )}{∣ Opt _{MIN-VC} ( G ) ∣} = \frac{∣ C ∣}{∣ Opt _{MIN-VC} ( G ) ∣} = \frac{2 \cdot ∣ A ∣}{∣ Opt _{MIN-VC} ( G ) ∣} \leq \frac{2 \cdot ∣ A ∣}{∣ A ∣} = 2$ Thus, VCA is a 2-approximation algorithm.

Traveling Salesperson Problem (TSP)

For the general TSP, it can be proven that (assuming $P \neq = NP$ ) no polynomial $δ$ -approximation algorithm exists for any constant $δ \geq 1$ . This means TSP is “too hard” for approximation in the general case.

However, for the Metric TSP ( $Δ$ -TSP), where edge costs satisfy the triangle inequality ( $c (u, v) \leq c (u, w) + c (w, v)$ for all vertices $u, v, w$ ), approximation algorithms exist.

Algorithm SB (Spanning Tree-Based for $Δ$ -TSP)

Input: Complete graph $G = (V, E)$ with cost function $c$ satisfying triangle inequality. Output: A Hamiltonian cycle $H$ .

Compute a Minimum Spanning Tree (MST) $T$ of $G$ with respect to the edge costs $c$ .

Perform a depth-first search (DFS) traversal of $T$ , starting from an arbitrary node $v$ . Let $W$ be the sequence of nodes visited in the order they are first encountered. This traversal effectively creates an Eulerian tour of $T$ where each edge is traversed twice.

Construct a Hamiltonian cycle $H$ by traversing the nodes in $W$ and adding edges to connect the last node back to the first. When a node is encountered that has already been visited in the current path, it is skipped (a “shortcut” is taken).

Output: $H$ .

Theorem 7.5

The algorithm SB is a polynomial-time 2-approximation algorithm for $Δ$ -TSP.

Proof

Polynomial Time: An MST can be computed in polynomial time (e.g., Kruskal’s or Prim’s algorithm). A DFS traversal also takes polynomial time. Constructing the Hamiltonian cycle from the DFS traversal is also polynomial. Thus, SB is a polynomial-time algorithm.

Approximation Quality: Let $H_{Opt}$ be an optimal Hamiltonian cycle for $G$ . Its cost is $Opt_{Δ -TSP} (G)$ .

If we remove any edge from $H_{Opt}$ , we get a spanning tree. Therefore, the cost of any spanning tree is less than or equal to the cost of $H_{Opt}$ . Since $T$ is a Minimum Spanning Tree, $cost (T) \leq cost (H_{Opt})$ .

The DFS traversal of $T$ (before shortcuts) effectively traverses every edge of $T$ twice (once in each direction). The total cost of this traversal, let’s call it $W^{'}$ , is $2 \cdot cost (T)$ .

From (1) and (2), $cost (W^{'}) = 2 \cdot cost (T) \leq 2 \cdot cost (H_{Opt})$ .

The Hamiltonian cycle $H$ is constructed from $W^{'}$ by taking “shortcuts.” For example, if $W^{'}$ visits nodes $u, w, v$ in sequence, and $w$ is already visited, we might skip $w$ and go directly from $u$ to $v$ . Due to the triangle inequality, $cost (u, v) \leq cost (u, w) + cost (w, v)$ . This means that taking shortcuts never increases the total cost. Therefore, $cost (H) \leq cost (W^{'})$ .

Combining these inequalities: $cost (H) \leq cost (W^{'}) \leq 2 \cdot cost (H_{Opt})$ . Thus, SB is a 2-approximation algorithm for $Δ$ -TSP.

7.4 Local Search

Local search is a general algorithmic paradigm for optimization problems. It starts with an initial feasible solution and iteratively tries to improve it by making small, local changes. The process continues until no further improvement can be made within the defined “neighborhood” of the current solution.

Definition 7.5 (Neighborhood)

For an optimization problem $U$ and an instance $x$ , a neighborhood $f_{x} : M (x) \to P (M (x))$ defines, for each feasible solution $α \in M (x)$ , a set of “neighboring” solutions $f_{x} (α)$ . A solution $β \in f_{x} (α)$ is a neighbor of $α$ . A neighborhood function $f_{x}$ typically satisfies:

Reflexivity: $α \in f_{x} (α)$ (a solution is its own neighbor, or a variant where it’s excluded if an improvement is found).
Symmetry: If $β \in f_{x} (α)$ , then $α \in f_{x} (β)$ .
Reachability: For any two solutions $α, β \in M (x)$ , there exists a path of neighbors connecting them.

An admissible solution $α$ is a local optimum with respect to $f_{x}$ if no neighbor in $f_{x} (α)$ is better than $α$ .

Algorithm LS (Local Search)

Input: Instance $x$ of optimization problem $U$ , neighborhood function $f_{x}$ . Output: A local optimum $α$ .

Compute an initial admissible solution $α \in M (x)$ .

While $α$ is not a local optimum with respect to $f_{x}$ : a. Find a $β \in f_{x} (α)$ such that $β$ is better than $α$ (e.g., $cost (β) < cost (α)$ for minimization, or $cost (β) > cost (α)$ for maximization). b. Set $α := β$ .

Output: $α$ .

Example: MAX-CUT

The MAX-CUT problem is to partition the vertices of a graph into two sets such that the number of edges between the sets is maximized.

A simple neighborhood for MAX-CUT: for a given partition $(V_{1}, V_{2})$ , a neighbor is any partition obtained by moving a single vertex from one set to the other.

Algorithm LS-CUT (Local Search for MAX-CUT)

Input: Graph $G = (V, E)$ . Output: A cut $(S, V ∖ S)$ .

Initialize $S = \emptyset$ . (Initial cut is $(\emptyset, V)$ ).

While there exists a vertex $v \in V$ such that moving $v$ to the other side of the cut improves the cut value: a. Move $v$ to the side that improves the cut.

Output: The final cut $(S, V ∖ S)$ .

Theorem 7.6

LS-CUT is a polynomial-time 2-approximation algorithm for MAX-CUT.

Proof

Polynomial Time: In each iteration, the algorithm checks all vertices to see if moving one improves the cut. Checking a single vertex takes $O (degree (v))$ time. So, finding an improving move takes $O (∣ V ∣ + ∣ E ∣)$ time. Each time a vertex is moved, the cut value increases. The maximum possible cut value is $∣ E ∣$ . Therefore, the loop can run at most $∣ E ∣$ times. The total time complexity is $O (∣ E ∣ \cdot (∣ V ∣ + ∣ E ∣))$ , which is polynomial.

Approximation Quality: Let $(Y_{1}, Y_{2})$ be the local optimum found by LS-CUT. This means that for any vertex $v \in Y_{1}$ , moving $v$ to $Y_{2}$ would not increase the cut value. This implies that the number of edges connecting $v$ to $Y_{2}$ is less than or equal to the number of edges connecting $v$ to $Y_{1}$ . Summing this over all vertices in $Y_{1}$ and $Y_{2}$ , it can be shown that the value of the cut $(Y_{1}, Y_{2})$ is at least half the total number of edges in the graph. Since the optimal cut value is at most $∣ E ∣$ , and we know the local optimum cost $(Y_{1}, Y_{2}) \geq ∣ E ∣/2$ , the approximation ratio is $\frac{∣ Opt ∣}{cost} \leq \frac{∣ E ∣}{∣ E ∣/2} = 2$ .

A major drawback of simple local search is that it can get stuck in a local optimum that is far from the global optimum.

7.5 Simulated Annealing

Simulated Annealing (SA) is a metaheuristic inspired by the physical process of annealing metals, designed to overcome the limitation of local search by allowing the search to “jump out” of local optima.

In metallurgy, annealing involves heating a metal to a high temperature and then slowly cooling it. At high temperatures, particles have enough energy to move freely, exploring many configurations. As the temperature slowly decreases, particles settle into a low-energy, stable (optimal) configuration.

Simulated Annealing applies this idea to optimization:

System states correspond to admissible solutions.
Energy of a state corresponds to the cost of a solution.
Optimal state corresponds to the optimal solution.
Temperature is a control parameter that decreases over time.

Algorithm SA (Simulated Annealing)

Input: Problem instance $x$ , initial temperature $T_{0}$ , cooling schedule $g$ . Output: An approximate optimal solution $α$ .

Compute an initial admissible solution $α \in M (x)$ .

Set current temperature $T = T_{0}$ .

While $T$ is not “very close to 0” (or a stopping criterion is met): a. Randomly choose a neighbor $β$ from $f_{x} (α)$ . b. Calculate $Δ E = cost (β) - cost (α)$ . (Assuming minimization, so $Δ E < 0$ is an improvement). c. If $Δ E \leq 0$ ( $β$ is better or equal): Set $α := β$ . (Always accept improving moves). d. Else ( $β$ is worse): Accept $β$ as the new solution with probability $e^{- Δ E / T}$ . (This is the Metropolis criterion. It allows escaping local optima, as accepting a worse solution occasionally prevents the algorithm from getting stuck in a suboptimal peak. The probability decreases for larger deteriorations and lower temperatures.) e. Update $T$ according to the cooling schedule $g$ (e.g., $T := T \cdot cooling_rate$ , or $T := T / (1 + α T)$ ).

Output: $α$ .

The key feature of SA is the probabilistic acceptance of worse solutions. This probability decreases as $Δ E$ increases (stronger deterioration is less likely) and as $T$ decreases (less likely to accept worse solutions at lower temperatures). With a carefully chosen cooling schedule, SA can converge to a global optimum, though often without guarantees on the number of iterations. It is widely used as a robust heuristic for many hard problems.

Summary

Tackling Intractability: For NP-hard problems, which are theoretically intractable in the worst case (assuming $P \neq = NP$ ), practical solutions are sought by relaxing strict requirements for optimality or efficiency.
Pseudopolynomial Algorithms: These algorithms are efficient when numerical input values are small, even if they are exponential in the input’s bit length. The Knapsack Problem is a prime example. Strongly NP-hard problems, however, do not admit such algorithms (unless P=NP).
Approximation Algorithms: These provide polynomial-time solutions for optimization problems with a provable guarantee on how close the solution is to the optimum (e.g., a 2-approximation for MIN-VC or Metric TSP). This offers a valuable trade-off between optimality and efficiency.
Local Search: A heuristic approach that iteratively improves a solution by making small, local changes until a local optimum is reached. While simple and widely applicable, it can get trapped in suboptimal local optima.
Simulated Annealing: A metaheuristic inspired by thermodynamics that enhances local search by allowing occasional moves to worse solutions. This probabilistic mechanism helps escape local optima, enabling a more thorough exploration of the solution space and often leading to better solutions for complex problems.
Art of Algorithm Design: This chapter highlights that even for problems deemed “hard” by complexity theory, clever algorithmic design, often involving compromises on optimality or generality, can yield effective and practical solutions.

Previous Chapter: Chapter 6 - Complexity Theory Next Up: Chapter 8 - Randomization

Exercises

Exercise 7.1 (Knapsack Problem Simulation)

Consider the Knapsack Problem instance with weights $w = (1, 3, 5)$ , benefits $c = (6, 7, 4)$ , and capacity $b = 8$ . Simulate the DPR algorithm to find the optimal solution.

Solution

Input: $I = (w_{1} = 1, w_{2} = 3, w_{3} = 5, c_{1} = 6, c_{2} = 7, c_{3} = 4, b = 8)$

Phase 1: Initialize TRIPLE(0) TRIPLE(0) = {(0, 0, $\emptyset$)}

Phase 2: Iterations

i = 1 (Object 1: $w_{1} = 1, c_{1} = 6$ ) SET(1) = TRIPLE(0) = {(0, 0, $\emptyset$)} Add object 1: $(0 + 6, 0 + 1, \emptyset \cup {1}) = (6, 1, {1})$ SET(1) = {(0, 0, $\emptyset$), (6, 1, \{1\})} TRIPLE(1) (no duplicates, so same as SET(1)): {(0, 0, $\emptyset$), (6, 1, \{1\})}

i = 2 (Object 2: $w_{2} = 3, c_{2} = 7$ ) SET(2) = TRIPLE(1) = {(0, 0, $\emptyset$), (6, 1, \{1\})}

From $(0, 0, \emptyset)$ : Add object 2. $W = 0 + 3 = 3 \leq b = 8$ . New triple: $(0 + 7, 0 + 3, \emptyset \cup {2}) = (7, 3, {2})$ SET(2) = {(0, 0, $\emptyset$), (6, 1, \{1\}), (7, 3, \{2\})}

From $(6, 1, {1})$ : Add object 2. $W = 1 + 3 = 4 \leq b = 8$ . New triple: $(6 + 7, 1 + 3, {1} \cup {2}) = (13, 4, {1, 2})$ SET(2) = {(0, 0, $\emptyset$), (6, 1, \{1\}), (7, 3, \{2\}), (13, 4, \{1, 2\})}

TRIPLE(2) (no duplicates in benefit values): {(0, 0, $\emptyset$), (6, 1, \{1\}), (7, 3, \{2\}), (13, 4, \{1, 2\})}

i = 3 (Object 3: $w_{3} = 5, c_{3} = 4$ ) SET(3) = TRIPLE(2) = {(0, 0, $\emptyset$), (6, 1, \{1\}), (7, 3, \{2\}), (13, 4, \{1, 2\})}

From $(0, 0, \emptyset)$ : Add object 3. $W = 0 + 5 = 5 \leq b = 8$ . New triple: $(0 + 4, 0 + 5, \emptyset \cup {3}) = (4, 5, {3})$ SET(3) = {(0, 0, $\emptyset$), (6, 1, \{1\}), (7, 3, \{2\}), (13, 4, \{1, 2\}), (4, 5, \{3\})}

From $(6, 1, {1})$ : Add object 3. $W = 1 + 5 = 6 \leq b = 8$ . New triple: $(6 + 4, 1 + 5, {1} \cup {3}) = (10, 6, {1, 3})$ SET(3) = {(0, 0, $\emptyset$), (6, 1, \{1\}), (7, 3, \{2\}), (13, 4, \{1, 2\}), (4, 5, \{3\}), (10, 6, \{1, 3\})}

From $(7, 3, {2})$ : Add object 3. $W = 3 + 5 = 8 \leq b = 8$ . New triple: $(7 + 4, 3 + 5, {2} \cup {3}) = (11, 8, {2, 3})$ SET(3) = {(0, 0, $\emptyset$), (6, 1, \{1\}), (7, 3, \{2\}), (13, 4, \{1, 2\}), (4, 5, \{3\}), (10, 6, \{1, 3\}), (11, 8, \{2, 3\})}

From $(13, 4, {1, 2})$ : Add object 3. $W = 4 + 5 = 9 > b = 8$ . (Cannot add)

TRIPLE(3) (no duplicates in benefit values): {(0, 0, $\emptyset$), (4, 5, \{3\}), (6, 1, \{1\}), (7, 3, \{2\}), (10, 6, \{1, 3\}), (11, 8, \{2, 3\}), (13, 4, \{1, 2\})}

Phase 3: Find Max Benefit The maximum benefit in TRIPLE(3) is 13, corresponding to the triple $(13, 4, {1, 2})$ .

Output: The index set {1, 2}. (Objects 1 and 2 give benefit 13 with weight 4, which is $\leq 8$ ).

Exercise 7.2 (Strongly NP-Hard Proof)

Consider the Weighted Vertex Cover problem: given a graph $G = (V, E)$ with weights $w (v) \geq 0$ for each vertex $v \in V$ , find a vertex cover $C \subseteq V$ such that $\sum_{v \in C} w (v)$ is minimized. Prove that this problem is strongly NP-hard.

Solution

To prove that Weighted Vertex Cover (WVC) is strongly NP-hard, we need to show that even if the numerical values of the weights are polynomially bounded by the input length, the problem remains NP-hard. This is typically done by reducing a known NP-hard problem (which is not a number problem, or whose numerical values are small) to WVC.

We know that the unweighted Vertex Cover (VC) problem is NP-hard. VC is a special case of WVC where all vertex weights are 1. In this case, $MaxInt (x)$ (where $x$ encodes the graph and weights) would be 1, which is polynomially bounded by the input length (it’s a constant).

Since VC is NP-hard, and it is equivalent to WVC where $MaxInt (x) = 1$ (a polynomial in $∣ x ∣$ ), it means that WVC restricted to instances with polynomially bounded weights is NP-hard.

Therefore, by Definition 7.3, Weighted Vertex Cover is strongly NP-hard.

Exercise 7.3 (2-Exchange Neighborhood for TSP)

The 2-Exchange neighborhood for TSP involves taking a Hamiltonian cycle, removing two non-adjacent edges $(u, v)$ and $(x, y)$ , and adding edges $(u, x)$ and $(v, y)$ to form a new Hamiltonian cycle. Does this neighborhood satisfy the conditions of Definition 7.5 (reflexivity, symmetry, reachability)?

Solution

Let $M (x)$ be the set of all Hamiltonian cycles for a given instance $x$ of TSP.

Reflexivity ( $α \in f_{x} (α)$ ): This condition is typically interpreted as “a solution is always in its own neighborhood.” If the 2-Exchange operation is defined such that it must result in a different cycle, then it’s not reflexive. However, if we allow the “new” edges to be the same as the “removed” edges (i.e., no actual change to the cycle), then it would be reflexive. In practice, local search algorithms usually only consider neighbors that are strictly better, so the current solution is implicitly considered its own neighbor if no better one exists. For formal definition, it’s usually assumed to be reflexive.

Symmetry (if $β \in f_{x} (α)$ then $α \in f_{x} (β)$ ): Yes, this condition is satisfied. If a cycle $β$ is obtained from $α$ by removing edges $(u, v)$ and $(x, y)$ and adding $(u, x)$ and $(v, y)$ , then $α$ can be obtained from $β$ by removing $(u, x)$ and $(v, y)$ and adding $(u, v)$ and $(x, y)$ . The operation is reversible.

Reachability (any $β$ reachable from any $α$ ): Yes, this condition is satisfied. It is a known result in graph theory that any Hamiltonian cycle in a complete graph can be transformed into any other Hamiltonian cycle using a sequence of 2-Exchange operations. This means that the 2-Exchange neighborhood connects all possible Hamiltonian cycles, allowing the local search to explore the entire solution space.

Exercise 7.4 (LS-CUT Polynomial Time Proof)

Prove that LS-CUT (Algorithm 7.6) is a polynomial algorithm.

Solution

The LS-CUT algorithm works by iteratively moving a vertex from one partition to another if such a move improves the cut value.

Initialization: Initializing $S = \emptyset$ takes $O (1)$ time.

Loop Iterations: The while loop continues as long as there exists a vertex whose move improves the cut. Each time a vertex is moved, the cut value (number of edges between $S$ and $V ∖ S$ ) strictly increases. Since the maximum possible cut value is $∣ E ∣$ (the total number of edges in the graph), the loop can run at most $∣ E ∣$ times.

Finding an Improving Move: In each iteration of the while loop, the algorithm needs to check if an improving move exists. This involves iterating through all vertices $v \in V$ . For each vertex $v$ , we calculate the change in the cut value if $v$ is moved to the other partition. This calculation involves summing the number of edges connecting $v$ to vertices within its current partition and to vertices in the other partition. This takes $O (degree (v))$ time. Summing over all vertices, finding the best improving move (or just any improving move) takes $O (∣ V ∣ + ∣ E ∣)$ time.

Total Time Complexity: The total time complexity is the product of the maximum number of iterations and the time taken per iteration: $O (∣ E ∣ \cdot (∣ V ∣ + ∣ E ∣))$ . Since $∣ E ∣ \leq ∣ V ∣^{2}$ for a simple graph, this simplifies to $O (∣ V ∣^{2} \cdot ∣ V ∣^{2}) = O (∣ V ∣^{4})$ , which is a polynomial in the number of vertices.

Therefore, LS-CUT is a polynomial algorithm.

CS Notes

Explorer

Chapter 7 - Algorithmics for Hard Problems

7.1 Aims

7.2 Pseudopolynomial Algorithms

Definition 7.1 (Pseudopolynomial Algorithm)

The Knapsack Problem

Strongly NP-Hard Problems

Definition 7.3 (Strongly NP-Hard)

7.3 Approximation Algorithms

Definition 7.4 (Approximation Quality)

Minimal Vertex Cover (MIN-VC)

Traveling Salesperson Problem (TSP)

7.4 Local Search

Definition 7.5 (Neighborhood)

Example: MAX-CUT

7.5 Simulated Annealing

Summary

Exercises

Table of Contents

Graph View

Backlinks