Complexity Theory: L-Completeness, NL-Completeness, and the Polynomial Hierarchy

Author

Your Name

Published

January 28, 2025

Introduction

This lecture delves into the realm of complete problems within various complexity classes. We initiate our exploration with L-complete and NL-complete problems, laying the groundwork for understanding the intricate relationships between a complexity class and its corresponding co-class. A key theorem establishes that a problem’s completeness for a class is intrinsically linked to its complement’s completeness for the co-class.

The concepts of reachability and unreachability in directed graphs are introduced, highlighting their profound connection to the 2-SAT problem. We rigorously prove that 2-SAT is NL-complete, showcasing the power of reductions in establishing completeness.

Furthermore, the lecture examines co-NP-complete problems, with a focus on unsatisfiability and validity. These problems provide a deeper understanding of the challenges posed by co-NP.

Finally, we introduce the polynomial hierarchy (PH), an infinite hierarchy of complexity classes nestled between NP and PSPACE. The definition of PH is presented through the lens of Turing machines equipped with oracles, offering a unique perspective on the structure of this hierarchy. We also mention that the classes of the polynomial hierarchy can be defined using a logical characterization with quantified boolean formulas, which will be explored in the next lectures. The polynomial hierarchy is defined recursively, with each level built upon the previous one using oracles. The base level, \(\Delta_0 = \Sigma_0 = \Pi_0\), is equivalent to the class P. Subsequent levels are defined as follows: \(\Delta_{i+1} = P^{\Sigma_i}\), \(\Sigma_{i+1} = NP^{\Sigma_i}\), and \(\Pi_{i+1} = \text{co-}NP^{\Sigma_i}\). The union of all these levels constitutes the polynomial hierarchy, denoted as \(PH = \bigcup_{i \in \mathbb{N}} \Sigma_i = \bigcup_{i \in \mathbb{N}} \Pi_i = \bigcup_{i \in \mathbb{N}} \Delta_i\). We will see that if P=NP then the polynomial hierarchy collapses to P, and if the polynomial hierarchy collapses to a level \(i\), it means that \(\Sigma_i = \Sigma_{i+1} = PH\).

L-Complete Problems

Theorem 1. If a language \(L\) is in the class L, and \(L\) is neither the empty set (\(\emptyset\)) nor the universal set (\(\Sigma^*\)), then \(L\) is L-complete.

This theorem states that any language in L that is not trivial (i.e., not empty or the universal language) is L-complete. This highlights the "simplicity" of the class L, where every non-trivial problem is as hard as any other.

Proof. Proof. Let \(L\) be a language in the class L such that \(L \neq \emptyset\) and \(L \neq \Sigma^*\). By definition, this means there exists at least one string \(X\) that belongs to \(L\) (\(X \in L\)) and at least one string \(Y\) that does not belong to \(L\) (\(Y \notin L\)).

Now, consider another language \(L'\) also in the class L. We aim to define a logarithmic-space reduction \(R\) from \(L'\) to \(L\). This reduction will map strings from the alphabet of \(L'\) (\(\Sigma_{L'}\)) to strings in the alphabet of \(L\) (\(\Sigma_L\)).

We define the reduction \(R\) as follows: \[R(Z) = \begin{cases} X & \text{if } Z \in L' \\ Y & \text{if } Z \notin L' \end{cases}\] where \(Z\) is a string in \(\Sigma_{L'}^*\).

This function \(R\) serves as a valid reduction because it satisfies the condition: \(R(Z) \in L \iff Z \in L'\). In other words, \(R\) maps a string \(Z\) to a string in \(L\) if and only if \(Z\) is in \(L'\).

To complete the proof, we need to demonstrate that \(R(Z)\) can be computed using only logarithmic space. Since \(L'\) is in L, there exists a deterministic Turing machine \(M'\) that decides \(L'\) in logarithmic space.

Given an input string \(Z\), we can use \(M'\) to determine whether \(Z \in L'\) in logarithmic space. If \(M'\) accepts \(Z\) (i.e., \(Z \in L'\)), our reduction machine outputs the pre-determined string \(X\). If \(M'\) rejects \(Z\) (i.e., \(Z \notin L'\)), the reduction machine outputs the pre-determined string \(Y\).

The crucial point here is that the space required to compute \(R(Z)\) is essentially the space used by \(M'\). The output strings \(X\) and \(Y\) are fixed and do not depend on the input \(Z\). Therefore, the machine computing the reduction can encode the output behavior (outputting \(X\) or \(Y\)) within its states, eliminating the need for additional memory to store \(X\) or \(Y\). The length of \(X\) and \(Y\) does not influence the space used by the reduction.

Thus, the reduction \(R\) can be computed in logarithmic space, establishing that \(L\) is L-complete. ◻

Complete Problems for NL

Theorem 2. A language \(L\) is complete for a complexity class \(C\) if and only if its complement \(\overline{L}\) is complete for the class co-\(C\).

This theorem establishes a fundamental relationship between the completeness of a language and its complement. It states that if a language is complete for a complexity class, then its complement is complete for the corresponding co-class.

Proof. Proof. Since applying the complement operation twice returns the original class (i.e., co-(co-\(C\)) = \(C\)), it is sufficient to prove one direction of the implication: if \(L\) is \(C\)-complete, then \(\overline{L}\) is co-\(C\)-complete.

Assume \(L\) is \(C\)-complete. This means that for any language \(L' \in C\), there exists a logarithmic-space reduction \(R\) from \(L'\) to \(L\). In other words, for all \(x\), \(x \in L' \iff R(x) \in L\).

Now, we want to show that \(\overline{L}\) is co-\(C\)-complete. To do this, we need to demonstrate that for any language \(L'' \in\) co-\(C\), there exists a logarithmic-space reduction \(R'\) from \(L''\) to \(\overline{L}\).

Since \(L''\) belongs to co-\(C\), its complement \(\overline{L''}\) must belong to \(C\). Let’s denote \(\overline{L''}\) as \(L'\). Since \(L\) is \(C\)-complete and \(L' \in C\), there exists a reduction \(R\) from \(L'\) to \(L\) such that \(x \in L' \iff R(x) \in L\).

However, we also know that \(x \in L' \iff x \notin L''\). Therefore, \(x \notin L'' \iff R(x) \in L\). Taking the contrapositive of this statement, we get \(x \in L'' \iff R(x) \notin L\), which is equivalent to \(x \in L'' \iff R(x) \in \overline{L}\).

This demonstrates that the same reduction \(R\) also serves as a reduction from \(L''\) to \(\overline{L}\). Since \(R\) is computable in logarithmic space, we have shown that for any \(L'' \in\) co-\(C\), there is a logarithmic-space reduction to \(\overline{L}\). Thus, \(\overline{L}\) is co-\(C\)-complete. ◻

Corollary 3. If a problem \(L\) is NL-complete, then its complement \(\overline{L}\) is also NL-complete, because NL = co-NL.

Reachability

Definition 1. Reachability: Given a directed graph \(G = (V, E)\) and two nodes \(u, v \in V\), determine whether there is a directed path from \(u\) to \(v\) in \(G\).

This definition introduces the reachability problem, a fundamental problem in graph theory that asks whether a path exists between two given nodes in a directed graph.

Theorem 4. Reachability is NL-complete.

This theorem establishes that the reachability problem is NL-complete, meaning it is both in NL and NL-hard. This makes reachability a canonical problem for the complexity class NL.

Proof. Proof. We need to show two things: (1) Reachability is in NL, and (2) Reachability is NL-hard.

(1) Reachability \(\in\) NL: We can solve Reachability non-deterministically in logarithmic space using the following algorithm:

\(current \gets u\) \(steps \gets 0\) accept Non-deterministically choose a neighbor \(next\) of \(current\) reject \(current \gets next\) \(steps \gets steps + 1\) reject

This algorithm starts at node \(u\) and iteratively guesses the next node to visit. It keeps track of the number of steps taken using a counter. If it reaches node \(v\) within \(|V|\) steps, it accepts; otherwise, it rejects.

The space used by this algorithm is logarithmic because we only need to store the current node, the counter (which requires \(\log |V|\) space), and potentially the chosen neighbor.

(2) Reachability is NL-hard: Let \(L\) be an arbitrary language in NL. By definition, there exists a non-deterministic Turing machine \(N\) that decides \(L\) in logarithmic space, say \(s(n) = O(\log n)\).

We will use the reachability method to construct a reduction from \(L\) to Reachability. For an input \(x\) of length \(n\), we construct the configuration graph \(G_{N,x}\) of \(N\) on \(x\). Recall that each node in \(G_{N,x}\) represents a configuration of \(N\) on \(x\), and there is an edge from configuration \(C_i\) to \(C_j\) if \(N\) can transition from \(C_i\) to \(C_j\) in one step.

The crucial property of this construction is that \(x \in L\) if and only if there is a path from the initial configuration of \(N\) on \(x\) (denoted by \(s\)) to an accepting configuration (denoted by \(t\)) in \(G_{N,x}\).

The reduction \(R\) from \(L\) to Reachability is then defined as \(R(x) = (G_{N,x}, s, t)\). We need to show that this reduction can be computed in logarithmic space.

The size of \(G_{N,x}\) is at most \(2^{O(s(n))} = 2^{O(\log n)}\), which is polynomial in \(n\). Each configuration can be represented using \(O(s(n))\) space. To construct \(G_{N,x}\), we can iterate through all possible pairs of configurations and check if there is a valid transition between them. This can be done in logarithmic space since we only need to store a constant number of configurations at any given time. The fact that the size of \(G_{N,x}\) is polynomial in \(n\) implies that we can compute it using logarithmic space.

Therefore, the reduction \(R\) can be computed in logarithmic space, and we have shown that Reachability is NL-hard.

Since Reachability is both in NL and NL-hard, it is NL-complete. ◻

Definition 2. Unreachability: Given a directed graph \(G = (V, E)\) and two nodes \(u, v \in V\), determine whether there is no directed path from \(u\) to \(v\) in \(G\).

This definition introduces the unreachability problem, which is the complement of the reachability problem. It asks whether there is no path between two given nodes in a directed graph.

Corollary 5. Unreachability is NL-complete.

This corollary states that the unreachability problem is NL-complete. This follows directly from the fact that reachability is NL-complete and that NL = co-NL.

Proof. Proof. Unreachability is the complement of Reachability. Since Reachability is NL-complete, by Corollary 3, Unreachability is also NL-complete. ◻

2-SAT

Definition 3. 2-SAT: Given a Boolean formula \(\phi\) in conjunctive normal form (CNF) where each clause has at most two literals, determine whether \(\phi\) is satisfiable.

This definition introduces the 2-SAT problem, which asks whether a given Boolean formula in CNF, where each clause has at most two literals, is satisfiable.

Example 1. Consider the formula \((x \lor x) \land (\neg x \lor y) \land (\neg y \lor z) \land (\neg z)\). This formula can be simplified to \(x \land (x \to y) \land (y \to z) \land \neg z\).

This example provides a concrete 2-SAT formula and demonstrates how it can be simplified using logical equivalences.

Theorem 6. 2-SAT is NL-complete.

This theorem establishes that the 2-SAT problem is NL-complete. This is a significant result, as it connects a problem in propositional logic to the complexity class NL.

Proof. Proof. We will show that 2-SAT is in NL and that it is NL-hard.

(1) 2-SAT \(\in\) NL: Given a 2-SAT formula \(\phi\), we can construct a directed graph \(G_\phi\), called the implication graph, as follows:

  • For each variable \(x\) in \(\phi\), create two nodes in \(G_\phi\): one labeled \(x\) and one labeled \(\neg x\).

  • For each clause \((l_1 \lor l_2)\) in \(\phi\), add two directed edges to \(G_\phi\): \((\neg l_1, l_2)\) and \((\neg l_2, l_1)\). These edges represent the implications \(\neg l_1 \to l_2\) and \(\neg l_2 \to l_1\).

The crucial observation is that \(\phi\) is unsatisfiable if and only if there exists a variable \(x\) such that there is a path from \(x\) to \(\neg x\) and a path from \(\neg x\) to \(x\) in \(G_\phi\).

We can check this condition non-deterministically in logarithmic space. For each variable \(x\), we can use the non-deterministic reachability algorithm (Algorithm [alg:reachability_NL]) to check for paths from \(x\) to \(\neg x\) and from \(\neg x\) to \(x\). Since we reuse the space for each variable, the total space used remains logarithmic.

(2) 2-SAT is NL-hard: We will reduce Unreachability to 2-SAT. Let \((G, u, v)\) be an instance of Unreachability, where \(G = (V, E)\) is a directed graph and \(u, v \in V\). We construct a 2-SAT formula \(\phi_G\) as follows:

  • For each node \(w \in V\), create a variable \(x_w\) in \(\phi_G\).

  • For each edge \((a, b) \in E\), add the clause \((x_a \to x_b)\), which is equivalent to \((\neg x_a \lor x_b)\).

  • Add the clause \((x_u)\), representing that the start node \(u\) must be true.

  • Add the clause \((\neg x_v)\), representing that the target node \(v\) must be false.

The resulting formula \(\phi_G\) is a conjunction of all these clauses.

We claim that \(\phi_G\) is satisfiable if and only if there is no path from \(u\) to \(v\) in \(G\).

  • If there is no path from \(u\) to \(v\), we can assign true to all variables corresponding to nodes reachable from \(u\) and false to all other variables. This assignment satisfies \(\phi_G\).

  • If \(\phi_G\) is satisfiable, the truth assignment must set \(x_u\) to true and \(x_v\) to false. The clauses corresponding to edges ensure that if \(x_a\) is true and there is an edge \((a, b)\), then \(x_b\) must also be true. Therefore, if there were a path from \(u\) to \(v\), \(x_v\) would have to be true, contradicting the satisfiability of \(\phi_G\).

The reduction can be performed in logarithmic space since we only need to iterate through the nodes and edges of \(G\) and create corresponding variables and clauses.

Since 2-SAT is both in NL and NL-hard, it is NL-complete. ◻

Co-NP-Complete Problems

Definition 4. Unsatisfiability (UNSAT): Given a Boolean formula \(\phi\), decide whether \(\phi\) is not satisfiable. In other words, determine if there is no truth assignment to the variables of \(\phi\) that makes the formula evaluate to true.

This definition introduces the unsatisfiability problem (UNSAT), which is the complement of the satisfiability problem (SAT). It asks whether a given Boolean formula is not satisfiable.

Theorem 7. UNSAT is co-NP-complete.

This theorem states that the unsatisfiability problem (UNSAT) is co-NP-complete. This is a fundamental result, as it establishes a canonical co-NP-complete problem.

Proof. Proof. We know that SAT (Satisfiability) is NP-complete. By Theorem 2, the complement of an NP-complete problem is co-NP-complete. Since UNSAT is the complement of SAT, it follows that UNSAT is co-NP-complete. ◻

Definition 5. Validity: Given a Boolean formula \(\phi\), decide whether \(\phi\) is valid. A formula is valid if it is true for all possible truth assignments to its variables. Valid formulas are also known as tautologies.

This definition introduces the validity problem, which asks whether a given Boolean formula is valid (i.e., a tautology).

Example 2. The formula \(x \lor \neg x\) is valid because it evaluates to true regardless of the truth value assigned to \(x\).

This example provides a concrete example of a valid formula (tautology), which is the formula \(x \lor \neg x\).

Theorem 8. Validity is co-NP-complete.

This theorem states that the validity problem is co-NP-complete. This result is important because it shows that determining whether a formula is a tautology is a co-NP-complete problem.

Proof. Proof. A formula \(\phi\) is valid if and only if its negation \(\neg \phi\) is unsatisfiable. We can express this as: \[\phi \text{ is valid} \iff \neg \phi \text{ is unsatisfiable}\] Since UNSAT is co-NP-complete (Theorem 7), and Validity can be reduced to UNSAT by negating the input formula, it follows that Validity is also co-NP-complete. ◻

Theorem 9. If there exists a language \(L\) such that \(L\) is co-NP-complete and \(L \in\) NP, then NP = co-NP.

This theorem states that if we find a co-NP-complete problem that can also be solved in non-deterministic polynomial time (i.e., it is in NP), it would imply a significant collapse in the complexity hierarchy, making NP equal to co-NP. This is a major open question in complexity theory.

Theorem 10. If P \(\neq\) NP, then there exists a language \(L\) such that \(L \in\) NP, \(L \notin\) P, and \(L\) is not NP-complete. This theorem is known as Ladner’s Theorem.

Ladner’s Theorem postulates that if P and NP are distinct, then there must exist problems in NP that are neither solvable in polynomial time (i.e., not in P) nor NP-complete. These problems would reside in an intermediate zone between P and NP-complete, often referred to as NPI (NP-intermediate).

The Polynomial Hierarchy

Definition 6. A Turing machine with an oracle is an augmented Turing machine equipped with an additional oracle tape and a special query state \(q_?\). During its computation, the machine can write a string \(w\) on the oracle tape and enter the query state. Upon entering the query state, the machine receives an answer in a single step, transitioning to either the state \(q_{yes}\) or \(q_{no}\) depending on whether \(w\) belongs to a predefined oracle language \(A\).

This definition introduces the concept of an oracle Turing machine, which is a Turing machine that can make queries to an oracle (a black box that can solve a specific problem).

Definition 7. Let \(C\) and \(C'\) be complexity classes. \(C^{C'}\) denotes the class of problems solvable by a Turing machine in class \(C\) that has access to an oracle for a language in class \(C'\).

This definition introduces the notation \(C^{C'}\), which represents the class of problems solvable by a Turing machine in class \(C\) with access to an oracle for a language in class \(C'\).

Example 3. P\(^P\) represents the class of problems solvable in polynomial time by a deterministic Turing machine with an oracle for problems in P. Since a polynomial-time machine can already solve problems in P without an oracle, having access to a P oracle does not provide any additional computational power. Therefore, P\(^P\) = P.

This example illustrates that a polynomial-time Turing machine with a P oracle (P\(^P\)) has the same computational power as a standard polynomial-time Turing machine (P), since the oracle does not add any computational power.

Example 4. NP\(^P\) denotes the class of problems solvable in non-deterministic polynomial time by a non-deterministic Turing machine with an oracle for problems in P. Similar to the previous example, the P oracle does not enhance the power of the non-deterministic machine. Hence, NP\(^P\) = NP.

This example demonstrates that a non-deterministic polynomial-time Turing machine with a P oracle (NP\(^P\)) has the same computational power as a standard non-deterministic polynomial-time Turing machine (NP), since the oracle does not add any computational power.

Example 5. P\(^{NP}\) represents the class of problems solvable in polynomial time by a deterministic Turing machine with an oracle for problems in NP. This class is potentially more powerful than NP. The machine can make multiple queries to the NP oracle during its computation, and the results of these queries can influence subsequentqueries. This ability to make adaptive queries based on previous oracle answers can potentially allow the machine to solve problems outside of NP.

This example illustrates that a polynomial-time Turing machine with an NP oracle (P\(^{NP}\)) is potentially more powerful than NP. The ability to make adaptive queries to the NP oracle gives the machine additional computational power.

In P\(^{NP}\), the generation of new queries can depend on the results of previous queries. For instance, the machine might make a query \(w_1\) to the NP oracle. If the oracle returns "yes" (\(w_1\) is in the NP language), the machine might then generate a new query \(w_2\) based on this information. If the oracle returns "no," the machine might generate a different query \(w_3\). This adaptive querying process, where each query can depend on the outcomes of prior queries, is a key feature that distinguishes P\(^{NP}\) from simply making a polynomial number of independent queries to an NP oracle.

Definition 8. The polynomial hierarchy (PH) is defined recursively as follows:

  • \(\Delta_0^P = \Sigma_0^P = \Pi_0^P = P\)

  • \(\Delta_{i+1}^P = P^{\Sigma_i^P}\)

  • \(\Sigma_{i+1}^P = NP^{\Sigma_i^P}\)

  • \(\Pi_{i+1}^P = \text{co-}NP^{\Sigma_i^P}\)

  • \(PH = \bigcup_{i \in \mathbb{N}} \Sigma_i^P = \bigcup_{i \in \mathbb{N}} \Pi_i^P = \bigcup_{i \in \mathbb{N}} \Delta_i^P\)

This definition introduces the polynomial hierarchy (PH), a hierarchy of complexity classes defined using oracle Turing machines. The hierarchy is built recursively, with each level defined in terms of the previous level.

The levels of the polynomial hierarchy can be visualized as follows:

  • Level 0: \(\Delta_0^P = \Sigma_0^P = \Pi_0^P = P\)

  • Level 1: \(\Delta_1^P = P^P = P\), \(\Sigma_1^P = NP^P = NP\), \(\Pi_1^P = \text{co-}NP^P = \text{co-}NP\)

  • Level 2: \(\Delta_2^P = P^{NP}\), \(\Sigma_2^P = NP^{NP}\), \(\Pi_2^P = \text{co-}NP^{NP}\)

  • Level 3: \(\Delta_3^P = P^{\Sigma_2^P} = P^{NP^{NP}}\), \(\Sigma_3^P = NP^{\Sigma_2^P} = NP^{NP^{NP}}\), \(\Pi_3^P = \text{co-}NP^{\Sigma_2^P} = \text{co-}NP^{NP^{NP}}\)

Remark. Remark 1. If P = NP, then the polynomial hierarchy collapses to P, meaning that all classes in PH are equal to P. It is an open question whether all classes in the polynomial hierarchy are distinct. It is possible that they are all different, or that the hierarchy collapses at some level \(i > 0\), i.e., \(\Sigma_i^P = \Sigma_{i+1}^P = PH\).

This remark highlights the potential for the polynomial hierarchy to collapse if P=NP, and it also points out the open question of whether the classes in PH are all distinct or if the hierarchy collapses at some level.

Conclusion

This lecture has traversed a significant landscape of complexity theory, encompassing L-complete and NL-complete problems, the intricate relationship between complexity classes and their complements, and the challenging realm of co-NP-complete problems. A key takeaway is the profound connection established between reachability in graphs and the satisfiability of 2-SAT formulas, culminating in the demonstration that 2-SAT is NL-complete.

Furthermore, we introduced the polynomial hierarchy (PH), a complex and fascinating hierarchy of classes residing between NP and PSPACE. We defined PH using the powerful tool of Turing machines equipped with oracles, providing a glimpse into its intricate structure.

**Follow-up Questions:**

The following questions, arising from the lecture’s content, will be explored in subsequent sessions:

1. **Examples of problems in \(\Sigma_2^P\) and \(\Pi_2^P\):** What are some concrete examples of problems that reside within the \(\Sigma_2^P\) and \(\Pi_2^P\) levels of the polynomial hierarchy? Understanding these examples will provide a more tangible grasp of these classes.

2. **Proving completeness in the polynomial hierarchy:** How can we rigorously prove that a problem is complete for a specific class within the polynomial hierarchy? This involves developing techniques for demonstrating both membership and hardness.

3. **Implications of PH collapse:** What are the broader implications if the polynomial hierarchy collapses at a certain level? This collapse would have significant ramifications for the structure of complexity classes and our understanding of computational complexity.

4. **Relationship with quantified Boolean formulas:** What is the precise relationship between the polynomial hierarchy and quantified Boolean formulas (QBFs)? This connection provides a logical characterization of PH and offers an alternative perspective on its structure.

5. **Adaptive queries in P\(^{NP}\) (Elaboration):** The lecture briefly touched upon the concept of adaptive queries in P\(^{NP}\). A more detailed explanation is needed to clarify how the generation of new queries in P\(^{NP}\) dynamically depends on the results of previous queries. This adaptiveness is a crucial aspect of the power of P\(^{NP}\).

In P\(^{NP}\), a polynomial-time Turing machine has access to an NP oracle. The machine can make a series of queries to this oracle, and the crucial point is that each query can be chosen based on the answers to all preceding queries.

Consider a scenario where the machine makes an initial query \(q_1\) to the NP oracle. Based on the oracle’s response (yes or no), the machine can then decide on the next query, \(q_2\). This \(q_2\) might be entirely different depending on whether \(q_1\) was answered affirmatively or negatively. This process can continue for a polynomial number of steps.

The ability to adaptively choose queries based on prior results is what gives P\(^{NP}\) its potential power. It’s not just about making a polynomial number of independent queries; it’s about using the information gained from each query to strategically formulate the next one. This dynamic interaction with the oracle allows for a more complex computation than simply making a fixed set of queries.

These questions will be addressed in the next lecture, along with a deeper dive into the logical characterization of the polynomial hierarchy and a discussion of complete problems within these classes. We will see that the polynomial hierarchy can be defined using a logical characterization with quantified boolean formulas, providing an alternative perspective on its structure.