Lecture Notes: Complexity Theory
Introduction
This lecture explores fundamental concepts in complexity theory, focusing on the hierarchy theorem, the simulation of non-deterministic Turing machines, the gap theorem, and the relationships between time and space complexity classes. The main objectives are to:
Understand the implications of the hierarchy theorem, particularly in demonstrating the separation between complexity classes such as \(\text{P}\) and \(\text{EXP}\).
Explore the simulation of non-deterministic Turing machines (NTMs) by deterministic Turing machines (DTMs), analyzing the time and space overhead involved.
Establish inclusions and potential separations between key complexity classes, including \(\text{P}\), \(\text{NP}\), and \(\text{EXP}\).
Examine the gap theorem, a somewhat counter-intuitive result that highlights the importance of proper functions in complexity theory.
Investigate the relationships and inclusions between time and space complexity classes, laying the groundwork for more advanced topics.
These topics are crucial for understanding the landscape of computational complexity and the inherent limitations and capabilities of different computational models.
Consequences of the Hierarchy Theorem
Theorem Description: This theorem presents a direct consequence of the Time Hierarchy Theorem. It demonstrates a separation between two specific deterministic time complexity classes.
Let \(\alpha(n)\) be a computable function such that \(\lfloor \alpha(n) \rfloor\) is a proper function. Then: \[\text{TIME}(n \cdot \lfloor \alpha(n) \rfloor) \subsetneq \text{TIME}(n^3)\]
Remark. Remark 1. The result stated in Theorem [thm:hierarchy_consequence] can also be expressed in a more general form directly derived from the Time Hierarchy Theorem: Let \(f(n)\) be a proper function. Then for any function \(t(n)\) such that \(t(n) \geq f(n) \cdot \omega(\log f(n))\), where \(\omega(\log f(n))\) is any function that grows asymptotically faster than \(\log f(n)\), we have: \[\text{TIME}(f(n)) \subsetneq \text{TIME}(t(n))\] The form in Theorem [thm:hierarchy_consequence] is a specific instance of this general result with \(f(n) = n \cdot \lfloor \alpha(n) \rfloor\) and \(t(n) = n^3\).
Corollary Description: This corollary demonstrates a significant separation between fundamental complexity classes: polynomial time (\(\text{P}\)) and exponential time (\(\text{EXP}\)). It is a direct consequence of the Time Hierarchy Theorem. \[\text{P} \subsetneq \text{EXP}\]
Proof. Proof. By definition, the complexity class \(\text{P}\) is the union of all deterministic time complexity classes with polynomial bounds: \[\text{P} = \bigcup_{h \in \mathbb{N}} \text{TIME}(n^h)\] The complexity class \(\text{EXP}\) is the union of all deterministic time complexity classes with exponential bounds where the exponent is a polynomial in \(n\): \[\text{EXP} = \bigcup_{h \in \mathbb{N}} \text{TIME}(2^{p(n)})\] where \(p(n)\) is a polynomial in \(n\). This can also be expressed as \(\text{EXP} = \bigcup_{h \in \mathbb{N}} \text{TIME}(2^{n^h})\).
We can establish the strict inclusion as follows:
Inclusion of P in TIME(\(2^n\)): Since for any \(h \in \mathbb{N}\), \(n^h\) is asymptotically smaller than \(2^n\), we have \(\text{TIME}(n^h) \subseteq \text{TIME}(2^n)\). Therefore, \[\text{P} = \bigcup_{h \in \mathbb{N}} \text{TIME}(n^h) \subseteq \text{TIME}(2^n)\]
Strict Inclusion by Hierarchy Theorem: Applying the Time Hierarchy Theorem, we know that for proper functions \(f(n)\) and \(g(n)\) such that \(f(n)\) is asymptotically smaller than \(g(n)\), \(\text{TIME}(f(n)) \subsetneq \text{TIME}(g(n))\). Consider \(f(n) = 2^n\) and \(g(n) = 2^{3n}\). Since \(2^n\) is a proper function and \(2^n = o(2^{3n})\), the Hierarchy Theorem implies: \[\text{TIME}(2^n) \subsetneq \text{TIME}(2^{3n})\]
Further Inclusions within EXP: We can further embed this within the definition of \(\text{EXP}\): \[\text{TIME}(2^{3n}) \subseteq \text{TIME}(2^{n \cdot (2+1)}) \subseteq \text{TIME}(2^{n^2})\]
Final Inclusion in EXP: Since \(\text{TIME}(2^{n^2})\) is one of the classes whose union defines \(\text{EXP}\), we have: \[\text{TIME}(2^{n^2}) \subseteq \text{EXP}\]
Combining these inclusions, we get: \[\text{P} \subseteq \text{TIME}(2^n) \subsetneq \text{TIME}(2^{3n}) \subseteq \text{TIME}(2^{n^2}) \subseteq \text{EXP}\] The crucial strict inclusion \(\text{TIME}(2^n) \subsetneq \text{TIME}(2^{3n})\) established by the Hierarchy Theorem ensures that \(\text{P}\) is strictly contained within \(\text{EXP}\). ◻
Simulation of Non-Deterministic Turing Machines
Theorem Description: This theorem demonstrates that any language decidable by a non-deterministic Turing machine within a certain time bound can also be decided by a deterministic Turing machine, albeit potentially with a significant increase in computation time. Specifically, if a language is in \(\text{NTIME}(f(n))\), it is also in \(\text{TIME}(c^{f(n)})\) for some constant \(c\). This constant \(c\) is related to the degree of non-determinism of the non-deterministic machine. If \(L \in \text{NTIME}(f(n))\), then \(L \in \text{TIME}(c^{f(n)})\) for some constant \(c\).
Proof. Proof. Let \(L \in \text{NTIME}(f(n))\). This implies the existence of a non-deterministic Turing machine (NTM) \(N\) that decides \(L\) and operates within a time bound of \(f(n)\). Consider an input string \(x\). The computation of \(N\) on \(x\) can be represented as a tree, where each node corresponds to a configuration of \(N\), and each branch represents a possible non-deterministic choice. Since \(N\) operates in time \(f(n)\), the depth of this computation tree is at most \(f(|x|)\).
We can simulate \(N\) using a deterministic Turing machine (DTM) \(M\). The core idea of the simulation is to perform an in-depth search (depth-first search) of the computation tree of \(N\). The deterministic machine \(M\) will have multiple tapes: one for the input, working tapes to simulate the tapes of \(N\), and an additional tape to keep track of the sequence of non-deterministic choices made at each step of the simulation.
Let \(d \ge 2\) be the degree of non-determinism of \(N\), meaning that at each step, \(N\) has at most \(d\) possible next moves. The machine \(M\) simulates \(N\) as follows:
Initialization: \(M\) starts with the initial configuration of \(N\) on the input \(x\). The additional tape for non-deterministic choices is initialized. At the beginning of the simulation of a path, this tape conceptually holds a sequence representing the first choice at each step (e.g., if choices are numbered \(1\) to \(d\), the initial sequence could represent always taking the first choice).
Simulation Step: \(M\) simulates the steps of \(N\) according to the current sequence of choices on the additional tape. The choices can be represented in binary. If the sequence dictates the \(k\)-th possible transition at a given step, \(M\) performs that transition.
Acceptance: If the simulation along a particular path leads to an accepting state of \(N\), then \(M\) accepts the input \(x\).
Rejection and Backtracking: If the simulation along a path leads to a rejecting state or a dead end, \(M\) needs to backtrack and explore other possible computation paths. This is done by updating the sequence of non-deterministic choices on the additional tape. For instance, \(M\) might increment the last choice in the sequence (in binary representation). If the last choice has reached its maximum value (\(d\)), \(M\) backtracks to the previous choice and increments it, and so on, effectively performing a depth-first traversal of the computation tree.
Restarting Computation: After updating the choice sequence, \(M\) restarts the simulation of \(N\) from the initial configuration, following the new sequence of non-deterministic choices. This ensures that all possible computation paths of \(N\) are explored.
The size of the computation tree is at most \(d^{f(|x|)}\), where \(d\) is the degree of non-determinism. In the worst case, \(M\) might have to explore all these paths. The time taken to simulate each step of \(N\) is constant. Therefore, the total number of steps performed by \(M\) is proportional to the number of nodes in the computation tree, which is \(O(d^{f(|x|)})\). Thus, \(L \in \text{TIME}(c^{f(n)})\), where the constant \(c\) corresponds to the degree of non-determinism \(d\) of the machine \(N\). ◻
During the simulation, the deterministic machine \(M\) needs space to store the current configuration of \(N\) and the sequence of non-deterministic choices. The space used by the working tapes of \(N\) can be reused during the depth-first search. The primary additional space required is for the tape that stores the non-deterministic choices. If the degree of non-determinism of \(N\) is \(d\), then each choice can be represented by \(\lceil \log_2 d \rceil\) bits. Since the computation time of \(N\) is at most \(f(n)\), the maximum number of choices in any computation path is \(f(n)\). Therefore, the space required for the additional tape is at most \(f(n) \cdot \lceil \log_2 d \rceil\), which is \(O(f(n))\).
Corollary Description: This corollary applies the result of the previous theorem to the complexity classes NP and EXP, showing that the class of problems solvable in non-deterministic polynomial time is a subset of the class of problems solvable in exponential time. \(\text{NP} \subseteq \text{EXP}\)
Proof. Proof. By definition, the complexity class \(\text{NP}\) is the union of all languages decidable by non-deterministic Turing machines in polynomial time: \(\text{NP} = \bigcup_{h \in \mathbb{N}} \text{NTIME}(n^h)\).
Applying Theorem [thm:ntime_to_time], for any language \(L \in \text{NTIME}(n^h)\), there exists a constant \(c\) such that \(L \in \text{TIME}(c^{n^h})\). We can choose a base of the exponential to be 2, so \(c^{n^h} = 2^{\log_2(c) \cdot n^h}\). Since \(\log_2(c)\) is a constant, \(2^{\log_2(c) \cdot n^h} \subseteq 2^{n^{h+1}}\) for sufficiently large \(n\).
Thus, \(\text{NTIME}(n^h) \subseteq \text{TIME}(2^{k \cdot n^h})\) for some constant \(k\), which is included in \(\text{TIME}(2^{n^{h+1}})\). Since \(\text{EXP} = \bigcup_{k \in \mathbb{N}} \text{TIME}(2^{n^k})\), we conclude that: \[\text{NP} = \bigcup_{h \in \mathbb{N}} \text{NTIME}(n^h) \subseteq \bigcup_{h \in \mathbb{N}} \text{TIME}(2^{n^{h+1}}) \subseteq \text{EXP}\] Therefore, \(\text{NP} \subseteq \text{EXP}\). ◻
Corollary Description: Given the established inclusions \(\text{P} \subseteq \text{NP} \subseteq \text{EXP}\) and the separation \(\text{P} \subsetneq \text{EXP}\), this corollary deduces that at least one of the intermediate inclusions must be strict. Either \(\text{P} \neq \text{NP}\) or \(\text{NP} \neq \text{EXP}\).
Proof. Proof. We have the sequence of inclusions \(\text{P} \subseteq \text{NP} \subseteq \text{EXP}\). From Corollary [cor:p_not_equal_exp], we know that \(\text{P} \subsetneq \text{EXP}\).
Consider the case where \(\text{P} = \text{NP}\). If this were true, then the sequence of inclusions would become \(\text{P} = \text{NP} \subseteq \text{EXP}\). Since we know \(\text{P} \subsetneq \text{EXP}\), it must follow that \(\text{NP} \subsetneq \text{EXP}\).
Now consider the case where \(\text{NP} = \text{EXP}\). If this were true, the sequence of inclusions would become \(\text{P} \subseteq \text{NP} = \text{EXP}\). Since we know \(\text{P} \subsetneq \text{EXP}\), it must follow that \(\text{P} \subsetneq \text{NP}\).
In either scenario, at least one of the inclusions \(\text{P} \subseteq \text{NP}\) or \(\text{NP} \subseteq \text{EXP}\) must be strict. Therefore, either \(\text{P} \neq \text{NP}\) or \(\text{NP} \neq \text{EXP}\). It is also possible that both inequalities are strict. ◻
The Gap Theorem
Theorem Description: The Gap Theorem presents a surprising result in complexity theory: the existence of a function \(F\) that is not proper, for which the time complexity classes \(\text{TIME}(F(n))\) and \(\text{TIME}(2^{F(n)})\) are identical. This theorem underscores the necessity of the "proper function" condition in the Time Hierarchy Theorem, as without it, the hierarchy might collapse. There exists a non-proper function \(F: \mathbb{N} \to \mathbb{N}\) such that \(\text{TIME}(F(n)) = \text{TIME}(2^{F(n)})\).
Proof. Proof. To prove this theorem, we need to construct a non-proper function \(F\) such that \(\text{TIME}(2^{F(n)}) \subseteq \text{TIME}(F(n))\). Consider an enumeration of all Turing machines \(M_0, M_1, M_2, \dots\). Such an enumeration is possible by encoding each Turing machine as a binary string.
We define a predicate \(P(i, k)\) which assesses the running times of the first \(i+1\) Turing machines on inputs of length \(i\). \(P(i, k)\) is true if and only if for every machine \(M_j\) with \(0 \le j \le i\), and for every input \(x\) of length \(|x| = i\), the computation of \(M_j\) on \(x\) satisfies one of the following conditions:
\(M_j\) terminates on \(x\) in at most \(k\) steps.
\(M_j\) terminates on \(x\) in more than \(2^k\) steps.
\(M_j\) does not terminate on \(x\).
The predicate \(P(i, k)\) is decidable. To check \(P(i, k)\), we simulate each of the machines \(M_0, \dots, M_i\) on all possible inputs of length \(i\). For each simulation, we count the number of steps. If a machine halts within \(k\) steps, the condition is met. If it runs for more than \(2^k\) steps without halting, we can stop the simulation and conclude the second or third condition is met. Non-terminating machines do not cause the predicate to be false, as they satisfy the third condition.
Now, we define a sequence of values \(K_j\) for a given \(i\): \(K_1 = 2i\), \(K_2 = 2^{K_1 + 1}\), \(K_3 = 2^{K_2 + 1}\), and in general, \(K_{j+1} = 2^{K_j + 1}\).
For each \(i \in \mathbb{N}\), we define \(F(i)\) to be the smallest value \(K_L\) in the sequence \(K_1, K_2, \dots\) such that the predicate \(P(i, K_L)\) is true. We need to argue that such a \(K_L\) always exists. Consider the behavior of the machines \(M_0, \dots, M_i\) on inputs of length \(i\). For each machine and each input, the computation either terminates in some number of steps or does not terminate. If we consider the set of all terminating computations, there will be a maximum number of steps among them. Since the sequence \(K_j\) grows very rapidly, eventually there will be a \(K_L\) large enough such that all terminating computations of \(M_0, \dots, M_i\) on inputs of length \(i\) either take at most \(K_L\) steps or more than \(2^{K_L}\) steps. Thus, \(P(i, K_L)\) will eventually become true.
Claim: With \(F\) defined as above, \(\text{TIME}(2^{F(n)}) \subseteq \text{TIME}(F(n))\).
Let \(L \in \text{TIME}(2^{F(n)})\). This means there exists a Turing machine \(M\) that decides \(L\) and runs in time at most \(2^{F(n)}\). Since \(M\) is in our enumeration of Turing machines, there exists an index \(j\) such that \(M = M_j\).
Consider any input \(x\) of length \(n \ge j\). By the definition of \(F(n)\), \(F(n) = K_L\) for the smallest \(L\) such that \(P(n, K_L)\) is true. Since \(n \ge j\), the machine \(M_j\) is one of the machines considered in the predicate \(P(n, K_L)\). The predicate \(P(n, F(n))\) being true implies that \(M_j\) on inputs of length \(n\) either terminates in at most \(F(n)\) steps or terminates in more than \(2^{F(n)}\) steps or does not terminate.
Since \(L \in \text{TIME}(2^{F(n)})\), \(M_j\) decides \(L\) in at most \(2^{F(n)}\) steps. This means that for inputs of length \(n\), \(M_j\) must terminate within \(2^{F(n)}\) steps. Given that \(P(n, F(n))\) is true, the only possibility is that \(M_j\) terminates in at most \(F(n)\) steps on inputs of length \(n\).
This holds for all \(n \ge j\). For inputs of length less than \(j\), there are a finite number of inputs, and we can always modify the Turing machine \(M_j\) to handle these cases within a linear time bound, which is subsumed by \(F(n)\) for sufficiently large \(F(n)\).
Therefore, the language \(L\) can be decided by a Turing machine (namely \(M_j\)) in time \(F(n)\), so \(L \in \text{TIME}(F(n))\). This proves that \(\text{TIME}(2^{F(n)}) \subseteq \text{TIME}(F(n))\), and since the reverse inclusion \(\text{TIME}(F(n)) \subseteq \text{TIME}(2^{F(n)})\) always holds (as \(F(n) \le 2^{F(n)}\)), we have \(\text{TIME}(F(n)) = \text{TIME}(2^{F(n)})\). The function \(F\) constructed in this way grows extremely rapidly and is not a proper function. ◻
The Gap Theorem illustrates a scenario where increasing the time bound exponentially does not enlarge the class of solvable problems. This counter-intuitive result stems from the nature of the constructed function \(F\), which creates "gaps" in the possible running times of Turing machines. It emphasizes that the Time Hierarchy Theorem relies crucially on the assumption that the bounding functions are proper. Without this condition, we can construct artificial scenarios where the hierarchy collapses, demonstrating that the properness condition is not just a technical detail but a fundamental requirement for the meaningful separation of complexity classes.
Relationships Between Complexity Classes
Theorem Description: This theorem establishes fundamental relationships between deterministic and non-deterministic time and space complexity classes, assuming \(f\) is a proper function. If \(f\) is a proper function, then:
\(\text{SPACE}(f(n)) \subseteq \text{NSPACE}(f(n))\)
\(\text{TIME}(f(n)) \subseteq \text{NTIME}(f(n))\)
\(\text{TIME}(f(n)) \subseteq \text{SPACE}(f(n))\)
\(\text{SPACE}(f(n)) \subseteq \text{TIME}(c^{f(n)+\log n})\) for some constant \(c\)
\(\text{NTIME}(f(n)) \subseteq \text{SPACE}(f(n))\)
\(\text{NSPACE}(f(n)) \subseteq \text{TIME}(c^{f(n)+\log n})\) for some constant \(c\)
Proof. Proof.
\(\text{SPACE}(f(n)) \subseteq \text{NSPACE}(f(n))\): This inclusion is immediate since any deterministic Turing machine is a special case of a non-deterministic Turing machine with only one choice at each step.
\(\text{TIME}(f(n)) \subseteq \text{NTIME}(f(n))\): Similarly, a deterministic Turing machine’s computation is a specific instance of a non-deterministic Turing machine’s computation where there is no branching.
\(\text{TIME}(f(n)) \subseteq \text{SPACE}(f(n))\): Consider a deterministic Turing machine \(M\) that decides a language \(L\) in time \(f(n)\). During its computation, \(M\) has a finite number of tapes. In each step of the computation, \(M\) can write at most one symbol on each of its tapes. Therefore, after \(f(n)\) steps, the maximum amount of space that \(M\) could have used on any single tape is \(f(n)\). Since the number of tapes is constant, the total space used by \(M\) is bounded by \(f(n)\) multiplied by a constant (the number of tapes), which is \(O(f(n))\). Thus, \(L\) can be decided by a machine using space \(O(f(n))\), and by the space speed-up theorem, we can assume the space used is \(f(n)\).
\(\text{SPACE}(f(n)) \subseteq \text{TIME}(c^{f(n)+\log n})\): This result follows from inclusion 6 and inclusion 1. If a language \(L\) is in \(\text{SPACE}(f(n))\), then by inclusion 1, \(L\) is also in \(\text{NSPACE}(f(n))\). By inclusion 6, \(\text{NSPACE}(f(n)) \subseteq \text{TIME}(c^{f(n)+\log n})\). Therefore, \(\text{SPACE}(f(n)) \subseteq \text{TIME}(c^{f(n)+\log n})\).
\(\text{NTIME}(f(n)) \subseteq \text{SPACE}(f(n))\): Let \(L \in \text{NTIME}(f(n))\), meaning there exists a non-deterministic Turing machine \(N\) that decides \(L\) in time \(f(n)\). We can simulate \(N\) with a deterministic Turing machine \(M\) to show that \(L\) is also in \(\text{SPACE}(f(n))\). The machine \(M\) performs a depth-first search on the computation tree of \(N\). \(M\) uses space to store the current configuration of \(N\) along the path being explored and an additional tape to keep track of the non-deterministic choices made along that path. The key to space efficiency is that when \(M\) backtracks, it can erase the working tapes and reuse the space. Specifically, when \(M\) explores a branch of the computation tree and reaches a rejecting state or a dead end, instead of trying to precisely undo the steps, \(M\) simply updates the choice tape to represent the next branch to explore and restarts the simulation of \(N\) from the initial configuration. Since \(N\) operates in time \(f(n)\), the depth of the computation tree is at most \(f(n)\), and the space needed to store the configuration along a path is \(O(f(n))\). The space for the non-deterministic choices is also \(O(f(n))\). Thus, the total space used by \(M\) is \(O(f(n))\), and by the space speed-up theorem, we can assume the space used is \(f(n)\).
\(\text{NSPACE}(f(n)) \subseteq \text{TIME}(c^{f(n)+\log n})\): The proof of this inclusion involves the reachability method and will be detailed in the subsequent lecture.
◻
The presence of the \(+\log n\) term in the exponent of inclusions 4 and 6 addresses the difference in the typical lower bounds for time and space complexity. Time complexity is generally assumed to be at least linear (to read the input), while space complexity can be sublinear. The term \(c^{\log n}\) ensures that the resulting time complexity is at least linear when \(f(n)\) is sublinear (e.g., a constant or logarithmic). Note that \(c^{\log n} = n^{\log c}\), which is a polynomial in \(n\).
Conclusion
This lecture has provided a comprehensive overview of several fundamental concepts in complexity theory. We began by exploring the implications of the hierarchy theorem, which rigorously demonstrates the separation between complexity classes, notably establishing that \(\text{P}\) is strictly contained within \(\text{EXP}\). We then examined the process of simulating non-deterministic Turing machines (NTMs) using deterministic Turing machines (DTMs), a technique that allowed us to understand the relationship between non-deterministic and deterministic computation and to prove the inclusion of \(\text{NP}\) within \(\text{EXP}\). The gap theorem was presented as a counter-intuitive result, emphasizing the crucial role of the "proper function" condition in establishing meaningful hierarchies between complexity classes. Finally, we laid out several key relationships between time and space complexity classes, including the important inclusion of \(\text{NTIME}(f(n))\) within \(\text{SPACE}(f(n))\).
Key Takeaways:
The hierarchy theorem is a powerful tool for proving the existence of computational problems that are inherently harder than others, establishing formal separations between complexity classes.
The simulation of NTMs by DTMs, while incurring a potential exponential time overhead, provides a fundamental understanding of the power and limitations of non-determinism.
The gap theorem serves as a cautionary tale, highlighting that the properties we expect from complexity classes depend on the nature of the functions defining their boundaries, with proper functions being a critical requirement for many standard results.
The established relationships between time and space complexity classes reveal fundamental connections and trade-offs between these essential computational resources.
Follow-up Questions:
What is the detailed proof of the inclusion \(\text{NSPACE}(f(n)) \subseteq \text{TIME}(c^{f(n)+\log n})\)? This will be the focus of the next lecture, where we will introduce and utilize the reachability method.
How does the reachability method work in detail, and why is it a significant technique not only in complexity theory but also in other fields like model checking and formal verification, as mentioned by the instructor?
Building upon the inclusions discussed, can we identify further refinements or separations between complexity classes? Are there other important relationships or theorems that build upon these fundamental connections between time and space?
These questions will guide our subsequent discussions and further deepen our understanding of the intricate landscape of computational complexity.