Turing Machines and Complexity Theory

Author

Your Name

Published

January 28, 2025

Introduction

In the previous lecture, we introduced two distinct cost criteria for analyzing computational models: logarithmic cost and uniform cost. We highlighted the significance of the logarithmic cost criterion, particularly in scenarios where the computational model has the capacity to expand intermediate results in terms of the number of bits, as is the case with operations like multiplication. This distinction is crucial because it accurately reflects the actual resources consumed by such operations.

We will delve deeper into this concept by formalizing the relationship between these cost criteria when we explore the simulation of random access machines (RAM) using Turing machines. This formalization will also allow us to revisit and further discuss the Extended Church-Turing thesis, which posits that all reasonable models of computation are equivalent in terms of their computational power, up to polynomial time differences.

Logarithmic Cost: Cost of an operation depends on the number of bits involved.
Uniform Cost: Each operation has a constant cost.
Extended Church-Turing Thesis: All reasonable models of computation are equivalent up to polynomial time differences.

Review of Unlimited Register Machines (URM) and Turing Machines

To understand the nuances of cost criteria, let’s briefly review the Unlimited Register Machine (URM) and Turing Machine models. In URM, we have the flexibility to define either a uniform cost or a logarithmic cost for each operation. The choice depends on whether we assign a constant cost to each operation, regardless of the size of the operands, or a cost that is proportional to the number of bits required to represent the operands.

Uniform Cost in Turing Machines

In contrast, Turing machines operate at a much lower level. Each fundamental operation, such as reading a symbol, replacing a symbol, moving the tape head (left or right), and changing the machine’s state, is counted as a single step. This is considered a uniform cost model because each operation is treated as having the same cost, irrespective of the specific symbol being manipulated. This is because Turing machines operate on single digits (bits), and the cardinality of the alphabet is fixed. Consequently, for Turing machines, the distinction between logarithmic and uniform cost criteria becomes irrelevant, as they effectively become the same.

Uniform Cost: Each basic operation (read, write, move, change state) costs one unit. Logarithmic Cost: Not applicable to standard Turing machines due to fixed alphabet size and bit-level operations.

Low-Level Nature of Turing Machines

The low-level nature of Turing machines stems from their manipulation of individual bits. This contrasts with higher-level models where operations might involve entire numbers or data structures. When analyzing the complexity of algorithms, such as sorting algorithms, we often use the length of the input array as a measure of input size. However, this is not always the most accurate representation of input size, especially when dealing with models like URM, where registers can hold arbitrarily large integers. In the case of Turing machines, the input size is directly and intuitively defined as the number of cells on the tape that are used to store the input. This direct correspondence between input size and tape usage simplifies the analysis of Turing machine complexity.

URM: Input size can be complex, depending on register contents. Turing Machine: Input size is the length of the input on the tape.

Formal Definition of Turing Machines

Definition 1 (Turing Machine). A Turing machine (TM) is a theoretical model of computation that consists of a finite-state control unit and an infinite tape. Formally, a Turing machine $M$ is defined as a 5-tuple: \[M = (K, \Sigma, \Gamma, \delta, s)\] where:

$K$ is a finite set of states.
$\Sigma$ is the input alphabet, a finite set of symbols that can be used as input to the machine.
$\Gamma$ is the tape alphabet, a finite set of symbols that can be written on the tape. It includes the input alphabet $\Sigma$ and a special blank symbol, denoted by $\sqcup$, i.e., $\Sigma \subseteq \Gamma$ and $\sqcup \in \Gamma$.
$\delta$ is the transition function, which dictates the behavior of the machine.
$s \in K$ is the initial state, the state in which the machine starts its computation.

The transition function $\delta$ is a mapping that takes the current state and the symbol currently under the tape head as input and returns the next state, the symbol to be written on the tape, and the direction in which the tape head should move. It is formally defined as: \[\delta: K \times \Gamma \rightarrow K \times \Gamma \times \{L, R, -\}\] where:

$L$ represents a move of the tape head one cell to the left.
$R$ represents a move of the tape head one cell to the right.
$-$ represents the tape head staying in the same position.

States ($K$): Finite set controlling the machine’s behavior.
Input Alphabet ($\Sigma$): Symbols used for input.
Tape Alphabet ($\Gamma$): Symbols that can be written on the tape, including $\Sigma$ and the blank symbol $\sqcup$.
Transition Function ($\delta$): Determines the next state, symbol to write, and tape head movement.
Initial State ($s$): Starting state of the machine.

Initial State and Starting Symbol Convention

To ensure consistent behavior at the start of the computation, we impose a convention on the initial state and the starting symbol. We assume that when the machine is in the initial state $s$ and is reading the starting symbol $\triangleright$ (which is a special symbol not in $\Sigma$ and is part of $\Gamma$), the only allowed action is to change the state, leave the starting symbol unchanged, and move the tape head to the right. This convention ensures that the machine does not overwrite the starting symbol or move to the left of the initial position.

When in initial state $s$ and reading the starting symbol $\triangleright$:

Change state.
Leave $\triangleright$ unchanged.
Move tape head to the right.

Configurations

Definition 2 (Configuration). A configuration* of a Turing machine provides a snapshot of the machine’s state at a particular point in its computation. It encapsulates all the information necessary to resume the computation from that point. A configuration includes the following:*

The content of the tape, which represents the sequence of symbols currently written on the tape.
The current state of the machine, which indicates the internal state of the finite-state control unit.
The position of the tape head, which specifies the cell on the tape that is currently being read or written.

A configuration is represented as a string of the form $UqW$, where:

$U$ is a string representing the portion of the tape to the left of the tape head, including the starting symbol $\triangleright$.
$q$ is the current state of the machine, $q \in K$.
$W$ is a string representing the portion of the tape to the right of the tape head.

The tape head is considered to be positioned on the last symbol of the string $U$.

Tape Content: Symbols written on the tape.
Current State ($q$): The machine’s current internal state.
Tape Head Position: Indicated by the end of string $U$.

Represented as $UqW$.

Initial Configuration

Definition 3 (Initial Configuration). The initial configuration of a Turing machine for a given input $X$ is the configuration at the very beginning of the computation. It is represented as $s\triangleright X$, where:

$s$ is the initial state of the machine.
$\triangleright$ is the starting symbol, which is placed at the beginning of the tape.
$X$ is the input string that the machine is supposed to process.

This configuration indicates that the machine starts in the initial state $s$, with the tape head positioned on the starting symbol $\triangleright$, and the input string $X$ written on the tape immediately after the starting symbol.

Represented as $s\triangleright X$, where:

$s$ is the initial state.
$\triangleright$ is the starting symbol.
$X$ is the input string.

Final Configurations

Definition 4 (Final Configuration). **Final configurations* are those from which the computation cannot proceed further. Since the transition function $\delta$ is a total function (i.e., it is defined for all possible combinations of states and tape symbols), the only way for a computation to halt is to reach a final state. These final states are typically designated as yes, no, or halt. A final configuration is of the form $hUW$, where:*

$h$ is a final state, which can be either yes, no, or halt.
$U$ is the string to the left of the tape head.
$W$ is the string to the right of the tape head.

When the machine reaches a final configuration, the computation terminates, and the result of the computation is determined by the final state reached.

Represented as $hUW$, where:

$h$ is a final state (yes, no, or halt).
$U$ is the string to the left of the tape head.
$W$ is the string to the right of the tape head.

Computation terminates upon reaching a final configuration.

Computation Steps

Definition 5 (Computation Step). A computation step* represents a single transition of a Turing machine from one configuration to the next. This transition is determined by the application of the transition function $\delta$. The precise details of how a configuration changes depend on the current state, the symbol under the tape head, and the specific rules defined by $\delta$.*

A single transition from one configuration to the next, determined by the transition function $\delta$.

Computation as a Sequence of Configurations

Definition 6 (Computation Sequence). The entire computation of a Turing machine $M$ on a given input $X$ can be represented as a sequence of configurations. This sequence starts with the initial configuration and proceeds step-by-step until a final configuration is reached (if the machine halts). The sequence can be formally written as: \[s\triangleright X \rightarrow U_1q_1W_1 \rightarrow U_2q_2W_2 \rightarrow \dots \rightarrow hUW\] where:

$s\triangleright X$ is the initial configuration of the machine with input $X$.
$U_i q_i W_i$ represents the configuration of the machine at step $i$.
$hUW$ is a final configuration, where $h$ is a final state (yes, no, or halt).
The arrow $\rightarrow$ denotes a computation step, where the configuration on the left transitions to the configuration on the right by applying the transition function $\delta$.

This sequence of configurations provides a complete trace of the machine’s execution, showing how the tape content, the machine state, and the tape head position change over time. If the machine does not halt, the sequence of configurations will be infinite.

A sequence of configurations: \[s\triangleright X \rightarrow U_1q_1W_1 \rightarrow U_2q_2W_2 \rightarrow \dots \rightarrow hUW\]

Starts with the initial configuration $s\triangleright X$.
Each step $\rightarrow$ is a transition based on $\delta$.
Ends with a final configuration $hUW$ (if the machine halts).

Time Complexity

The time complexity of a Turing machine $M$ quantifies the amount of time (number of computation steps) the machine takes to complete its computation as a function of the input size. Formally, the time complexity of a Turing machine $M$ is a function $f: \mathbb{N} \rightarrow \mathbb{N}$ such that for all inputs $X$ of length $n$, the machine $M$ terminates after at most $f(n)$ steps. This means that for any input of size $n$, the number of computation steps required by the machine to reach a final configuration is bounded by $f(n)$.

Worst-Case Complexity

Definition 7 (Worst-Case Time Complexity). The time complexity defined above represents a worst-case complexity. This is because the function $f(n)$ must provide an upper bound on the number of steps for all* possible inputs of size $n$. In other words, the machine must terminate within $f(n)$ steps for every input of length $n$, regardless of the specific input string. This ensures that the time complexity analysis provides a guarantee on the maximum time the machine will take to complete its computation.*

The time complexity $f(n)$ is an upper bound on the number of steps for all inputs of size $n$.

Notation

Definition 8 (Turing Machine Notation). To facilitate the discussion of Turing machine computations, we introduce the following notation:

$M(X)$: Represents the Turing machine $M$ working on input $X$. This notation is used to denote the execution of the machine on a specific input.
$M(X)\downarrow$: Indicates that the Turing machine $M$ terminates (halts) on input $X$. This means that the machine reaches a final configuration after a finite number of steps.
$M(X)\uparrow$: Indicates that the Turing machine $M$ does not terminate on input $X$. This means that the machine enters an infinite loop and never reaches a final configuration.
$M(X) = \text{yes}$: Indicates that the Turing machine $M$ accepts the input $X$. This means that the machine terminates in a final state designated as yes.
$M(X) = \text{no}$: Indicates that the Turing machine $M$ refuses the input $X$. This means that the machine terminates in a final state designated as no.

These notations will be used throughout the course to concisely describe the behavior of Turing machines on various inputs.

$M(X)$: Machine $M$ on input $X$.
$M(X)\downarrow$: $M$ terminates on $X$.
$M(X)\uparrow$: $M$ does not terminate on $X$.
$M(X) = \text{yes}$: $M$ accepts $X$.
$M(X) = \text{no}$: $M$ refuses $X$.

Decidable and Accepted Languages

In the context of Turing machines, we can define two important concepts related to how a machine interacts with a language: deciding and accepting a language.

A Turing machine $M$ decides a language $L$ if, for all input strings $X$:
- If $X$ is in the language $L$ (i.e., $X \in L$), then $M(X) = \text{yes}$. This means that the machine halts and accepts the input.
- If $X$ is not in the language $L$ (i.e., $X \notin L$), then $M(X) = \text{no}$. This means that the machine halts and rejects the input.
A Turing machine $M$ accepts a language $L$ if, for all input strings $X$:
- If $X$ is in the language $L$ (i.e., $X \in L$), then $M(X) = \text{yes}$. This means that the machine halts and accepts the input.
- If $X$ is not in the language $L$ (i.e., $X \notin L$), then $M(X)\uparrow$. This means that the machine does not halt (it enters an infinite loop).

The key difference between deciding and accepting a language is that a machine that decides a language must always halt, providing a definitive answer (either yes or no), while a machine that accepts a language only needs to halt and accept if the input is in the language; it can loop indefinitely if the input is not in the language.

Recursive and Enumerable Languages

Definition 9 (Recursive and Enumerable Languages). Based on the concepts of deciding and accepting languages, we can define two important classes of languages:

Recursive languages are those that can be decided* by a Turing machine. These languages are also known as decidable languages. For a recursive language, there exists a Turing machine that will always halt and correctly determine whether a given input string is in the language or not.*
Enumerable languages (also known as recursively enumerable languages) are those that can be accepted* by a Turing machine. For an enumerable language, there exists a Turing machine that will halt and accept if the input string is in the language, but it may loop indefinitely if the input string is not in the language.*
The set of enumerable languages is larger than the set of recursive languages. This means that there exist languages that can be accepted by a Turing machine but cannot be decided by any Turing machine. In other words, there are languages for which we can verify membership but cannot definitively determine non-membership.

The distinction between recursive and enumerable languages is fundamental in computability theory and highlights the limitations of what can be computed by Turing machines.

Recursive Languages: Decided by a Turing machine (always halts).
Enumerable Languages: Accepted by a Turing machine (halts on acceptance, may loop otherwise).
Enumerable languages form a larger set than recursive languages.

Time Complexity Classes

A language $L$ is said to be decidable in time $f(n)$ if there exists a Turing machine $M$ that decides $L$ and operates within a time complexity of $f(n)$. This means that for any input of size $n$, the machine $M$ will halt in at most $f(n)$ steps and correctly determine whether the input is in the language $L$ or not.

We define the time complexity class $\text{Time}(f(n))$ as the set of all languages that can be decided by a Turing machine in time $f(n)$. Formally: \[\text{Time}(f(n)) = \{ L \mid L \text{ is decidable in time } f(n) \}\] This notation allows us to group languages based on the time complexity required to decide them.

Example: Palindromes

Definition 10 (Palindrome). A string is a palindrome* if it reads the same forwards and backwards. For example, the string $0110$ is a palindrome, while the string $110$ is not.*

Claim: The language of palindromes is in $\text{Time}(f(n))$, where $f(n)$ is a quadratic function. This means that there exists a Turing machine that can decide whether a given string is a palindrome in quadratic time.

Proof Idea: A single-tape Turing machine can decide the language of palindromes by repeatedly comparing the first and last symbols of the input string and moving inwards. The machine can achieve this by:

Reading the first symbol and storing it in the state.
Moving to the end of the string.
Comparing the last symbol with the stored symbol.
If they match, move inwards and repeat the process.
If they don’t match, reject the string.
If all symbols match, accept the string.

This process involves scanning the string multiple times, leading to a quadratic time complexity. The complexity can be roughly expressed as $T(n) = T(n-2) + 2n + c$, where $T(n)$ is the time complexity for a string of length $n$, and $c$ is a constant representing the overhead of the operations. This recurrence relation indicates a quadratic time complexity.

Detailed Analysis: While the teacher did not derive the exact quadratic function, a more detailed analysis reveals that the time complexity of this algorithm is approximately $4n^2 + 2n + 1$. This is because the machine needs to scan the tape multiple times, and each scan takes time proportional to the length of the remaining string.

Input: String $X$ on the tape. Initial State: $s$ Mark the beginning of the tape with a special symbol $\triangleright$ Read the first symbol and store it in the state. Move to the end of the string. Compare the last symbol with the stored symbol. Reject the string. Halt Move inwards (remove the compared symbols). Accept the string. Halt

Time Complexity: $O(n^2)$ The algorithm performs multiple scans of the tape, each taking $O(n)$ time, resulting in a quadratic time complexity.

Multi-Tape Turing Machines

Definition 11 (Multi-Tape Turing Machine). To enhance the capabilities of our computational model, we generalize the standard Turing machine to have multiple tapes. A multi-tape Turing machine* has $k$ tapes, where $k$ is a fixed positive integer. These tapes allow the machine to store and manipulate data in parallel, potentially leading to more efficient computations. The first tape is designated as the input tape, where the input is initially placed. For functional problems, the last tape is used as the output tape, where the result of the computation is written. All other tapes are initially empty, filled with blank symbols.*

$k$ tapes, where $k$ is a fixed positive integer.
First tape is the input tape.
Last tape is the output tape (for functional problems).
Other tapes are initially empty.

Formal Definition

Definition 12 (Formal Multi-Tape TM). A $k$-tape Turing machine $M$ is formally defined as a 4-tuple: \[M = (K, \Gamma, \delta, s)\] where:

$K$ is a finite set of states.
$\Gamma$ is the tape alphabet, a finite set of symbols that can be written on any of the $k$ tapes.
$\delta$ is the transition function, which dictates the behavior of the machine.
$s \in K$ is the initial state, the state in which the machine starts its computation.

The transition function $\delta$ for a $k$-tape Turing machine is defined as: \[\delta: K \times \Gamma^k \rightarrow K \times \Gamma^k \times \{L, R, -\}^k\] This function takes the current state and the symbols under the tape heads of all $k$ tapes as input. It then returns the next state, the symbols to be written on each of the $k$ tapes, and the direction in which each of the $k$ tape heads should move. The movement of each tape head is independent of the others.

States ($K$): Finite set controlling the machine’s behavior.
Tape Alphabet ($\Gamma$): Symbols that can be written on any of the $k$ tapes.
Transition Function ($\delta$): Determines the next state, symbols to write, and tape head movements for each tape.
Initial State ($s$): Starting state of the machine.

Configuration for Multi-Tape Turing Machines

Definition 13 (Multi-Tape TM Configuration). A configuration of a $k$-tape Turing machine includes the state of the machine and the content of each of the $k$ tapes. It is represented as: \[q(U_1, W_1), (U_2, W_2), \dots, (U_k, W_k)\] where:

$q$ is the current state of the machine.
$(U_i, W_i)$ represents the content of the $i$-th tape, where $U_i$ is the string to the left of the tape head (including the starting symbol), and $W_i$ is the string to the right of the tape head. The tape head is considered to be positioned on the last symbol of $U_i$.

This configuration provides a complete snapshot of the machine’s state at a given point in its computation, including the state and the content of all tapes.

Represented as: \[q(U_1, W_1), (U_2, W_2), \dots, (U_k, W_k)\]

$q$ is the current state.
$(U_i, W_i)$ represents the content of the $i$-th tape.

Example: Palindromes with Two Tapes

Claim: The language of palindromes can be decided in linear time using a two-tape Turing machine. This is a significant improvement over the quadratic time complexity achieved with a single-tape Turing machine.

Proof Idea: The two-tape Turing machine decides the language of palindromes using the following steps:

Copy the input to the second tape: The machine reads the input string from the first tape and simultaneously writes it onto the second tape. This step takes $n+1$ steps, where $n$ is the length of the input string. The extra step is to move the head to the first blank symbol after the input.
Move the head of the first tape backward: The machine moves the tape head on the first tape back to the beginning of the input string. This step takes $n+1$ steps.
Compare the symbols on the two tapes: The machine compares the symbols on the two tapes by moving the tape head on the first tape to the right and the tape head on the second tape to the left. If all symbols match, the machine accepts the input; otherwise, it rejects the input. This steptakes at most $n+1$ steps.

Input: String $X$ on tape 1, tape 2 is empty. Copy $X$ from tape 1 to tape 2. Move tape 1 head to the beginning of $X$. Move tape 2 head to the end of $X$. while tape 1 head is not at the end of $X$ do if symbol at tape 1 head $\neq$ symbol at tape 2 head then Reject Halt end if Move tape 1 head to the right. Move tape 2 head to the left. end while Accept Halt

Complexity Analysis: The copying step takes $n+1$ steps. Moving the first tape head back to the beginning takes $n+1$ steps. The comparison step takes at most $n+1$ steps. Therefore, the total number of steps is approximately $3n + 3$, which is a linear function of the input size $n$. Thus, the language of palindromes can be decided in linear time using a two-tape Turing machine.

Time Complexity: $O(n)$ The algorithm performs a constant number of passes over the input, each taking $O(n)$ time, resulting in a linear time complexity.

Comparison of Single-Tape and Multi-Tape Turing Machines

Remark. Having introduced both single-tape and multi-tape Turing machines, it is natural to compare their computational power and efficiency. While multi-tape Turing machines can often solve problems more efficiently, it is important to understand how they relate to single-tape Turing machines in terms of time complexity. The following theorem establishes a fundamental relationship between these two models.

Theorem 1. Any $k$-tape Turing machine $M$ that operates in time $f(n)$ can be simulated by a single-tape Turing machine $M'$ that operates in time $O(f(n)^2)$, provided that $f(n) \geq n$.

Proof Idea: The proof of this theorem involves constructing a single-tape Turing machine $M'$ that simulates the behavior of a given $k$-tape Turing machine $M$. The key idea is to store the contents of all $k$ tapes of $M$ on the single tape of $M'$, separated by special symbols. The single-tape machine $M'$ simulates each step of the multi-tape machine $M$ by scanning its tape back and forth, updating the tape contents, and moving the simulated tape heads.

Simulation Details:

Tape Representation: The single-tape machine $M'$ represents the contents of the $k$ tapes of $M$ on its single tape. Each tape’s content is separated by a special delimiter symbol, and the current head position on each tape is marked by an underlined version of the symbol.
Extended Alphabet: The single-tape machine $M'$ uses an extended alphabet that includes the original alphabet $\Gamma$, underlined versions of the symbols in $\Gamma$ to mark the head positions, and special symbols to delimit the tape contents.
Simulation Step: To simulate one step of the multi-tape machine $M$, the single-tape machine $M'$ performs the following:
1. Scans its tape to determine the symbols under the simulated heads of $M$.
2. Stores these symbols in its state.
3. Applies the transition function of $M$ to determine the changes to be made.
4. Updates the tape content, moves the simulated heads, and creates space if needed.

Key Points:

The single-tape machine $M'$ uses an extended alphabet to mark the head positions on each of the $k$ tapes of $M$.
Simulating one step of the multi-tape machine $M$ requires $O(n + f(n))$ steps on the single-tape machine $M'$. This is because the single-tape machine needs to scan its tape, which can have a length of $O(n + f(n))$, to simulate the actions of the multi-tape machine.
The total time complexity of the simulation is $O(f(n) \cdot (n + f(n)))$. Since we assume $f(n) \geq n$, this simplifies to $O(f(n)^2)$.

Detailed Analysis of Simulation Time: The single-tape machine $M'$ simulates each step of the $k$-tape machine $M$ by scanning its tape, which contains the contents of all $k$ tapes of $M$. The length of the tape of $M'$ is at most $O(n + k \cdot f(n))$, where $n$ is the length of the input and $f(n)$ is the number of steps performed by $M$. Since $k$ is a constant, the length of the tape of $M'$ is $O(n + f(n))$. To simulate one step of $M$, $M'$ needs to scan its tape, which takes $O(n + f(n))$ steps. Since $M$ performs $f(n)$ steps, the total time complexity of $M'$ is $O(f(n) \cdot (n + f(n)))$. Given the condition that $f(n) \geq n$, the total time complexity is $O(f(n)^2)$.

Constant Factor: The exact constant factor in the simulation time depends on the number of tapes $k$ and the need to shift tape contents when a tape head moves to a previously unvisited cell. In the worst case, the single-tape machine may need to shift the contents of all other tapes to create space for a new symbol on one of the tapes. This shifting operation contributes to the constant factor in the $O(f(n)^2)$ complexity.

Time Complexity: $O(f(n)^2)$ A $k$-tape TM running in time $f(n)$ can be simulated by a single-tape TM in time $O(f(n)^2)$, assuming $f(n) \geq n$.

Extended Church-Turing Thesis

Remark. The Extended Church-Turing thesis is a fundamental principle in theoretical computer science that posits that all reasonable models of computation are polynomially related in terms of their time complexity. This means that if a problem can be solved in polynomial time on one reasonable model of computation, it can also be solved in polynomial time on any other reasonable model of computation, with at most a polynomial increase in the time complexity.

Theorem 1, which demonstrates that any $k$-tape Turing machine can be simulated by a single-tape Turing machine with at most a quadratic increase in time complexity, is a specific instance of the Extended Church-Turing thesis. This theorem shows that single-tape and multi-tape Turing machines are polynomially related, as the simulation incurs a polynomial overhead.

The Extended Church-Turing thesis is not a theorem that can be proven mathematically, but rather a hypothesis supported by extensive evidence. It suggests that the notion of polynomial time is a robust measure of computational efficiency, independent of the specific model of computation used. This thesis provides a strong foundation for the study of computational complexity, allowing us to focus on the inherent difficulty of problems rather than the specific details of the computational models used to solve them.

Core Idea: All reasonable models of computation are polynomially related in terms of time complexity. Implication: If a problem is solvable in polynomial time on one model, it is solvable in polynomial time on any other reasonable model (with at most a polynomial overhead). Status: A hypothesis supported by extensive evidence, not a provable theorem.

Importance of the Condition $f(n) \geq n$

Remark. The condition $f(n) \geq n$ in Theorem 1 is not merely a technical detail; it is a crucial requirement for the quadratic relationship between the time complexity of multi-tape and single-tape Turing machines. This condition reflects a fundamental aspect of computation: that at least linear time is required to read the input.

In the context of Turing machines, at least one step is needed to read each symbol of the input. Therefore, any time complexity function $f(n)$ must be at least linear in $n$. The condition $f(n) \geq n$ ensures that the simulation of a multi-tape Turing machine by a single-tape Turing machine does not lead to a time complexity that is less than linear.

This condition is also important because it allows us to simplify the time complexity analysis. Without this condition, the time complexity of the simulation would be $O(f(n) \cdot (n + f(n)))$, which is more complex. By assuming $f(n) \geq n$, we can simplify this expression to $O(f(n)^2)$, which is a more manageable and intuitive result.

The condition $f(n) \geq n$ highlights that the time complexity of a Turing machine is fundamentally tied to the size of the input. It emphasizes that the act of reading the input itself is a computational step that cannot be ignored. This is a key consideration when analyzing the time complexity of algorithms and computational models.

Fundamental Requirement: At least linear time is needed to read the input. Ensures Linear Lower Bound: Prevents simulation time complexity from being less than linear. Simplifies Analysis: Allows simplification of the simulation time complexity to $O(f(n)^2)$. Reflects Input Dependency: Highlights that time complexity is fundamentally tied to input size.

Future Topics

In the upcoming lectures, we will delve deeper into the implications of time complexities lower than linear and explore further into complexity theory. We will also discuss the limitations of Turing machines and introduce more advanced models of computation. This will enable us to understand the landscape of computational complexity and the inherent difficulty of various computational problems.

Implications of time complexities lower than linear.
Further exploration of complexity theory.
Limitations of Turing machines.
Introduction of more advanced models of computation.

--- title: "Turing Machines and Complexity Theory" author: "Your Name" date: "2025-01-28" format: html: toc: true # Table of Contents toc-depth: 2 code-tools: true theme: cosmo # Or "journal" for Distill-like minimalism --- # Introduction {#sec:introduction} In the previous lecture, we introduced two distinct cost criteria for analyzing computational models: *logarithmic cost* and *uniform cost*. We highlighted the significance of the logarithmic cost criterion, particularly in scenarios where the computational model has the capacity to expand intermediate results in terms of the number of bits, as is the case with operations like multiplication. This distinction is crucial because it accurately reflects the actual resources consumed by such operations. We will delve deeper into this concept by formalizing the relationship between these cost criteria when we explore the simulation of random access machines (RAM) using Turing machines. This formalization will also allow us to revisit and further discuss the [Extended Church-Turing thesis](#sec:extended-church-thesis), which posits that all reasonable models of computation are equivalent in terms of their computational power, up to polynomial time differences. ::: tcolorbox - **Logarithmic Cost**: Cost of an operation depends on the number of bits involved. - **Uniform Cost**: Each operation has a constant cost. - **Extended Church-Turing Thesis**: All reasonable models of computation are equivalent up to polynomial time differences. ::: # Review of Unlimited Register Machines (URM) and Turing Machines {#sec:review-machines} To understand the nuances of cost criteria, let's briefly review the Unlimited Register Machine (URM) and Turing Machine models. In URM, we have the flexibility to define either a *uniform cost* or a *logarithmic cost* for each operation. The choice depends on whether we assign a constant cost to each operation, regardless of the size of the operands, or a cost that is proportional to the number of bits required to represent the operands. ## Uniform Cost in Turing Machines {#subsec:uniform-cost-turing} In contrast, Turing machines operate at a much lower level. Each fundamental operation, such as reading a symbol, replacing a symbol, moving the tape head (left or right), and changing the machine's state, is counted as a single step. This is considered a **uniform cost** model because each operation is treated as having the same cost, irrespective of the specific symbol being manipulated. This is because Turing machines operate on single digits (bits), and the cardinality of the alphabet is fixed. Consequently, for Turing machines, the distinction between logarithmic and uniform cost criteria becomes irrelevant, as they effectively become the same. ::: tcolorbox **Uniform Cost**: Each basic operation (read, write, move, change state) costs one unit. **Logarithmic Cost**: Not applicable to standard Turing machines due to fixed alphabet size and bit-level operations. ::: ## Low-Level Nature of Turing Machines {#subsec:low-level-turing} The low-level nature of Turing machines stems from their manipulation of individual bits. This contrasts with higher-level models where operations might involve entire numbers or data structures. When analyzing the complexity of algorithms, such as sorting algorithms, we often use the length of the input array as a measure of input size. However, this is not always the most accurate representation of input size, especially when dealing with models like URM, where registers can hold arbitrarily large integers. In the case of Turing machines, the input size is directly and intuitively defined as the number of cells on the tape that are used to store the input. This direct correspondence between input size and tape usage simplifies the analysis of Turing machine complexity. ::: tcolorbox **URM**: Input size can be complex, depending on register contents. **Turing Machine**: Input size is the length of the input on the tape. ::: # Formal Definition of Turing Machines {#sec:formal-turing-def} ::: {.def:turing-machine .definition} **Definition 1** (Turing Machine). *A Turing machine (TM) is a theoretical model of computation that consists of a finite-state control unit and an infinite tape. Formally, a Turing machine $M$ is defined as a 5-tuple: $$M = (K, \Sigma, \Gamma, \delta, s)$$ where:* - *$K$ is a finite set of **states**.* - *$\Sigma$ is the **input alphabet**, a finite set of symbols that can be used as input to the machine.* - *$\Gamma$ is the **tape alphabet**, a finite set of symbols that can be written on the tape. It includes the input alphabet $\Sigma$ and a special blank symbol, denoted by $\sqcup$, i.e., $\Sigma \subseteq \Gamma$ and $\sqcup \in \Gamma$.* - *$\delta$ is the **transition function**, which dictates the behavior of the machine.* - *$s \in K$ is the **initial state**, the state in which the machine starts its computation.* *The transition function $\delta$ is a mapping that takes the current state and the symbol currently under the tape head as input and returns the next state, the symbol to be written on the tape, and the direction in which the tape head should move. It is formally defined as: $$\delta: K \times \Gamma \rightarrow K \times \Gamma \times \{L, R, -\}$$ where:* - *$L$ represents a move of the tape head one cell to the **left**.* - *$R$ represents a move of the tape head one cell to the **right**.* - *$-$ represents the tape head **staying** in the same position.* ::: ::: tcolorbox - **States ($K$)**: Finite set controlling the machine's behavior. - **Input Alphabet ($\Sigma$)**: Symbols used for input. - **Tape Alphabet ($\Gamma$)**: Symbols that can be written on the tape, including $\Sigma$ and the blank symbol $\sqcup$. - **Transition Function ($\delta$)**: Determines the next state, symbol to write, and tape head movement. - **Initial State ($s$)**: Starting state of the machine. ::: ## Initial State and Starting Symbol Convention {#subsec:turing-conventions} To ensure consistent behavior at the start of the computation, we impose a convention on the initial state and the starting symbol. We assume that when the machine is in the initial state $s$ and is reading the starting symbol $\triangleright$ (which is a special symbol not in $\Sigma$ and is part of $\Gamma$), the only allowed action is to change the state, leave the starting symbol unchanged, and move the tape head to the right. This convention ensures that the machine does not overwrite the starting symbol or move to the left of the initial position. ::: tcolorbox When in initial state $s$ and reading the starting symbol $\triangleright$: - Change state. - Leave $\triangleright$ unchanged. - Move tape head to the right. ::: # Configurations {#sec:configurations} ::: {.def:configuration .definition} **Definition 2** (Configuration). *A *configuration* of a Turing machine provides a snapshot of the machine's state at a particular point in its computation. It encapsulates all the information necessary to resume the computation from that point. A configuration includes the following:* - *The **content of the tape**, which represents the sequence of symbols currently written on the tape.* - *The **current state** of the machine, which indicates the internal state of the finite-state control unit.* - *The **position of the tape head**, which specifies the cell on the tape that is currently being read or written.* *A configuration is represented as a string of the form $UqW$, where:* - *$U$ is a string representing the portion of the tape to the **left** of the tape head, including the starting symbol $\triangleright$.* - *$q$ is the **current state** of the machine, $q \in K$.* - *$W$ is a string representing the portion of the tape to the **right** of the tape head.* *The tape head is considered to be positioned on the last symbol of the string $U$.* ::: ::: tcolorbox - **Tape Content**: Symbols written on the tape. - **Current State ($q$)**: The machine's current internal state. - **Tape Head Position**: Indicated by the end of string $U$. Represented as $UqW$. ::: ## Initial Configuration {#subsec:initial-config} ::: {.def:initial-configuration .definition} **Definition 3** (Initial Configuration). *The *initial configuration* of a Turing machine for a given input $X$ is the configuration at the very beginning of the computation. It is represented as $s\triangleright X$, where:* - *$s$ is the **initial state** of the machine.* - *$\triangleright$ is the **starting symbol**, which is placed at the beginning of the tape.* - *$X$ is the **input string** that the machine is supposed to process.* *This configuration indicates that the machine starts in the initial state $s$, with the tape head positioned on the starting symbol $\triangleright$, and the input string $X$ written on the tape immediately after the starting symbol.* ::: ::: tcolorbox Represented as $s\triangleright X$, where: - $s$ is the initial state. - $\triangleright$ is the starting symbol. - $X$ is the input string. ::: ## Final Configurations {#subsec:final-config} ::: {.def:final-configuration .definition} **Definition 4** (Final Configuration). **Final configurations* are those from which the computation cannot proceed further. Since the transition function $\delta$ is a total function (i.e., it is defined for all possible combinations of states and tape symbols), the only way for a computation to halt is to reach a final state. These final states are typically designated as `yes`, `no`, or `halt`. A final configuration is of the form $hUW$, where:* - *$h$ is a **final state**, which can be either `yes`, `no`, or `halt`.* - *$U$ is the string to the left of the tape head.* - *$W$ is the string to the right of the tape head.* *When the machine reaches a final configuration, the computation terminates, and the result of the computation is determined by the final state reached.* ::: ::: tcolorbox Represented as $hUW$, where: - $h$ is a final state (`yes`, `no`, or `halt`). - $U$ is the string to the left of the tape head. - $W$ is the string to the right of the tape head. Computation terminates upon reaching a final configuration. ::: # Computation Steps {#sec:computation-steps} ::: {.def:computation-step .definition} **Definition 5** (Computation Step). *A *computation step* represents a single transition of a Turing machine from one configuration to the next. This transition is determined by the application of the transition function $\delta$. The precise details of how a configuration changes depend on the current state, the symbol under the tape head, and the specific rules defined by $\delta$.* ::: ::: tcolorbox A single transition from one configuration to the next, determined by the transition function $\delta$. ::: ## Computation as a Sequence of Configurations {#subsec:computation-sequence} ::: {.def:computation-sequence .definition} **Definition 6** (Computation Sequence). *The entire computation of a Turing machine $M$ on a given input $X$ can be represented as a sequence of configurations. This sequence starts with the initial configuration and proceeds step-by-step until a final configuration is reached (if the machine halts). The sequence can be formally written as: $$s\triangleright X \rightarrow U_1q_1W_1 \rightarrow U_2q_2W_2 \rightarrow \dots \rightarrow hUW$$ where:* - *$s\triangleright X$ is the **initial configuration** of the machine with input $X$.* - *$U_i q_i W_i$ represents the **configuration** of the machine at step $i$.* - *$hUW$ is a **final configuration**, where $h$ is a final state (`yes`, `no`, or `halt`).* - *The arrow $\rightarrow$ denotes a **computation step**, where the configuration on the left transitions to the configuration on the right by applying the transition function $\delta$.* *This sequence of configurations provides a complete trace of the machine's execution, showing how the tape content, the machine state, and the tape head position change over time. If the machine does not halt, the sequence of configurations will be infinite.* ::: ::: tcolorbox A sequence of configurations: $$s\triangleright X \rightarrow U_1q_1W_1 \rightarrow U_2q_2W_2 \rightarrow \dots \rightarrow hUW$$ - Starts with the initial configuration $s\triangleright X$. - Each step $\rightarrow$ is a transition based on $\delta$. - Ends with a final configuration $hUW$ (if the machine halts). ::: # Time Complexity {#sec:time-complexity} ::: tcolorbox The *time complexity* of a Turing machine $M$ quantifies the amount of time (number of computation steps) the machine takes to complete its computation as a function of the input size. Formally, the time complexity of a Turing machine $M$ is a function $f: \mathbb{N} \rightarrow \mathbb{N}$ such that for all inputs $X$ of length $n$, the machine $M$ terminates after at most $f(n)$ steps. This means that for any input of size $n$, the number of computation steps required by the machine to reach a final configuration is bounded by $f(n)$. ::: ## Worst-Case Complexity {#subsec:worst-case-complexity} ::: {.def:worst-case-complexity .definition} **Definition 7** (Worst-Case Time Complexity). *The time complexity defined above represents a *worst-case complexity*. This is because the function $f(n)$ must provide an upper bound on the number of steps for *all* possible inputs of size $n$. In other words, the machine must terminate within $f(n)$ steps for every input of length $n$, regardless of the specific input string. This ensures that the time complexity analysis provides a guarantee on the maximum time the machine will take to complete its computation.* ::: ::: tcolorbox The time complexity $f(n)$ is an upper bound on the number of steps for *all* inputs of size $n$. ::: ## Notation {#subsec:turing-notation} ::: {.def:turing-notation .definition} **Definition 8** (Turing Machine Notation). *To facilitate the discussion of Turing machine computations, we introduce the following notation:* - *$M(X)$: Represents the Turing machine $M$ working on input $X$. This notation is used to denote the execution of the machine on a specific input.* - *$M(X)\downarrow$: Indicates that the Turing machine $M$ **terminates** (halts) on input $X$. This means that the machine reaches a final configuration after a finite number of steps.* - *$M(X)\uparrow$: Indicates that the Turing machine $M$ **does not terminate** on input $X$. This means that the machine enters an infinite loop and never reaches a final configuration.* - *$M(X) = \text{yes}$: Indicates that the Turing machine $M$ **accepts** the input $X$. This means that the machine terminates in a final state designated as `yes`.* - *$M(X) = \text{no}$: Indicates that the Turing machine $M$ **refuses** the input $X$. This means that the machine terminates in a final state designated as `no`.* *These notations will be used throughout the course to concisely describe the behavior of Turing machines on various inputs.* ::: ::: tcolorbox - $M(X)$: Machine $M$ on input $X$. - $M(X)\downarrow$: $M$ terminates on $X$. - $M(X)\uparrow$: $M$ does not terminate on $X$. - $M(X) = \text{yes}$: $M$ accepts $X$. - $M(X) = \text{no}$: $M$ refuses $X$. ::: # Decidable and Accepted Languages {#sec:decidable-accepted-languages} ::: tcolorbox In the context of Turing machines, we can define two important concepts related to how a machine interacts with a language: *deciding* and *accepting* a language. - A Turing machine $M$ **decides** a language $L$ if, for all input strings $X$: - If $X$ is in the language $L$ (i.e., $X \in L$), then $M(X) = \text{yes}$. This means that the machine halts and accepts the input. - If $X$ is not in the language $L$ (i.e., $X \notin L$), then $M(X) = \text{no}$. This means that the machine halts and rejects the input. - A Turing machine $M$ **accepts** a language $L$ if, for all input strings $X$: - If $X$ is in the language $L$ (i.e., $X \in L$), then $M(X) = \text{yes}$. This means that the machine halts and accepts the input. - If $X$ is not in the language $L$ (i.e., $X \notin L$), then $M(X)\uparrow$. This means that the machine does not halt (it enters an infinite loop). The key difference between deciding and accepting a language is that a machine that decides a language must always halt, providing a definitive answer (either `yes` or `no`), while a machine that accepts a language only needs to halt and accept if the input is in the language; it can loop indefinitely if the input is not in the language. ::: ## Recursive and Enumerable Languages {#subsec:recursive-enumerable} ::: {.def:recursive-enumerable .definition} **Definition 9** (Recursive and Enumerable Languages). *Based on the concepts of deciding and accepting languages, we can define two important classes of languages:* - ***Recursive languages** are those that can be *decided* by a Turing machine. These languages are also known as decidable languages. For a recursive language, there exists a Turing machine that will always halt and correctly determine whether a given input string is in the language or not.* - ***Enumerable languages** (also known as recursively enumerable languages) are those that can be *accepted* by a Turing machine. For an enumerable language, there exists a Turing machine that will halt and accept if the input string is in the language, but it may loop indefinitely if the input string is not in the language.* - *The set of **enumerable languages is larger than the set of recursive languages**. This means that there exist languages that can be accepted by a Turing machine but cannot be decided by any Turing machine. In other words, there are languages for which we can verify membership but cannot definitively determine non-membership.* *The distinction between recursive and enumerable languages is fundamental in computability theory and highlights the limitations of what can be computed by Turing machines.* ::: ::: tcolorbox - **Recursive Languages**: Decided by a Turing machine (always halts). - **Enumerable Languages**: Accepted by a Turing machine (halts on acceptance, may loop otherwise). - Enumerable languages form a larger set than recursive languages. ::: # Time Complexity Classes {#sec:time-complexity-classes} ::: tcolorbox A language $L$ is said to be *decidable in time* $f(n)$ if there exists a Turing machine $M$ that decides $L$ and operates within a time complexity of $f(n)$. This means that for any input of size $n$, the machine $M$ will halt in at most $f(n)$ steps and correctly determine whether the input is in the language $L$ or not. We define the time complexity class $\text{Time}(f(n))$ as the set of all languages that can be decided by a Turing machine in time $f(n)$. Formally: $$\text{Time}(f(n)) = \{ L \mid L \text{ is decidable in time } f(n) \}$$ This notation allows us to group languages based on the time complexity required to decide them. ::: ## Example: Palindromes {#subsec:palindrome-example} ::: {.def:palindrome .definition} **Definition 10** (Palindrome). *A string is a *palindrome* if it reads the same forwards and backwards. For example, the string $0110$ is a palindrome, while the string $110$ is not.* ::: ::: tcolorbox **Claim**: The language of palindromes is in $\text{Time}(f(n))$, where $f(n)$ is a quadratic function. This means that there exists a Turing machine that can decide whether a given string is a palindrome in quadratic time. **Proof Idea**: A single-tape Turing machine can decide the language of palindromes by repeatedly comparing the first and last symbols of the input string and moving inwards. The machine can achieve this by: 1. Reading the first symbol and storing it in the state. 2. Moving to the end of the string. 3. Comparing the last symbol with the stored symbol. 4. If they match, move inwards and repeat the process. 5. If they don't match, reject the string. 6. If all symbols match, accept the string. This process involves scanning the string multiple times, leading to a quadratic time complexity. The complexity can be roughly expressed as $T(n) = T(n-2) + 2n + c$, where $T(n)$ is the time complexity for a string of length $n$, and $c$ is a constant representing the overhead of the operations. This recurrence relation indicates a quadratic time complexity. **Detailed Analysis**: While the teacher did not derive the exact quadratic function, a more detailed analysis reveals that the time complexity of this algorithm is approximately $4n^2 + 2n + 1$. This is because the machine needs to scan the tape multiple times, and each scan takes time proportional to the length of the remaining string. ::: :::: algorithm ::: algorithmic **Input:** String $X$ on the tape. **Initial State:** $s$ Mark the beginning of the tape with a special symbol $\triangleright$ Read the first symbol and store it in the state. Move to the end of the string. Compare the last symbol with the stored symbol. Reject the string. **Halt** Move inwards (remove the compared symbols). Accept the string. **Halt** ::: :::: ::: tcolorbox **Time Complexity**: $O(n^2)$ The algorithm performs multiple scans of the tape, each taking $O(n)$ time, resulting in a quadratic time complexity. ::: # Multi-Tape Turing Machines {#sec:multi-tape-turing} ::: {.def:multi-tape-tm .definition} **Definition 11** (Multi-Tape Turing Machine). *To enhance the capabilities of our computational model, we generalize the standard Turing machine to have multiple tapes. A *multi-tape Turing machine* has $k$ tapes, where $k$ is a fixed positive integer. These tapes allow the machine to store and manipulate data in parallel, potentially leading to more efficient computations. The first tape is designated as the *input tape*, where the input is initially placed. For functional problems, the last tape is used as the *output tape*, where the result of the computation is written. All other tapes are initially empty, filled with blank symbols.* ::: ::: tcolorbox - $k$ tapes, where $k$ is a fixed positive integer. - First tape is the input tape. - Last tape is the output tape (for functional problems). - Other tapes are initially empty. ::: ## Formal Definition {#subsec:formal-multi-tape} ::: {.def:formal-multi-tape-tm .definition} **Definition 12** (Formal Multi-Tape TM). *A $k$-tape Turing machine $M$ is formally defined as a 4-tuple: $$M = (K, \Gamma, \delta, s)$$ where:* - *$K$ is a finite set of **states**.* - *$\Gamma$ is the **tape alphabet**, a finite set of symbols that can be written on any of the $k$ tapes.* - *$\delta$ is the **transition function**, which dictates the behavior of the machine.* - *$s \in K$ is the **initial state**, the state in which the machine starts its computation.* *The transition function $\delta$ for a $k$-tape Turing machine is defined as: $$\delta: K \times \Gamma^k \rightarrow K \times \Gamma^k \times \{L, R, -\}^k$$ This function takes the current state and the symbols under the tape heads of all $k$ tapes as input. It then returns the next state, the symbols to be written on each of the $k$ tapes, and the direction in which each of the $k$ tape heads should move. The movement of each tape head is independent of the others.* ::: ::: tcolorbox - **States ($K$)**: Finite set controlling the machine's behavior. - **Tape Alphabet ($\Gamma$)**: Symbols that can be written on any of the $k$ tapes. - **Transition Function ($\delta$)**: Determines the next state, symbols to write, and tape head movements for each tape. - **Initial State ($s$)**: Starting state of the machine. ::: ## Configuration for Multi-Tape Turing Machines {#subsec:config-multi-tape} ::: {.def:multi-tape-config .definition} **Definition 13** (Multi-Tape TM Configuration). *A configuration of a $k$-tape Turing machine includes the state of the machine and the content of each of the $k$ tapes. It is represented as: $$q(U_1, W_1), (U_2, W_2), \dots, (U_k, W_k)$$ where:* - *$q$ is the **current state** of the machine.* - *$(U_i, W_i)$ represents the content of the $i$-th tape, where $U_i$ is the string to the left of the tape head (including the starting symbol), and $W_i$ is the string to the right of the tape head. The tape head is considered to be positioned on the last symbol of $U_i$.* *This configuration provides a complete snapshot of the machine's state at a given point in its computation, including the state and the content of all tapes.* ::: ::: tcolorbox Represented as: $$q(U_1, W_1), (U_2, W_2), \dots, (U_k, W_k)$$ - $q$ is the current state. - $(U_i, W_i)$ represents the content of the $i$-th tape. ::: # Example: Palindromes with Two Tapes {#sec:palindrome-two-tapes} ::: tcolorbox **Claim**: The language of palindromes can be decided in linear time using a two-tape Turing machine. This is a significant improvement over the quadratic time complexity achieved with a single-tape Turing machine. **Proof Idea**: The two-tape Turing machine decides the language of palindromes using the following steps: 1. **Copy the input to the second tape**: The machine reads the input string from the first tape and simultaneously writes it onto the second tape. This step takes $n+1$ steps, where $n$ is the length of the input string. The extra step is to move the head to the first blank symbol after the input. 2. **Move the head of the first tape backward**: The machine moves the tape head on the first tape back to the beginning of the input string. This step takes $n+1$ steps. 3. **Compare the symbols on the two tapes**: The machine compares the symbols on the two tapes by moving the tape head on the first tape to the right and the tape head on the second tape to the left. If all symbols match, the machine accepts the input; otherwise, it rejects the input. This steptakes at most $n+1$ steps. ::: ::::: tcolorbox :::: algorithm ::: algorithmic **Input:** String $X$ on tape 1, tape 2 is empty. Copy $X$ from tape 1 to tape 2. Move tape 1 head to the beginning of $X$. Move tape 2 head to the end of $X$. **while** tape 1 head is not at the end of $X$ **do** **if** symbol at tape 1 head $\neq$ symbol at tape 2 head **then** **Reject** **Halt** **end if** Move tape 1 head to the right. Move tape 2 head to the left. **end while** **Accept** **Halt** ::: :::: ::::: ::: tcolorbox **Complexity Analysis**: The copying step takes $n+1$ steps. Moving the first tape head back to the beginning takes $n+1$ steps. The comparison step takes at most $n+1$ steps. Therefore, the total number of steps is approximately $3n + 3$, which is a linear function of the input size $n$. Thus, the language of palindromes can be decided in linear time using a two-tape Turing machine. ::: ::: tcolorbox **Time Complexity**: $O(n)$ The algorithm performs a constant number of passes over the input, each taking $O(n)$ time, resulting in a linear time complexity. ::: # Comparison of Single-Tape and Multi-Tape Turing Machines {#sec:comparison-single-multi} ::: remark Having introduced both single-tape and multi-tape Turing machines, it is natural to compare their computational power and efficiency. While multi-tape Turing machines can often solve problems more efficiently, it is important to understand how they relate to single-tape Turing machines in terms of time complexity. The following theorem establishes a fundamental relationship between these two models. ::: :::: tcolorbox ::: {.thm:multi-to-single .theorem} **Theorem 1**. *Any $k$-tape Turing machine $M$ that operates in time $f(n)$ can be simulated by a single-tape Turing machine $M'$ that operates in time $O(f(n)^2)$, provided that $f(n) \geq n$.* ::: :::: ::: tcolorbox **Proof Idea**: The proof of this theorem involves constructing a single-tape Turing machine $M'$ that simulates the behavior of a given $k$-tape Turing machine $M$. The key idea is to store the contents of all $k$ tapes of $M$ on the single tape of $M'$, separated by special symbols. The single-tape machine $M'$ simulates each step of the multi-tape machine $M$ by scanning its tape back and forth, updating the tape contents, and moving the simulated tape heads. **Simulation Details**: 1. **Tape Representation**: The single-tape machine $M'$ represents the contents of the $k$ tapes of $M$ on its single tape. Each tape's content is separated by a special delimiter symbol, and the current head position on each tape is marked by an underlined version of the symbol. 2. **Extended Alphabet**: The single-tape machine $M'$ uses an extended alphabet that includes the original alphabet $\Gamma$, underlined versions of the symbols in $\Gamma$ to mark the head positions, and special symbols to delimit the tape contents. 3. **Simulation Step**: To simulate one step of the multi-tape machine $M$, the single-tape machine $M'$ performs the following: 1. Scans its tape to determine the symbols under the simulated heads of $M$. 2. Stores these symbols in its state. 3. Applies the transition function of $M$ to determine the changes to be made. 4. Updates the tape content, moves the simulated heads, and creates space if needed. **Key Points**: - The single-tape machine $M'$ uses an **extended alphabet** to mark the head positions on each of the $k$ tapes of $M$. - Simulating one step of the multi-tape machine $M$ requires $O(n + f(n))$ steps on the single-tape machine $M'$. This is because the single-tape machine needs to scan its tape, which can have a length of $O(n + f(n))$, to simulate the actions of the multi-tape machine. - The total time complexity of the simulation is $O(f(n) \cdot (n + f(n)))$. Since we assume $f(n) \geq n$, this simplifies to $O(f(n)^2)$. **Detailed Analysis of Simulation Time**: The single-tape machine $M'$ simulates each step of the $k$-tape machine $M$ by scanning its tape, which contains the contents of all $k$ tapes of $M$. The length of the tape of $M'$ is at most $O(n + k \cdot f(n))$, where $n$ is the length of the input and $f(n)$ is the number of steps performed by $M$. Since $k$ is a constant, the length of the tape of $M'$ is $O(n + f(n))$. To simulate one step of $M$, $M'$ needs to scan its tape, which takes $O(n + f(n))$ steps. Since $M$ performs $f(n)$ steps, the total time complexity of $M'$ is $O(f(n) \cdot (n + f(n)))$. Given the condition that $f(n) \geq n$, the total time complexity is $O(f(n)^2)$. **Constant Factor**: The exact constant factor in the simulation time depends on the number of tapes $k$ and the need to shift tape contents when a tape head moves to a previously unvisited cell. In the worst case, the single-tape machine may need to shift the contents of all other tapes to create space for a new symbol on one of the tapes. This shifting operation contributes to the constant factor in the $O(f(n)^2)$ complexity. ::: ::: tcolorbox **Time Complexity**: $O(f(n)^2)$ A $k$-tape TM running in time $f(n)$ can be simulated by a single-tape TM in time $O(f(n)^2)$, assuming $f(n) \geq n$. ::: # Extended Church-Turing Thesis {#sec:extended-church-thesis} ::: remark The *Extended Church-Turing thesis* is a fundamental principle in theoretical computer science that posits that all reasonable models of computation are polynomially related in terms of their time complexity. This means that if a problem can be solved in polynomial time on one reasonable model of computation, it can also be solved in polynomial time on any other reasonable model of computation, with at most a polynomial increase in the time complexity. Theorem [1](#thm:multi-to-single){reference-type="ref" reference="thm:multi-to-single"}, which demonstrates that any $k$-tape Turing machine can be simulated by a single-tape Turing machine with at most a quadratic increase in time complexity, is a specific instance of the Extended Church-Turing thesis. This theorem shows that single-tape and multi-tape Turing machines are polynomially related, as the simulation incurs a polynomial overhead. The Extended Church-Turing thesis is not a theorem that can be proven mathematically, but rather a hypothesis supported by extensive evidence. It suggests that the notion of polynomial time is a robust measure of computational efficiency, independent of the specific model of computation used. This thesis provides a strong foundation for the study of computational complexity, allowing us to focus on the inherent difficulty of problems rather than the specific details of the computational models used to solve them. ::: ::: tcolorbox **Core Idea**: All reasonable models of computation are polynomially related in terms of time complexity. **Implication**: If a problem is solvable in polynomial time on one model, it is solvable in polynomial time on any other reasonable model (with at most a polynomial overhead). **Status**: A hypothesis supported by extensive evidence, not a provable theorem. ::: # Importance of the Condition $f(n) \geq n$ {#sec:importance-f-n} ::: remark The condition $f(n) \geq n$ in Theorem [1](#thm:multi-to-single){reference-type="ref" reference="thm:multi-to-single"} is not merely a technical detail; it is a crucial requirement for the quadratic relationship between the time complexity of multi-tape and single-tape Turing machines. This condition reflects a fundamental aspect of computation: that at least linear time is required to read the input. In the context of Turing machines, at least one step is needed to read each symbol of the input. Therefore, any time complexity function $f(n)$ must be at least linear in $n$. The condition $f(n) \geq n$ ensures that the simulation of a multi-tape Turing machine by a single-tape Turing machine does not lead to a time complexity that is less than linear. This condition is also important because it allows us to simplify the time complexity analysis. Without this condition, the time complexity of the simulation would be $O(f(n) \cdot (n + f(n)))$, which is more complex. By assuming $f(n) \geq n$, we can simplify this expression to $O(f(n)^2)$, which is a more manageable and intuitive result. The condition $f(n) \geq n$ highlights that the time complexity of a Turing machine is fundamentally tied to the size of the input. It emphasizes that the act of reading the input itself is a computational step that cannot be ignored. This is a key consideration when analyzing the time complexity of algorithms and computational models. ::: ::: tcolorbox **Fundamental Requirement**: At least linear time is needed to read the input. **Ensures Linear Lower Bound**: Prevents simulation time complexity from being less than linear. **Simplifies Analysis**: Allows simplification of the simulation time complexity to $O(f(n)^2)$. **Reflects Input Dependency**: Highlights that time complexity is fundamentally tied to input size. ::: # Future Topics {#sec:future-topics} In the upcoming lectures, we will delve deeper into the implications of time complexities lower than linear and explore further into complexity theory. We will also discuss the limitations of Turing machines and introduce more advanced models of computation. This will enable us to understand the landscape of computational complexity and the inherent difficulty of various computational problems. ::: tcolorbox - Implications of time complexities lower than linear. - Further exploration of complexity theory. - Limitations of Turing machines. - Introduction of more advanced models of computation. :::