Lecture Notes: Quantified Boolean Formulas (QBF) and First-Order Logic

Author

Your Name

Published

February 5, 2025

Introduction

This lecture focuses on Quantified Boolean Formulas (QBF), their complexity, and an introduction to first-order logic. We will explore the model checking and satisfiability problems for QBF, highlighting the role of quantifier alternation. We will also discuss the polynomial hierarchy and demonstrate how QBF can be used to solve reachability problems more succinctly than propositional logic. Finally, we will introduce first-order logic (FO) as a generalization of propositional logic and, in some sense, QBF.

Quantified Boolean Formulas (QBF)

Model Checking and Satisfiability

We revisit the model checking and satisfiability problems for QBF, which are both PSPACE-complete.

Model Checking: Given a QBF $\phi$ and a model $M$, determine if $M \models \phi$.
Satisfiability: Given a QBF $\phi$, determine if there exists a model $M$ such that $M \models \phi$.

The complexity of both model checking and satisfiability for QBF arises from the alternation between existential and universal quantifiers.

This alternation can be viewed as a game between two players:

The green player (existential) tries to find a valuation that makes the formula true.
The red player (universal) tries to find a valuation that makes the formula false.

Analogy with Mathematical Proofs: The process of proving a theorem with alternating quantifiers (e.g., $\forall x, \exists y, \dots$) can be seen as a game where the opponent chooses an arbitrary $x$, and the prover must construct a $y$ that satisfies the given condition. For example, the Pumping Lemma for regular languages involves such alternating quantifications.

Prenex Normal Form

Definition 1 (Prenex Normal Form). A QBF is in prenex normal form if all quantifiers are at the beginning of the formula, followed by a quantifier-free propositional formula. It has the form: \[Q_1 x_1 Q_2 x_2 \dots Q_n x_n \phi\] where $Q_i \in \{\exists, \forall\}$ and $\phi$ is a propositional formula.

Definition 1 defines the structure of a QBF in prenex normal form, where all quantifiers precede a quantifier-free propositional formula.

Lemma 1 (Prenex Normal Form Transformation). Every QBF can be transformed into an equivalent formula in prenex normal form in linear time.

Lemma 1 states that any QBF can be converted into an equivalent prenex normal form in linear time.

Proof. Proof. The transformation involves pulling quantifiers out while renaming variables to avoid capturing free variables.

For example, consider the formula $(\exists p \, \phi_1) \lor \phi_2$. If $p$ occurs free in $\phi_2$, we rename $p$ to a fresh variable $q$ in $\phi_1$: \[(\exists p \, \phi_1) \lor \phi_2 \equiv \exists q \, (\phi_1[p/q] \lor \phi_2)\] where $q$ is a fresh variable not occurring in $\phi_2$.
Similarly, for $(\forall p \, \phi_1) \lor \phi_2$, we have: \[(\forall p \, \phi_1) \lor \phi_2 \equiv \forall q \, (\phi_1[p/q] \lor \phi_2)\]

This process is repeated until all quantifiers are pulled to the front. Each step takes linear time, and the total number of steps is linear in the size of the formula. ◻

Quantifier Alternation and Complexity

The number of alternations between blocks of existential and universal quantifiers is a crucial complexity parameter for QBF.

Example: In the formula $\exists p \exists q \forall r \exists s \forall t \forall u \, \phi$, there are three quantifier alternations.

Polynomial Hierarchy

The polynomial hierarchy is a hierarchy of complexity classes that generalizes NP and co-NP.

$\Sigma_1^P = \text{NP}$: Problems solvable in polynomial time with an existential guess (certificate). Formally, $L \in \Sigma_1^P$ if there exists a polynomial-time decidable relation $R$ and a polynomial $p$ such that $x \in L \iff \exists y, |y| \leq p(|x|) \land R(x, y)$.
$\Pi_1^P = \text{co-NP}$: Problems solvable in polynomial time with a universal guess. Formally, $L \in \Pi_1^P$ if there exists a polynomial-time decidable relation $R$ and a polynomial $p$ such that $x \in L \iff \forall y, |y| \leq p(|x|) \land R(x, y)$.
$\Sigma_2^P$: Problems solvable with an existential guess followed by a universal guess: $\exists x \forall y \, P(x, y)$, where $P$ is a polynomial-time predicate. Formally, $L \in \Sigma_2^P$ if there exists a polynomial-time decidable relation $R$ and polynomials $p, q$ such that $x \in L \iff \exists y, |y| \leq p(|x|) \land \forall z, |z| \leq q(|x|) \land R(x, y, z)$.
$\Pi_2^P$: Problems solvable with a universal guess followed by an existential guess: $\forall x \exists y \, P(x, y)$. Formally, $L \in \Pi_2^P$ if there exists a polynomial-time decidable relation $R$ and polynomials $p, q$ such that $x \in L \iff \forall y, |y| \leq p(|x|) \land \exists z, |z| \leq q(|x|) \land R(x, y, z)$.

In general, $\Sigma_n^P$ and $\Pi_n^P$ involve $n$ alternating quantifiers, starting with an existential quantifier for $\Sigma_n^P$ and a universal quantifier for $\Pi_n^P$. The polynomial hierarchy is believed to be strictly contained in PSPACE.

Theorem 2. Model checking and satisfiability for QBF in prenex normal form with a bounded number of quantifier alternations ($n$) are in $\Sigma_n^P$ if the first quantifier is existential and in $\Pi_n^P$ if the first quantifier is universal.

Theorem 2 describes the relationship between QBF with bounded quantifier alternations and the polynomial hierarchy. Specifically, the complexity of model checking and satisfiability for such QBFs falls within the $\Sigma_n^P$ or $\Pi_n^P$ classes based on the initial quantifier.

Bounding quantifier alternations reduces the complexity from PSPACE to a level within the polynomial hierarchy. Each alternation corresponds to a level in the hierarchy.

Reachability Problem with QBF

We can use QBF to solve reachability problems more succinctly than with propositional logic.

Problem Setup:

A machine with $k$ bits representing its state.
$\phi_{init}(s)$: A formula representing initial states.
$\phi_{target}(s)$: A formula representing target states.
$\phi_{step}(s, s')$: A formula representing valid transitions between states $s$ and $s'$.

Goal: Determine if there is a sequence of $N = 2^k$ steps from an initial state to a target state.

Propositional Logic Approach: The formula $\phi_{reach}$ is constructed as: \[\exists p_1, \dots, p_N (\phi_{init}(p_1) \land \phi_{target}(p_N) \land \bigwedge_{i=1}^{N-1} \phi_{step}(p_i, p_{i+1}))\] This formula has exponential size ($O(N)$), since $N=2^k$.

QBF Approach: We define $\phi_{run}^N(p, q)$ inductively, where $\phi_{run}^N(p,q)$ means that state $q$ is reachable from state $p$ in exactly $N$ steps:

$\phi_{run}^1(p, q) = \phi_{step}(p, q)$
$\phi_{run}^N(p, q) = \exists r (\phi_{run}^{N/2}(p, r) \land \phi_{run}^{N/2}(r, q))$

This can be rewritten using universal quantification to avoid duplicating $\phi_{run}^{N/2}$: \[\label{eq:phi_run_qbf} \phi_{run}^N(p, q) = \exists r \forall s \forall t (((s, t) = (p, r) \lor (s, t) = (r, q)) \to \phi_{run}^{N/2}(s, t))\] where $(s, t) = (p, r)$ is a shorthand for the conjunction of equivalences between corresponding bits of the tuples. For example, if $s = (s_1, s_2, s_3)$ and $p = (p_1, p_2, p_3)$, then $(s = p)$ is equivalent to $(s_1 \leftrightarrow p_1) \land (s_2 \leftrightarrow p_2) \land (s_3 \leftrightarrow p_3)$.

The formula $\phi_{run}^N(p, q)$ states that there exists an intermediate state $r$ such that either $(s, t)$ is equivalent to $(p, r)$ or $(s, t)$ is equivalent to $(r, q)$. In both cases, $\phi_{run}^{N/2}(s, t)$ must hold. This cleverly uses universal quantification to avoid explicitly writing two copies of $\phi_{run}^{N/2}$.

Result: The QBF formula has polynomial size ($O(k^2)$) and uses quantifier alternation. The reachability problem can be solved in polynomial space. The size is polynomial because each inductive step adds a constant number of symbols, and there are $\log N = k$ steps.

Propositional Logic: Exponential size formula ($O(N)$), NP-complete satisfiability.
QBF: Polynomial size formula ($O(k^2)$), PSPACE-complete satisfiability.

QBF provides a more succinct representation, but the problem becomes PSPACE-complete instead of NP-complete. However, since the original reachability problem is PSPACE-complete, using QBF is more efficient in this case.

Introduction to First-Order Logic

First-order logic (FO) extends propositional logic by introducing several key components:

Variables: These represent objects in a domain of discourse (e.g., numbers, strings, people). Unlike propositional variables, they do not stand for truth values but for elements of a set.
Quantifiers: $\forall$ (for all) and $\exists$ (there exists) are used to make statements about all or some objects in the domain. They quantify over objects, not truth values as in QBF.
Relational Symbols: These represent relations between objects. For example, $R(x, y)$ might represent "$x$ is related to $y$", and $S(x, y, z)$ might represent a three-way relationship between $x$, $y$, and $z$.
Function Symbols: These represent functions that map objects to objects. For instance, $f(x)$ could represent the successor of $x$, and $g(x,y)$ could represent the sum of $x$ and $y$.
Constants: These represent specific, individual objects in the domain. Examples include 0, 1 (in the domain of numbers), or "Socrates" (in the domain of people).

Signature: The set of relational, function, and constant symbols used in a first-order formula is called its signature. It defines the vocabulary for describing a particular domain.

Example: A simple first-order formula: \[\forall x \exists y (R(x, y) \land S(y, x))\] This formula states that for every object $x$, there exists an object $y$ such that $R(x, y)$ and $S(y, x)$ are both true. The interpretation of $R$ and $S$ depends on the specific domain and the meaning assigned to these relational symbols.

Note: First-order logic generalizes propositional logic and, in a sense, QBF. In FO, quantifiers range over objects in a domain, whereas in QBF, quantifiers range over Boolean values. This difference in the nature of quantification makes FO more expressive in many contexts. We will explore the connection between FO and QBF later in the course.

Conclusion

In this lecture, we delved deeper into Quantified Boolean Formulas (QBF), exploring their complexity through the lens of quantifier alternation and the polynomial hierarchy. We demonstrated how QBF can provide a more succinct representation for solving reachability problems compared to propositional logic, albeit at the cost of increased computational complexity (PSPACE-completeness).

The key takeaway is the trade-off between succinctness and complexity when using QBF. While QBF can express certain problems (like reachability) more compactly, the computational cost of solving QBF instances is generally higher than that of propositional logic.

We also introduced first-order logic (FO) as a powerful generalization of propositional logic, incorporating variables, quantifiers over objects, relational symbols, function symbols, and constants. This sets the stage for future discussions on more expressive logical formalisms.

Follow-up Questions:

How does the choice of signature affect the expressiveness of first-order logic?
Can we find other examples where QBF provides a more succinct representation than propositional logic?
What are the practical applications of first-order logic in computer science? (e.g., databases, program verification, artificial intelligence)

Next Lecture Preview: In the next lecture, we will delve deeper into the syntax and semantics of first-order logic, exploring concepts such as interpretations, models, and satisfiability. We will also discuss the relationship between first-order logic and other formalisms, such as relational algebra and database query languages. We will examine how first-order logic serves as a foundation for many areas of computer science.

--- title: "Lecture Notes: Quantified Boolean Formulas (QBF) and First-Order Logic" author: "Your Name" date: "2025-02-05" format: html: toc: true # Table of Contents toc-depth: 2 code-tools: true theme: cosmo # Or "journal" for Distill-like minimalism --- # Introduction This lecture focuses on Quantified Boolean Formulas (QBF), their complexity, and an introduction to first-order logic. We will explore the model checking and satisfiability problems for QBF, highlighting the role of quantifier alternation. We will also discuss the polynomial hierarchy and demonstrate how QBF can be used to solve reachability problems more succinctly than propositional logic. Finally, we will introduce first-order logic (FO) as a generalization of propositional logic and, in some sense, QBF. # Quantified Boolean Formulas (QBF) ## Model Checking and Satisfiability We revisit the model checking and satisfiability problems for QBF, which are both PSPACE-complete. - **Model Checking**: Given a QBF $\phi$ and a model $M$, determine if $M \models \phi$. - **Satisfiability**: Given a QBF $\phi$, determine if there exists a model $M$ such that $M \models \phi$. ::: tcolorbox The complexity of both model checking and satisfiability for QBF arises from the alternation between existential and universal quantifiers. ::: This alternation can be viewed as a game between two players: - The **green player** (existential) tries to find a valuation that makes the formula true. - The **red player** (universal) tries to find a valuation that makes the formula false. **Analogy with Mathematical Proofs**: The process of proving a theorem with alternating quantifiers (e.g., $\forall x, \exists y, \dots$) can be seen as a game where the opponent chooses an arbitrary $x$, and the prover must construct a $y$ that satisfies the given condition. For example, the Pumping Lemma for regular languages involves such alternating quantifications. ## Prenex Normal Form ::: {.def:prenex_normal_form .definition} **Definition 1** (Prenex Normal Form). *A QBF is in **prenex normal form** if all quantifiers are at the beginning of the formula, followed by a quantifier-free propositional formula. It has the form: $$Q_1 x_1 Q_2 x_2 \dots Q_n x_n \phi$$ where $Q_i \in \{\exists, \forall\}$ and $\phi$ is a propositional formula.* ::: ::: tcolorbox **Definition [1](#def:prenex_normal_form){reference-type="ref" reference="def:prenex_normal_form"}** defines the structure of a QBF in prenex normal form, where all quantifiers precede a quantifier-free propositional formula. ::: ::: {.lemma:pnf .lemma} **Lemma 1** (Prenex Normal Form Transformation). *Every QBF can be transformed into an equivalent formula in prenex normal form in linear time.* ::: ::: tcolorbox **Lemma [1](#lemma:pnf){reference-type="ref" reference="lemma:pnf"}** states that any QBF can be converted into an equivalent prenex normal form in linear time. ::: ::: proof *Proof.* The transformation involves pulling quantifiers out while renaming variables to avoid capturing free variables. - For example, consider the formula $(\exists p \, \phi_1) \lor \phi_2$. If $p$ occurs free in $\phi_2$, we rename $p$ to a fresh variable $q$ in $\phi_1$: $$(\exists p \, \phi_1) \lor \phi_2 \equiv \exists q \, (\phi_1[p/q] \lor \phi_2)$$ where $q$ is a fresh variable not occurring in $\phi_2$. - Similarly, for $(\forall p \, \phi_1) \lor \phi_2$, we have: $$(\forall p \, \phi_1) \lor \phi_2 \equiv \forall q \, (\phi_1[p/q] \lor \phi_2)$$ This process is repeated until all quantifiers are pulled to the front. Each step takes linear time, and the total number of steps is linear in the size of the formula. ◻ ::: ## Quantifier Alternation and Complexity The number of alternations between blocks of existential and universal quantifiers is a crucial complexity parameter for QBF. **Example**: In the formula $\exists p \exists q \forall r \exists s \forall t \forall u \, \phi$, there are three quantifier alternations. ## Polynomial Hierarchy The polynomial hierarchy is a hierarchy of complexity classes that generalizes NP and co-NP. - $\Sigma_1^P = \text{NP}$: Problems solvable in polynomial time with an existential guess (certificate). Formally, $L \in \Sigma_1^P$ if there exists a polynomial-time decidable relation $R$ and a polynomial $p$ such that $x \in L \iff \exists y, |y| \leq p(|x|) \land R(x, y)$. - $\Pi_1^P = \text{co-NP}$: Problems solvable in polynomial time with a universal guess. Formally, $L \in \Pi_1^P$ if there exists a polynomial-time decidable relation $R$ and a polynomial $p$ such that $x \in L \iff \forall y, |y| \leq p(|x|) \land R(x, y)$. - $\Sigma_2^P$: Problems solvable with an existential guess followed by a universal guess: $\exists x \forall y \, P(x, y)$, where $P$ is a polynomial-time predicate. Formally, $L \in \Sigma_2^P$ if there exists a polynomial-time decidable relation $R$ and polynomials $p, q$ such that $x \in L \iff \exists y, |y| \leq p(|x|) \land \forall z, |z| \leq q(|x|) \land R(x, y, z)$. - $\Pi_2^P$: Problems solvable with a universal guess followed by an existential guess: $\forall x \exists y \, P(x, y)$. Formally, $L \in \Pi_2^P$ if there exists a polynomial-time decidable relation $R$ and polynomials $p, q$ such that $x \in L \iff \forall y, |y| \leq p(|x|) \land \exists z, |z| \leq q(|x|) \land R(x, y, z)$. In general, $\Sigma_n^P$ and $\Pi_n^P$ involve $n$ alternating quantifiers, starting with an existential quantifier for $\Sigma_n^P$ and a universal quantifier for $\Pi_n^P$. The polynomial hierarchy is believed to be strictly contained in PSPACE. ::: {.thm:qbf_poly_hierarchy .theorem} **Theorem 2**. *Model checking and satisfiability for QBF in prenex normal form with a bounded number of quantifier alternations ($n$) are in $\Sigma_n^P$ if the first quantifier is existential and in $\Pi_n^P$ if the first quantifier is universal.* ::: ::: tcolorbox **Theorem [2](#thm:qbf_poly_hierarchy){reference-type="ref" reference="thm:qbf_poly_hierarchy"}** describes the relationship between QBF with bounded quantifier alternations and the polynomial hierarchy. Specifically, the complexity of model checking and satisfiability for such QBFs falls within the $\Sigma_n^P$ or $\Pi_n^P$ classes based on the initial quantifier. ::: ::: tcolorbox Bounding quantifier alternations reduces the complexity from PSPACE to a level within the polynomial hierarchy. Each alternation corresponds to a level in the hierarchy. ::: ## Reachability Problem with QBF We can use QBF to solve reachability problems more succinctly than with propositional logic. **Problem Setup**: - A machine with $k$ bits representing its state. - $\phi_{init}(s)$: A formula representing initial states. - $\phi_{target}(s)$: A formula representing target states. - $\phi_{step}(s, s')$: A formula representing valid transitions between states $s$ and $s'$. **Goal**: Determine if there is a sequence of $N = 2^k$ steps from an initial state to a target state. **Propositional Logic Approach**: The formula $\phi_{reach}$ is constructed as: $$\exists p_1, \dots, p_N (\phi_{init}(p_1) \land \phi_{target}(p_N) \land \bigwedge_{i=1}^{N-1} \phi_{step}(p_i, p_{i+1}))$$ This formula has exponential size ($O(N)$), since $N=2^k$. **QBF Approach**: We define $\phi_{run}^N(p, q)$ inductively, where $\phi_{run}^N(p,q)$ means that state $q$ is reachable from state $p$ in exactly $N$ steps: - $\phi_{run}^1(p, q) = \phi_{step}(p, q)$ - $\phi_{run}^N(p, q) = \exists r (\phi_{run}^{N/2}(p, r) \land \phi_{run}^{N/2}(r, q))$ This can be rewritten using universal quantification to avoid duplicating $\phi_{run}^{N/2}$: $$\label{eq:phi_run_qbf} \phi_{run}^N(p, q) = \exists r \forall s \forall t (((s, t) = (p, r) \lor (s, t) = (r, q)) \to \phi_{run}^{N/2}(s, t))$$ where $(s, t) = (p, r)$ is a shorthand for the conjunction of equivalences between corresponding bits of the tuples. For example, if $s = (s_1, s_2, s_3)$ and $p = (p_1, p_2, p_3)$, then $(s = p)$ is equivalent to $(s_1 \leftrightarrow p_1) \land (s_2 \leftrightarrow p_2) \land (s_3 \leftrightarrow p_3)$. ::: tcolorbox The formula $\phi_{run}^N(p, q)$ states that there exists an intermediate state $r$ such that either $(s, t)$ is equivalent to $(p, r)$ or $(s, t)$ is equivalent to $(r, q)$. In both cases, $\phi_{run}^{N/2}(s, t)$ must hold. This cleverly uses universal quantification to avoid explicitly writing two copies of $\phi_{run}^{N/2}$. ::: **Result**: The QBF formula has polynomial size ($O(k^2)$) and uses quantifier alternation. The reachability problem can be solved in polynomial space. The size is polynomial because each inductive step adds a constant number of symbols, and there are $\log N = k$ steps. ::: tcolorbox - **Propositional Logic**: Exponential size formula ($O(N)$), NP-complete satisfiability. - **QBF**: Polynomial size formula ($O(k^2)$), PSPACE-complete satisfiability. QBF provides a more succinct representation, but the problem becomes PSPACE-complete instead of NP-complete. However, since the original reachability problem is PSPACE-complete, using QBF is more efficient in this case. ::: # Introduction to First-Order Logic First-order logic (FO) extends propositional logic by introducing several key components: - **Variables**: These represent objects in a domain of discourse (e.g., numbers, strings, people). Unlike propositional variables, they do not stand for truth values but for elements of a set. - **Quantifiers**: $\forall$ (for all) and $\exists$ (there exists) are used to make statements about all or some objects in the domain. They quantify over objects, not truth values as in QBF. - **Relational Symbols**: These represent relations between objects. For example, $R(x, y)$ might represent \"$x$ is related to $y$\", and $S(x, y, z)$ might represent a three-way relationship between $x$, $y$, and $z$. - **Function Symbols**: These represent functions that map objects to objects. For instance, $f(x)$ could represent the successor of $x$, and $g(x,y)$ could represent the sum of $x$ and $y$. - **Constants**: These represent specific, individual objects in the domain. Examples include 0, 1 (in the domain of numbers), or \"Socrates\" (in the domain of people). ::: tcolorbox **Signature**: The set of relational, function, and constant symbols used in a first-order formula is called its signature. It defines the vocabulary for describing a particular domain. ::: **Example**: A simple first-order formula: $$\forall x \exists y (R(x, y) \land S(y, x))$$ This formula states that for every object $x$, there exists an object $y$ such that $R(x, y)$ and $S(y, x)$ are both true. The interpretation of $R$ and $S$ depends on the specific domain and the meaning assigned to these relational symbols. ::: tcolorbox **Note**: First-order logic generalizes propositional logic and, in a sense, QBF. In FO, quantifiers range over objects in a domain, whereas in QBF, quantifiers range over Boolean values. This difference in the nature of quantification makes FO more expressive in many contexts. We will explore the connection between FO and QBF later in the course. ::: # Conclusion In this lecture, we delved deeper into Quantified Boolean Formulas (QBF), exploring their complexity through the lens of quantifier alternation and the polynomial hierarchy. We demonstrated how QBF can provide a more succinct representation for solving reachability problems compared to propositional logic, albeit at the cost of increased computational complexity (PSPACE-completeness). ::: tcolorbox The key takeaway is the trade-off between succinctness and complexity when using QBF. While QBF can express certain problems (like reachability) more compactly, the computational cost of solving QBF instances is generally higher than that of propositional logic. ::: We also introduced first-order logic (FO) as a powerful generalization of propositional logic, incorporating variables, quantifiers over objects, relational symbols, function symbols, and constants. This sets the stage for future discussions on more expressive logical formalisms. **Follow-up Questions**: - How does the choice of signature affect the expressiveness of first-order logic? - Can we find other examples where QBF provides a more succinct representation than propositional logic? - What are the practical applications of first-order logic in computer science? (e.g., databases, program verification, artificial intelligence) **Next Lecture Preview**: In the next lecture, we will delve deeper into the syntax and semantics of first-order logic, exploring concepts such as interpretations, models, and satisfiability. We will also discuss the relationship between first-order logic and other formalisms, such as relational algebra and database query languages. We will examine how first-order logic serves as a foundation for many areas of computer science.