Logical Consequence and Resolution in Logic Programming
Introduction
This lecture introduces the concept of logical consequence, a cornerstone of logic programming. Our objective is to understand how to formally define and compute logical consequences within a given logical theory. We will cover the following key topics:
The Herbrand Theorem and its implications for logical consequence.
The concept of the Minimum Herbrand Model (MP) and its significance.
Two primary methods for computing MP:
Top-down approaches, based on resolution.
Bottom-up approaches, based on iterative rule application.
SLD-resolution, a specialized form of resolution tailored for definite clauses.
The crucial concept of unification and the most general unifier (MGU).
These concepts are essential for understanding the theoretical underpinnings of logic programming and its practical implementations.
Logical Consequence: What logically follows from a set of premises.
Herbrand Model: A specific interpretation used in logic programming.
Minimum Herbrand Model (MP): The smallest model that satisfies a program.
SLD Resolution: A proof procedure for definite clauses.
Unification: The process of making two expressions identical by substitution.
Most General Unifier (MGU): The simplest substitution that makes two expressions identical.
We will explore these topics in detail, providing both theoretical foundations and practical insights.
Logical Consequence
The notion of logical consequence is central to understanding what we expect from a logic program or a logical theory. Intuitively, a logical consequence is a statement that is necessarily true if the premises (the program or theory) are true. We focus on ground atoms for simplicity, but this concept can be extended to more complex formulas.
Definition 1 (Logical Consequence). A ground atom \(A\) is a logical consequence of a program theory \(P\) if \(A\) is true in all interpretations that are models of \(P\).
In other words, if \(P\) is true, then \(A\) must also be true.
Formally, let \(Q\) be a predicate and \(T_1, \dots, T_n\) be terms. If the interpretation of the tuple of terms \((T_1, \dots, T_n)\) belongs to the semantic predicate \(Q\) in every model of \(P\), then the ground atom \(Q(T_1, \dots, T_n)\) is a logical consequence of \(P\).
The Herbrand Theorem provides a crucial simplification for determining logical consequences. It states that instead of considering all possible interpretations, it is sufficient to verify the truth of \(A\) in all Herbrand models of \(P\). This significantly reduces the complexity of the problem.
Herbrand Theorem: To determine if a ground atom is a logical consequence of a program theory, it is sufficient to verify its truth in all Herbrand models of the program theory.
This theorem allows us to work within a more constrained and manageable space of interpretations.
Herbrand Models and Minimum Herbrand Model
To understand the significance of the Herbrand Theorem, we need to define Herbrand models. These models provide a specific interpretation of our logical programs, simplifying the process of determining logical consequences.
Definition 2 (Herbrand Model). A Herbrand model is an interpretation where:
The universe (the domain of interpretation) is the set of all ground terms that can be formed from the constants and function symbols appearing in the program.
The mapping of terms to elements of the universe is fixed; each ground term is interpreted as itself.
We focus solely on the interpretation of predicates, determining which tuples of ground terms satisfy each predicate.
In essence, a Herbrand model provides a concrete way to interpret the symbols in our program, allowing us to evaluate the truth of atoms.
Intersection Property: For programs consisting of definite clauses, the intersection of two Herbrand models is also a Herbrand model. This property extends to the intersection of an infinite number of Herbrand models.
This property is crucial because it guarantees the existence of a unique smallest Herbrand model, called the Minimum Herbrand Model (MP).
Definition 3 (Minimum Herbrand Model (MP)). The Minimum Herbrand Model (MP) of a program \(P\) is the intersection of all Herbrand models of \(P\).
The MP is a fundamental concept in logic programming. It represents the set of all ground atoms that are true in every Herbrand model of the program.
Theorem: A ground atom \(A\) is a logical consequence of a program \(P\) if and only if \(A\) belongs to the Minimum Herbrand Model (MP) of \(P\).
This theorem provides a practical way to determine logical consequences: we only need to check if an atom is in the MP. The challenge then becomes how to compute this MP, which we will address in the following sections.
Computing the Minimum Herbrand Model
Computing the Minimum Herbrand Model (MP) is a key task in logic programming. There are two primary approaches to this problem, each with its strengths and limitations:
Top-Down Approach:
Method: This approach is based on resolution, a proof procedure that starts with a goal (the atom we want to prove) and tries to derive it from the program’s rules and facts.
Suitability: Top-down methods, such as SLD-resolution (which we will discuss later), are generally suitable for both finite and infinite Herbrand models.
Limitation: A major limitation is that top-down methods may not always terminate, especially when dealing with programs that can lead to infinite derivations.
Bottom-Up Approach:
Method: This approach starts with the facts in the program and iteratively applies the program’s rules to derive new facts until no new facts can be derived. This process is also known as forward chaining.
Suitability: Bottom-up methods are well-suited for programs with finite Herbrand models, where the process will eventually converge to the MP.
Limitation: For programs with infinite Herbrand models, bottom-up methods are impractical because the derivation process may never terminate.
Top-Down vs. Bottom-Up:
Top-Down: Goal-driven, may not terminate, suitable for finite and infinite models.
Bottom-Up: Data-driven, terminates for finite models, impractical for infinite models.
The choice between top-down and bottom-up methods depends on the specific characteristics of the program and the goals of the analysis. In the following sections, we will focus on the top-down approach, specifically SLD-resolution, which is widely used in logic programming.
SLD Resolution
SLD-resolution is a specialized form of resolution, a proof procedure used in automated theorem proving, that is particularly well-suited for logic programs consisting of definite clauses. It is a top-down method for computing logical consequences. The acronym SLD stands for:
S: Selection rule - a rule for choosing which atom to resolve on next.
L: Linear - the resolution process is linear, meaning each step involves the previous goal and a program clause.
D: Definite clauses - the method is designed for programs consisting of definite clauses (rules with a single positive literal in the head).
SLD Resolution Rule
The core of SLD-resolution is a rule that specifies how to derive new goals from existing ones. Given a goal, which is a conjunction of negative literals (represented as \(\neg(A_1, A_2, \dots, A_n)\)), and a program \(P\), the SLD-resolution rule proceeds as follows:
Selection: Select an atom \(A_i\) from the current goal using a selection rule. Common selection rules include selecting the leftmost atom.
Rule Selection and Renaming: Find a rule \(H \leftarrow B_1, \dots, B_m\) in the program \(P\) such that the head \(H\) can be unified with the selected atom \(A_i\). Before unification, rename all variables in the rule to avoid conflicts with variables in the goal.
Unification: Compute the most general unifier (MGU) \(\theta\) for \(A_i\) and \(H\). If no unifier exists, this branch of the derivation fails.
Goal Reduction: Replace the selected atom \(A_i\) in the goal with the body of the rule \(B_1, \dots, B_m\), and apply the unifier \(\theta\) to the entire new goal. The new goal is now \(\neg((A_1, \dots, A_{i-1}, B_1, \dots, B_m, A_{i+1}, \dots, A_n)\theta)\).
Iteration: Repeat steps 1-4 with the new goal until one of the following occurs:
Success: The goal becomes empty (denoted by \(\Box\)), indicating that a proof has been found.
Failure: No rule can be applied to the selected atom, indicating that this branch of the derivation has failed.
Backtracking: If a derivation branch fails, the SLD-resolution process backtracks to the most recent choice point where an alternative rule could have been selected. This non-deterministic search continues until a proof is found or all possibilities have been exhausted.
The SLD-resolution rule provides a systematic way to explore the space of possible proofs for a given goal. The efficiency and success of this process depend on the selection rule and the order of rules in the program.
Goal \(G = \neg(A_1, A_2, \dots, A_n)\), Program \(P\) \(G_0 \gets G\) \(k \gets 0\) Select an atom \(A_i\) from \(G_k\) using a selection rule Find a rule \(H \leftarrow B_1, \dots, B_m\) in \(P\) such that \(A_i\) and \(H\) unify with MGU \(\theta\), after renaming variables in the rule failure \(G_{k+1} \gets \neg((A_1, \dots, A_{i-1}, B_1, \dots, B_m, A_{i+1}, \dots, A_n)\theta)\) \(k \gets k+1\) return Success return Failure
Complexity: The complexity of SLD resolution depends on the size of the search space, which can grow exponentially with the number of rules and atoms. However, efficient implementations can mitigate this complexity.
Unification and Most General Unifier
Unification is a fundamental operation in logic programming and theorem proving. It is the process of finding a substitution that makes two terms (or atoms) syntactically identical. The concept of a most general unifier (MGU) is crucial because it provides the simplest possible substitution that achieves unification.
Definition 4 (Substitution). A substitution \(\theta\) is a mapping from a set of variables to terms. Applying a substitution \(\theta\) to a term \(R\), denoted as \(R\theta\), means replacing every variable \(X\) in \(R\) with the term \(\theta(X)\). This replacement is done simultaneously for all variables.
Definition 5 (Unifier). A unifier \(\theta\) of two terms (or atoms) \(S\) and \(T\) is a substitution such that \(S\theta = T\theta\). In other words, applying the substitution \(\theta\) to both \(S\) and \(T\) results in identical terms.
Definition 6 (Most General Unifier (MGU)). A unifier \(\sigma\) is more general than a unifier \(\theta\) if there exists a substitution \(\gamma\) such that for every term \(T\), \((T\sigma)\gamma = T\theta\). The most general unifier (MGU) is a unifier that is more general than all other unifiers. It is the simplest substitution that makes two terms identical.
Theorem 1. If two terms (or atoms) admit a unifier, then they admit a most general unifier (MGU), which is unique up to variable renaming.
This theorem guarantees that if two terms can be unified, there exists a unique (up to variable renaming) simplest substitution that achieves this.
Unification Algorithm
The following algorithm describes how to compute the MGU of two terms. It operates on a set of equations \(E\), which are pairs of terms that we want to unify.
Remove Identity Equations: If an equation is of the form \(X = X\), where \(X\) is a variable, remove it from \(E\).
Orient Equations: If an equation is of the form \(T = X\), where \(T\) is not a variable, rewrite it as \(X = T\).
Occur Check: If an equation is of the form \(X = T\), where \(X\) is a variable and \(X\) occurs in \(T\) (and \(T\) is not \(X\)), then fail. This is known as the occur check and prevents infinite terms.
Apply Substitution: If an equation is of the form \(X = T\), where \(X\) is a variable and \(X\) does not occur in \(T\), apply the substitution \(\{X/T\}\) to the rest of the equations in \(E\). This means replacing all occurrences of \(X\) with \(T\) in the remaining equations.
Function Symbol Mismatch: If an equation is of the form \(F(S_1, \dots, S_n) = G(T_1, \dots, T_m)\) and \(F \neq G\), where \(F\) and \(G\) are function symbols, then fail.
Decompose Function Terms: If an equation is of the form \(F(S_1, \dots, S_n) = F(T_1, \dots, T_n)\), replace it with the set of equations \(S_1 = T_1, \dots, S_n = T_n\).
Set of equations \(E\) Select an equation from \(E\) Remove the equation from \(E\) Replace the equation with \(X = T\) return Failure (Occur Check) Apply substitution \(\{X/T\}\) to the rest of equations in \(E\) Remove the equation from \(E\) return Failure (Function Symbol Mismatch) Replace the equation with \(S_1 = T_1, \dots, S_n = T_n\) return Failure return MGU (set of substitutions)
Termination and Complexity: The unification algorithm always terminates. A naive implementation that copies terms during substitution can lead to exponential time complexity. However, using structure sharing, where terms are represented as pointers to shared data structures, can reduce the complexity to quadratic time. Further optimizations, such as using techniques similar to the union-find algorithm, can achieve linear time complexity.
Example: Top-Down Derivation
To illustrate the process of SLD-resolution, let’s consider a simple logic program and a goal. This example will demonstrate how the top-down derivation works in practice.
Program:
\(R(A)\).
\(R(B)\).
\(P(A)\).
\(Q(X) \leftarrow R(X), P(X)\).
This program consists of three facts and one rule. We want to determine if the goal \(Q(A)\) is a logical consequence of this program.
Goal: \(Q(A)\)
The top-down derivation using SLD-resolution proceeds as follows:
Initial Goal: Start with the goal \(\neg Q(A)\).
Rule Application: The goal \(Q(A)\) unifies with the head of the rule [rule:q_x], \(Q(X)\), with the substitution \(\theta_1 = \{X/A\}\). After renaming the variable in the rule to \(X_1\), we have \(Q(X_1) \leftarrow R(X_1), P(X_1)\). Applying the substitution, we get the new goal \(\neg (R(A), P(A))\).
First Atom Resolution: Select the first atom \(R(A)\) from the goal. This atom unifies with the fact [fact:r_a], \(R(A)\), with the empty substitution \(\theta_2 = \{\}\). The new goal becomes \(\neg P(A)\).
Second Atom Resolution: Select the atom \(P(A)\) from the goal. This atom unifies with the fact [fact:p_a], \(P(A)\), with the empty substitution \(\theta_3 = \{\}\). The new goal becomes empty, denoted by \(\Box\).
Derivation Steps:
Goal: \(\neg Q(A)\)
Apply rule [rule:q_x] with \(\theta_1 = \{X/A\}\): \(\neg (R(A), P(A))\)
Apply fact [fact:r_a] with \(\theta_2 = \{\}\): \(\neg P(A)\)
Apply fact [fact:p_a] with \(\theta_3 = \{\}\): \(\Box\) (empty goal)
Since we have reached an empty goal, the derivation is successful, and we conclude that \(Q(A)\) is a logical consequence of the program. This example demonstrates the basic steps of SLD-resolution: selecting an atom, finding a matching rule or fact, unifying, and reducing the goal until it becomes empty.
Figure 1 provides a visual representation of the derivation process. Each node represents a goal, and the arrows indicate the application of a rule or fact, along with the corresponding substitution.
Prolog and SLD Resolution
Prolog is a logic programming language that implements SLD-resolution as its core inference mechanism. However, Prolog makes specific choices regarding the selection rule and the order in which rules are tried, which can affect the behavior of the program.
Selection Rule: Prolog uses a leftmost atom selection rule. This means that when a goal has multiple atoms, Prolog always selects the leftmost atom for resolution.
Rule Order: Prolog tries rules in the order they appear in the program. If multiple rules have heads that unify with the selected atom, Prolog tries them sequentially.
Prolog’s Strategy:
Leftmost Atom Selection: Always selects the leftmost atom in the goal.
Rule Order: Tries rules in the order they appear in the program.
Depth-First Search: Explores one branch of the search tree completely before backtracking.
These choices can lead to situations where Prolog enters an infinite loop, even if a solution exists. This happens when the search strategy explores an infinite branch of the search space without finding a proof, while other branches would lead to a successful derivation.
Figure 2 illustrates a scenario where Prolog’s depth-first search strategy might lead to non-termination. If the leftmost branch of the search tree is infinite, Prolog will never reach the solution in the right branch.
Completeness Theorem:
Theorem 2. Let \(P\) be a definite program. Then for any ground atom \(A\), \(A \in MP\) if and only if there is an SLD-refutation of \(P \cup \{\neg A\}\).
This theorem states that if a ground atom \(A\) is a logical consequence of a definite program \(P\) (i.e., \(A\) is in the Minimum Herbrand Model of \(P\)), then there exists an SLD-refutation (a successful derivation) for the goal \(\neg A\). This is a fundamental result that guarantees the completeness of SLD resolution.
However, the theorem does not specify how to find the refutation. Prolog’s specific choices can lead to non-termination, but the underlying theory guarantees that a solution can be found with a suitable search strategy. For instance, a breadth-first search strategy, which explores all possible derivations at each level before going deeper, would guarantee that a solution is found if one exists. However, breadth-first search is generally less efficient than depth-first search in terms of memory usage.
In summary, while Prolog’s specific implementation choices can lead to non-termination in some cases, the underlying theory of SLD-resolution guarantees that if a solution exists, it can be found with a suitable search strategy. This highlights the importance of understanding both the theoretical foundations and the practical limitations of logic programming systems.
Conclusion
This lecture has provided a comprehensive overview of fundamental concepts in logic programming, focusing on logical consequence, Herbrand models, and methods for computing the Minimum Herbrand Model (MP). We have explored the following key topics:
Logical Consequence: The notion of a statement being necessarily true given a set of premises.
Herbrand Models: A specific interpretation of logical programs that simplifies the analysis of logical consequences.
Minimum Herbrand Model (MP): The smallest Herbrand model that satisfies a program, representing the set of all logical consequences.
SLD Resolution: A specialized form of resolution tailored for definite clauses, used as a top-down method for computing logical consequences.
Unification and Most General Unifier (MGU): The process of making two terms identical by substitution, and the simplest substitution that achieves this.
We have also discussed how Prolog implements SLD-resolution, highlighting its specific choices regarding selection rules and rule order, and the potential for non-termination. Despite these practical limitations, the theoretical foundations of SLD-resolution guarantee its completeness: if a solution exists, it can be found with a suitable search strategy.
Key Points:
Logical consequence is a core concept in logic programming.
Herbrand models simplify the analysis of logical consequences.
The Minimum Herbrand Model (MP) represents all logical consequences.
SLD-resolution is a practical method for computing logical consequences.
Unification and MGU are fundamental operations in SLD-resolution.
Prolog’s implementation of SLD-resolution has practical limitations, but the underlying theory guarantees completeness.
Further Considerations:
Selection Rule Impact: How does the choice of selection rule affect the efficiency and termination of SLD resolution? Can different selection rules lead to more efficient derivations or guarantee termination in more cases?
Search Strategy Modification: Can we modify Prolog’s search strategy (e.g., using breadth-first search) to guarantee termination in more cases, and what are the trade-offs in terms of efficiency?
Extending the Concepts: How can we extend these concepts to handle negation, disjunction, and other more complex logical constructs?
Relationship to Other Techniques: What is the relationship between SLD resolution and other theorem-proving techniques, such as resolution in first-order logic?
Applications: How can these concepts be applied in areas such as deductive databases, planning, and knowledge representation?
These questions provide a starting point for further exploration and research in the field of logic programming.
In the next lecture, we will begin our exploration of stable model semantics and answer set programming, which provide alternative approaches to logic programming that can handle negation and other complex features more naturally.