Lecture 7: Adversarial Search, Knowledge-Based Agents, and Introduction to Logic

Author

Your Name

Published

February 10, 2025

Introduction

Course Announcements: Examination Dates and Format

This lecture will address remaining topics from the previous session and provide essential course information. The examination dates for the summer session, encompassing June, July, and September, have been set as July 5th, July 22nd, and September 26th. These dates are positioned later in the session due to scheduling constraints.

The intended examination format consists of a written exam in the morning at 9:30 AM, followed by an oral exam in the afternoon. The feasibility of conducting both on the same day is contingent upon the number of students registered for each exam session. Higher student turnout, especially in the initial exam sessions, may necessitate scheduling the oral examinations on subsequent days. Contingency plans for oral examinations will be communicated during the written exam session if needed. These dates are provided to facilitate your study planning.

Review of Lecture 6: Adversarial Search and Game Playing

Continuing from the previous lecture, we will revisit the concept of search within adversarial environments, focusing on game playing as a primary example. While games are the central paradigm, the principles extend to other competitive scenarios, such as autonomous driving where vehicles might compete to reach a destination, or in scenarios involving strategic resource allocation. Historically, the study of adversarial search originated in the context of game theory, with significant early advancements like Samuel’s checkers program developed in the 1950s.

Adversarial search differs fundamentally from standard search algorithms due to the presence of an opponent actively working against the agent’s objectives. In this lecture, we will concentrate on two key algorithms for adversarial search: the Minimax algorithm and alpha-beta pruning. We will also examine how Samuel’s checkers program implemented these concepts, offering a historical perspective on their practical application and early success. These methods are crucial for understanding how to develop rational agents in competitive environments.

Adversarial Search in Games

Zero-Sum Games

In the context of game theory, we focus on zero-sum games. While games in general can involve complex interactions including cooperation and alliance formation, we simplify our analysis by considering zero-sum scenarios. These games are characterized by a strict competition between two players, where one player’s gain is directly equivalent to the other player’s loss. This means the total sum of utilities for all players in any outcome of the game is zero. Common examples include chess, checkers, and tic-tac-toe. In these games, outcomes are typically win, lose, or draw.

In zero-sum games, every game state, particularly terminal states, can be assigned a utility value that represents the desirability of that state for each player. Consider chess: a checkmate position for player Max is assigned maximum positive utility for Max and, correspondingly, maximum negative utility (or minimum utility) for player Min. Conversely, a position where Max is checkmated has minimum utility for Max and maximum utility for Min. This inverse relationship of utilities is a defining feature of zero-sum games, simplifying the analysis as we only need to explicitly consider maximizing the utility for one player, as minimizing the opponent’s utility is implicitly achieved.

Minimax Algorithm

The foundational concept of the Minimax algorithm is rooted in the opposing objectives of the players in a zero-sum game. Player Max aims to select actions that lead to game states with the highest possible utility for themselves, while player Min seeks to choose actions that result in game states with the lowest possible utility for Max (thereby maximizing their own utility). This creates a competitive dynamic where each player attempts to anticipate and counteract the moves of the other.

Optimal Strategy and Minimax Value

The Minimax algorithm is designed to determine the optimal strategy for player Max, under the critical assumption that player Min also plays optimally. An optimal strategy in this context is one that maximizes Max’s utility in the worst-case scenario, considering all possible optimal counter-strategies by Min. The Minimax value of a game state is the expected utility for Max when both players follow optimal strategies from that state until the end of the game. It represents the best guaranteed outcome for Max, assuming perfect play from both sides. This value is computed recursively, starting from the terminal states and working backwards up the game tree.

Maximizing and Minimizing Players in Game Tree

The Minimax algorithm operates by traversing the game tree, which represents all possible game states and transitions. It assigns roles to each level of the tree, alternating between maximizing and minimizing levels.

Maximizing Player (Max Nodes): At levels where it is Max’s turn to move, the algorithm selects the action that leads to the successor state with the maximum Minimax value. Max aims to maximize their potential gain. These nodes are often visualized as upward-pointing triangles, symbolizing maximization.
Minimizing Player (Min Nodes): Conversely, at levels where it is Min’s turn, the algorithm chooses the action that leads to the successor state with the minimum Minimax value. Min seeks to minimize Max’s gain, which is equivalent to maximizing Min’s own outcome in a zero-sum game. These nodes are often visualized as downward-pointing triangles, symbolizing minimization.

This alternating maximization and minimization process propagates utility values from the terminal states up to the root of the game tree, allowing the initial state’s Minimax value and the optimal first move for Max to be determined.

Illustrative Examples: Pac-Man and Tic-Tac-Toe

To clarify the Minimax concept, let’s consider simplified examples using Pac-Man and Tic-Tac-Toe.

Pac-Man Example: Single Player vs. Two Players

Initially, consider Pac-Man in a simplified environment where the sole objective is to eat a pellet. Pac-Man can move towards or away from the pellet. In this single-player scenario, the optimal strategy is straightforward: always move towards the pellet.

Example 1 (title=Pac-Man Single Player Scenario, colback=white, colframe=purple!75!black).

Move Selection: Moving right, which reduces the distance to the pellet, is evaluated as superior to moving left, which increases the distance.
Terminal State: Reaching the pellet is a terminal state, representing the end of this simplified game with a high utility value for Pac-Man.
Utility Definition: We can quantify utility by assigning points for achieving the goal (e.g., +10 for eating a pellet) and costs for actions (e.g., -1 per move).
Utility Propagation: Terminal states are directly assigned utility values. For non-terminal states, the utility is derived based on optimal play. In a single-player game, this means choosing the action that leads to the state with the maximum possible utility.
Optimal Value: For any non-terminal state, the value* is the maximum utility achievable from its possible successor states, assuming Pac-Man always chooses the action that maximizes their score.*

Now, let’s introduce an adversary: a ghost whose objective is to catch Pac-Man. This transforms the scenario into a two-player, zero-sum game.

Example 2 (title=Pac-Man vs. Ghost - Two Player Adversarial Game, colback=white, colframe=purple!75!black).

Adversarial Objective: With the ghost present, the simple strategy of always moving towards the pellet is no longer optimal. Approaching the pellet might increase the risk of encountering the ghost.
Turn-Based Movement: Assume a turn-based system where Pac-Man moves first, followed by the ghost.
Ghost’s Goal: The ghost’s objective is to minimize Pac-Man’s utility, for instance, by moving closer to Pac-Man and capturing it. Conversely, from the ghost’s perspective, it aims to maximize its own utility by capturing Pac-Man.
Expanded Terminal States: Terminal states now include not only Pac-Man eating the pellet but also Pac-Man being caught by the ghost. Each terminal state has a defined utility reflecting the outcome for Pac-Man (and implicitly for the ghost in a zero-sum game).
Ghost’s Optimal Moves: The ghost will choose moves that minimize Pac-Man’s utility. If faced with a choice, the ghost will select the move that leads to the game state least favorable for Pac-Man.
Pac-Man’s Anticipation: Pac-Man, using the Minimax principle, must anticipate the ghost’s optimal counter-moves. Pac-Man will evaluate possible moves by considering the subsequent moves the ghost will make to minimize Pac-Man’s utility. Pac-Man’s optimal move will be the one that maximizes its utility, considering the ghost’s best responses.

Tic-Tac-Toe (Triga) Example: A Complete Game Tree in Principle

Tic-Tac-Toe provides a more structured and complete example for illustrating Minimax. Starting with an empty 3x3 board, players Max (typically ‘X’) and Min (‘O’) take turns placing their marks. The game ends when one player gets three in a row, or all squares are filled (a draw).

Example 3 (title=Tic-Tac-Toe - Minimax Application, colback=white, colframe=purple!75!black).

Game States as Board Configurations: Each state in Tic-Tac-Toe is a unique configuration of the board, representing the placement of ’X’s and ’O’s.
Terminal States and Utilities: Terminal states are reached when a player wins, or the game is a draw. We can assign utilities: +1 for a Max win, -1 for a Min win, and 0 for a draw.
Minimax Value Propagation: To determine the optimal move from any non-terminal state, we apply Minimax. Starting from terminal states, we propagate utility values upwards through the game tree.
Min’s Move Selection: When it’s Min’s turn, Min will evaluate all possible moves and choose the one that leads to a state with the minimum* utility value for Max (i.e., the best outcome for Min).*
Max’s Move Selection: Conversely, when it’s Max’s turn, Max will choose the move that maximizes the utility value, anticipating Min’s subsequent minimizing moves.

Minimax Example Tree: Value Propagation

Figure 1 provides a numerical illustration of Minimax value propagation. Nodes A, B, C, and D represent game states, with A being the root. Squares at the bottom represent terminal states with associated utility values. The algorithm proceeds bottom-up. For Min nodes B, C, and D, the Minimax value is the minimum of the utilities of their children. For Max node A, the Minimax value is the maximum of the Minimax values of its children (B, C, D). Thus, from state A, the optimal move for Max is A1, leading to state B, which guarantees a Minimax value of 3, assuming optimal play by Min subsequently.

Formal Definition and Pseudo-code of Minimax Algorithm

The Minimax algorithm can be formally defined using a recursive function that computes the Minimax value of each state.

Definition 1 (title=Minimax Value Function, colback=white, colframe=green!75!black). The Minimax value function, Minimax-Value(s), for a state $s$ is defined as: \[\texttt{Minimax-Value}(s) =\begin{cases} \texttt{Utility}(s) & \text{if } \texttt{Terminal-Test}(s) \\ \max_{a \in \texttt{Actions}(s)} \texttt{Minimax-Value}(\texttt{Result}(s, a)) & \text{if } \texttt{To-Move}(s) = \text{Max} \\ \min_{a \in \texttt{Actions}(s)} \texttt{Minimax-Value}(\texttt{Result}(s, a)) & \text{if } \texttt{To-Move}(s) = \text{Min}\end{cases}\] Where:

$\texttt{Actions}(s)$ is the set of legal actions available in state $s$.
$\texttt{Result}(s, a)$ is the state that results from performing action $a$ in state $s$.
$\texttt{Terminal-Test}(s)$ is a function that returns true if $s$ is a terminal state, and false otherwise.
$\texttt{Utility}(s)$ is the utility function that assigns a numerical value to a terminal state $s$.
$\texttt{To-Move}(s)$ indicates which player’s turn it is in state $s$.

Algorithm 1 (H).

return $a \in Actions(state)$ that maximizes

Algorithm 2 (H).

return return $\max_{a \in Actions(state)}$ return $\min_{a \in Actions(state)}$

Algorithm 3 (H).

Computational Complexity of Minimax

The primary limitation of the Minimax algorithm is its computational demand. The algorithm performs a depth-first traversal of the game tree. In the worst case, it needs to explore the entire game tree to determine the optimal move. This results in a time complexity of $O(b^m)$, where:

$b$ is the branching factor, representing the average number of legal moves available at each game state.
$m$ is the maximum depth of the game tree, which corresponds to the length of the longest possible game.

For games with large branching factors and deep game trees, such as chess (where $b \approx 35$ and $m \approx 100$), the computational cost becomes prohibitively expensive. For chess, this results in exploring approximately $35^{100}$ nodes, rendering a full Minimax search computationally infeasible in practice. The space complexity for a depth-first search like Minimax is $O(bm)$ in the case of storing only the current path, or $O(b^m)$ if the entire tree is stored. This exponential complexity underscores the need for optimization techniques to make adversarial search practical for complex games.

Alpha-Beta Pruning: Enhancing Minimax Efficiency

Motivation for Optimization: Pruning the Game Tree

To overcome the computational bottleneck of the Minimax algorithm, especially in games with large game trees, optimization techniques are essential. One of the most effective methods is alpha-beta pruning. Alpha-beta pruning is a refinement of Minimax that significantly reduces the number of nodes that need to be evaluated in the game tree, without altering the final Minimax decision. It achieves this by eliminating branches of the search tree that are provably irrelevant to the final outcome.

Core Idea: Alpha and Beta Values for Pruning

Alpha-beta pruning enhances the Minimax algorithm by maintaining two additional values during the tree traversal:

$\alpha$ (Alpha): Alpha is the best (highest) utility value that Max has found so far at any choice point along the current path of the search. It represents the minimum score that Max is assured of achieving, given the search path so far. Alpha is initialized to $-\infty$ at the root and tends to increase as Max finds better moves.
$\beta$ (Beta): Beta is the best (lowest) utility value that Min has found so far at any choice point along the current path for Min. It represents the maximum score that Min is assured of allowing Max to achieve. Beta is initialized to $+\infty$ at the root and tends to decrease as Min finds moves that limit Max’s score.

Pruning Condition: Pruning occurs when, at any Min node, the current beta value becomes less than or equal to the alpha value of any Max node ancestor. Specifically, if at a Min node, any of its children nodes returns a value less than or equal to the current $\alpha$, then the remaining children of this Min node need not be explored. This is because Min will always choose a value less than or equal to beta, and Max already has a guaranteed option (alpha) that is at least as good. Symmetrically, at a Max node, if any child returns a value greater than or equal to the current $\beta$, then the remaining children of this Max node can be pruned.

In essence, alpha-beta pruning identifies and eliminates portions of the search tree that are guaranteed not to influence the final decision, thereby making the search process much more efficient.

Illustrative Example of Alpha-Beta Pruning in Minimax Tree Search

Consider again the Minimax example from Figure 1. Let’s walk through how alpha-beta pruning would optimize the search.

When evaluating node C (a Min node), we examine its children sequentially.

We visit the first child of C, which is a terminal node with a utility value of 2. Since C is a Min node, the beta value for C is initially set to $\infty$, and after examining the first child, beta is updated to $\min(\infty, 2) = 2$. So, we know that the value of node C will be at most 2.
Now, consider the context: Node C is a child of node A (a Max node). Assume that node B, another child of A, has already been evaluated and has a Minimax value of 3. This means that for node A, the alpha value is at least 3 (since Max can guarantee a value of 3 by choosing move A1 to reach node B).
When evaluating node C, we’ve established that its beta value is 2 (or less, if further children were to yield even lower values). Since the beta value of C (2) is less than the current alpha value of A (which is at least 3 due to node B), we can prune the remaining children of C (C2 and C3). There is no need to explore them because Max will always choose the path through node B (with value 3) over the path through node C (with a maximum value of 2).

By applying alpha-beta pruning, we avoid evaluating the subtrees rooted at nodes C2 and C3, significantly reducing the search effort without changing the optimal Minimax decision at node A. The optimal move from A remains A1, leading to node B, with a Minimax value of 3.

Example Demonstration and Pruning Cutoffs

[To fully illustrate alpha-beta pruning, a step-by-step visual demonstration, similar to the animated example described in the transcript, would be ideal. This would involve dynamically showing the alpha and beta values at each node and highlighting the pruned branches. For now, a textual description suffices, but a visual aid would greatly enhance understanding in a lecture setting. The key is to emphasize how the alpha and beta bounds tighten as the search progresses, and how these bounds enable the algorithm to identify and prune irrelevant branches.]

Heuristic Evaluation Functions for Complex Games

The Necessity of Evaluation Functions for Practical Games

For many games of practical interest, such as chess, Go, and even moderately complex games like checkers, reaching terminal states within a computationally feasible timeframe is often impossible. The vast size of the game trees for these games makes complete Minimax search, even with alpha-beta pruning, impractical for real-time decision-making. To address this, we employ heuristic evaluation functions.

Heuristic evaluation functions are designed to estimate the utility of a game state without exploring the game to its conclusion. They provide an approximate value for non-terminal states, allowing the algorithm to assess the relative desirability of different positions. These functions are crucial for enabling depth-limited search, where the game tree is explored only to a certain depth, and the evaluation function is applied at the leaf nodes of this limited tree.

Depth-Limited Search: Truncating the Search Tree

Depth-limited search is a strategy used to manage the complexity of game tree search. Instead of searching all the way to terminal states, we impose a cutoff depth. The Minimax tree is expanded only up to this predetermined depth. At the cutoff depth, the search stops, and the heuristic evaluation function is applied to the leaf nodesat this depth. These heuristic values then serve as estimated utilities for these nodes, and the Minimax algorithm proceeds to propagate these values upwards, just as it does with terminal utilities.

Effectively, depth-limited search transforms the problem from finding exact Minimax values based on game completion to finding approximate Minimax values based on heuristic estimations at a certain depth. The choice of depth limit is a trade-off: deeper searches generally yield more accurate results but require more computation time.

H-MINIMAX Algorithm: Integrating Heuristics into Minimax

The H-MINIMAX algorithm is a modification of the standard Minimax algorithm that incorporates heuristic evaluation functions to enable depth-limited search. It is structurally similar to Minimax but replaces the utility function for terminal states with a heuristic evaluation function for non-terminal states at the cutoff depth.

Definition 2 (title=H-MINIMAX Value Function, colback=white, colframe=green!75!black). The H-MINIMAX value function, H-Minimax-Value(s, d), for a state $s$ at depth $d$ is defined as: \[\texttt{H-Minimax-Value}(s, d) =\begin{cases} \texttt{Eval}(s) & \text{if } \texttt{Cutoff-Test}(s, d) \\ \texttt{Utility}(s) & \text{if } \texttt{Terminal-Test}(s) \\ \max_{a \in \texttt{Actions}(s)} \texttt{H-Minimax-Value}(\texttt{Result}(s, a), d+1) & \text{if } \texttt{To-Move}(s) = \text{Max} \\ \min_{a \in \texttt{Actions}(s)} \texttt{H-Minimax-Value}(\texttt{Result}(s, a), d+1) & \text{if } \texttt{To-Move}(s) = \text{Min}\end{cases}\] Where:

$\texttt{Cutoff-Test}(s, d)$ is a function that determines if the search should be terminated at state $s$ and depth $d$. A simple cutoff test is to check if the current depth $d$ has reached a predefined limit. More sophisticated cutoff tests might consider the stability of the evaluation function or the quiescence of the game state.
$\texttt{Eval}(s)$ is the heuristic evaluation function* that estimates the utility of a non-terminal state $s$. A well-designed evaluation function is crucial for the performance of H-MINIMAX. It should capture the key features of the game state that are indicative of winning or losing.*

The H-MINIMAX algorithm uses the Cutoff-Test to decide when to stop expanding the search tree along a particular path. If the cutoff test is triggered (e.g., maximum depth reached), the Eval function is called to estimate the value of the state. Otherwise, the algorithm proceeds with Minimax recursion, incrementing the depth $d$ at each step.

Historical Significance: Samuel’s Checkers Program and Early Machine Learning

Arthur Samuel’s checkers program, developed in the 1950s and 1960s, is a landmark achievement in AI and game playing. It effectively utilized heuristic evaluation functions and depth-limited search to play checkers at a competent level, even surpassing Samuel himself. Crucially, Samuel’s program was not static; it incorporated pioneering machine learning techniques to improve its performance over time.

One of the key innovations was that the program learned from experience. It played numerous games against itself and against human opponents. This experience was used to refine the heuristic evaluation function. For instance, the program adjusted the weights of different features in the evaluation function based on game outcomes. This is an early example of reinforcement learning, where successful play (winning games) provided positive feedback, reinforcing the strategies and evaluation parameters that led to success. Samuel’s work not only demonstrated the practical power of heuristic search in game playing but also foreshadowed the importance of machine learning in artificial intelligence, and he is indeed credited with coining the term "Machine Learning" itself.

Summary of Adversarial Search Techniques

In this section, we have explored several key techniques for adversarial search in games:

Informed Search Algorithms: A* as a Foundation

Building upon our previous discussions of search algorithms, particularly informed search, we recall the A* algorithm. While A* is primarily designed for single-agent pathfinding and optimization problems, the concept of using heuristic information to guide search efficiently is a foundational principle that extends to adversarial search. A* uses a heuristic function to estimate the cost to reach the goal, guiding the search towards promising paths. In adversarial search, heuristic evaluation functions play a similar role, estimating the value of game states and guiding the Minimax search.

AND-OR Trees for Non-Deterministic Environments

In earlier lectures, we introduced AND-OR trees as a framework for problem-solving in non-deterministic environments. While Minimax and alpha-beta pruning are designed for deterministic, two-player zero-sum games, AND-OR trees provide a more general approach that can handle uncertainty and different types of problem decomposition. The concepts of maximizing and minimizing outcomes, and the recursive evaluation of states, share conceptual similarities between Minimax and the evaluation process in AND-OR trees, although they are applied in different contexts (adversarial vs. non-deterministic environments).

Minimax and Alpha-Beta Pruning: Core Algorithms for Adversarial Games

Minimax Algorithm: The fundamental algorithm for decision-making in two-player, zero-sum games with perfect information. It computes optimal moves by recursively evaluating game states and alternating between maximizing and minimizing player perspectives.
Alpha-Beta Pruning: An optimization technique that significantly enhancesenhances the efficiency of Minimax search by pruning branches of the game tree that are guaranteed to be suboptimal. It maintains alpha and beta bounds to detect and eliminate irrelevant search paths.
Heuristic Evaluation Functions and Depth-Limited Search: Essential for applying Minimax to complex games with large game trees. Heuristic evaluation functions estimate the utility of non-terminal states, enabling depth-limited search and practical game-playing agents.

These techniques provide a robust foundation for developing AI agents capable of strategic decision-making in competitive environments, and they represent key milestones in the history and development of artificial intelligence.

Knowledge-Based Agents and Logical Reasoning

Introduction to Knowledge-Based Agents: Mimicking Human Cognition

We now shift our focus to knowledge-based agents, a paradigm in AI that aims to create intelligent systems by explicitly representing and reasoning with knowledge. The fundamental idea is to emulate human cognitive processes, where knowledge and reasoning play a central role in decision-making and problem-solving. Humans possess a vast amount of knowledge about the world, and they use this knowledge to reason, infer, and ultimately choose actions. For example, a human knows that "if it is raining, then taking an umbrella is advisable." This piece of knowledge directly influences the action of whether or not to take an umbrella when going outside. Knowledge-based agents strive to replicate this capability by encoding knowledge in a structured format and employing reasoning mechanisms to derive appropriate actions. This approach contrasts with purely reactive or model-free agents, which rely on direct mappings from percepts to actions or learned associations without explicit knowledge representation.

Explicit and Declarative Knowledge Representation

A defining characteristic of knowledge-based agents is the explicit representation of knowledge. This means that the agent’s knowledge is not implicitly embedded within its code or learned parameters, but rather is stored in a form that is directly accessible and interpretable. During the 1970s and 1980s, a significant area of AI research was dedicated to exploring effective methods for knowledge representation. Various approaches were considered, including graphical notations, semantic networks, and frames. However, logic emerged as the dominant and most influential tool for explicit knowledge representation in artificial intelligence.

The adoption of logic as a representation language was driven by its formal nature, well-defined semantics, and the availability of inference mechanisms. Logic provides a precise and unambiguous way to express facts, rules, and relationships. This paradigm shift towards logic-based knowledge representation paved the way for the development of knowledge-based systems and, more specifically, expert systems. These systems were designed to capture and utilize domain-specific knowledge to solve complex problems in areas such as medical diagnosis, financial analysis, and engineering design.

Logic as the Lingua Franca for Knowledge Representation

Logic serves as a formal, unambiguous, and well-understood language for representing knowledge in AI systems. Its adoption was a pivotal moment in the field, providing a robust foundation for building intelligent agents.

Logic, in its various forms, offers a structured and mathematically grounded approach to knowledge representation. Within the realm of logic, several types exist, each with varying expressive power and suitability for different types of knowledge and reasoning tasks. These include:

Propositional Logic: The most basic form, dealing with simple statements (propositions) that are either true or false, and logical connectives like AND, OR, and NOT.
First-Order Logic (Predicate Logic): A more expressive logic that introduces predicates, objects, and quantifiers, allowing for representation of more complex relationships and generalizations.
Modal Logic: Extends propositional or first-order logic to include modalities such as necessity and possibility, useful for reasoning about beliefs, knowledge, and time.
Higher-Order Logic: Even more expressive, allowing quantification over predicates and functions, but also increasing complexity.

In our exploration of knowledge-based agents, we will begin with propositional logic due to its simplicity and its ability to illustrate fundamental concepts. We will then progress to first-order logic, which is essential for representing more complex and nuanced knowledge required for many AI applications. Understanding the distinctions and capabilities of these logical systems is crucial for designing intelligent agents that can effectively reason about the world.

Knowledge Base: The Repository of Agent’s Knowledge

At the heart of a knowledge-based agent lies the knowledge base (KB). The knowledge base is the central component that stores the agent’s accumulated knowledge about the world. It acts as a repository of facts, beliefs, and rules, all formally encoded in a chosen logical language. Typically, a KB is structured as a collection of sentences or formulas in a logical language. Each sentence represents a piece of knowledge. For instance, in propositional logic, a sentence might be "It is raining," represented symbolically as, say, $R$. In first-order logic, a sentence could be "All humans are mortal," represented as $\forall x (\text{Human}(x) \implies \text{Mortal}(x))$.

The knowledge base is not merely a passive data store; it is an active component that the agent uses to reason and make decisions. The effectiveness of a knowledge-based agent is heavily dependent on the quality, completeness, and organization of its knowledge base. Building and maintaining a knowledge base is a significant challenge in developing knowledge-based systems, often referred to as knowledge engineering.

Core Operations: Tell and Ask - Interacting with the Knowledge Base

Knowledge-based agents interact with their knowledge base through two fundamental operations: Tell and Ask. These operations define how the agent updates and queries its knowledge.

Tell (Knowledge Ingestion): The Tell operation is used to add new knowledge to the KB. This is how the agent learns about the world, either through perception or by being programmed with initial knowledge. When the agent perceives something new or is given new information, this information is translated into a sentence in the logical language and then told to the KB. For example, if a sensor detects rain, the agent uses Tell to add the sentence representing "It is raining" to its KB. Tell operations expand the KB, enriching the agent’s representation of the world.
Ask (Knowledge Retrieval and Inference): The Ask operation is used to query the KB. The agent uses Ask to retrieve existing knowledge or, more importantly, to pose queries that require logical inference. When an agent needs to decide on an action or answer a question, it formulates a query in the logical language and asks the KB. The Ask operation is not simply a database lookup; it involves logical reasoning. The inference engine within the KB processes the query, using the existing sentences in the KB to derive logical consequences and determine if the query is entailed by the KB. For instance, an agent might Ask "Should I take an umbrella?". The inference engine would then use the knowledge in the KB (e.g., "If it is raining, then take an umbrella") and the current facts (e.g., "It is raining") to infer the answer.

Remark. Remark 1 (title=Tell and Ask Operations, colback=white, colframe=gray!75!black). Tell and Ask are the two fundamental interfaces for a knowledge-based agent to interact with its knowledge base. Tell adds knowledge, while Ask queries and reasons with the knowledge.

Inference Engine: The Reasoning Mechanism

A critical component of a knowledge-based agent, beyond the knowledge base itself, is the inference engine. The inference engine is the reasoning mechanism that enables the agent to infer new knowledge from the explicitly represented facts and rules in the KB. It is the engine that powers the Ask operation, allowing the agent to go beyond simply retrieving stored facts and to deduce logical consequences.

The inference engine applies inference rules to the sentences in the KB to derive new sentences that are logically entailed by the existing knowledge. For example, given the sentences "Socrates is a man" and "All men are mortal," an inference engine using a rule like Modus Ponens can infer the new sentence "Socrates is mortal." This ability to derive implicit knowledge is what gives knowledge-based agents their reasoning power.

Historically, a major goal in the development of knowledge-based systems was to create a general-purpose inference engine. The vision was to build a single, domain-independent reasoning mechanism that could be coupled with different knowledge bases to create expert systems for various domains. The idea was that by simply changing the knowledge base—populating it with facts and rules relevant to a specific domain (e.g., medicine, finance, law)—the same inference engine could be used to perform expert-level reasoning in that domain. While this ambitious goal of a completely general-purpose inference engine faced challenges in practice, the concept of separating the inference mechanism from the domain-specific knowledge remains a fundamental principle in knowledge-based systems design.

Knowledge Level vs. Implementation Level: Declarative Programming

When designing and analyzing knowledge-based agents, it is crucial to distinguish between the knowledge level and the implementation level. Describing an agent at the knowledge level means focusing on what the agent knows and what reasoning it is capable of, without being concerned with the specific how of implementation. At the knowledge level, we specify the agent’s competence by describing the knowledge it possesses and the inferences it can draw, irrespective of the data structures or algorithms used to realize these capabilities.

This knowledge-level perspective leads to a declarative approach to agent programming. In declarative programming, we focus on describing what the agent should know and what goals it should achieve, rather than explicitly programming how to achieve them step-by-step. We provide the agent with a knowledge base and an inference engine, and the agent uses these components to figure out the necessary steps to achieve its goals. This is in stark contrast to procedural programming, where the programmer explicitly specifies the sequence of operations or algorithm that the agent must follow.

Declarative Programming (Knowledge Level): Focuses on what the agent knows and should achieve. Knowledge and goals are specified, and the agent reasons to find a solution.

Procedural Programming (Implementation Level): Focuses on how to achieve a goal. Explicit algorithms and step-by-step instructions are programmed.

Knowledge-based agents, through their declarative nature, offer a powerful abstraction. By programming at the knowledge level, we can design agents by focusing on the knowledge they need and the reasoning they should perform, leaving the inference engine to handle the procedural details of deriving solutions. This approach can simplify the development of complex intelligent systems and make them more understandable and maintainable.

The Wumpus World: A Practical Example for Knowledge-Based Agents

Introducing the Wumpus World: A Perceptual Reasoning Challenge

The Wumpus World is a widely recognized and utilized example in Artificial Intelligence, specifically designed to illustrate the principles of knowledge representation, reasoning, and agent design in a perceptually limited environment. It presents a scenario where an agent must navigate a dangerous cave to find gold, relying on limited sensory information to avoid deadly pitfalls and a fearsome monster called the Wumpus. The game is played on a discrete 4x4 grid, representing the cave, and is inherently single-player, focusing on the intelligent decision-making of the agent. The agent invariably begins its adventure in the bottom-left corner of this grid, at location [1,1].

Navigating the Perilous Cave: Environment and Agent’s Challenge

The Wumpus world is populated with several key elements that define the agent’s challenges and the nature of the game:

Bottomless Pits (Pits): Scattered randomly throughout the cave are hidden pits. Should the agent stumble into a square containing a pit, the outcome is fatal – the agent falls and the game ends. Crucially, the presence of pits is not directly perceivable. Instead, in a stroke of environmental storytelling, squares immediately adjacent (horizontally or vertically) to a pit emit a subtle breeze. This breeze is the agent’s only warning of nearby pits, requiring it to use inference to deduce potential pit locations.
Gleaming Gold (Gold): The objective of the game is to locate and retrieve a piece of gold hidden in one of the squares within the cave. Finding the gold is rewarding, but merely finding it is not enough to win. The agent must grab the gold, then navigate back to the starting square [1,1], and finally climb out of the cave to successfully complete the mission. The presence of gold is indicated by a perceptible glitter in the square where it is located.
The Wumpus: A Deadly Monster: The Wumpus is a fearsome monster lurking in a single, randomly chosen square within the cave. Like pits, the Wumpus’s location is unknown to the agent at the start. Encountering the Wumpus by entering its square results in immediate death for the agent, as it is devoured. To provide a clue to its location, the Wumpus emits a foul stench in all squares immediately adjacent to its location. This stench serves as a warning signal, allowing the agent to infer the potential presence of the Wumpus nearby and avoid fatal encounters.
Cave Walls (Walls): The 4x4 grid representing the Wumpus World is enclosed by impenetrable walls. These walls define the boundaries of the cave. If the agent attempts to move beyond the grid, into a wall, it will bump into it, perceiving a bump sensor reading, and its position will remain unchanged. These walls constrain the agent’s movement and define the limits of the explorable world.

Adding to the agent’s arsenal, it is equipped with a single arrow. This arrow can be used to attempt to eliminate the Wumpus. If the agent shoots the arrow in a straight line in its current facing direction, and the arrow’s path intersects with the Wumpus’s location, the Wumpus is killed. Upon successfully killing the Wumpus, a distinctive scream is emitted, which the agent can perceive, confirming the monster’s demise. However, the agent has only one arrow, making its use a strategic decision.

PEAS Description of the Wumpus World Agent

To formally define the Wumpus World problem and the agent operating within it, we can use the PEAS (Performance measure, Environment, Actuators, Sensors) framework. This framework helps to systematically analyze and design intelligent agents.

Performance Measure and Agent’s Goals

The agent’s performance in the Wumpus World is evaluated based on a scoring system designed to incentivize goal achievement and penalize undesirable outcomes:

Positive Reward:
- Climbing out with Gold: +1000 points are awarded if the agent successfully climbs out of the cave from square [1,1] while carrying the gold. This is the primary objective and yields the highest reward.
Negative Penalties:
- Death: -1000 points are deducted if the agent dies, either by falling into a pit or being eaten by the Wumpus. This significant penalty emphasizes the importance of survival.
- Action Cost: -1 point is subtracted for each action performed by the agent, including moving forward, turning (left or right), grabbing the gold, and climbing. This encourages efficiency and finding solutions with the minimum number of steps.
- Arrow Shot Cost: -10 points are deducted each time the agent shoots its arrow. This substantial penalty for using the arrow encourages careful and strategic use of this limited resource.

The game’s duration is finite, ending when the agent either dies (falls into a pit or is eaten by the Wumpus) or successfully climbs out of the cave. The agent’s ultimate goal is to maximize its accumulated score, which necessitates finding the gold, escaping the cave alive, and doing so efficiently, while strategically considering the use of its single arrow.

Environment Characteristics: The Cave Layout

The Wumpus World environment is characterized by:

Grid-Based Cave: The cave is represented as a 4x4 grid of locations or rooms, surrounded by walls that form the outer boundary.
Starting Position: The agent always begins its exploration in square [1,1], located at the bottom-left of the grid, and is initially oriented to face East (to the right).
Random Placement of Gold and Wumpus: The gold and the Wumpus are each located in one square within the cave, with their positions chosen randomly at the start of each game. Importantly, neither the gold nor the Wumpus can be placed in the starting square [1,1], ensuring the agent always begins in a safe location. The distribution of their placement is uniform across the possible squares (excluding [1,1]).
Probabilistic Pit Distribution: Pits are also distributed randomly throughout the cave. Each square, except for the starting square [1,1], has a 20% probability of containing a pit. This probabilistic placement means that the number of pits can vary from game to game, and some games might have no pits at all, while others could be densely riddled with them.

Agent Actuators: Actions Available to the Agent

The agent can interact with the Wumpus World environment through a defined set of actions, or actuators:

Move Forward (Forward): This action attempts to move the agent one square in the direction it is currently facing. If the movement would cause the agent to collide with a wall (i.e., move outside the 4x4 grid), the action results in a bump perception, and the agent’s position remains unchanged.
Turn Left (Turn Left): This action rotates the agent 90 degrees counter-clockwise, changing its facing direction (e.g., from East to North). The agent remains in the same square.
Turn Right (Turn Right): This action rotates the agent 90 degrees clockwise, changing its facing direction (e.g., from East to South). The agent remains in the same square.
Grab (Grab): The agent can use this action to pick up the gold, but only if it is currently in the same square as the gold. If gold is present in the current square, executing Grab will result in the agent carrying the gold.
Shoot Arrow (Shoot): This action fires the agent’s single arrow in a straight line in the direction the agent is currently facing. The arrow travels until it hits a wall or the Wumpus. If the arrow hits the Wumpus, the Wumpus is killed, and a scream is perceived. The agent only has one arrow for the entire game, making this a one-time use action. Subsequent Shoot actions have no effect.
Climb (Climb): This action allows the agent to climb out of the cave, but only if it is in the starting square [1,1]. To win the maximum score, the agent must be in square [1,1], possess the gold, and then execute the Climb action.

Agent Sensors: Perceptions of the Wumpus World

The agent’s knowledge about the Wumpus World is derived from its limited set of sensors, which provide local perceptions of its immediate surroundings:

Stench Sensor (Stench): If there is a Wumpus in any of the squares immediately adjacent (North, South, East, or West) to the agent’s current square, the agent perceives a stench in its current location. The stench does not indicate the Wumpus’s exact location, only its proximity.
Breeze Sensor (Breeze): If any adjacent square contains a pit, the agent perceives a breeze in its current square. Similar to the stench, the breeze is a proximal indicator, not a precise location of the pit.
Glitter Sensor (Glitter): If the gold is located in the agent’s current square, the agent perceives a glitter. This is a direct indication of the gold’s presence in the current location.
Bump Sensor (Bump): If the agent attempts to move forward into a wall, the bump sensor is activated, providing feedback that the attempted move was blocked by a wall.
Scream Sensor (Scream): If the agent successfully shoots and kills the Wumpus, the scream sensor is activated in the turn immediately following the Shoot action. This confirms to the agent that the Wumpus has been eliminated.

These sensors provide the agent with crucial, albeit indirect and local, information about the dangers and rewards within the Wumpus World. The agent must intelligently process these perceptions to navigate safely, deduce the locations of the Wumpus and pits, and ultimately find and retrieve the gold.

Introduction to Propositional Logic

The Fundamental Role of Logic in Artificial Intelligence Systems

Logic is indispensable to Artificial Intelligence. It provides the bedrock for formalizing knowledge and reasoning processes, enabling AI systems to move beyond mere data processing to genuine understanding and inference.

Logic, in its essence, offers a structured and unambiguous system for representing information and deriving conclusions. Its importance in AI stems from its ability to provide:

Formal Knowledge Representation: Logic allows us to encode knowledge in a precise and unambiguous format, eliminating the vagueness and ambiguity inherent in natural language. This formal representation is crucial for AI systems to process and manipulate knowledge effectively.
Sound Reasoning and Inference: Logic provides a set of inference rules that guarantee the validity of deductions. Using logical inference, AI agents can derive new, sound conclusions from existing knowledge. This deductive capability is fundamental to intelligent behavior, allowing agents to answer questions, solve problems, and make predictions based on their knowledge.
Rational Decision Making: By representing goals, constraints, and the state of the world in logic, AI agents can use logical reasoning to determine optimal or rational actions. Logic enables agents to evaluate different courses of action, predict their outcomes, and choose actions that are logically consistent with their knowledge and objectives.
Basis for AI Paradigms: Logic underpins many important areas within AI, including knowledge representation, automated reasoning, planning, and problem-solving. It serves as a foundational tool for building intelligent systems that can reason, learn, and interact with the world in a coherent and rational manner.

In essence, logic empowers AI systems with the ability to think, in a formal and computational sense, mirroring aspects of human reasoning and providing a pathway towards creating truly intelligent machines.

Previewing Propositional Logic: The Logic of Simple Statements

Our exploration of logic in AI will commence with propositional logic, which is the foundational and simplest form of logic. Despite its simplicity, propositional logic provides a powerful starting point for understanding the core principles of logical representation and inference.

Propositional Logic is the logic of simple statements. It deals with propositions – declarative statements that can be definitively classified as either true or false – and how these propositions can be combined and manipulated using logical connectives.

In propositional logic, we focus on:

Propositions: These are basic declarative sentences that are either true or false. Examples include "The sun is shining," "It is raining," or "2 + 2 = 4." In propositional logic, we treat these statements as atomic units, without analyzing their internal structure.
Logical Connectives: These are operators that combine propositions to form more complex sentences. The primary logical connectives in propositional logic are:
- AND ($\land$): Combines two propositions, true only if both are true.
- OR ($\lor$): Combines two propositions, true if at least one is true.
- NOT ($\neg$): Negates a proposition, true if the proposition is false, and vice versa.
- IMPLICATION ($\implies$): Represents "if...then..." relationships. $P \implies Q$ is false only if $P$ is true and $Q$ is false.
- BICONDITIONAL ($\iff$): Represents "if and only if" relationships. $P \iff Q$ is true if $P$ and $Q$ have the same truth value.

In the subsequent sections and lectures, we will delve into the details of propositional logic. We will learn:

Syntax and Semantics: How to formally construct sentences in propositional logic and how to determine their truth values.
Knowledge Representation: How to use propositional logic to represent knowledge about the world, including facts, rules, and constraints.
Logical Inference: Methods for performing logical inference in propositional logic, allowing us to derive new knowledge from existing sentences. This will include exploring concepts like entailment and inference rules, which are crucial for building reasoning agents.

By mastering propositional logic, we will establish a solid foundation for understanding more advanced logical systems and their applications in building intelligent AI agents.

Conclusion

Summary: Bridging Adversarial Search and Knowledge-Based Systems

In today’s lecture, we have traversed two significant areas within Artificial Intelligence. We began by revisiting adversarial search, focusing on the challenges of game playing and competitive environments. We explored the Minimax algorithm as a foundational approach for optimal decision-making in zero-sum games and examined alpha-beta pruning as a crucial optimization technique to enhance its efficiency. We also discussed the necessity of heuristic evaluation functions for tackling complex games where exhaustive search is infeasible, referencing Samuel’s checkers program as a historical example of these principles in action.

Subsequently, we transitioned to the realm of knowledge-based agents. We introduced the concept of agents that explicitly represent and reason with knowledge, drawing a contrast with purely reactive approaches. We emphasized theimportance of logic as a formal language for knowledge representation and discussed the core components of knowledge-based agents: the knowledge base and the inference engine. We highlighted the Tell and Ask operations as the primary modes of interaction with a knowledge base and differentiated between the knowledge level and the implementation level of agent design, underscoring the declarative nature of knowledge-based programming. To ground these abstract concepts, we introduced the Wumpus World as a concrete and illustrative example environment that will serve as a running case study for applying knowledge representation and reasoning techniques in the lectures to come.

Looking Ahead: Propositional Logic and Reasoning in the Wumpus World

Building upon the introduction to knowledge-based agents, our next lectures will embark on a deeper exploration of propositional logic. We will systematically examine its syntax, defining how to construct well-formed logical sentences, and its semantics, establishing how to determine the truth value of these sentences in different possible worlds or interpretations. A key focus will be on inference rules and methods for performing logical deduction, enabling us to derive new knowledge from existing information.

Crucially, we will apply propositional logic to the Wumpus World. We will learn how to formally represent the agent’s knowledge about the Wumpus World environment, including the rules of the game, the agent’s perceptions (stench, breeze, glitter, bump, scream), and the consequences of its actions. Our goal is to equip our agent with the ability to reason logically about its percepts and the environment to make informed and safe decisions. We will investigate how to represent the dynamic aspects of the Wumpus World and the agent’s evolving state within a logical framework, ultimately aiming to design a knowledge-based agent capable of intelligently navigating the Wumpus World, finding the gold, and escaping unscathed. This practical application will solidify our understanding of propositional logic and its role in creating intelligent systems.

--- title: "Lecture 7: Adversarial Search, Knowledge-Based Agents, and Introduction to Logic" author: "Your Name" date: "2025-02-10" format: html: toc: true # Table of Contents toc-depth: 2 code-tools: true theme: cosmo # Or "journal" for Distill-like minimalism --- # Introduction {#sec:introduction} ## Course Announcements: Examination Dates and Format {#subsec:exam_dates} This lecture will address remaining topics from the previous session and provide essential course information. The examination dates for the summer session, encompassing June, July, and September, have been set as July 5th, July 22nd, and September 26th. These dates are positioned later in the session due to scheduling constraints. The intended examination format consists of a written exam in the morning at 9:30 AM, followed by an oral exam in the afternoon. The feasibility of conducting both on the same day is contingent upon the number of students registered for each exam session. Higher student turnout, especially in the initial exam sessions, may necessitate scheduling the oral examinations on subsequent days. Contingency plans for oral examinations will be communicated during the written exam session if needed. These dates are provided to facilitate your study planning. ## Review of Lecture 6: Adversarial Search and Game Playing {#subsec:review_lecture_6} Continuing from the previous lecture, we will revisit the concept of search within adversarial environments, focusing on game playing as a primary example. While games are the central paradigm, the principles extend to other competitive scenarios, such as autonomous driving where vehicles might compete to reach a destination, or in scenarios involving strategic resource allocation. Historically, the study of adversarial search originated in the context of game theory, with significant early advancements like Samuel's checkers program developed in the 1950s. Adversarial search differs fundamentally from standard search algorithms due to the presence of an opponent actively working against the agent's objectives. In this lecture, we will concentrate on two key algorithms for adversarial search: the Minimax algorithm and alpha-beta pruning. We will also examine how Samuel's checkers program implemented these concepts, offering a historical perspective on their practical application and early success. These methods are crucial for understanding how to develop rational agents in competitive environments. # Adversarial Search in Games {#sec:adversarial_search} ## Zero-Sum Games {#subsec:zero_sum_games} In the context of game theory, we focus on *zero-sum games*. While games in general can involve complex interactions including cooperation and alliance formation, we simplify our analysis by considering zero-sum scenarios. These games are characterized by a strict competition between two players, where one player's gain is directly equivalent to the other player's loss. This means the total sum of utilities for all players in any outcome of the game is zero. Common examples include chess, checkers, and tic-tac-toe. In these games, outcomes are typically win, lose, or draw. In zero-sum games, every game state, particularly terminal states, can be assigned a *utility value* that represents the desirability of that state for each player. Consider chess: a checkmate position for player Max is assigned maximum positive utility for Max and, correspondingly, maximum negative utility (or minimum utility) for player Min. Conversely, a position where Max is checkmated has minimum utility for Max and maximum utility for Min. This inverse relationship of utilities is a defining feature of zero-sum games, simplifying the analysis as we only need to explicitly consider maximizing the utility for one player, as minimizing the opponent's utility is implicitly achieved. ## Minimax Algorithm {#subsec:minimax_algorithm} The foundational concept of the Minimax algorithm is rooted in the opposing objectives of the players in a zero-sum game. Player Max aims to select actions that lead to game states with the highest possible utility for themselves, while player Min seeks to choose actions that result in game states with the lowest possible utility for Max (thereby maximizing their own utility). This creates a competitive dynamic where each player attempts to anticipate and counteract the moves of the other. ### Optimal Strategy and Minimax Value {#subsubsec:optimal_strategy} The Minimax algorithm is designed to determine the *optimal strategy* for player Max, under the critical assumption that player Min also plays optimally. An optimal strategy in this context is one that maximizes Max's utility in the worst-case scenario, considering all possible optimal counter-strategies by Min. The *Minimax value* of a game state is the expected utility for Max when both players follow optimal strategies from that state until the end of the game. It represents the best guaranteed outcome for Max, assuming perfect play from both sides. This value is computed recursively, starting from the terminal states and working backwards up the game tree. ### Maximizing and Minimizing Players in Game Tree {#subsubsec:max_min_players} The Minimax algorithm operates by traversing the game tree, which represents all possible game states and transitions. It assigns roles to each level of the tree, alternating between maximizing and minimizing levels. - **Maximizing Player (Max Nodes):** At levels where it is Max's turn to move, the algorithm selects the action that leads to the successor state with the *maximum* Minimax value. Max aims to maximize their potential gain. These nodes are often visualized as upward-pointing triangles, symbolizing maximization. - **Minimizing Player (Min Nodes):** Conversely, at levels where it is Min's turn, the algorithm chooses the action that leads to the successor state with the *minimum* Minimax value. Min seeks to minimize Max's gain, which is equivalent to maximizing Min's own outcome in a zero-sum game. These nodes are often visualized as downward-pointing triangles, symbolizing minimization. This alternating maximization and minimization process propagates utility values from the terminal states up to the root of the game tree, allowing the initial state's Minimax value and the optimal first move for Max to be determined. ### Illustrative Examples: Pac-Man and Tic-Tac-Toe {#subsubsec:examples} To clarify the Minimax concept, let's consider simplified examples using Pac-Man and Tic-Tac-Toe. #### Pac-Man Example: Single Player vs. Two Players Initially, consider Pac-Man in a simplified environment where the sole objective is to eat a pellet. Pac-Man can move towards or away from the pellet. In this single-player scenario, the optimal strategy is straightforward: always move towards the pellet. ::: example **Example 1** (title=Pac-Man Single Player Scenario, colback=white, colframe=purple!75!black). - ***Move Selection:** Moving right, which reduces the distance to the pellet, is evaluated as superior to moving left, which increases the distance.* - ***Terminal State:** Reaching the pellet is a *terminal state*, representing the end of this simplified game with a high utility value for Pac-Man.* - ***Utility Definition:** We can quantify utility by assigning points for achieving the goal (e.g., +10 for eating a pellet) and costs for actions (e.g., -1 per move).* - ***Utility Propagation:** Terminal states are directly assigned utility values. For non-terminal states, the utility is derived based on optimal play. In a single-player game, this means choosing the action that leads to the state with the maximum possible utility.* - ***Optimal Value:** For any non-terminal state, the *value* is the maximum utility achievable from its possible successor states, assuming Pac-Man always chooses the action that maximizes their score.* ::: Now, let's introduce an adversary: a ghost whose objective is to catch Pac-Man. This transforms the scenario into a two-player, zero-sum game. ::: example **Example 2** (title=Pac-Man vs. Ghost - Two Player Adversarial Game, colback=white, colframe=purple!75!black). - ***Adversarial Objective:** With the ghost present, the simple strategy of always moving towards the pellet is no longer optimal. Approaching the pellet might increase the risk of encountering the ghost.* - ***Turn-Based Movement:** Assume a turn-based system where Pac-Man moves first, followed by the ghost.* - ***Ghost's Goal:** The ghost's objective is to minimize Pac-Man's utility, for instance, by moving closer to Pac-Man and capturing it. Conversely, from the ghost's perspective, it aims to maximize its own utility by capturing Pac-Man.* - ***Expanded Terminal States:** Terminal states now include not only Pac-Man eating the pellet but also Pac-Man being caught by the ghost. Each terminal state has a defined utility reflecting the outcome for Pac-Man (and implicitly for the ghost in a zero-sum game).* - ***Ghost's Optimal Moves:** The ghost will choose moves that minimize Pac-Man's utility. If faced with a choice, the ghost will select the move that leads to the game state least favorable for Pac-Man.* - ***Pac-Man's Anticipation:** Pac-Man, using the Minimax principle, must anticipate the ghost's optimal counter-moves. Pac-Man will evaluate possible moves by considering the subsequent moves the ghost will make to minimize Pac-Man's utility. Pac-Man's optimal move will be the one that maximizes its utility, considering the ghost's best responses.* ::: #### Tic-Tac-Toe (Triga) Example: A Complete Game Tree in Principle Tic-Tac-Toe provides a more structured and complete example for illustrating Minimax. Starting with an empty 3x3 board, players Max (typically 'X') and Min ('O') take turns placing their marks. The game ends when one player gets three in a row, or all squares are filled (a draw). ::: example **Example 3** (title=Tic-Tac-Toe - Minimax Application, colback=white, colframe=purple!75!black). - ***Game States as Board Configurations:** Each state in Tic-Tac-Toe is a unique configuration of the board, representing the placement of 'X's and 'O's.* - ***Terminal States and Utilities:** Terminal states are reached when a player wins, or the game is a draw. We can assign utilities: +1 for a Max win, -1 for a Min win, and 0 for a draw.* - ***Minimax Value Propagation:** To determine the optimal move from any non-terminal state, we apply Minimax. Starting from terminal states, we propagate utility values upwards through the game tree.* - ***Min's Move Selection:** When it's Min's turn, Min will evaluate all possible moves and choose the one that leads to a state with the *minimum* utility value for Max (i.e., the best outcome for Min).* - ***Max's Move Selection:** Conversely, when it's Max's turn, Max will choose the move that maximizes the utility value, anticipating Min's subsequent minimizing moves.* ::: <figure id="fig:minimax_example"> <figcaption>Minimax Example Tree: Value Propagation</figcaption> </figure> Figure [1](#fig:minimax_example){reference-type="ref" reference="fig:minimax_example"} provides a numerical illustration of Minimax value propagation. Nodes A, B, C, and D represent game states, with A being the root. Squares at the bottom represent terminal states with associated utility values. The algorithm proceeds bottom-up. For Min nodes B, C, and D, the Minimax value is the minimum of the utilities of their children. For Max node A, the Minimax value is the maximum of the Minimax values of its children (B, C, D). Thus, from state A, the optimal move for Max is A1, leading to state B, which guarantees a Minimax value of 3, assuming optimal play by Min subsequently. ### Formal Definition and Pseudo-code of Minimax Algorithm {#subsubsec:formal_definition} The Minimax algorithm can be formally defined using a recursive function that computes the Minimax value of each state. ::: definition **Definition 1** (title=Minimax Value Function, colback=white, colframe=green!75!black). *The Minimax value function, `Minimax-Value(s)`, for a state $s$ is defined as: $$\texttt{Minimax-Value}(s) =\begin{cases} \texttt{Utility}(s) & \text{if } \texttt{Terminal-Test}(s) \\ \max_{a \in \texttt{Actions}(s)} \texttt{Minimax-Value}(\texttt{Result}(s, a)) & \text{if } \texttt{To-Move}(s) = \text{Max} \\ \min_{a \in \texttt{Actions}(s)} \texttt{Minimax-Value}(\texttt{Result}(s, a)) & \text{if } \texttt{To-Move}(s) = \text{Min}\end{cases}$$ Where:* - *$\texttt{Actions}(s)$ is the set of legal actions available in state $s$.* - *$\texttt{Result}(s, a)$ is the state that results from performing action $a$ in state $s$.* - *$\texttt{Terminal-Test}(s)$ is a function that returns true if $s$ is a terminal state, and false otherwise.* - *$\texttt{Utility}(s)$ is the utility function that assigns a numerical value to a terminal state $s$.* - *$\texttt{To-Move}(s)$ indicates which player's turn it is in state $s$.* ::: ::::: algorithm_env **Algorithm 1** (H). :::: tcolorbox ::: algorithmic ***return** $a \in Actions(state)$ that maximizes* ::: :::: ::::: ::::: algorithm_env **Algorithm 2** (H). :::: tcolorbox ::: algorithmic ***return** **return** $\max_{a \in Actions(state)}$ **return** $\min_{a \in Actions(state)}$* ::: :::: ::::: :::: algorithm_env **Algorithm 3** (H). ::: tcolorbox ::: :::: ### Computational Complexity of Minimax {#subsubsec:complexity} The primary limitation of the Minimax algorithm is its computational demand. The algorithm performs a depth-first traversal of the game tree. In the worst case, it needs to explore the entire game tree to determine the optimal move. This results in a time complexity of $O(b^m)$, where: - $b$ is the *branching factor*, representing the average number of legal moves available at each game state. - $m$ is the *maximum depth* of the game tree, which corresponds to the length of the longest possible game. For games with large branching factors and deep game trees, such as chess (where $b \approx 35$ and $m \approx 100$), the computational cost becomes prohibitively expensive. For chess, this results in exploring approximately $35^{100}$ nodes, rendering a full Minimax search computationally infeasible in practice. The space complexity for a depth-first search like Minimax is $O(bm)$ in the case of storing only the current path, or $O(b^m)$ if the entire tree is stored. This exponential complexity underscores the need for optimization techniques to make adversarial search practical for complex games. ## Alpha-Beta Pruning: Enhancing Minimax Efficiency {#subsec:alpha_beta_pruning} ### Motivation for Optimization: Pruning the Game Tree {#subsubsec:pruning_motivation} To overcome the computational bottleneck of the Minimax algorithm, especially in games with large game trees, optimization techniques are essential. One of the most effective methods is *alpha-beta pruning*. Alpha-beta pruning is a refinement of Minimax that significantly reduces the number of nodes that need to be evaluated in the game tree, without altering the final Minimax decision. It achieves this by eliminating branches of the search tree that are provably irrelevant to the final outcome. ### Core Idea: Alpha and Beta Values for Pruning {#subsubsec:alpha_beta_idea} Alpha-beta pruning enhances the Minimax algorithm by maintaining two additional values during the tree traversal: - **$\alpha$ (Alpha):** Alpha is the best (highest) utility value that Max has found so far at any choice point along the current path of the search. It represents the minimum score that Max is assured of achieving, given the search path so far. Alpha is initialized to $-\infty$ at the root and tends to increase as Max finds better moves. - **$\beta$ (Beta):** Beta is the best (lowest) utility value that Min has found so far at any choice point along the current path for Min. It represents the maximum score that Min is assured of allowing Max to achieve. Beta is initialized to $+\infty$ at the root and tends to decrease as Min finds moves that limit Max's score. **Pruning Condition:** Pruning occurs when, at any Min node, the current beta value becomes less than or equal to the alpha value of any Max node ancestor. Specifically, if at a Min node, any of its children nodes returns a value less than or equal to the current $\alpha$, then the remaining children of this Min node need not be explored. This is because Min will always choose a value less than or equal to beta, and Max already has a guaranteed option (alpha) that is at least as good. Symmetrically, at a Max node, if any child returns a value greater than or equal to the current $\beta$, then the remaining children of this Max node can be pruned. In essence, alpha-beta pruning identifies and eliminates portions of the search tree that are guaranteed not to influence the final decision, thereby making the search process much more efficient. ### Illustrative Example of Alpha-Beta Pruning in Minimax Tree Search {#subsubsec:pruning_tree} Consider again the Minimax example from Figure [1](#fig:minimax_example){reference-type="ref" reference="fig:minimax_example"}. Let's walk through how alpha-beta pruning would optimize the search. When evaluating node C (a Min node), we examine its children sequentially. 1. We visit the first child of C, which is a terminal node with a utility value of 2. Since C is a Min node, the beta value for C is initially set to $\infty$, and after examining the first child, beta is updated to $\min(\infty, 2) = 2$. So, we know that the value of node C will be at most 2. 2. Now, consider the context: Node C is a child of node A (a Max node). Assume that node B, another child of A, has already been evaluated and has a Minimax value of 3. This means that for node A, the alpha value is at least 3 (since Max can guarantee a value of 3 by choosing move A1 to reach node B). 3. When evaluating node C, we've established that its beta value is 2 (or less, if further children were to yield even lower values). Since the beta value of C (2) is less than the current alpha value of A (which is at least 3 due to node B), we can prune the remaining children of C (C2 and C3). There is no need to explore them because Max will always choose the path through node B (with value 3) over the path through node C (with a maximum value of 2). By applying alpha-beta pruning, we avoid evaluating the subtrees rooted at nodes C2 and C3, significantly reducing the search effort without changing the optimal Minimax decision at node A. The optimal move from A remains A1, leading to node B, with a Minimax value of 3. ### Example Demonstration and Pruning Cutoffs {#subsubsec:pruning_example} \[To fully illustrate alpha-beta pruning, a step-by-step visual demonstration, similar to the animated example described in the transcript, would be ideal. This would involve dynamically showing the alpha and beta values at each node and highlighting the pruned branches. For now, a textual description suffices, but a visual aid would greatly enhance understanding in a lecture setting. The key is to emphasize how the alpha and beta bounds tighten as the search progresses, and how these bounds enable the algorithm to identify and prune irrelevant branches.\] ## Heuristic Evaluation Functions for Complex Games {#subsec:heuristic_evaluation} ### The Necessity of Evaluation Functions for Practical Games {#subsubsec:eval_functions_need} For many games of practical interest, such as chess, Go, and even moderately complex games like checkers, reaching terminal states within a computationally feasible timeframe is often impossible. The vast size of the game trees for these games makes complete Minimax search, even with alpha-beta pruning, impractical for real-time decision-making. To address this, we employ *heuristic evaluation functions*. Heuristic evaluation functions are designed to estimate the utility of a game state without exploring the game to its conclusion. They provide an approximate value for non-terminal states, allowing the algorithm to assess the relative desirability of different positions. These functions are crucial for enabling *depth-limited search*, where the game tree is explored only to a certain depth, and the evaluation function is applied at the leaf nodes of this limited tree. ### Depth-Limited Search: Truncating the Search Tree {#subsubsec:depth_limited_search} *Depth-limited search* is a strategy used to manage the complexity of game tree search. Instead of searching all the way to terminal states, we impose a cutoff depth. The Minimax tree is expanded only up to this predetermined depth. At the cutoff depth, the search stops, and the heuristic evaluation function is applied to the leaf nodesat this depth. These heuristic values then serve as estimated utilities for these nodes, and the Minimax algorithm proceeds to propagate these values upwards, just as it does with terminal utilities. Effectively, depth-limited search transforms the problem from finding exact Minimax values based on game completion to finding approximate Minimax values based on heuristic estimations at a certain depth. The choice of depth limit is a trade-off: deeper searches generally yield more accurate results but require more computation time. ### H-MINIMAX Algorithm: Integrating Heuristics into Minimax {#subsubsec:h_minimax} The *H-MINIMAX* algorithm is a modification of the standard Minimax algorithm that incorporates heuristic evaluation functions to enable depth-limited search. It is structurally similar to Minimax but replaces the utility function for terminal states with a heuristic evaluation function for non-terminal states at the cutoff depth. ::: definition **Definition 2** (title=H-MINIMAX Value Function, colback=white, colframe=green!75!black). *The H-MINIMAX value function, `H-Minimax-Value(s, d)`, for a state $s$ at depth $d$ is defined as: $$\texttt{H-Minimax-Value}(s, d) =\begin{cases} \texttt{Eval}(s) & \text{if } \texttt{Cutoff-Test}(s, d) \\ \texttt{Utility}(s) & \text{if } \texttt{Terminal-Test}(s) \\ \max_{a \in \texttt{Actions}(s)} \texttt{H-Minimax-Value}(\texttt{Result}(s, a), d+1) & \text{if } \texttt{To-Move}(s) = \text{Max} \\ \min_{a \in \texttt{Actions}(s)} \texttt{H-Minimax-Value}(\texttt{Result}(s, a), d+1) & \text{if } \texttt{To-Move}(s) = \text{Min}\end{cases}$$ Where:* - *$\texttt{Cutoff-Test}(s, d)$ is a function that determines if the search should be terminated at state $s$ and depth $d$. A simple cutoff test is to check if the current depth $d$ has reached a predefined limit. More sophisticated cutoff tests might consider the stability of the evaluation function or the quiescence of the game state.* - *$\texttt{Eval}(s)$ is the *heuristic evaluation function* that estimates the utility of a non-terminal state $s$. A well-designed evaluation function is crucial for the performance of H-MINIMAX. It should capture the key features of the game state that are indicative of winning or losing.* ::: The H-MINIMAX algorithm uses the `Cutoff-Test` to decide when to stop expanding the search tree along a particular path. If the cutoff test is triggered (e.g., maximum depth reached), the `Eval` function is called to estimate the value of the state. Otherwise, the algorithm proceeds with Minimax recursion, incrementing the depth $d$ at each step. ### Historical Significance: Samuel's Checkers Program and Early Machine Learning {#subsubsec:samuel_checkers} Arthur Samuel's checkers program, developed in the 1950s and 1960s, is a landmark achievement in AI and game playing. It effectively utilized heuristic evaluation functions and depth-limited search to play checkers at a competent level, even surpassing Samuel himself. Crucially, Samuel's program was not static; it incorporated pioneering *machine learning* techniques to improve its performance over time. One of the key innovations was that the program learned from experience. It played numerous games against itself and against human opponents. This experience was used to refine the heuristic evaluation function. For instance, the program adjusted the weights of different features in the evaluation function based on game outcomes. This is an early example of *reinforcement learning*, where successful play (winning games) provided positive feedback, reinforcing the strategies and evaluation parameters that led to success. Samuel's work not only demonstrated the practical power of heuristic search in game playing but also foreshadowed the importance of machine learning in artificial intelligence, and he is indeed credited with coining the term \"Machine Learning\" itself. ## Summary of Adversarial Search Techniques {#subsec:summary_search_techniques} In this section, we have explored several key techniques for adversarial search in games: ### Informed Search Algorithms: A\* as a Foundation {#subsubsec:informed_search} Building upon our previous discussions of search algorithms, particularly *informed search*, we recall the *A\* algorithm*. While A\* is primarily designed for single-agent pathfinding and optimization problems, the concept of using heuristic information to guide search efficiently is a foundational principle that extends to adversarial search. A\* uses a heuristic function to estimate the cost to reach the goal, guiding the search towards promising paths. In adversarial search, heuristic evaluation functions play a similar role, estimating the value of game states and guiding the Minimax search. ### AND-OR Trees for Non-Deterministic Environments {#subsubsec:and_or_trees} In earlier lectures, we introduced *AND-OR trees* as a framework for problem-solving in *non-deterministic environments*. While Minimax and alpha-beta pruning are designed for deterministic, two-player zero-sum games, AND-OR trees provide a more general approach that can handle uncertainty and different types of problem decomposition. The concepts of maximizing and minimizing outcomes, and the recursive evaluation of states, share conceptual similarities between Minimax and the evaluation process in AND-OR trees, although they are applied in different contexts (adversarial vs. non-deterministic environments). ### Minimax and Alpha-Beta Pruning: Core Algorithms for Adversarial Games {#subsubsec:minimax_alphabeta_summary} - **Minimax Algorithm:** The fundamental algorithm for decision-making in two-player, zero-sum games with perfect information. It computes optimal moves by recursively evaluating game states and alternating between maximizing and minimizing player perspectives. - **Alpha-Beta Pruning:** An optimization technique that significantly enhancesenhances the efficiency of Minimax search by pruning branches of the game tree that are guaranteed to be suboptimal. It maintains alpha and beta bounds to detect and eliminate irrelevant search paths. - **Heuristic Evaluation Functions and Depth-Limited Search:** Essential for applying Minimax to complex games with large game trees. Heuristic evaluation functions estimate the utility of non-terminal states, enabling depth-limited search and practical game-playing agents. These techniques provide a robust foundation for developing AI agents capable of strategic decision-making in competitive environments, and they represent key milestones in the history and development of artificial intelligence. # Knowledge-Based Agents and Logical Reasoning {#sec:knowledge_based_agents} ## Introduction to Knowledge-Based Agents: Mimicking Human Cognition {#subsec:kb_agents_intro} We now shift our focus to *knowledge-based agents*, a paradigm in AI that aims to create intelligent systems by explicitly representing and reasoning with knowledge. The fundamental idea is to emulate human cognitive processes, where knowledge and reasoning play a central role in decision-making and problem-solving. Humans possess a vast amount of knowledge about the world, and they use this knowledge to reason, infer, and ultimately choose actions. For example, a human knows that \"if it is raining, then taking an umbrella is advisable.\" This piece of knowledge directly influences the action of whether or not to take an umbrella when going outside. Knowledge-based agents strive to replicate this capability by encoding knowledge in a structured format and employing reasoning mechanisms to derive appropriate actions. This approach contrasts with purely reactive or model-free agents, which rely on direct mappings from percepts to actions or learned associations without explicit knowledge representation. ## Explicit and Declarative Knowledge Representation {#subsec:knowledge_representation} A defining characteristic of knowledge-based agents is the *explicit representation of knowledge*. This means that the agent's knowledge is not implicitly embedded within its code or learned parameters, but rather is stored in a form that is directly accessible and interpretable. During the 1970s and 1980s, a significant area of AI research was dedicated to exploring effective methods for knowledge representation. Various approaches were considered, including graphical notations, semantic networks, and frames. However, *logic* emerged as the dominant and most influential tool for explicit knowledge representation in artificial intelligence. The adoption of logic as a representation language was driven by its formal nature, well-defined semantics, and the availability of inference mechanisms. Logic provides a precise and unambiguous way to express facts, rules, and relationships. This paradigm shift towards logic-based knowledge representation paved the way for the development of *knowledge-based systems* and, more specifically, *expert systems*. These systems were designed to capture and utilize domain-specific knowledge to solve complex problems in areas such as medical diagnosis, financial analysis, and engineering design. ## Logic as the Lingua Franca for Knowledge Representation {#subsec:logic_representation_language} ::: tcolorbox **Logic** serves as a formal, unambiguous, and well-understood language for representing knowledge in AI systems. Its adoption was a pivotal moment in the field, providing a robust foundation for building intelligent agents. ::: Logic, in its various forms, offers a structured and mathematically grounded approach to knowledge representation. Within the realm of logic, several types exist, each with varying expressive power and suitability for different types of knowledge and reasoning tasks. These include: - **Propositional Logic:** The most basic form, dealing with simple statements (propositions) that are either true or false, and logical connectives like AND, OR, and NOT. - **First-Order Logic (Predicate Logic):** A more expressive logic that introduces predicates, objects, and quantifiers, allowing for representation of more complex relationships and generalizations. - **Modal Logic:** Extends propositional or first-order logic to include modalities such as necessity and possibility, useful for reasoning about beliefs, knowledge, and time. - **Higher-Order Logic:** Even more expressive, allowing quantification over predicates and functions, but also increasing complexity. In our exploration of knowledge-based agents, we will begin with *propositional logic* due to its simplicity and its ability to illustrate fundamental concepts. We will then progress to *first-order logic*, which is essential for representing more complex and nuanced knowledge required for many AI applications. Understanding the distinctions and capabilities of these logical systems is crucial for designing intelligent agents that can effectively reason about the world. ## Knowledge Base: The Repository of Agent's Knowledge {#subsec:knowledge_base} At the heart of a knowledge-based agent lies the *knowledge base (KB)*. The knowledge base is the central component that stores the agent's accumulated knowledge about the world. It acts as a repository of facts, beliefs, and rules, all formally encoded in a chosen logical language. Typically, a KB is structured as a collection of *sentences* or *formulas* in a logical language. Each sentence represents a piece of knowledge. For instance, in propositional logic, a sentence might be \"It is raining,\" represented symbolically as, say, $R$. In first-order logic, a sentence could be \"All humans are mortal,\" represented as $\forall x (\text{Human}(x) \implies \text{Mortal}(x))$. The knowledge base is not merely a passive data store; it is an active component that the agent uses to reason and make decisions. The effectiveness of a knowledge-based agent is heavily dependent on the quality, completeness, and organization of its knowledge base. Building and maintaining a knowledge base is a significant challenge in developing knowledge-based systems, often referred to as knowledge engineering. ## Core Operations: Tell and Ask - Interacting with the Knowledge Base {#subsec:tell_ask_operations} Knowledge-based agents interact with their knowledge base through two fundamental operations: *Tell* and *Ask*. These operations define how the agent updates and queries its knowledge. - **Tell (Knowledge Ingestion):** The *Tell* operation is used to add new knowledge to the KB. This is how the agent learns about the world, either through perception or by being programmed with initial knowledge. When the agent perceives something new or is given new information, this information is translated into a sentence in the logical language and then *told* to the KB. For example, if a sensor detects rain, the agent uses `Tell` to add the sentence representing \"It is raining\" to its KB. `Tell` operations expand the KB, enriching the agent's representation of the world. - **Ask (Knowledge Retrieval and Inference):** The *Ask* operation is used to query the KB. The agent uses `Ask` to retrieve existing knowledge or, more importantly, to pose queries that require logical inference. When an agent needs to decide on an action or answer a question, it formulates a query in the logical language and *asks* the KB. The `Ask` operation is not simply a database lookup; it involves logical reasoning. The inference engine within the KB processes the query, using the existing sentences in the KB to derive logical consequences and determine if the query is entailed by the KB. For instance, an agent might `Ask` \"Should I take an umbrella?\". The inference engine would then use the knowledge in the KB (e.g., \"If it is raining, then take an umbrella\") and the current facts (e.g., \"It is raining\") to infer the answer. ::: remark **Remark 1** (title=Tell and Ask Operations, colback=white, colframe=gray!75!black). ***Tell** and **Ask** are the two fundamental interfaces for a knowledge-based agent to interact with its knowledge base. **Tell** adds knowledge, while **Ask** queries and reasons with the knowledge.* ::: ## Inference Engine: The Reasoning Mechanism {#subsec:inference_engine} A critical component of a knowledge-based agent, beyond the knowledge base itself, is the *inference engine*. The inference engine is the reasoning mechanism that enables the agent to *infer* new knowledge from the explicitly represented facts and rules in the KB. It is the engine that powers the `Ask` operation, allowing the agent to go beyond simply retrieving stored facts and to deduce logical consequences. The inference engine applies *inference rules* to the sentences in the KB to derive new sentences that are logically entailed by the existing knowledge. For example, given the sentences \"Socrates is a man\" and \"All men are mortal,\" an inference engine using a rule like Modus Ponens can infer the new sentence \"Socrates is mortal.\" This ability to derive implicit knowledge is what gives knowledge-based agents their reasoning power. Historically, a major goal in the development of knowledge-based systems was to create a *general-purpose inference engine*. The vision was to build a single, domain-independent reasoning mechanism that could be coupled with different knowledge bases to create expert systems for various domains. The idea was that by simply changing the knowledge base---populating it with facts and rules relevant to a specific domain (e.g., medicine, finance, law)---the same inference engine could be used to perform expert-level reasoning in that domain. While this ambitious goal of a completely general-purpose inference engine faced challenges in practice, the concept of separating the inference mechanism from the domain-specific knowledge remains a fundamental principle in knowledge-based systems design. ## Knowledge Level vs. Implementation Level: Declarative Programming {#subsec:knowledge_level_vs_implementation} When designing and analyzing knowledge-based agents, it is crucial to distinguish between the *knowledge level* and the *implementation level*. Describing an agent at the *knowledge level* means focusing on *what* the agent knows and what reasoning it is capable of, without being concerned with the specific *how* of implementation. At the knowledge level, we specify the agent's competence by describing the knowledge it possesses and the inferences it can draw, irrespective of the data structures or algorithms used to realize these capabilities. This knowledge-level perspective leads to a *declarative approach* to agent programming. In declarative programming, we focus on describing *what* the agent should know and what goals it should achieve, rather than explicitly programming *how* to achieve them step-by-step. We provide the agent with a knowledge base and an inference engine, and the agent uses these components to figure out the necessary steps to achieve its goals. This is in stark contrast to *procedural programming*, where the programmer explicitly specifies the sequence of operations or algorithm that the agent must follow. ::: tcolorbox **Declarative Programming (Knowledge Level):** Focuses on *what* the agent knows and should achieve. Knowledge and goals are specified, and the agent reasons to find a solution. **Procedural Programming (Implementation Level):** Focuses on *how* to achieve a goal. Explicit algorithms and step-by-step instructions are programmed. Knowledge-based agents, through their declarative nature, offer a powerful abstraction. By programming at the knowledge level, we can design agents by focusing on the knowledge they need and the reasoning they should perform, leaving the inference engine to handle the procedural details of deriving solutions. This approach can simplify the development of complex intelligent systems and make them more understandable and maintainable. ::: # The Wumpus World: A Practical Example for Knowledge-Based Agents {#sec:wumpus_world} ## Introducing the Wumpus World: A Perceptual Reasoning Challenge {#subsec:wumpus_world_intro} The Wumpus World is a widely recognized and utilized example in Artificial Intelligence, specifically designed to illustrate the principles of knowledge representation, reasoning, and agent design in a perceptually limited environment. It presents a scenario where an agent must navigate a dangerous cave to find gold, relying on limited sensory information to avoid deadly pitfalls and a fearsome monster called the Wumpus. The game is played on a discrete 4x4 grid, representing the cave, and is inherently single-player, focusing on the intelligent decision-making of the agent. The agent invariably begins its adventure in the bottom-left corner of this grid, at location \[1,1\]. ## Navigating the Perilous Cave: Environment and Agent's Challenge {#subsec:wumpus_world_environment} The Wumpus world is populated with several key elements that define the agent's challenges and the nature of the game: - **Bottomless Pits (Pits):** Scattered randomly throughout the cave are hidden pits. Should the agent stumble into a square containing a pit, the outcome is fatal -- the agent falls and the game ends. Crucially, the presence of pits is not directly perceivable. Instead, in a stroke of environmental storytelling, squares immediately adjacent (horizontally or vertically) to a pit emit a subtle *breeze*. This breeze is the agent's only warning of nearby pits, requiring it to use inference to deduce potential pit locations. - **Gleaming Gold (Gold):** The objective of the game is to locate and retrieve a piece of gold hidden in one of the squares within the cave. Finding the gold is rewarding, but merely finding it is not enough to win. The agent must *grab* the gold, then navigate back to the starting square \[1,1\], and finally *climb* out of the cave to successfully complete the mission. The presence of gold is indicated by a perceptible *glitter* in the square where it is located. - **The Wumpus: A Deadly Monster:** The Wumpus is a fearsome monster lurking in a single, randomly chosen square within the cave. Like pits, the Wumpus's location is unknown to the agent at the start. Encountering the Wumpus by entering its square results in immediate death for the agent, as it is devoured. To provide a clue to its location, the Wumpus emits a foul *stench* in all squares immediately adjacent to its location. This stench serves as a warning signal, allowing the agent to infer the potential presence of the Wumpus nearby and avoid fatal encounters. - **Cave Walls (Walls):** The 4x4 grid representing the Wumpus World is enclosed by impenetrable walls. These walls define the boundaries of the cave. If the agent attempts to move beyond the grid, into a wall, it will bump into it, perceiving a *bump* sensor reading, and its position will remain unchanged. These walls constrain the agent's movement and define the limits of the explorable world. Adding to the agent's arsenal, it is equipped with a single arrow. This arrow can be used to attempt to eliminate the Wumpus. If the agent *shoots* the arrow in a straight line in its current facing direction, and the arrow's path intersects with the Wumpus's location, the Wumpus is killed. Upon successfully killing the Wumpus, a distinctive *scream* is emitted, which the agent can perceive, confirming the monster's demise. However, the agent has only one arrow, making its use a strategic decision. ## PEAS Description of the Wumpus World Agent {#subsec:peas_description} To formally define the Wumpus World problem and the agent operating within it, we can use the PEAS (Performance measure, Environment, Actuators, Sensors) framework. This framework helps to systematically analyze and design intelligent agents. ### Performance Measure and Agent's Goals {#subsubsec:performance_measure} The agent's performance in the Wumpus World is evaluated based on a scoring system designed to incentivize goal achievement and penalize undesirable outcomes: - **Positive Reward:** - **Climbing out with Gold:** +1000 points are awarded if the agent successfully climbs out of the cave from square \[1,1\] while carrying the gold. This is the primary objective and yields the highest reward. - **Negative Penalties:** - **Death:** -1000 points are deducted if the agent dies, either by falling into a pit or being eaten by the Wumpus. This significant penalty emphasizes the importance of survival. - **Action Cost:** -1 point is subtracted for each action performed by the agent, including moving forward, turning (left or right), grabbing the gold, and climbing. This encourages efficiency and finding solutions with the minimum number of steps. - **Arrow Shot Cost:** -10 points are deducted each time the agent shoots its arrow. This substantial penalty for using the arrow encourages careful and strategic use of this limited resource. The game's duration is finite, ending when the agent either dies (falls into a pit or is eaten by the Wumpus) or successfully climbs out of the cave. The agent's ultimate goal is to maximize its accumulated score, which necessitates finding the gold, escaping the cave alive, and doing so efficiently, while strategically considering the use of its single arrow. ### Environment Characteristics: The Cave Layout {#subsubsec:environment_characteristics} The Wumpus World environment is characterized by: - **Grid-Based Cave:** The cave is represented as a 4x4 grid of locations or rooms, surrounded by walls that form the outer boundary. - **Starting Position:** The agent always begins its exploration in square \[1,1\], located at the bottom-left of the grid, and is initially oriented to face East (to the right). - **Random Placement of Gold and Wumpus:** The gold and the Wumpus are each located in one square within the cave, with their positions chosen randomly at the start of each game. Importantly, neither the gold nor the Wumpus can be placed in the starting square \[1,1\], ensuring the agent always begins in a safe location. The distribution of their placement is uniform across the possible squares (excluding \[1,1\]). - **Probabilistic Pit Distribution:** Pits are also distributed randomly throughout the cave. Each square, except for the starting square \[1,1\], has a 20% probability of containing a pit. This probabilistic placement means that the number of pits can vary from game to game, and some games might have no pits at all, while others could be densely riddled with them. ### Agent Actuators: Actions Available to the Agent {#subsubsec:agent_actuators} The agent can interact with the Wumpus World environment through a defined set of actions, or actuators: - **Move Forward (Forward):** This action attempts to move the agent one square in the direction it is currently facing. If the movement would cause the agent to collide with a wall (i.e., move outside the 4x4 grid), the action results in a *bump* perception, and the agent's position remains unchanged. - **Turn Left (Turn Left):** This action rotates the agent 90 degrees counter-clockwise, changing its facing direction (e.g., from East to North). The agent remains in the same square. - **Turn Right (Turn Right):** This action rotates the agent 90 degrees clockwise, changing its facing direction (e.g., from East to South). The agent remains in the same square. - **Grab (Grab):** The agent can use this action to pick up the gold, but only if it is currently in the same square as the gold. If gold is present in the current square, executing *Grab* will result in the agent carrying the gold. - **Shoot Arrow (Shoot):** This action fires the agent's single arrow in a straight line in the direction the agent is currently facing. The arrow travels until it hits a wall or the Wumpus. If the arrow hits the Wumpus, the Wumpus is killed, and a *scream* is perceived. The agent only has one arrow for the entire game, making this a one-time use action. Subsequent *Shoot* actions have no effect. - **Climb (Climb):** This action allows the agent to climb out of the cave, but only if it is in the starting square \[1,1\]. To win the maximum score, the agent must be in square \[1,1\], possess the gold, and then execute the *Climb* action. ### Agent Sensors: Perceptions of the Wumpus World {#subsubsec:agent_sensors} The agent's knowledge about the Wumpus World is derived from its limited set of sensors, which provide local perceptions of its immediate surroundings: - **Stench Sensor (Stench):** If there is a Wumpus in any of the squares immediately adjacent (North, South, East, or West) to the agent's current square, the agent perceives a *stench* in its current location. The stench does not indicate the Wumpus's exact location, only its proximity. - **Breeze Sensor (Breeze):** If any adjacent square contains a pit, the agent perceives a *breeze* in its current square. Similar to the stench, the breeze is a proximal indicator, not a precise location of the pit. - **Glitter Sensor (Glitter):** If the gold is located in the agent's current square, the agent perceives a *glitter*. This is a direct indication of the gold's presence in the current location. - **Bump Sensor (Bump):** If the agent attempts to move forward into a wall, the *bump* sensor is activated, providing feedback that the attempted move was blocked by a wall. - **Scream Sensor (Scream):** If the agent successfully shoots and kills the Wumpus, the *scream* sensor is activated in the turn immediately following the *Shoot* action. This confirms to the agent that the Wumpus has been eliminated. These sensors provide the agent with crucial, albeit indirect and local, information about the dangers and rewards within the Wumpus World. The agent must intelligently process these perceptions to navigate safely, deduce the locations of the Wumpus and pits, and ultimately find and retrieve the gold. # Introduction to Propositional Logic {#sec:propositional_logic} ## The Fundamental Role of Logic in Artificial Intelligence Systems {#subsec:logic_motivation} ::: tcolorbox **Logic** is indispensable to Artificial Intelligence. It provides the bedrock for formalizing knowledge and reasoning processes, enabling AI systems to move beyond mere data processing to genuine understanding and inference. ::: Logic, in its essence, offers a structured and unambiguous system for representing information and deriving conclusions. Its importance in AI stems from its ability to provide: - **Formal Knowledge Representation:** Logic allows us to encode knowledge in a precise and unambiguous format, eliminating the vagueness and ambiguity inherent in natural language. This formal representation is crucial for AI systems to process and manipulate knowledge effectively. - **Sound Reasoning and Inference:** Logic provides a set of inference rules that guarantee the validity of deductions. Using logical inference, AI agents can derive new, sound conclusions from existing knowledge. This deductive capability is fundamental to intelligent behavior, allowing agents to answer questions, solve problems, and make predictions based on their knowledge. - **Rational Decision Making:** By representing goals, constraints, and the state of the world in logic, AI agents can use logical reasoning to determine optimal or rational actions. Logic enables agents to evaluate different courses of action, predict their outcomes, and choose actions that are logically consistent with their knowledge and objectives. - **Basis for AI Paradigms:** Logic underpins many important areas within AI, including knowledge representation, automated reasoning, planning, and problem-solving. It serves as a foundational tool for building intelligent systems that can reason, learn, and interact with the world in a coherent and rational manner. In essence, logic empowers AI systems with the ability to think, in a formal and computational sense, mirroring aspects of human reasoning and providing a pathway towards creating truly intelligent machines. ## Previewing Propositional Logic: The Logic of Simple Statements {#subsec:propositional_logic_preview} Our exploration of logic in AI will commence with *propositional logic*, which is the foundational and simplest form of logic. Despite its simplicity, propositional logic provides a powerful starting point for understanding the core principles of logical representation and inference. ::: tcolorbox **Propositional Logic** is the logic of simple statements. It deals with **propositions** -- declarative statements that can be definitively classified as either true or false -- and how these propositions can be combined and manipulated using **logical connectives**. ::: In propositional logic, we focus on: - **Propositions:** These are basic declarative sentences that are either true or false. Examples include \"The sun is shining,\" \"It is raining,\" or \"2 + 2 = 4.\" In propositional logic, we treat these statements as atomic units, without analyzing their internal structure. - **Logical Connectives:** These are operators that combine propositions to form more complex sentences. The primary logical connectives in propositional logic are: - **AND** ($\land$): Combines two propositions, true only if both are true. - **OR** ($\lor$): Combines two propositions, true if at least one is true. - **NOT** ($\neg$): Negates a proposition, true if the proposition is false, and vice versa. - **IMPLICATION** ($\implies$): Represents \"if\...then\...\" relationships. $P \implies Q$ is false only if $P$ is true and $Q$ is false. - **BICONDITIONAL** ($\iff$): Represents \"if and only if\" relationships. $P \iff Q$ is true if $P$ and $Q$ have the same truth value. In the subsequent sections and lectures, we will delve into the details of propositional logic. We will learn: - **Syntax and Semantics:** How to formally construct sentences in propositional logic and how to determine their truth values. - **Knowledge Representation:** How to use propositional logic to represent knowledge about the world, including facts, rules, and constraints. - **Logical Inference:** Methods for performing logical inference in propositional logic, allowing us to derive new knowledge from existing sentences. This will include exploring concepts like entailment and inference rules, which are crucial for building reasoning agents. By mastering propositional logic, we will establish a solid foundation for understanding more advanced logical systems and their applications in building intelligent AI agents. # Conclusion {#sec:conclusion} ## Summary: Bridging Adversarial Search and Knowledge-Based Systems {#subsec:recap_kb_logic} In today's lecture, we have traversed two significant areas within Artificial Intelligence. We began by revisiting *adversarial search*, focusing on the challenges of game playing and competitive environments. We explored the *Minimax algorithm* as a foundational approach for optimal decision-making in zero-sum games and examined *alpha-beta pruning* as a crucial optimization technique to enhance its efficiency. We also discussed the necessity of *heuristic evaluation functions* for tackling complex games where exhaustive search is infeasible, referencing Samuel's checkers program as a historical example of these principles in action. Subsequently, we transitioned to the realm of *knowledge-based agents*. We introduced the concept of agents that explicitly represent and reason with knowledge, drawing a contrast with purely reactive approaches. We emphasized theimportance of *logic* as a formal language for knowledge representation and discussed the core components of knowledge-based agents: the *knowledge base* and the *inference engine*. We highlighted the *Tell* and *Ask* operations as the primary modes of interaction with a knowledge base and differentiated between the *knowledge level* and the *implementation level* of agent design, underscoring the declarative nature of knowledge-based programming. To ground these abstract concepts, we introduced the *Wumpus World* as a concrete and illustrative example environment that will serve as a running case study for applying knowledge representation and reasoning techniques in the lectures to come. ## Looking Ahead: Propositional Logic and Reasoning in the Wumpus World {#subsec:next_steps_logic} Building upon the introduction to knowledge-based agents, our next lectures will embark on a deeper exploration of *propositional logic*. We will systematically examine its *syntax*, defining how to construct well-formed logical sentences, and its *semantics*, establishing how to determine the truth value of these sentences in different possible worlds or interpretations. A key focus will be on *inference rules* and methods for performing logical deduction, enabling us to derive new knowledge from existing information. Crucially, we will apply propositional logic to the *Wumpus World*. We will learn how to formally represent the agent's knowledge about the Wumpus World environment, including the rules of the game, the agent's perceptions (stench, breeze, glitter, bump, scream), and the consequences of its actions. Our goal is to equip our agent with the ability to reason logically about its percepts and the environment to make informed and safe decisions. We will investigate how to represent the dynamic aspects of the Wumpus World and the agent's evolving state within a logical framework, ultimately aiming to design a knowledge-based agent capable of intelligently navigating the Wumpus World, finding the gold, and escaping unscathed. This practical application will solidify our understanding of propositional logic and its role in creating intelligent systems.