Formal Methods for AI and Cyber Security

Author

Your Name

Published

February 5, 2025

Introduction

This document provides an overview of the course Formal Methods for AI and Cyber Security, taught by Professors Angelo Montanari, Gabriele Puppis, and Luca Geatti. The course is designed for students in the Artificial Intelligence and Cyber Security degree program (12 credits) and the Italian Computer Science degree program (9 credits).

The primary goal of this course is to introduce students to the field of symbolic AI and formal methods, emphasizing their importance alongside machine learning in AI and cyber security. The course aims to equip students with the ability to model systems, specify properties, and verify them effectively.

The course will cover fundamental concepts and techniques in formal methods, including:

Logic
Automata Theory
Temporal Logic
Algorithms for Model Checking and Satisfiability

Furthermore, it explores advanced topics such as:

Automata Learning
Synthesis

demonstrating the practical application of these methods in AI and cyber security. The importance of symbolic AI, also known as model-driven AI, is highlighted, especially in providing interpretable and verifiable results, which complements the advancements in machine learning.

Modeling Systems: Using automata on infinite objects to represent systems with non-terminating behavior.
Specifying Properties: Employing temporal logic to describe system properties over time.
Verification Algorithms: Implementing algorithms for model checking and satisfiability to ensure system correctness.
Advanced Topics: Exploring automata learning and synthesis for automated system development.

The integration of symbolic AI with machine learning is presented as a promising direction for future research, particularly in addressing the interpretability problem in machine learning, especially in safety-critical applications.

Course Organization

Instructors

The course will be jointly taught by:

Professor Angelo Montanari
Professor Gabriele Puppis
Professor Luca Geatti

Course Structure

The course is divided into two parts:

First Semester (24 hours): Background knowledge on logic, automata, and games. This part is primarily for students in the Artificial Intelligence and Cyber Security degree but is also recommended for students in the Italian Computer Science degree as a refresher.
Second Semester (72 hours): Core topics including automata on infinite objects, temporal logic, algorithms for model checking and satisfiability, automata learning, synthesis, and applications in AI and cyber security.

Schedule

First Semester: One lesson per week.
Second Semester: Three lessons per week.

Office Hours

Weekly office hours will be held online for question answering and further discussion. Specific timing will be communicated through the course’s Teams channel.

Exams

There will be five exam sessions:

Two sessions in June/July (end of the course)
Two sessions in September
One session in January/February

The exam is primarily an oral examination.
Students scoring 26 or below out of 30 will have completed their exam.
Students scoring 27 or above can opt for a second part, which involves a seminar on a specific topic agreed upon with the professors. The second part can be taken in any of the following sessions and does not have to be in the one immediately following the oral examination.
Students who score 30 cum laude in the oral examination do not need to take the second part.

Note: There is a minimum threshold of knowledge required to pass the exam, focusing on basics of automata, logical formalisms, and algorithms. The exam is designed to assess both the depth and breadth of understanding of the course material.

Communication

All course materials, announcements, and communications will be handled through Microsoft Teams. Students are advised to check Teams regularly for updates, especially the day before each lesson.

Materials

Course materials, including lecture recordings, slides, and book PDFs, will be available on Microsoft Teams under the "File" section.

Course Content

The course content is structured into two main parts: background knowledge covered in the first semester, and core topics covered in the second semester.

Background Knowledge (First Semester)

The first semester will provide foundational knowledge in logic, automata theory, and game theory.

Logic:
- Propositional Logic: Fundamentals and basic concepts.
- First-Order Logic: Extending propositional logic with quantifiers and predicates.
- Tableau Systems: Techniques for automated reasoning and proving the validity of logical formulas.
Automata:
- Automata on Finite Words: Introduction to automata that process finite sequences of symbols.
- Deterministic and Non-deterministic Automata: Understanding the differences and conversions between deterministic and non-deterministic finite automata (DFA and NFA).
- Representation of Automata: Different ways to represent automata, including:
  - Graphs: Visual representation using nodes and edges.
  - Logical Formulas: Expressing automata behavior using logical statements.
  - Regular Expressions: Using regular expressions to define the language accepted by an automaton.
Games:
- Logical Games: Introduction to games used in computer science for modeling and solving problems.
- Introduction to Other Types of Games: Overview of other game types, such as parity games, which are used in formal verification.

Core Topics (Second Semester)

The second semester will delve into core topics, including modeling systems with automata on infinite objects, specifying properties using temporal logic, and developing verification algorithms.

Modeling Systems

Automata on Infinite Objects:
- Generalization of automata to handle infinite words and trees, which are essential for modeling non-terminating systems.
- Modeling systems with non-terminating behavior, such as protocols and continuous systems that operate indefinitely.
- Example: Modeling a printer system within a department, which manages infinite sequences of print requests and maintains operational states. This includes ensuring that each request is eventually satisfied and that print jobs are not mixed.

Specifying Properties

Temporal Logic:
- Formalism for describing properties of systems over time, capturing how system behavior evolves.
- Expressing properties like "sooner or later, all requests are satisfied," which is crucial for non-terminating systems.
- Ensuring properties are maintained in non-terminating systems, guaranteeing correct behavior indefinitely.

Verification Algorithms

Model Checking:
- Algorithms to verify if a system satisfies specified properties.
- Application in various domains including databases (querying as model checking), planning (finding a plan as model checking), and protocol verification.
Satisfiability Checking:
- Algorithms to determine if a given formula is satisfiable, ensuring there exists an assignment of variables that makes the formula true.

Advanced Topics

Automata Learning:
- Techniques to automatically learn the structure of an unknown automaton from examples.
- Building models from positive and negative examples, where positive examples are accepted by the automaton and negative examples are not.
Synthesis:
- Automatically building a system from a given specification, ensuring the system meets the desired properties.
- Application in building controllers for automatic plan execution, ensuring the system can execute a plan to reach a goal state from an initial state.
- Historical context: Church’s problem (1950s), which sought to automatically synthesize a system from a given specification.

Applications in AI and Cyber Security

Case studies demonstrating the application of formal methods in real-world scenarios.
Integration of monitoring tools with machine learning for detecting system failures, using past data to predict and prevent issues.
Potential topics for student seminars, allowing students to explore specific applications in depth.

Symbolic AI and Formal Methods

Importance of Symbolic AI

Symbolic AI, also known as model-driven AI, is a crucial part of artificial intelligence that complements machine learning. While machine learning, particularly deep learning, has gained prominence, symbolic AI remains essential for its ability to provide interpretable and verifiable results. This is especially important in contexts where understanding the reasoning behind an AI’s decision is as crucial as the decision itself.

Interpretability: Symbolic AI systems are inherently more interpretable than black-box models like deep neural networks. The logical rules and knowledge representation in symbolic AI allow for a clear understanding of how a system arrives at a particular conclusion.
Verifiability: Formal methods, a cornerstone of symbolic AI, enable rigorous verification of system properties. This ensures that a system meets its specifications and behaves correctly under all circumstances.
Explainability: The explicit knowledge representation in symbolic AI facilitates the generation of explanations for the system’s behavior, which is crucial for building trust and understanding in AI systems.

Integration with Machine Learning

The future of AI likely involves integrating machine learning with formal methods. This integration can address the interpretability problem in machine learning, especially in safety-critical applications like medicine, autonomous driving, and aerospace. By combining the strengths of both approaches, we can create AI systems that are both powerful and trustworthy.

Addressing Interpretability: Integrating formal methods with machine learning can help explain the decisions made by complex machine learning models, making them more transparent and understandable.
Enhancing Reliability: Formal verification techniques can be used to ensure the reliability and safety of machine learning systems, particularly in critical applications.
Improving Robustness: Combining symbolic AI and machine learning can lead to more robust AI systems that are less susceptible to adversarial attacks and unexpected inputs.
Monitoring: Integrating monitoring tools with machine learning allows for the detection of system failures by analyzing past data to predict and prevent issues.

Formal Methods in Cyber Security

In cyber security, formal methods are used to model and verify the properties of systems such as protocols, access control mechanisms, and cryptographic algorithms. These methods ensure that systems behave correctly and securely under all conditions, including potential attacks.

Protocol Verification: Formal methods are used to verify the correctness and security of communication protocols, ensuring they are free from vulnerabilities and can withstand attacks.
System Modeling: Formal methods provide a rigorous way to model security-critical systems, allowing for the identification of potential weaknesses and vulnerabilities.
Property Specification: Temporal logic and other formal languages are used to specify security properties, such as confidentiality, integrity, and availability.
Automated Verification: Model checking and other automated techniques are employed to verify that systems satisfy specified security properties.

Practical Applications

Formal methods and symbolic AI have a wide range of practical applications across various domains:

Automated Planning: Used in various industries for optimizing operations, resource allocation, and scheduling. For example, in logistics, automated planning can optimize delivery routes and schedules.
Database Querying: Formalized as a model checking problem, where querying a database can be seen as verifying whether a certain property (the query) holds in a model (the database).
Protocol Verification: Ensuring the correctness and security of communication protocols, such as those used in network security and distributed systems. This is crucial for preventing attacks and ensuring reliable communication.

Conclusion

This course provides a comprehensive introduction to formal methods and their applications in AI and cyber security. By covering topics from basic logic and automata to advanced algorithms and practical applications, it aims to equip students with the necessary tools to understand, develop, and analyze complex systems. The integration of symbolic AI with machine learning is highlighted as a promising direction for future research, offering the potential to create AI systems that are both powerful and trustworthy.

Formal methods and symbolic AI are essential for creating interpretable, verifiable, and explainable AI systems.
The course covers a wide range of topics, from foundational concepts (logic, automata) to advanced techniques (model checking, automata learning) and applications (automated planning, protocol verification).
Integration of formal methods with machine learning is crucial for future advancements in AI and cyber security, particularly in addressing the interpretability and reliability challenges of machine learning models.

What are some specific examples of systems that can be modeled using automata on infinite objects, beyond the printer example mentioned in the course introduction?
How can temporal logic be used to specify properties of a cyber security protocol, such as ensuring secure key exchange or preventing denial-of-service attacks?
What are the main challenges in integrating formal methods with machine learning, and how can they be addressed? For instance, how can we handle the computational complexity of formal verification when applied to large machine learning models?
How can we automatically learn an automaton from examples of accepted and rejected words?

--- title: "Formal Methods for AI and Cyber Security" author: "Your Name" date: "2025-02-05" format: html: toc: true # Table of Contents toc-depth: 2 code-tools: true theme: cosmo # Or "journal" for Distill-like minimalism --- # Introduction This document provides an overview of the course **Formal Methods for AI and Cyber Security**, taught by Professors Angelo Montanari, Gabriele Puppis, and Luca Geatti. The course is designed for students in the **Artificial Intelligence and Cyber Security** degree program (12 credits) and the **Italian Computer Science** degree program (9 credits). ::: tcolorbox The primary goal of this course is to introduce students to the field of **symbolic AI** and **formal methods**, emphasizing their importance alongside machine learning in AI and cyber security. The course aims to equip students with the ability to **model systems**, **specify properties**, and **verify them effectively**. ::: The course will cover fundamental concepts and techniques in formal methods, including: - Logic - Automata Theory - Temporal Logic - Algorithms for Model Checking and Satisfiability Furthermore, it explores advanced topics such as: - Automata Learning - Synthesis demonstrating the practical application of these methods in AI and cyber security. The importance of symbolic AI, also known as model-driven AI, is highlighted, especially in providing interpretable and verifiable results, which complements the advancements in machine learning. ::: tcolorbox - **Modeling Systems**: Using automata on infinite objects to represent systems with non-terminating behavior. - **Specifying Properties**: Employing temporal logic to describe system properties over time. - **Verification Algorithms**: Implementing algorithms for model checking and satisfiability to ensure system correctness. - **Advanced Topics**: Exploring automata learning and synthesis for automated system development. ::: The integration of symbolic AI with machine learning is presented as a promising direction for future research, particularly in addressing the interpretability problem in machine learning, especially in safety-critical applications. # Course Organization ## Instructors The course will be jointly taught by: - Professor Angelo Montanari - Professor Gabriele Puppis - Professor Luca Geatti ## Course Structure The course is divided into two parts: - **First Semester (24 hours)**: Background knowledge on logic, automata, and games. This part is primarily for students in the Artificial Intelligence and Cyber Security degree but is also recommended for students in the Italian Computer Science degree as a refresher. - **Second Semester (72 hours)**: Core topics including automata on infinite objects, temporal logic, algorithms for model checking and satisfiability, automata learning, synthesis, and applications in AI and cyber security. ## Schedule - **First Semester**: One lesson per week. - **Second Semester**: Three lessons per week. ## Office Hours Weekly office hours will be held online for question answering and further discussion. Specific timing will be communicated through the course's Teams channel. ## Exams There will be five exam sessions: - Two sessions in June/July (end of the course) - Two sessions in September - One session in January/February ::: tcolorbox - The exam is primarily an oral examination. - Students scoring 26 or below out of 30 will have completed their exam. - Students scoring 27 or above can opt for a second part, which involves a seminar on a specific topic agreed upon with the professors. The second part can be taken in any of the following sessions and does not have to be in the one immediately following the oral examination. - Students who score 30 *cum laude* in the oral examination do not need to take the second part. ::: **Note**: There is a minimum threshold of knowledge required to pass the exam, focusing on basics of automata, logical formalisms, and algorithms. The exam is designed to assess both the depth and breadth of understanding of the course material. ## Communication All course materials, announcements, and communications will be handled through **Microsoft Teams**. Students are advised to check Teams regularly for updates, especially the day before each lesson. ## Materials Course materials, including lecture recordings, slides, and book PDFs, will be available on Microsoft Teams under the **\"File\"** section. # Course Content The course content is structured into two main parts: background knowledge covered in the first semester, and core topics covered in the second semester. ## Background Knowledge (First Semester) The first semester will provide foundational knowledge in logic, automata theory, and game theory. - **Logic**: - Propositional Logic: Fundamentals and basic concepts. - First-Order Logic: Extending propositional logic with quantifiers and predicates. - Tableau Systems: Techniques for automated reasoning and proving the validity of logical formulas. - **Automata**: - Automata on Finite Words: Introduction to automata that process finite sequences of symbols. - Deterministic and Non-deterministic Automata: Understanding the differences and conversions between deterministic and non-deterministic finite automata (DFA and NFA). - Representation of Automata: Different ways to represent automata, including: - **Graphs**: Visual representation using nodes and edges. - **Logical Formulas**: Expressing automata behavior using logical statements. - **Regular Expressions**: Using regular expressions to define the language accepted by an automaton. - **Games**: - Logical Games: Introduction to games used in computer science for modeling and solving problems. - Introduction to Other Types of Games: Overview of other game types, such as parity games, which are used in formal verification. ## Core Topics (Second Semester) The second semester will delve into core topics, including modeling systems with automata on infinite objects, specifying properties using temporal logic, and developing verification algorithms. ### Modeling Systems - **Automata on Infinite Objects**: - Generalization of automata to handle infinite words and trees, which are essential for modeling non-terminating systems. - Modeling systems with non-terminating behavior, such as protocols and continuous systems that operate indefinitely. - Example: Modeling a printer system within a department, which manages infinite sequences of print requests and maintains operational states. This includes ensuring that each request is eventually satisfied and that print jobs are not mixed. ### Specifying Properties - **Temporal Logic**: - Formalism for describing properties of systems over time, capturing how system behavior evolves. - Expressing properties like \"sooner or later, all requests are satisfied,\" which is crucial for non-terminating systems. - Ensuring properties are maintained in non-terminating systems, guaranteeing correct behavior indefinitely. ### Verification Algorithms - **Model Checking**: - Algorithms to verify if a system satisfies specified properties. - Application in various domains including databases (querying as model checking), planning (finding a plan as model checking), and protocol verification. - **Satisfiability Checking**: - Algorithms to determine if a given formula is satisfiable, ensuring there exists an assignment of variables that makes the formula true. ### Advanced Topics - **Automata Learning**: - Techniques to automatically learn the structure of an unknown automaton from examples. - Building models from positive and negative examples, where positive examples are accepted by the automaton and negative examples are not. - **Synthesis**: - Automatically building a system from a given specification, ensuring the system meets the desired properties. - Application in building controllers for automatic plan execution, ensuring the system can execute a plan to reach a goal state from an initial state. - Historical context: Church's problem (1950s), which sought to automatically synthesize a system from a given specification. ### Applications in AI and Cyber Security - Case studies demonstrating the application of formal methods in real-world scenarios. - Integration of monitoring tools with machine learning for detecting system failures, using past data to predict and prevent issues. - Potential topics for student seminars, allowing students to explore specific applications in depth. # Symbolic AI and Formal Methods ## Importance of Symbolic AI Symbolic AI, also known as model-driven AI, is a crucial part of artificial intelligence that complements machine learning. While machine learning, particularly deep learning, has gained prominence, symbolic AI remains essential for its ability to provide **interpretable** and **verifiable** results. This is especially important in contexts where understanding the reasoning behind an AI's decision is as crucial as the decision itself. ::: tcolorbox - **Interpretability**: Symbolic AI systems are inherently more interpretable than black-box models like deep neural networks. The logical rules and knowledge representation in symbolic AI allow for a clear understanding of how a system arrives at a particular conclusion. - **Verifiability**: Formal methods, a cornerstone of symbolic AI, enable rigorous verification of system properties. This ensures that a system meets its specifications and behaves correctly under all circumstances. - **Explainability**: The explicit knowledge representation in symbolic AI facilitates the generation of explanations for the system's behavior, which is crucial for building trust and understanding in AI systems. ::: ## Integration with Machine Learning The future of AI likely involves integrating machine learning with formal methods. This integration can address the **interpretability problem** in machine learning, especially in safety-critical applications like medicine, autonomous driving, and aerospace. By combining the strengths of both approaches, we can create AI systems that are both powerful and trustworthy. ::: tcolorbox - **Addressing Interpretability**: Integrating formal methods with machine learning can help explain the decisions made by complex machine learning models, making them more transparent and understandable. - **Enhancing Reliability**: Formal verification techniques can be used to ensure the reliability and safety of machine learning systems, particularly in critical applications. - **Improving Robustness**: Combining symbolic AI and machine learning can lead to more robust AI systems that are less susceptible to adversarial attacks and unexpected inputs. - **Monitoring**: Integrating monitoring tools with machine learning allows for the detection of system failures by analyzing past data to predict and prevent issues. ::: ## Formal Methods in Cyber Security In cyber security, formal methods are used to model and verify the properties of systems such as protocols, access control mechanisms, and cryptographic algorithms. These methods ensure that systems behave correctly and securely under all conditions, including potential attacks. ::: tcolorbox - **Protocol Verification**: Formal methods are used to verify the correctness and security of communication protocols, ensuring they are free from vulnerabilities and can withstand attacks. - **System Modeling**: Formal methods provide a rigorous way to model security-critical systems, allowing for the identification of potential weaknesses and vulnerabilities. - **Property Specification**: Temporal logic and other formal languages are used to specify security properties, such as confidentiality, integrity, and availability. - **Automated Verification**: Model checking and other automated techniques are employed to verify that systems satisfy specified security properties. ::: ## Practical Applications Formal methods and symbolic AI have a wide range of practical applications across various domains: - **Automated Planning**: Used in various industries for optimizing operations, resource allocation, and scheduling. For example, in logistics, automated planning can optimize delivery routes and schedules. - **Database Querying**: Formalized as a model checking problem, where querying a database can be seen as verifying whether a certain property (the query) holds in a model (the database). - **Protocol Verification**: Ensuring the correctness and security of communication protocols, such as those used in network security and distributed systems. This is crucial for preventing attacks and ensuring reliable communication. # Conclusion This course provides a comprehensive introduction to formal methods and their applications in AI and cyber security. By covering topics from basic logic and automata to advanced algorithms and practical applications, it aims to equip students with the necessary tools to understand, develop, and analyze complex systems. The integration of symbolic AI with machine learning is highlighted as a promising direction for future research, offering the potential to create AI systems that are both powerful and trustworthy. ::: tcolorbox - Formal methods and symbolic AI are essential for creating **interpretable**, **verifiable**, and **explainable** AI systems. - The course covers a wide range of topics, from foundational concepts (logic, automata) to advanced techniques (model checking, automata learning) and applications (automated planning, protocol verification). - Integration of formal methods with machine learning is crucial for future advancements in AI and cyber security, particularly in addressing the interpretability and reliability challenges of machine learning models. ::: ::: tcolorbox - What are some specific examples of systems that can be modeled using automata on infinite objects, beyond the printer example mentioned in the course introduction? - How can temporal logic be used to specify properties of a cyber security protocol, such as ensuring secure key exchange or preventing denial-of-service attacks? - What are the main challenges in integrating formal methods with machine learning, and how can they be addressed? For instance, how can we handle the computational complexity of formal verification when applied to large machine learning models? - How can we automatically learn an automaton from examples of accepted and rejected words? :::