Introduction to Deep Learning

Author

Your Name

Published

January 28, 2025

Introduction

This document serves as an introduction to the Deep Learning course, designed for students at both the University of Udine and the University of Klagenfurt. The course aims to provide a comprehensive understanding of deep learning, encompassing its theoretical underpinnings and practical applications.

The primary objectives of this introductory lecture are to:

Introduce the fundamental concepts of Artificial Intelligence (AI), Machine Learning (ML), and Deep Learning (DL).
Emphasize the significance of deep learning across various domains and its impact on modern technology.
Outline the course structure, including office hours, methods of content delivery, and recommended resources.
Discuss the importance of supervised learning, with a focus on regression as a foundational concept.

This course will explore these topics in detail, providing both theoretical knowledge and practical skills.

Administrative Details

Instructor Information

The instructor for this course is [Professor’s Name]. Office hours are scheduled for Tuesdays, but the specific time is to be determined. Students are strongly encouraged to contact the instructor via email to arrange appointments, either in person or online, to ensure availability and address specific queries effectively.

Course Materials

All course materials, including lecture recordings and slides, will be accessible through Microsoft Teams. This platform will serve as the primary channel for communication and content distribution for all students from both Udine and Klagenfurt.

References

The following resources are recommended to enhance your understanding of deep learning:

Lecture Videos from the University of Amsterdam: These videos offer a valuable perspective on deep learning concepts. (Specific link will be provided on Teams)
Stanford’s Deep Learning Course: This course is renowned for its comprehensive approach to deep learning. (Specific link will be provided on Teams)
"Deep Learning" by Goodfellow, Bengio, and Courville: This book is a comprehensive reference, often considered a foundational text in the field. However, it may contain more detail than necessary for beginners.
"Dive into Deep Learning" by Zhang, Lipton, Li, and Smola: This is a more recent and freely available online resource. It is known for its clear explanations and excellent figures, making it a highly recommended resource for this course.

Exam Information

Detailed information regarding the exam format, requirements, and schedule will be provided in the coming weeks. A dedicated presentation outlining all exam-related details will be made available on Microsoft Teams. Please refer to this presentation for the most up-to-date information.

Personal Introduction and Background

The instructor’s academic background includes a master’s degree from the University of Florence, followed by a research visit to Carnegie Mellon University in Pittsburgh. Subsequently, the instructor completed a PhD at the University of Florence, with a research period at Telecom ParisTech in Paris. Postdoctoral research positions were held at the Universities of Florence and Modena. Currently, the instructor serves as an associate professor at the University of Udine and is the director of the Artificial Intelligence Laboratory at the same institution. Furthermore, the instructor has established a startup company focused on artificial intelligence applications.

Note: The instructor highlighted the value of PhD programs for students interested in pursuing research careers or positions in major technology companies. A PhD provides a dedicated period for focused research and is often a prerequisite for advanced roles in both academia and industry. The instructor also emphasized the significance of practical experience and networking, citing a visit to AI labs in Barcelona as an example of valuable opportunities for students.

This diverse background underscores the instructor’s extensive experience in both academic research and practical applications of artificial intelligence.

Lab Members and Contact Information

The Artificial Intelligence Laboratory at the University of Udine is composed of a dynamic team of PhD students and postdoctoral researchers, all actively engaged in cutting-edge research.

For the latest updates on our projects, publications, and activities, please visit our:

Laboratory Website: [Website Link - To be added]
LinkedIn Profile: [LinkedIn Profile Link - To be added]

These platforms are regularly updated with project details, relevant news, and opportunities for collaboration.

Terminology: Artificial Intelligence, Machine Learning, and Deep Learning

Artificial Intelligence (AI)

Definition 1. Artificial Intelligence (AI) encompasses any technique that enables computers to perform tasks that typically require human intelligence.

While historically, AI was defined as mimicking human behavior, this definition is now considered outdated. Modern AI systems often surpass human capabilities in specific tasks, such as face detection, medical diagnosis, and complex data analysis.

Machine Learning (ML)

Definition 2. Machine Learning (ML) is a subset of AI that utilizes statistical methods to enable machines to improve their performance based on experience (data) without being explicitly programmed.

A face detection system trained on millions of images to recognize faces in new, unseen images. This system learns from the data rather than being explicitly programmed to identify faces.

Deep Learning (DL)

Definition 3. Deep Learning (DL) is a specialized subset of machine learning that is based on artificial neural networks with multiple layers (deep neural networks).

Deep learning has become increasingly dominant within the field of machine learning due to its flexibility and effectiveness across various domains and data types. Its ability to automatically learn complex features from data has led to significant advancements in areas such as image recognition, natural language processing, and speech synthesis.

Hierarchical Relationship between AI, ML, and DL

1 illustrates the hierarchical relationship between these three concepts, with Deep Learning being a subset of Machine Learning, which in turn is a subset of Artificial Intelligence.

Why Deep Learning?

The widespread adoption of deep learning is primarily due to its versatility and effectiveness across various subfields of machine learning. Unlike many traditional machine learning techniques, deep learning models can be effectively applied to supervised learning, unsupervised learning, and reinforcement learning paradigms. Moreover, deep learning models exhibit remarkable adaptability to different data types, including images, text, and audio, making them a powerful tool for a wide range of applications.

Deep Learning’s Applicability Across Machine Learning Subfields

As illustrated in 2, deep learning models (represented by the orange circle) can be applied across the different subfields of machine learning, demonstrating their broad applicability and versatility. This contrasts with many traditional machine learning techniques that are often specialized for a specific subfield or data type.

This adaptability and broad applicability are key factors contributing to the current prominence of deep learning in the field of artificial intelligence.

Supervised Learning

Supervised learning is a fundamental area within machine learning where models learn from labeled data. In this paradigm, the goal is to train a model to learn a function that accurately maps input data to corresponding desired outputs. This learning process is guided by the provided labels, which indicate the correct output for each input.

Definition 4. In supervised learning, the training data consists of pairs of inputs ($x$) and their corresponding desired outputs ($y$). The primary objective is to learn a function $f$ such that $f(x) \approx y$ for new, unseen inputs.

Regression

Regression is a specific type of supervised learning where the output variable is a continuous real number. This is in contrast to classification tasks, where the output is a discrete category.

Predicting the price of an apartment based on various features such as square footage, the number of bedrooms, and the number of parking spots.

Input ($x$): Features of an apartment, such as square footage, number of bedrooms, and parking spots. These features are typically represented as numerical values.
Output ($y$): The predicted price of the apartment, which is a real number.

Training Data: In order to train a regression model, we require a dataset consisting of multiple apartments, each with known features and their corresponding actual prices. This dataset serves as the ‘experience’ from which the model learns.

Mathematical Formulation: Let $x$ represent the input features of an apartment, and let $y$ represent the actual price of that apartment. The goal of regression is to find a function $f$ such that: \[f(x) = \hat{y}\] where $\hat{y}$ is the predicted price of the apartment. The function $f$ is learned from the training data and is used to predict the price of new, unseen apartments.

Example of Training Data for Apartment Price Prediction
Apartment	Square Footage	Bedrooms	Parking Spots	Price
1	1500	3	2	300,000
2	1200	2	1	250,000
3	1800	4	2	350,000
...	...	...	...	...

1 provides a sample of the training data used to train a regression model for predicting apartment prices. Each row represents a different apartment with its corresponding features and price.

Conclusion

This introductory lecture provided an overview of the fundamental concepts of artificial intelligence (AI), machine learning (ML), and deep learning (DL). We emphasized the significance of deep learning and its wide-ranging applicability across various domains. Additionally, we introduced supervised learning, with a particular focus on regression, as a crucial area within machine learning.

Key Takeaways:

Deep learning is a powerful subset of machine learning that utilizes artificial neural networks.
Deep learning models can be applied to diverse data types and across various machine learning subfields, including supervised, unsupervised, and reinforcement learning.
Supervised learning involves training models using labeled data to learn a mapping from inputs to desired outputs.
Regression is a type of supervised learning where the output is a continuous real number.

Follow-up Questions:

How can we effectively evaluate the performance of a regression model?
What are some other examples of supervised learning tasks beyond regression?
What are the limitations of supervised learning and when might other approaches be more suitable?

Next Lecture: The next lecture will explore the architecture of neural networks in greater detail. We will discuss different types of activation functions and their impact on network behavior. Furthermore, we will introduce the concept of backpropagation and its critical role in training deep learning models. (Specific date and time will be announced on Microsoft Teams).

These follow-up questions are designed to encourage further thought and exploration of the topics covered in this introductory lecture.

--- title: "Introduction to Deep Learning" author: "Your Name" date: "2025-01-28" format: html: toc: true # Table of Contents toc-depth: 2 code-tools: true theme: cosmo # Or "journal" for Distill-like minimalism --- # Introduction This document serves as an introduction to the Deep Learning course, designed for students at both the University of Udine and the University of Klagenfurt. The course aims to provide a comprehensive understanding of deep learning, encompassing its theoretical underpinnings and practical applications. ::: tcolorbox The primary objectives of this introductory lecture are to: - Introduce the fundamental concepts of **Artificial Intelligence (AI)**, **Machine Learning (ML)**, and **Deep Learning (DL)**. - Emphasize the significance of deep learning across various domains and its impact on modern technology. - Outline the course structure, including office hours, methods of content delivery, and recommended resources. - Discuss the importance of **supervised learning**, with a focus on **regression** as a foundational concept. ::: This course will explore these topics in detail, providing both theoretical knowledge and practical skills. # Administrative Details ## Instructor Information The instructor for this course is **\[Professor's Name\]**. Office hours are scheduled for Tuesdays, but the specific time is to be determined. Students are strongly encouraged to contact the instructor via email to arrange appointments, either in person or online, to ensure availability and address specific queries effectively. ## Course Materials All course materials, including lecture recordings and slides, will be accessible through **Microsoft Teams**. This platform will serve as the primary channel for communication and content distribution for all students from both Udine and Klagenfurt. ## References The following resources are recommended to enhance your understanding of deep learning: - **Lecture Videos from the University of Amsterdam**: These videos offer a valuable perspective on deep learning concepts. (Specific link will be provided on Teams) - **Stanford's Deep Learning Course**: This course is renowned for its comprehensive approach to deep learning. (Specific link will be provided on Teams) - **\"Deep Learning\" by Goodfellow, Bengio, and Courville**: This book is a comprehensive reference, often considered a foundational text in the field. However, it may contain more detail than necessary for beginners. - **\"Dive into Deep Learning\" by Zhang, Lipton, Li, and Smola**: This is a more recent and freely available online resource. It is known for its clear explanations and excellent figures, making it a highly recommended resource for this course. ## Exam Information Detailed information regarding the exam format, requirements, and schedule will be provided in the coming weeks. A dedicated presentation outlining all exam-related details will be made available on **Microsoft Teams**. Please refer to this presentation for the most up-to-date information. # Personal Introduction and Background The instructor's academic background includes a master's degree from the University of Florence, followed by a research visit to **Carnegie Mellon University** in Pittsburgh. Subsequently, the instructor completed a PhD at the University of Florence, with a research period at **Telecom ParisTech** in Paris. Postdoctoral research positions were held at the Universities of Florence and Modena. Currently, the instructor serves as an associate professor at the University of Udine and is the director of the **Artificial Intelligence Laboratory** at the same institution. Furthermore, the instructor has established a startup company focused on artificial intelligence applications. ::: tcolorbox **Note:** The instructor highlighted the value of **PhD programs** for students interested in pursuing research careers or positions in major technology companies. A PhD provides a dedicated period for focused research and is often a prerequisite for advanced roles in both academia and industry. The instructor also emphasized the significance of practical experience and networking, citing a visit to AI labs in Barcelona as an example of valuable opportunities for students. ::: This diverse background underscores the instructor's extensive experience in both academic research and practical applications of artificial intelligence. # Lab Members and Contact Information The **Artificial Intelligence Laboratory** at the University of Udine is composed of a dynamic team of **PhD students** and **postdoctoral researchers**, all actively engaged in cutting-edge research. ::: tcolorbox For the latest updates on our projects, publications, and activities, please visit our: - **Laboratory Website**: \[Website Link - To be added\] - **LinkedIn Profile**: \[LinkedIn Profile Link - To be added\] ::: These platforms are regularly updated with project details, relevant news, and opportunities for collaboration. # Terminology: Artificial Intelligence, Machine Learning, and Deep Learning ## Artificial Intelligence (AI) ::: {#def:ai .definition} **Definition 1**. ***Artificial Intelligence (AI)** encompasses any technique that enables computers to perform tasks that typically require human intelligence. * ::: ::: tcolorbox While historically, AI was defined as mimicking human behavior, this definition is now considered outdated. Modern AI systems often surpass human capabilities in specific tasks, such as face detection, medical diagnosis, and complex data analysis. ::: ## Machine Learning (ML) ::: {#def:ml .definition} **Definition 2**. ***Machine Learning (ML)** is a subset of AI that utilizes statistical methods to enable machines to improve their performance based on experience (data) without being explicitly programmed. * ::: ::: tcolorbox A face detection system trained on millions of images to recognize faces in new, unseen images. This system learns from the data rather than being explicitly programmed to identify faces. ::: ## Deep Learning (DL) ::: {#def:dl .definition} **Definition 3**. ***Deep Learning (DL)** is a specialized subset of machine learning that is based on artificial neural networks with multiple layers (deep neural networks). * ::: ::: tcolorbox Deep learning has become increasingly dominant within the field of machine learning due to its flexibility and effectiveness across various domains and data types. Its ability to automatically learn complex features from data has led to significant advancements in areas such as image recognition, natural language processing, and speech synthesis. ::: <figure id="fig:ai_ml_dl"> <figcaption>Hierarchical Relationship between AI, ML, and DL</figcaption> </figure> [1](#fig:ai_ml_dl){reference-type="ref+Label" reference="fig:ai_ml_dl"} illustrates the hierarchical relationship between these three concepts, with Deep Learning being a subset of Machine Learning, which in turn is a subset of Artificial Intelligence. # Why Deep Learning? The widespread adoption of deep learning is primarily due to its versatility and effectiveness across various subfields of machine learning. Unlike many traditional machine learning techniques, deep learning models can be effectively applied to **supervised learning**, **unsupervised learning**, and **reinforcement learning** paradigms. Moreover, deep learning models exhibit remarkable adaptability to different data types, including images, text, and audio, making them a powerful tool for a wide range of applications. <figure id="fig:dl_applicability"> <figcaption>Deep Learning’s Applicability Across Machine Learning Subfields</figcaption> </figure> ::: tcolorbox As illustrated in [2](#fig:dl_applicability){reference-type="ref+Label" reference="fig:dl_applicability"}, deep learning models (represented by the orange circle) can be applied across the different subfields of machine learning, demonstrating their broad applicability and versatility. This contrasts with many traditional machine learning techniques that are often specialized for a specific subfield or data type. ::: This adaptability and broad applicability are key factors contributing to the current prominence of deep learning in the field of artificial intelligence. # Supervised Learning {#sec:supervised_learning} **Supervised learning** is a fundamental area within machine learning where models learn from labeled data. In this paradigm, the goal is to train a model to learn a function that accurately maps input data to corresponding desired outputs. This learning process is guided by the provided labels, which indicate the correct output for each input. ::: {#def:supervised_learning .definition} **Definition 4**. *In **supervised learning**, the training data consists of pairs of inputs ($x$) and their corresponding desired outputs ($y$). The primary objective is to learn a function $f$ such that $f(x) \approx y$ for new, unseen inputs. * ::: ## Regression {#subsec:regression} **Regression** is a specific type of supervised learning where the output variable is a continuous real number. This is in contrast to classification tasks, where the output is a discrete category. ::: tcolorbox Predicting the price of an apartment based on various features such as square footage, the number of bedrooms, and the number of parking spots. ::: - **Input ($x$)**: Features of an apartment, such as square footage, number of bedrooms, and parking spots. These features are typically represented as numerical values. - **Output ($y$)**: The predicted price of the apartment, which is a real number. **Training Data:** In order to train a regression model, we require a dataset consisting of multiple apartments, each with known features and their corresponding actual prices. This dataset serves as the 'experience' from which the model learns. **Mathematical Formulation:** Let $x$ represent the input features of an apartment, and let $y$ represent the actual price of that apartment. The goal of regression is to find a function $f$ such that: $$f(x) = \hat{y}$$ where $\hat{y}$ is the predicted price of the apartment. The function $f$ is learned from the training data and is used to predict the price of new, unseen apartments. ::: {#tab:apartment_data} Apartment Square Footage Bedrooms Parking Spots Price ----------- ---------------- ---------- --------------- --------- 1 1500 3 2 300,000 2 1200 2 1 250,000 3 1800 4 2 350,000 \... \... \... \... \... : Example of Training Data for Apartment Price Prediction ::: [1](#tab:apartment_data){reference-type="ref+Label" reference="tab:apartment_data"} provides a sample of the training data used to train a regression model for predicting apartment prices. Each row represents a different apartment with its corresponding features and price. # Conclusion {#sec:conclusion} This introductory lecture provided an overview of the fundamental concepts of **artificial intelligence (AI)**, **machine learning (ML)**, and **deep learning (DL)**. We emphasized the significance of deep learning and its wide-ranging applicability across various domains. Additionally, we introduced **supervised learning**, with a particular focus on **regression**, as a crucial area within machine learning. ::: tcolorbox **Key Takeaways:** - **Deep learning** is a powerful subset of machine learning that utilizes artificial neural networks. - Deep learning models can be applied to diverse data types and across various machine learning subfields, including supervised, unsupervised, and reinforcement learning. - **Supervised learning** involves training models using labeled data to learn a mapping from inputs to desired outputs. - **Regression** is a type of supervised learning where the output is a continuous real number. ::: ::: tcolorbox **Follow-up Questions:** - How can we effectively evaluate the performance of a regression model? - What are some other examples of supervised learning tasks beyond regression? - What are the limitations of supervised learning and when might other approaches be more suitable? ::: ::: tcolorbox **Next Lecture:** The next lecture will explore the architecture of **neural networks** in greater detail. We will discuss different types of **activation functions** and their impact on network behavior. Furthermore, we will introduce the concept of **backpropagation** and its critical role in training deep learning models. (Specific date and time will be announced on Microsoft Teams). ::: These follow-up questions are designed to encourage further thought and exploration of the topics covered in this introductory lecture.