AI Lab Projects Overview

Author

Your Name

Published

January 28, 2025

Introduction

This document provides an overview of the various projects undertaken by the AI Lab at the University of Udine. The projects are grouped into clusters based on their topics, including computer vision, multimedia retrieval, agriculture, healthcare, and digital humanities. The aim is to showcase the diverse applications of AI techniques and the interdisciplinary nature of the research conducted at the lab. Key concepts and techniques from the course are highlighted, demonstrating their practical applications in real-world scenarios. The presentation of these projects serves to illustrate how theoretical knowledge acquired during the course can be applied to solve concrete problems across various domains.

  • To present a comprehensive overview of the AI Lab’s projects.

  • To demonstrate the practical application of AI techniques in real-world scenarios.

  • To highlight the interdisciplinary nature of the research conducted at the AI Lab.

  • To showcase the diverse applications of AI in fields such as computer vision, multimedia, agriculture, healthcare, and digital humanities.

  • To provide students with insights into potential research and publication opportunities.

Computer Vision and Multimedia

Cross-Modal Retrieval for 3D Scenes

This project focuses on using text queries to rank 3D scenes, analogous to how search engines rank web pages or videos. The 3D scenes include detailed models of apartments and museums. These scenes are populated with various elements such as furniture, paintings, and videos, each of which can influence the relevance of the scene to specific user queries. The core idea is to retrieve and rank these complex 3D environments based on their relevance to textual descriptions, effectively creating a search engine for virtual spaces.

  • Objective: Develop a system to rank 3D scenes based on their relevance to text queries.

  • Data: 3D scenes representing apartments and museums, including associated multimedia elements.

  • Techniques: Cross-modal retrieval, relevance ranking, multimedia content analysis.

  • Team Members: Alex Falcon, Ali.

  • Status: Ongoing, with significant progress and opportunities for publication.

Cross-Modal Retrieval System: Text queries are used to rank 3D scenes.

Video Ranking and Generation

This project extends the concept of video ranking, similar to popular platforms like YouTube, but with a deeper focus on understanding the semantic content of both the query and the video. The goal is to improve the accuracy of video retrieval by analyzing the meaning and context of user queries and matching them with the most relevant videos. Additionally, the project involves generating artistic videos to augment the dataset, specifically focusing on videos related to Renaissance paintings. This addresses the scarcity of such content and enhances the diversity of the training data.

  • Objective: Rank videos based on semantic understanding of queries and content, and generate artistic videos to enrich the dataset.

  • Data: A collection of videos, with a particular emphasis on artistic videos related to Renaissance paintings.

  • Techniques: Semantic analysis, video generation using AI, video content evaluation, user needs understanding.

  • Team Members: Alex Falcon, Ali, Gianluca.

  • Status: Ongoing, with opportunities for participation in international challenges and publications at top-tier conferences like CVPR.

Video Ranking System: Text queries are semantically analyzed to rank videos.

3D Model Generation

This project focuses on generating 3D models of complex scenes using a novel approach that leverages the advancements in two-dimensional image generation techniques. Instead of creating 3D models from scratch, which is computationally expensive and often yields suboptimal results, this method starts with generating a 2D image of the desired scene. It then estimates the depth information from this image and uses it to construct a 3D model. By combining multiple such models, the project aims to create detailed and realistic 3D representations of complex environments, such as rooms or entire buildings.

  • Objective: Generate 3D models of complex scenes efficiently and realistically.

  • Techniques: 2D image generation, depth estimation from 2D images, merging of multiple 3D models, leveraging state-of-the-art 2D generation models.

  • Status: Ongoing, with a focus on improving computational efficiency and the quality of the generated 3D models by capitalizing on the superior results of 2D image generation.

3D Model Generation Process: From 2D image generation to 3D model merging.

Applications in Agriculture

Carbon Sequestration and Storage Estimation

This project aims to estimate carbon sequestration and storage in forests using remote sensing data. The data includes weather patterns, geomorphological characteristics, and spectral data obtained from satellites. The project combines vision-based techniques (for analyzing spectral images) with tabular data analysis to develop a comprehensive understanding of carbon dynamics in forest ecosystems.

  • Objective: Estimate carbon sequestration and storage in forests accurately.

  • Data: Remote sensing data, including weather data (temperature, rainfall), geomorphological data (slope, elevation), and spectral data from satellites.

  • Techniques: Combination of vision and tabular information processing, convolutional neural networks (CNNs) for image analysis, and potentially other machine learning models for tabular data. The CNNs are used to extract relevant features from the spectral images, which are then combined with tabular data for a holistic analysis.

  • Team Members: Alex Falcon, Beatrice, Maddie.

  • Status: Ongoing, with a recent publication in the journal Ecological Informatics, demonstrating the project’s impact and relevance.

Grapevine Phenological Stage Prediction

This project focuses on predicting the phenological stages of grapevines, which is crucial for optimizing vineyard management practices. The prediction is based on a combination of weather data, geomorphological information, and satellite imagery. The main challenge lies in the limited availability of labeled data, as obtaining accurate phenological stage labels requires expert assessment and is thus expensive and time-consuming.

Theorem 1 (Phenological Stage Prediction). Predicting the phenological stages of grapevines based on weather, geomorphological, and satellite data.

  • Objective: Predict grapevine phenological stages (e.g., flowering, fruit set, ripening) accurately and efficiently.

  • Data: Weather data (temperature, precipitation, solar radiation), geomorphological data (soil type, slope), and satellite data (spectral indices related to vine growth).

  • Techniques: Semi-supervised learning to leverage the limited labeled data, pseudo-labeling to generate synthetic labels for unlabeled data, and potentially deep learning models for integrating different data modalities.

  • Status: Ongoing, with a paper submitted to a reputable journal, indicating the novelty and potential impact of the research.

Algorithm for Pseudo-Labeling:

Input: Limited labeled data \(D_L = \{(x_i, y_i)\}_{i=1}^l\), unlabeled data \(D_U = \{x_j\}_{j=1}^u\) Output: Augmented labeled data \(D_{aug}\) Train an initial model \(M\) on \(D_L\) Predict pseudo-labels for \(D_U\): \(\hat{y}_j = M(x_j)\) for each \(x_j \in D_U\) Create pseudo-labeled data \(D_P = \{(x_j, \hat{y}_j)\}_{j=1}^u\) Combine labeled and pseudo-labeled data: \(D_{aug} = D_L \cup D_P\) Train a new model \(M'\) on \(D_{aug}\) return \(M'\)

Gene Discovery for Iron Management in Plants

This project aims to discover new gene sequences related to iron absorption, maintenance, and utilization in plants. This is a crucial area of research for improving crop yield and nutritional value, particularly in iron-deficient soils. The project employs AI techniques to analyze large datasets of gene sequences and identify novel sequences that play a significant role in iron management.

  • Objective: Discover new gene sequences involved in iron absorption, maintenance, and transport in plants.

  • Data: Gene sequences from various plant species, potentially combined with experimental data on gene expression under different iron conditions.

  • Techniques: Clustering to group similar gene sequences, classification to identify genes with specific functions related to iron management, and potentially other bioinformatics techniques like sequence alignment and motif discovery.

  • Team Members: Ali.

  • Status: Recently started (three weeks ago), indicating an exploratory phase with significant potential for groundbreaking discoveries.

Plant Coverage Estimation and Annotation Tool

This project involves developing an innovative tool for estimating plant coverage in agricultural fields using image segmentation techniques. Accurate plant coverage estimation is essential for monitoring crop health, optimizing resource allocation, and assessing the impact of weeds or other competing species. The tool not only provides automated plant coverage estimates but also supports manual annotation with AI assistance, significantly improving the efficiency of data labeling.

  • Objective: Develop an automated tool for accurate plant coverage estimation and create an AI-assisted annotation tool for efficient data labeling.

  • Data: Images of agricultural fields, potentially including different crop types, growth stages, and varying lighting conditions.

  • Techniques: Image segmentation using state-of-the-art deep learning models, specifically large segmentation models that can generalize well to unseen data. The AI assistance in the annotation tool leverages these models to suggest annotations, reducing manual effort.

  • Team Members: Alex Falcon, Beatrice.

  • Status: Ongoing, with promising qualitative results indicating that approximately 90% of the pixels in the annotated images were annotated by the AI, demonstrating the effectiveness of the AI assistance.

Plant Coverage Estimation and Annotation Tool Workflow

Fish Health Monitoring and Prediction

This project focuses on monitoring the health of fish in aquaculture farms using a combination of sensor data and image analysis. The primary goal is to develop a predictive system that can identify early signs of diseases and help fish farmers take preventive measures. This is particularly important for minimizing economic losses and ensuring the sustainability of fish farming. The project specifically addresses the Red Mark Syndrome, a disease that is not yet well-studied in the literature.

  • Objective: Monitor fish health in real-time, predict the onset of diseases, and develop preventive strategies based on environmental conditions.

  • Data: Sensor data (water temperature, salinity, oxygen levels), environmental data (weather conditions), and image data of fish, particularly focusing on detecting Red Mark Syndrome.

  • Techniques: Object detection using YOLO for identifying red marks on fish in images, machine learning models like Random Forests and SVMs for analyzing sensor and environmental data, and potentially time-series analysis for predicting disease outbreaks.

  • Team Members: Beatrice, Alex Falcon.

  • Status: Ongoing, facing challenges due to the limited availability of image data (only 40 images of fish with Red Mark Syndrome and no images of healthy fish). This highlights the need for data augmentation and potentially transfer learning from other fish diseases.

  • Note: The project underscores the importance of traditional machine learning techniques when deep learning is not feasible due to data limitations. It also emphasizes the need for creative data acquisition strategies, such as finding similar datasets or collecting images of healthy fish from online sources.

Soil Texture Estimation

This project focuses on estimating soil texture, specifically the proportions of sand, clay, and silt, based on measurements of gamma-ray radiation and electromagnetic impedance. Accurate soil texture estimation is crucial for various agricultural applications, including irrigation management, fertilizer application, and crop suitability assessment.

Theorem 2 (Soil Texture Estimation). Estimating the proportions of sand, clay, and silt in soil using gamma-ray radiation and electromagnetic impedance measurements.

  • Objective: Estimate soil texture (sand, clay, silt content) accurately and efficiently.

  • Data: 340 soil samples with corresponding gamma-ray radiation and electromagnetic impedance measurements.

  • Techniques: A multi-layer perceptron (MLP) with a single hidden layer of 100 neurons was found to be the most effective model, outperforming a linear regressor. This suggests that the relationship between the input measurements and soil texture is non-linear.

  • Team Members: Beatrice.

  • Status: Completed, with the MLP model demonstrating good performance in predicting soil texture.

Crop Yield Prediction

This project involves predicting crop yields based on an exceptionally comprehensive dataset spanning 50 years of daily field management practices. This dataset, collected since 1974, includes detailed information on irrigation, fertilization, and other treatments applied to six different fields. The project represents a unique opportunity to analyze long-term trends and develop highly accurate predictive models for crop yields.

  • Objective: Predict crop yields based on historical field management data.

  • Data: A unique and extensive dataset covering 50 years of daily field management data for six fields, including information on irrigation, fertilization, and other treatments. The data is stored in a well-structured database with around 10 tables.

  • Techniques: Given the tabular nature of the data and its size, techniques to be used are still unclear.

  • Team Members: Beatrice.

  • Status: Ongoing, with access to a remarkable historical dataset that has the potential to revolutionize crop yield prediction.

Healthcare Applications

Nutritional Information Prediction from Food Images

This project aims to develop a tool capable of predicting nutritional information, such as calories, fat, protein, and carbohydrate content, directly from images or videos of food. This tool is envisioned to support dietary assessment and potentially aid in the management of diet-related health conditions.

  • Objective: Accurately predict nutritional information (calories, fat, protein, carbs) from food images or videos.

  • Data: A dataset of images and videos specifically focusing on Italian food, addressing the nuances and differences in Italian cuisine compared to American food habits and nutritional databases.

  • Techniques:

    • Transfer Learning: Leveraging pre-trained models on large datasets (e.g., ImageNet) and fine-tuning them on the Italian food dataset to adapt to the specific characteristics of Italian dishes.

    • Convolutional Neural Networks (CNNs): Employing CNNs as the backbone architecture for image feature extraction and processing. Different architectures like ResNet, InceptionNet, and Vision Transformers are being considered.

    • Depth-Supported Prediction: Incorporating depth information, either estimated from images or provided as additional input, to improve the accuracy of volume and mass estimation, which in turn enhances the accuracy of nutritional content prediction.

  • Team Members: Alex Falcon.

  • Status: Ongoing, with active experimentation on transfer learning strategies and the development of novel modeling approaches to improve prediction accuracy.

  • Note: The project specifically addresses the differences between US and Italian nutritional databases and food habits, ensuring the tool’s relevance and accuracy for Italian users.

Nutritional Information Prediction System

Fact Checking and Fake News Detection

This project focuses on the critical task of detecting fake news and unverifiable claims, particularly in the context of multimodal content where both text and images are used to convey information. The project aims to develop sophisticated models that can analyze the complex interplay between textual and visual elements to identify inconsistencies, manipulations, and potentially harmful misinformation.

  • Objective: Develop robust methods for detecting fake news and unverifiable claims, with a specific focus on multimodal content that combines text and images.

  • Data: Multimodal content (text and images) collected from various social media platforms, encompassing a wide range of topics and potential misinformation.

  • Techniques:

    • Multimodal Analysis: Developing techniques to analyze both the textual and visual components of social media posts and their interrelationships. This includes identifying discrepancies between the message conveyed by the text and the image.

    • Graph Neural Networks (GNNs): Utilizing GNNs to incorporate social network information into the analysis. By modeling the relationships between users and their interactions, GNNs can help identify patterns of misinformation spread and potentially identify sources of fake news.

    • Fine-grained Output: Creating models that provide nuanced outputs, categorizing different types of mismatches between text and images, and potentially identifying the intent behind the misinformation (e.g., satire, propaganda).

  • Team Members: Beatrice.

  • Status: Ongoing, with a focus on combining textual, visual, and social network information to improve the accuracy and robustness of fake news detection. The project also involves collaboration with Sapienza University of Rome to leverage their expertise in social network analysis.

Digital Humanities

Automatic Classification of Legislative Documents

This project addressed the challenge of automatically classifying a large corpus of legislative documents using the EuroVoc thesaurus, a multilingual thesaurus maintained by the Publications Office of the European Union. The project was carried out at the request of the Italian Chamber of Deputies to improve the efficiency and consistency of their document classification process.

  • Objective: Develop a tool to automatically classify legislative documents according to the EuroVoc thesaurus, replacing the previous manual classification system.

  • Data: European Parliament documents were used as the primary data source. However, due to the lack of readily available data from the Italian Chamber of Deputies, data scraping techniques were employed to collect relevant documents from the European Parliament’s website.

  • Techniques:

    • Data Scraping: Web scraping techniques were used to extract legislative documents from the European Parliament’s website, as no pre-existing dataset was provided.

    • Data Balancing: The scraped dataset presented challenges related to class imbalance, with many EuroVoc tags being underrepresented. Data balancing techniques were applied to address this issue and ensure the model’s performance across all categories.

    • Fine-tuning Transformer (BERT) Models: The core of the classification system relies on fine-tuning a pre-trained BERT (Bidirectional Encoder Representations from Transformers) model. BERT’s ability to understand the context of words in a sentence made it well-suited for this task. The model was fine-tuned on the balanced dataset of European Parliament documents to classify documents according to the EuroVoc tags.

  • Team Members: Alessandro, Gianluca.

  • Status: Successfully completed. The developed tool is currently in production at the Italian Chamber of Deputies, demonstrating the project’s real-world impact and effectiveness. The system is integrated into their website, and the automatically generated EuroVoc classifications are visible on the document pages.

Conclusion

The AI Lab at the University of Udine is actively engaged in a wide spectrum of research projects that leverage the power of Artificial Intelligence across diverse domains, including computer vision, multimedia, agriculture, healthcare, and digital humanities. These projects underscore the interdisciplinary nature of AI research, showcasing its potential to address complex real-world challenges and drive innovation in various sectors. The lab’s commitment to applying fundamental AI techniques learned in the course, such as convolutional neural networks, image segmentation, transfer learning, and natural language processing, is evident in the breadth and depth of the projects undertaken. Furthermore, the strong collaborations with other departments within the university and external institutions, both academic and industrial, significantly enrich the research environment and provide invaluable opportunities for students and researchers alike.

  • The AI Lab’s projects serve as compelling examples of the diverse applications of AI in various fields, demonstrating the versatility and transformative potential of AI techniques.

  • Many projects involve interdisciplinary collaborations, highlighting the importance of domain expertise and the synergy that arises from combining AI with other fields of study.

  • Data scarcity is a common challenge encountered in several projects, particularly in agriculture and healthcare. This is addressed through innovative techniques like semi-supervised learning, data augmentation, and transfer learning.

  • Traditional machine learning methods, such as Random Forests and Support Vector Machines (SVMs), remain highly relevant in scenarios where deep learning is not feasible due to data limitations or computational constraints.

  • The lab’s work has resulted in tangible outcomes, including publications in reputable journals and conferences, collaborations with industry partners like Bayer SPA and academic institutions like MIT, and the development of practical tools that are currently being used by external organizations, such as the Italian Chamber of Deputies.

Follow-up Questions/Topics for Next Lecture:

These questions delve into specific challenges and opportunities identified in the presented projects, encouraging further discussion and exploration of advanced AI techniques and their applications.

  • How can we improve the accuracy of nutritional information prediction from food images, especially for complex dishes with multiple ingredients and occlusions? This could involve exploring more sophisticated computer vision techniques, such as 3D reconstruction or incorporating depth information.

  • What are the ethical considerations when analyzing social media data for health-related tasks, particularly concerning user privacy, data security, and the potential for bias or misinterpretation of results? This discussion could delve into topics like informed consent, data anonymization, and the responsible use of AI in healthcare.

  • How can we better address the challenges of limited data in agricultural applications, where obtaining large labeled datasets is often expensive and time-consuming? This could involve exploring techniques like few-shot learning, active learning, or leveraging synthetic data generation.

  • What are the latest advancements in multimodal fact-checking and fake news detection, and how can we improve the robustness of these systems against adversarial attacks or evolving misinformation tactics? This could involve discussing novel architectures for multimodalarchitectures for multimodal fusion, incorporating contextual information, and addressing the challenges of explainability and interpretability.

  • How can graph neural networks be effectively used to incorporate social network information into fake news detection models, and what are the challenges in modeling complex social interactions and identifying influential actors or communities involved in spreading misinformation?