-
GRADUATE STUDIES
- • STUDYING AT MANUTECH SLEIGHT
-
MSc in Optics, Image, Vision, Multimedia (OIVM)
-
iPSRS - Intelligent Photonics for Security, Reliability, Sustainability and Safety
- PSRS - Partner universities
- RADMEP - Radiation and its Effects on MicroElectronics and Photonics Technologies
- COSI - Computational Colour and Spectral Imaging
- IMLEX - Imaging & Light in Extended Reality
- AIMA - Advanced Imaging & Material Appearance
- PE - Photonics Engineering
-
iPSRS - Intelligent Photonics for Security, Reliability, Sustainability and Safety
- MSc in Computer Science
- MSc in Health Engineering
- Engineering schools' research tracks
- Doctoral studies
- Training through research
- Opportunities
- Admission and aid
- OPTICA student chapter
-
RESEARCH & INNOVATION
-
SCIENTIFIC EVENTS
- • News and about
-
The SLEIGHT Science Events
- SSE #12 - Imaging in Manutech-SLEIGHT
- SSE #11 - SLEIGHT in 2024
- SSE #10 - Sustainable Surface Engineering
- SSE #09 - SLEIGHT in 2023
- SSE #08 - Photonics for Health
- SSE #07 - SLEIGHT in 2022
- SSE #06 - Machine Learning
- SSE #05 - SLEIGHT in 2021
- SSE #03 - SLEIGHT in 2020
- SSE #02 - Material Appearance
- SSE #01 - Topics and stakeholders
- Manutech-SLEIGHT Awards
- Scientific conferences
- Events in partnership with Manutech-SLEIGHT
- CAMPUS LIFE
- ABOUT US
- NEWSLETTER
You are here : EUR MANUTECH SLEIGHT > SLEIGHT's research projects
-
Partager cette page
RMLDM - Research project
Representation and Machine Learning with Missing Data
PhD student: Richard SERRANO, ED SIS 488 (Science, Engineering, Health)
ABSTRACT
Attributed graphs are a very powerful data structure for many tasks in image analysis. After segmentation, the nodes of the graph represent the regions of the image and the links their relationships. The properties characterizing the regions (colour, texture, size, etc.) are used to define the attributes of the nodes and the type of relationships between the regions (connectivity, distance, …) the labels of the links. However, due to defaults of acquisition, many values of these features can be missing or noisy, degrading the performances of the graph-based mining and learning algorithms and the whole image analysis process (Little, 2019). On the other hand, deep learning methods have proven to be particularly effective in dealing with graph data and learning node representation but they strongly rely on the availability and the quality of node/link features (notably the family of Graph Neural Networks). In the frame of Graph Neural Networks, only a few works focused on learning attribute-missing graph embeddings (Chen, 2020. Yoon, 2018).
In this thesis, our objectives are twofold. First, we will study the impact of the incompleteness and biases of the data on the algorithms. Second, we will develop methods for filling missing entries with plausible values to be able to better perform downstream machine-learning methods on the completed data. In particular, recently and for tabular data, optimal transport (OT) (Muzellec, 2020) has been shown to be more efficient than classical imputation methods based on low-rank assumptions (Hastie et al.,2015), iterative random forests (Stekhoven & Buhlmann, 2011) or variational autoencoders (Mattei & Frellsen, 2019; Ivanov et al., 2019). We propose to adapt this approach to graph data using the Fused-Gromov-Wasserstein metric (Vayer, 2019).
In this thesis, our objectives are twofold. First, we will study the impact of the incompleteness and biases of the data on the algorithms. Second, we will develop methods for filling missing entries with plausible values to be able to better perform downstream machine-learning methods on the completed data. In particular, recently and for tabular data, optimal transport (OT) (Muzellec, 2020) has been shown to be more efficient than classical imputation methods based on low-rank assumptions (Hastie et al.,2015), iterative random forests (Stekhoven & Buhlmann, 2011) or variational autoencoders (Mattei & Frellsen, 2019; Ivanov et al., 2019). We propose to adapt this approach to graph data using the Fused-Gromov-Wasserstein metric (Vayer, 2019).
ABOUT the RMLDM project - Thesis certified
RESEARCH AXES
Axis #2
KEYWORDS
Image processing, Machine learning, Graph,
Missing/ noisy data, Data completion, Optimal transport
DURATION - STATUS
01/10/2022 - 30/09/2025 - Ongoing
PhD STUDENT
Richard SERRANO (LabHC)
PROJECT COORDINATOR
Christine LARGERON (LabHC)
COORDINATING LABORATORY
Hubert Curien Laboratory (LabHC)
PARTNER LABORATORIES
Alberta Machine Intelligence Institute – Alberta University, USA
PARTNER RESEARCHERS
Baptiste JEUDY (LabHC)
Osmar ZAÏANE (AMII)
Axis #2
KEYWORDS
Image processing, Machine learning, Graph,
Missing/ noisy data, Data completion, Optimal transport
DURATION - STATUS
01/10/2022 - 30/09/2025 - Ongoing
PhD STUDENT
Richard SERRANO (LabHC)
PROJECT COORDINATOR
Christine LARGERON (LabHC)
COORDINATING LABORATORY
Hubert Curien Laboratory (LabHC)
PARTNER LABORATORIES
Alberta Machine Intelligence Institute – Alberta University, USA
PARTNER RESEARCHERS
Baptiste JEUDY (LabHC)
Osmar ZAÏANE (AMII)