Skip to Main content Skip to Navigation

Deep representation spaces

Abstract : In recent years, Deep Learning techniques have swept the state-of-the-art of many applications of Machine Learning, becoming the new standard approach for them. The architectures issued from these techniques have been used for transfer learning, which extended the power of deep models to tasks that did not have enough data to fully train them from scratch. This thesis' subject of study is the representation spaces created by deep architectures. First, we study properties inherent to them, with particular interest in dimensionality redundancy and precision of their features. Our findings reveal a strong degree of robustness, pointing the path to simple and powerful compression schemes. Then, we focus on refining these representations. We choose to adopt a cross-modal multi-task problem, and design a loss function capable of taking advantage of data coming from multiple modalities, while also taking into account different tasks associated to the same dataset. In order to correctly balance these losses, we also we develop a new sampling scheme that only takes into account examples contributing to the learning phase, i.e. those having a positive loss. Finally, we test our approach in a large-scale dataset of cooking recipes and associated pictures. Our method achieves a 5-fold improvement over the state-of-the-art, and we show that the multi-task aspect of our approach promotes a semantically meaningful organization of the representation space, allowing it to perform subtasks never seen during training, like ingredient exclusion and selection. The results we present in this thesis open many possibilities, including feature compression for remote applications, robust multi-modal and multi-task learning, and feature space refinement. For the cooking application, in particular, many of our findings are directly applicable in a real-world context, especially for the detection of allergens, finding alternative recipes due to dietary restrictions, and menu planning.
Complete list of metadata

Cited literature [94 references]  Display  Hide  Download
Contributor : ABES STAR :  Contact
Submitted on : Monday, March 9, 2020 - 5:56:10 PM
Last modification on : Saturday, July 9, 2022 - 3:18:52 AM
Long-term archiving on: : Wednesday, June 10, 2020 - 4:32:30 PM


Version validated by the jury (STAR)


  • HAL Id : tel-02503198, version 1


Micael Carvalho. Deep representation spaces. Artificial Intelligence [cs.AI]. Sorbonne Université, 2018. English. ⟨NNT : 2018SORUS292⟩. ⟨tel-02503198⟩



Record views


Files downloads