Skip to Main content Skip to Navigation

Improving Latent Representations of ConvNets for Visual Understanding

Abstract : For a decade now, convolutional deep neural networks have demonstrated their ability to produce excellent results for computer vision. For this, these models transform the input image into a series of latent representations. In this thesis, we work on improving the "quality'' of the latent representations of ConvNets for different tasks. First, we work on regularizing those representations to increase their robustness toward intra-class variations and thus improve their performance for classification. To do so, we develop a loss based on information theory metrics to decrease the entropy conditionally to the class. Then, we propose to structure the information in two complementary latent spaces, solving a conflict between the invariance of the representations and the reconstruction task. This structure allows to release the constraint posed by classical architecture, allowing to obtain better results in the context of semi-supervised learning. Finally, we address the problem of disentangling, i.e. explicitly separating and representing independent factors of variation of the dataset. We pursue our work on structuring the latent spaces and use adversarial costs to ensure an effective separation of the information. This allows to improve the quality of the representations and allows semantic image editing.
Complete list of metadata

Cited literature [196 references]  Display  Hide  Download
Contributor : ABES STAR :  Contact
Submitted on : Friday, November 20, 2020 - 5:18:21 PM
Last modification on : Saturday, July 9, 2022 - 3:20:48 AM


Version validated by the jury (STAR)


  • HAL Id : tel-02309812, version 2


Thomas Robert. Improving Latent Representations of ConvNets for Visual Understanding. Artificial Intelligence [cs.AI]. Sorbonne Université, 2019. English. ⟨NNT : 2019SORUS343⟩. ⟨tel-02309812v2⟩



Record views


Files downloads