Skip to Main content Skip to Navigation

Deep Inside Visual-Semantic Embeddings

Abstract : Nowadays Artificial Intelligence (AI) is omnipresent in our society. The recentdevelopment of learning methods based on deep neural networks alsocalled "Deep Learning" has led to a significant improvement in visual representation models.and textual.In this thesis, we aim to further advance image representation and understanding.Revolving around Visual Semantic Embedding (VSE) approaches, we explore different directions: We present relevant background covering images and textual representation and existing multimodal approaches. We propose novel architectures further improving retrieval capability of VSE and we extend VSE models to novel applications and leverage embedding models to visually ground semantic concept. Finally, we delve into the learning process andin particular the loss function by learning differentiable approximation of ranking based metric.
Complete list of metadata
Contributor : ABES STAR :  Contact
Submitted on : Monday, October 25, 2021 - 4:29:26 PM
Last modification on : Saturday, July 9, 2022 - 3:25:09 AM
Long-term archiving on: : Wednesday, January 26, 2022 - 9:41:32 PM


Version validated by the jury (STAR)


  • HAL Id : tel-03402492, version 1


Martin Engilberge. Deep Inside Visual-Semantic Embeddings. Machine Learning [cs.LG]. Sorbonne Université, 2020. English. ⟨NNT : 2020SORUS150⟩. ⟨tel-03402492⟩



Record views


Files downloads