Skip to Main content Skip to Navigation

Gaze based weakly supervised localization for image classification : application to visual recognition in a food dataset

Xin Wang 1 
1 MLIA - Machine Learning and Information Access
LIP6 - Laboratoire d'Informatique de Paris 6
Abstract : In this dissertation, we discuss how to use the human gaze data to improve the performance of the weak supervised learning model in image classification. The background of this topic is in the era of rapidly growing information technology. As a consequence, the data to analyze is also growing dramatically. Since the amount of data that can be annotated by the human cannot keep up with the amount of data itself, current well-developed supervised learning approaches may confront bottlenecks in the future. In this context, the use of weak annotations for high-performance learning methods is worthy of study. Specifically, we try to solve the problem from two aspects: One is to propose a more time-saving annotation, human eye-tracking gaze, as an alternative annotation with respect to the traditional time-consuming annotation, e.g. bounding box. The other is to integrate gaze annotation into a weakly supervised learning scheme for image classification. This scheme benefits from the gaze annotation for inferring the regions containing the target object. A useful property of our model is that it only exploits gaze for training, while the test phase is gaze free. This property further reduces the demand of annotations. The two isolated aspects are connected together in our models, which further achieve competitive experimental results.
Document type :
Complete list of metadata

Cited literature [221 references]  Display  Hide  Download
Contributor : ABES STAR :  Contact
Submitted on : Monday, November 5, 2018 - 5:05:09 PM
Last modification on : Saturday, July 9, 2022 - 3:25:40 AM
Long-term archiving on: : Wednesday, February 6, 2019 - 3:41:35 PM


Version validated by the jury (STAR)


  • HAL Id : tel-01912846, version 1


Xin Wang. Gaze based weakly supervised localization for image classification : application to visual recognition in a food dataset. Human-Computer Interaction [cs.HC]. Université Pierre et Marie Curie - Paris VI, 2017. English. ⟨NNT : 2017PA066577⟩. ⟨tel-01912846⟩



Record views


Files downloads