Skip to Main content Skip to Navigation

Reconnaissance de scènes multimodale embarquée

Abstract : Context: This PhD takes place in the contexts of Ambient Intelligence and (Mobile) Context/Scene Awareness. Historically, the project comes from the company ST-Ericsson. The project was depicted as a need to develop and embed a “context server” on the smartphone that would get and provide context information to applications that would require it. One use case was given for illustration: when someone is involved in a meeting and receives a call, then thanks to the understanding of the current scene (meet at work), the smartphone is able to automatically act and, in this case, switch to vibrate mode in order not to disturb the meeting. The main problems consist of i) proposing a definition of what is a scene and what examples of scenes would suit the use case, ii) acquiring a corpus of data to be exploited with machine learning based approaches, and iii) propose algorithmic solutions to the problem of scene recognition.Data collection: After a review of existing databases, it appeared that none fitted the criteria I fixed (long continuous records, multi-sources synchronized records necessarily including audio, relevant labels). Hence, I developed an Android application for collecting data. The application is called RecordMe and has been successfully tested on 10+ devices, running Android 2.3 and 4.0 OS versions. It has been used for 3 different campaigns including the one for scenes. This results in 500+ hours recorded, 25+ volunteers were involved, mostly in Grenoble area but abroad also (Dublin, Singapore, Budapest). The application and the collection protocol both include features for protecting volunteers privacy: for instance, raw audio is not saved, instead MFCCs are saved; sensitive strings (GPS coordinates, device ids) are hashed on the phone.Scene definition: The study of existing works related to the task of scene recognition, along with the analysis of the annotations provided by the volunteers during the data collection, allowed me to propose a definition of a scene. It is defined as a generalisation of a situation, composed of a place and an action performed by one person (the smartphone owner). Examples of scenes include taking a transportation, being involved in a work meeting, walking in the street. The composition allows to get different kinds of information to provide on the current scene. However, the definition is still too generic, and I think that it might be completed with additionnal information, integrated as new elements of the composition.Algorithmics: I have performed experiments involving machine learning techniques, both supervised and unsupervised. The supervised one is about classification. The method is quite standard: find relevant descriptors of the data through the use of an attribute selection method. Then train and test several classifiers (in my case, there were J48 and Random Forest trees ; GMM ; HMM ; and DNN). Also, I have tried a 2-stage system composed of a first step of classifiers trained to identify intermediate concepts and whose predictions are merged in order to estimate the most likely scene. The unsupervised part of the work aimed at extracting information from the data, in an unsupervised way. For this purpose, I applied a bottom-up hierarchical clustering, based on the EM algorithm on acceleration and audio data, taken separately and together. One of the results is the distinction of acceleration into groups based on the amount of agitation.
Complete list of metadatas

Cited literature [61 references]  Display  Hide  Download
Contributor : Abes Star :  Contact
Submitted on : Wednesday, April 6, 2016 - 3:16:07 PM
Last modification on : Thursday, November 19, 2020 - 1:01:55 PM
Long-term archiving on: : Thursday, July 7, 2016 - 4:42:02 PM


Version validated by the jury (STAR)


  • HAL Id : tel-01298709, version 1



David Blachon. Reconnaissance de scènes multimodale embarquée. Intelligence artificielle [cs.AI]. Université Grenoble Alpes, 2016. Français. ⟨NNT : 2016GREAM001⟩. ⟨tel-01298709⟩



Record views


Files downloads