Event detection and analysis on short text messages

Amosse Edouard 1, 2
2 WIMMICS - Web-Instrumented Man-Machine Interactions, Communities and Semantics
CRISAM - Inria Sophia Antipolis - Méditerranée , Laboratoire I3S - SPARKS - Scalable and Pervasive softwARe and Knowledge Systems
Abstract : In the latest years, the Web has shifted from a read-only medium where most users could only consume information to an interactive medium allowing every user to create, share and comment information. The downside of social media as an information source is that often the texts are short, informal and lack contextual information. On the other hand, the Web also contains structured Knowledge Bases (KBs) that could be used to enrich the user-generated content. This dissertation investigates the potential of exploiting information from the Linked Open Data KBs to detect, classify and track events on social media, in particular Twitter. More specifically, we address 3 research questions: i) How to extract and classify messages related to events? ii) How to cluster events into fine-grained categories? and 3) Given an event, to what extent user-generated contents on social medias can contribute in the creation of a timeline of sub-events? We provide methods that rely on Linked Open Data KBs to enrich the context of social media content; we show that supervised models can achieve good generalisation capabilities through semantic linking, thus mitigating overfitting; we rely on graph theory to model the relationships between NEs and the other terms in tweets in order to cluster fine-grained events. Finally, we use in-domain ontologies and local gazetteers to identify relationships between actors involved in the same event, to create a timeline of sub-events. We show that enriching the NEs in the text with information provided by LOD KBs improves the performance of both supervised and unsupervised machine learning models.
Document type :
Theses
Complete list of metadatas

Cited literature [139 references]  Display  Hide  Download

https://tel.archives-ouvertes.fr/tel-01679673
Contributor : Abes Star <>
Submitted on : Wednesday, January 10, 2018 - 10:27:34 AM
Last modification on : Monday, November 5, 2018 - 3:52:10 PM

File

2017AZUR4079.pdf
Version validated by the jury (STAR)

Identifiers

  • HAL Id : tel-01679673, version 1

Collections

Citation

Amosse Edouard. Event detection and analysis on short text messages. Other [cs.OH]. Université Côte d'Azur, 2017. English. ⟨NNT : 2017AZUR4079⟩. ⟨tel-01679673⟩

Share

Metrics

Record views

697

Files downloads

4394