Temporal and semantic analysis of richly typed social networks from user-generated content sites on the web

Zide Meng 1
1 WIMMICS - Web-Instrumented Man-Machine Interactions, Communities and Semantics
CRISAM - Inria Sophia Antipolis - Méditerranée , Laboratoire I3S - SPARKS - Scalable and Pervasive softwARe and Knowledge Systems
Abstract : We propose an approach to detect topics, overlapping communities of interest, expertise, trends andactivities in user-generated content sites and in particular in question-answering forums such asStackOverFlow. We first describe QASM (Question & Answer Social Media), a system based on socialnetwork analysis to manage the two main resources in question-answering sites: users and contents. Wealso introduce the QASM vocabulary used to formalize both the level of interest and the expertise ofusers on topics. We then propose an efficient approach to detect communities of interest. It relies onanother method to enrich questions with a more general tag when needed. We compared threedetection methods on a dataset extracted from the popular Q&A site StackOverflow. Our method basedon topic modeling and user membership assignment is shown to be much simpler and faster whilepreserving the quality of the detection. We then propose an additional method to automatically generatea label for a detected topic by analyzing the meaning and links of its bag of words. We conduct a userstudy to compare different algorithms to choose the label. Finally we extend our probabilistic graphicalmodel to jointly model topics, expertise, activities and trends. We performed experiments with realworlddata to confirm the effectiveness of our joint model, studying the users’ behaviors and topicsdynamics
Document type :
Theses
Complete list of metadatas

Cited literature [61 references]  Display  Hide  Download

https://hal.inria.fr/tel-01402612
Contributor : Abes Star <>
Submitted on : Thursday, February 9, 2017 - 10:45:07 AM
Last modification on : Monday, November 5, 2018 - 3:52:09 PM
Long-term archiving on : Wednesday, May 10, 2017 - 1:16:01 PM

File

2016AZUR4090.pdf
Version validated by the jury (STAR)

Identifiers

  • HAL Id : tel-01402612, version 3

Collections

Citation

Zide Meng. Temporal and semantic analysis of richly typed social networks from user-generated content sites on the web. Other [cs.OH]. Université Côte d'Azur, 2016. English. ⟨NNT : 2016AZUR4090⟩. ⟨tel-01402612v3⟩

Share

Metrics

Record views

1066

Files downloads

443