, Predicting information di usion on microblogs, p.39

R. .. Experiments, 41 2.3.5.1 Binary classi cation

. .. Multi-class-classi-cation, , p.44

. .. Most-important-features, 47 2.3.6.2 Multi-class classi cation, p.48

. .. Correlations-between-features, , p.49

. .. , 56 2.4.4 Experiments and results, Predicting the di usion of brand stories on microblogs 53 2.4.1 Tweet representation

, 111 4.2.1 Ontology-based information extraction, p.111

, Knowledge base model: the geographical-festival ontology

. .. Populating-the-domain-ontology,

. .. , Performance population, p.121

. .. Inferring-new-knowledge, , p.121

D. .. Conclusions, , p.121

P. Agarwal and R. Vaithiyanathan, Saurabh Sharma et Gautam Shro . Catching the long-tail: extracting local news events from Twitter, Sixth international AAAI conference on weblogs and social media, 2012.

, Towards a metaanalysis-based user assistant for analysis processes, International conference on computer science and information technology, 2017.

H. Allcott and M. Gentzkow, Social media and fake news in the 2016 election, 2017.

W. Assaad and J. M. Gomez, Social network in marketing (social media marketing) opportunities and risks. International journal of managing public sector information and communication technologies, vol.2, p.13, 2011.

L. Backstrom, E. Sun, and C. Marlow, Find me if you can: improving geographical prediction with social and spatial proximity, Proceedings of the 19th international conference on world wide web, pp.61-70, 2010.

A. Benson, R. Haghighi, and . Barzilay, Event discovery in social media feeds, Proceedings of the 49th annual meeting of the association for computational linguistics: human language technologies, vol.1, pp.389-398, 2011.

H. Bo, P. Cook, and T. Baldwin, Geolocation prediction in social media data by nding location indicative words, COL-ING 2012, 24th International conference on computational linguistics, pp.1045-1062, 2012.

K. Bontcheva, L. Derczynski, A. Funk, A. Mark, D. Greenwood et al., TwitIE: An open-source information extraction pipeline for microblog text, Recent advances in natural language processing, vol.9, pp.83-90, 2013.

A. E. , C. Basave, A. Varga, M. Rowe, M. Stankovic et al., Making sense of microposts (# msm2013) concept extraction challenge, 2013.

S. Chandra, Latifur Khan et Fahad Bin Muhaya. Estimating twitter user location using social interactions-a content based approach, Privacy, security, risk and trust (PASSAT) and, 2011.

, IEEE third international conference on social computing (Social-Com), pp.838-843, 2011.

V. Nitesh, K. W. Chawla, L. Bowyer, P. Hall, and . Kegelmeyer, SMOTE: synthetic minority over-sampling technique, Journal of arti cial intelligence research, vol.16, pp.321-357, 2002.

Z. Cheng, J. Caverlee, and K. Lee, You are where you tweet: a content-based approach to geo-locating twitter users, Proceedings of the 19th ACM international conference on information and knowledge management, 2010.

L. Ermakova, A method for short message contextualization: experiments at CLEF/INEX, International conference of the cross-language evaluation forum for european languages, pp.352-363, 2015.
URL : https://hal.archives-ouvertes.fr/hal-01343032

O. Etzioni, M. Cafarella, D. Downey, A. Popescu, T. Shaked et al., Unsupervised named-entity extraction from the web: An experimental study, Arti cial intelligence, vol.165, issue.1, pp.91-134, 2005.

C. Fink, C. D. Piatko, J. May-eld, T. Finin, and J. Martineau, Geolocating blogs from their textual content. AAAI Spring symposium: social semantic web: Where Web 2, 2009.

J. R. Finkel, T. Grenager, and C. Manning, Incorporating non-local information into information extraction systems by gibbs sampling, Proceedings of the 43rd annual meeting on association for computational linguistics, pp.363-370, 2005.

S. Gensler, F. Völckner, Y. Liu-thompkins, and C. Wiertz, Josiane Mothe, Philippe Mulhem, Fionn Murtagh et Eric SanJuan. Overview of the CLEF 2016 cultural micro-blog contextualization workshop, International conference of the cross-language evaluation forum for european languages, vol.27, pp.371-378, 2013.

L. Goeuriot, J. Mothe, P. Mulhem, F. Murtagh, and E. Sanjuan, Overview of the CLEF 2016 cultural micro-blog contextualization workshop, International conference of the cross-language evaluation forum for European languages, pp.371-378, 2016.
URL : https://hal.archives-ouvertes.fr/hal-01571620

M. Hall, E. Frank, G. R. Holmes, B. Pfahringer, P. Reutemann et al., The WEKA data mining software: an update, ACM SIGKDD explorations newsletter, vol.11, issue.1, pp.10-18, 2009.

C. Hennessy-et-alan and F. Smeaton, Pro ling, assessing and matching personalities active in social media, Irish conference on arti cial intelligence and cognitive Science, 2016.

J. Hltcoe, Semeval-2013 task 2: Sentiment analysis in Twitter, vol.312, 2013.

, Building a knowledge base using microblogs: the case of cultural microblog contextualization collection), Conference and labs of the evaluation forum (CLEF 2016), 2016.

T. Bich, N. Hoang, and J. Mothe, Building a knowledge base using microBlogs: the case of festivals and location-based events, Rencontres jeunes chercheurs en recherche d'information (CORIA-RJCRI 2016), p.295, 2016.

T. Bich, N. Hoang, V. Moriceau, and J. Mothe, Location extraction from tweets (poster), Computational linguistics and intelligent Text Processing, pp.17-23, 2017.

T. Bich, N. Hoang, and J. Mothe, Predicting information di usion on Twitter-analysis of predictive features, Journal of Computational Science, 2017.

, Can we Predict locations in tweets? a Machine learning approach (accepted), 2018.

, Predicting the diffusion of brand stories in social network (regular paper), Computational linguistics and intelligent text processing, 2018.

, Thi Bich Ngoc Hoang et Josiane Mothe. Location extraction from tweets. Information processing & management, vol.54, pp.129-144, 2018.

T. Bich, N. Hoang, and J. Mothe, Méthode d'apprentissage pour extraire les localisations dans les MicroBlog, EGC -Atelier extraction et gestion parallèles distribuées des connaissances, pp.22-26, 2018.

T. Bich, N. Hoang, and J. Mothe, Extraction de localisations dans les microBlogs, Gestion et l'Analyse de données Spatiales et Temporelles, 2018.

. Jerry-r-hobbs, . Pan, and . Feng, An ontology of time for the semantic web, ACM transactions on Asian language information processing, vol.3, issue.1, pp.66-85, 2004.

L. Hong, O. Dan, D. Brian, Y. Davison, C. Hu et al., Predicting popular messages in twitter, Predicting the popularity of viral topics based on time series forecasting. Neurocomputing, vol.210, pp.55-65, 2011.

Y. Huang, Z. Liu, and P. Nguyen, Location-based event search in social texts, Computing, networking and communications (ICNC), 2015 international conference on, pp.668-672, 2015.

Y. Ikawa, M. Enoki, and M. Tatsubori, The-Minh Nguyen, Takahiro Kawamura, Hiroyuki Nakagawa, Yasuyuki Tahara et Akihiko Ohsuga. Building an earthquake evacuation ontology from twitter, Proceedings of the 21st international conference on world Wide Web, pp.306-311, 2011.

Z. Ji, A. Sun, G. Cong, and J. Han, Joint recognition and linking of ne-grained locations from tweets, Proceedings of the 25th international conference on world wide web, pp.1271-1281, 2016.

K. Jun'ichi-kazama and . Torisawa, Inducing gazetteers for named entity recognition by large-scale clustering of dependency relations, Proceedings annual meeting of the association of computational linguistics, pp.407-415, 2008.

E. Kontopoulos and C. Berberidis, Theologos Dergiades et Nick Bassiliades. Ontology-based sentiment analysis of twitter posts. Expert systems with applications, vol.40, pp.4065-4074, 2013.

V. Krishnan, D. Christopher, and . Manning, An e ective two-stage model for exploiting non-local dependencies in named entity recognition, Proceedings of the 21st international conference on computational linguistics and the 44th annual meeting of the association for computational linguistics, pp.1121-1128, 2006.

O. Kummer, J. Savoy-et-rue, and E. Argand, Feature selection in sentiment analysis, CORIA, 2012.

. Citeseer, , 2012.

H. Kwak and C. Lee, Hosung Park et Sue Moon, Proceedings of the 19th international conference on World wide web, pp.591-600, 2010.

J. Han-lau and T. Baldwin, An empirical evaluation of doc2vec with practical insights into document embedding generation, Proceedings of the 1st Workshop on representation learning for NLP, 2016.

T. Quoc-v-le and . Mikolov, Ryong Lee et Kazutoshi Sumiya. Measuring geographical regularities of crowd behaviors for Twitter-based geo-social event detection, Proceedings of the 2nd ACM SIGSPATIAL international workshop on location based social networks, vol.14, pp.337-340, 2010.

C. Li, J. Weng, Q. He, Y. Yao, and A. Datta, Aixin Sun et Bu-Sung Lee. Twiner: named entity recognition in targeted twitter stream, Proceedings of the 35th international ACM SIGIR conference on research and development in information retrieval, pp.721-730, 2012.

C. Li and A. Sun, Fine-grained location extraction from tweets with temporal awareness, Proceedings of the 37th international ACM SIGIR conference on research & development in information retrieval, pp.43-52, 2014.

J. Lingad, S. Karimi-et-jie-yin-;-xiaohua, S. Liu, F. Zhang, M. Wei et al., Location extraction from disaster-related microblogs, Proceedings of the 49th annual meeting of the association for computational linguistics: human language technologies, vol.1, pp.359-367, 2011.

J. Mahmud, J. Rey-nichols, C. Drews, ;. Mangold, J. David et al., Social media: The new hybrid element of the promotion mix, ACM transactions on intelligent systems and technology (TIST), vol.5, pp.357-365, 2009.

O. Peter and . Munro, Trends in social software. Collaboration and content strategies in-depth research Overview, Meenakshi Nagarajan, Hemant Purohit et Amit P Sheth. A Qualitative examination of topical tweet and retweet practices, pp.68-77, 2006.

, ICWSM, vol.2, issue.010, 2010.

S. Narayan, S. Prodanovic, M. Fazleh-elahi, and Z. Bogart, Population and Enrichment of Event Ontology using Twitter, Proceedings of the workshop on semantic personalized information management (SPIM) in conjunction with the 7th international conference on language resources and evaluation (LREC), 2010.

K. Nebhi, Ontology-based information extraction from Twitter, Proceedings of the Workshop on information extraction and entity analytics on social media data -COLING 2012. Mumbai (India), pp.17-22, 2011.

Q. Ngo, S. Doan, and W. Winiwarter, Using Wikipedia for extracting hierarchy and building geo-ontology. International journal of Web information systems, 2012.

, Halit O?uztüzün et Pinar Karagoz. Evidential estimation of event locations in microblogs using the Dempster-Shafer theory. Information processing & management, vol.52, pp.1227-1246, 2016.

B. Pang and L. Lee, A Sentimental Education: Sentiment Analysis Using Subjectivity Summarization Based on Minimum Cuts, Proceedings of the ACL, 2004.

T. Quack and . Bastian-leibe-et-luc-van-gool, World-scale mining of objects and events from community photo collections, Proceedings of the 2008 international conference on Content-based image and video retrieval, pp.47-56, 2008.

L. Ratinov and D. Roth, Design challenges and misconceptions in named entity recognition, Proceedings of the thirteenth conference on computational natural language learning, pp.147-155, 2009.

W. Raynaut, C. Soulé-dupuy, and N. Vallés-parlangeau, Cédric Dray et Philippe Valet. Characterization of learning instances for evolutionary meta-learning, European conference on machine learning and principles and practice of knowledge discovery in databases (ECML-PKDD 2015), p.198, 2015.

C. Remy, N. Pervin, F. Toriumi, and H. Takeda, Information di usion on twitter: everyone has its chance, but all chances are not equal, Signal-image technology & Internet-based systems (SITIS), 2013 international conference on, pp.483-490, 2013.

X. Ren and Y. Zhang, Predicting information di usion in social networks with users' social Rroles and topic interests, Information retrieval technology, pp.349-355, 2016.

A. Ritter, S. Clark, and O. Etzioni, Named entity recognition in tweets: an experimental study, Proceedings of the conference on empirical methods in natural language processing, pp.1524-1534, 2011.

A. Roberts, J. Robert, M. Gaizauskas, Y. Hepple, and . Guo, Combining terminology resources and statistical methods for entity recognition: an evaluation, Proceedings of the international conference on language resources and evaluation, 2008.

M. Rogers, C. Chapman, and V. Giotsas, Measuring the di usion of marketing messages across a social network, vol.14, pp.97-130, 2012.

F. Sabate, J. Berbegal-mirabent, A. Cañabate, R. Philipp, and . Lebherz, Factors in uencing popularity of branded content in Facebook fan pages, European management journal, vol.32, issue.6, pp.1001-1011, 2014.

T. Sahni, C. Chandak, N. Reddy-chedeti, and M. Singh, E cient Twitter sentiment classi cation using subjective distant supervision, Communication systems and networks (COMSNETS), 2017 9th international conference on, pp.548-553, 2017.

T. Sakaki, M. Okazaki, and Y. Matsuo, Earthquake shakes Twitter users: real-time event detection by social sensors, 2010.

E. Sanjuan, V. Moriceau, X. Tannier, P. Bellot, and J. Mothe, Overview of the INEX 2012 tweet contextualization track, Conference on multilingual and multimodal information access evaluation, pp.148-160, 2010.

L. Sloan and . Je-rey-morgan, Who tweets with their location? Understanding the relationship between demographic characteristics and the use of geoservices and geotagging on Twitter, PloS one, vol.10, issue.11, p.142209, 2015.

B. Suh, L. Hong, P. Pirolli, H. Ed, and . Chi, Want to be retweeted? large scale analytics on factors impacting retweet in twitter network, pp.177-184, 2010.

L. Tamine, L. Soulier, L. Ben-jabeur, F. Amblard, C. Hanachi et al., Starbird et Leysia Palen. Microblogging during two natural hazards events: what twitter may contribute to situational awareness, Proceedings of the 2003 conference of the North American chapter of the association for computational linguistics on human language technology, vol.1, pp.1079-1088, 2003.

M. Washha, A. Qaroush, and F. Sedes, Leveraging time for spammers detection on Twitter, Proceedings of the 8th international conference on management of digital ecoSystems, pp.109-116, 2016.

K. Watanabe, M. Ochi, M. Okabe, and R. Onai, Jasmine: a real-time local-event detection system based on geolocation information propagated to microblogs, Proceedings of the 20th ACM international conference on information and knowledge management, pp.2541-2544, 2011.

, Event detection in twitter, Proceedings of the fth international AAAI conference on weblogs and social media, vol.11, pp.401-408, 2011.

P. Benjamin, J. Wing, and . Baldridge, Simple supervised document geolocation with geodesic grids, Proceedings of the 49th annual meeting of the association for computational minguistics: human language technologies, vol.1, pp.955-964, 2011.

F. Xiong, Y. Liu, Z. Zhang, J. Zhu, and Y. Zhang, An information di usion model based on retweeting mechanism for online social media, Physics letters A, vol.376, issue.30, pp.2103-2108, 2012.

Z. Yang, J. Guo, K. Cai, J. Tang, J. Li et al., Understanding retweeting behaviors in social networks, Proceedings of the 19th ACM international conference on information and knowledge management, pp.1633-1636, 2010.

B. Yu, M. Chen, and L. Kwok, Toward predicting popularity of social marketing messages, International conference on social computing, behavioral-cultural modeling, and prediction, pp.317-324, 2011.

, Model-based feedback in the language modeling approach to information retrieval, Proceedings of the tenth international conference on information and knowledge management, pp.403-410, 2001.

J. Zhang, B. Liu, J. Tang, T. Chen, and J. Li, Social in uence locality for modeling Retweeting Behaviors, IJCAI, vol.13, pp.2761-2767, 2013.

Q. Zhao, P. Mitra, and B. Chen, Temporal and information ow based event detection from social text streams, vol.7, pp.1501-1506, 2007.