Skip to Main content Skip to Navigation

Brouillard de pollution en Chine. Analyse sémantique différentielle de corpus institutionnels, médiatiques et de microblogues

Abstract : Air pollution has increasingly become a serious problem in China, more and more journalistic articles and miniblogs (weibo in Chinese, equivalent to tweet), comming from governmental or media websites, social networks, blogs and forums, etc., discuss the issue of «雾 霾» (wumai in Chinese, means smog) in China through several angles : political, ecological, economic, sociological, health, etc. The semantics of the themes adressed in these texts differ significantly from each other according to their textual genre. In the framework of our research, our objectif is double-fold : on the one hand, to identify different themes of a digital propose-bulit corpus relating to wumai ; and on the other hand, to interpret differentially the semantics of these themes. Firstly, we collect the textual data written in chinese and related to wumai. These journalistic articles and weibo deriving from three traditional chinese and the social network are divided into four genres of sub-corpus. Secondly, we constitute our corpus through a series of data processing : data cleaning, word segmentation, normalization, POS tagging, benchmarking and data organization. We study the characteristics of the four genres of sub-corpus through a series of discriminating variables - hyperstructural, lexical, semiotic, rhetorical, modal and syntactic - distributed at the infratextual and intratextual level. After that, based on the characteristics of each textual genre, we identify the main themes exposed in each genre of sub-corpus, and analyze the semantics of these identified themes in a contrastive way. Our analysis results are interpreted from two angles : quantitative and qualitative. All statistical analysis are assisted by textometric tools ; and the semantic interpretations are implemented on several fundamental concepts of SI (Sémantique interprétative) proposed by Rastier (1987).
Document type :
Complete list of metadata
Contributor : Abes Star :  Contact
Submitted on : Monday, November 23, 2020 - 3:26:15 PM
Last modification on : Thursday, December 31, 2020 - 11:27:53 AM
Long-term archiving on: : Thursday, February 25, 2021 - 1:34:45 PM


Version validated by the jury (STAR)


  • HAL Id : tel-03019809, version 1



Qinran Dang. Brouillard de pollution en Chine. Analyse sémantique différentielle de corpus institutionnels, médiatiques et de microblogues. Linguistique. Institut National des Langues et Civilisations Orientales- INALCO PARIS - LANGUES O', 2020. Français. ⟨NNT : 2020INAL0009⟩. ⟨tel-03019809⟩



Record views


Files downloads