Skip to Main content Skip to Navigation

Text Mining Approaches for Semantic Similarity Exploration and Metadata Enrichment of Scientific Digital Libraries

Abstract : For scientists and researchers, it is very critical to ensure knowledge is accessible for re-use and development. Moreover, the way we store and manage scientific articles and their metadata in digital libraries determines the amount of relevant articles we can discover and access depending on what is actually meant in a search query. Yet, are we able to explore all semantically relevant scientific documents with the existing keyword-based search information retrieval systems? This is the primary question addressed in this thesis. Hence, the main purpose of our work is to broaden or expand the knowledge spectrum of researchers working in an interdisciplinary domain when they use the information retrieval systems of multidisciplinary digital libraries. However, the problem raises when such researchers use community-dependent search keywords while other scientific names given to relevant concepts are being used in a different research community.Towards proposing a solution to this semantic exploration task in multidisciplinary digital libraries, we applied several text mining approaches. First, we studied the semantic representation of words, sentences, paragraphs and documents for better semantic similarity estimation. In addition, we utilized the semantic information of words in lexical databases and knowledge graphs in order to enhance our semantic approach. Furthermore, the thesis presents a couple of use-case implementations of our proposed model
Document type :
Complete list of metadata
Contributor : ABES STAR :  Contact
Submitted on : Friday, February 28, 2020 - 2:41:31 PM
Last modification on : Thursday, April 30, 2020 - 10:12:02 AM
Long-term archiving on: : Friday, May 29, 2020 - 3:40:45 PM


  • HAL Id : tel-02476157, version 2



Hussein Al-Natsheh. Text Mining Approaches for Semantic Similarity Exploration and Metadata Enrichment of Scientific Digital Libraries. Artificial Intelligence [cs.AI]. Université de Lyon, 2019. English. ⟨NNT : 2019LYSE2062⟩. ⟨tel-02476157v2⟩



Record views


Files downloads