Towards the French Biomedical Ontology Enrichment

Juan Antonio Lossio-Ventura 1, 2
1 ADVANSE - ADVanced Analytics for data SciencE
LIRMM - Laboratoire d'Informatique de Robotique et de Microélectronique de Montpellier
2 SMILE - Système Multi-agent, Interaction, Langage, Evolution
LIRMM - Laboratoire d'Informatique de Robotique et de Microélectronique de Montpellier
Abstract : Big Data for biomedicine domain deals with a major issue, the analyze of large volume of heterogeneous data (e.g. video, audio, text, image). Ontology, conceptual models of the reality, can play a crucial role in biomedical to automate data processing, querying, and matching heterogeneous data. Various English resources exist but there are considerably less available in French and there is a strong lack of related tools and services to exploit them. Initially, ontologies were built manually. In recent years, few semi-automatic methodologies have been proposed. The semi-automatic construction/enrichment of ontologies are mostly induced from texts by using natural language processing (NLP) techniques. NLP methods have to take into account lexical and semantic complexity of biomedical data : (1) lexical refers to complex phrases to take into account, (2) semantic refers to sense and context induction of the terminology.In this thesis, we propose methodologies for enrichment/construction of biomedical ontologies based on two main contributions, in order to tackle the previously mentioned challenges. The first contribution is about the automatic extraction of specialized biomedical terms (lexical complexity) from corpora. New ranking measures for single- and multi-word term extraction methods have been proposed and evaluated. In addition, we present BioTex software that implements the proposed measures. The second contribution concerns the concept extraction and semantic linkage of the extracted terminology (semantic complexity). This work seeks to induce semantic concepts of new candidate terms, and to find the semantic links, i.e. relevant location of new candidate terms, in an existing biomedical ontology. We proposed a methodology that extracts new terms in MeSH ontology. The experiments conducted on real data highlight the relevance of the contributions.
Document type :
Theses
Complete list of metadatas

Cited literature [300 references]  Display  Hide  Download

https://hal-lirmm.ccsd.cnrs.fr/tel-01385697
Contributor : Abes Star <>
Submitted on : Tuesday, March 5, 2019 - 3:25:09 PM
Last modification on : Thursday, March 7, 2019 - 1:13:18 AM

File

52807_LOSSIO_VENTURA_2015_arch...
Files produced by the author(s)

Identifiers

  • HAL Id : tel-01385697, version 2

Collections

Citation

Juan Antonio Lossio-Ventura. Towards the French Biomedical Ontology Enrichment. Other [cs.OH]. Université Montpellier, 2015. English. ⟨NNT : 2015MONTS220⟩. ⟨tel-01385697v2⟩

Share

Metrics

Record views

151

Files downloads

185