Skip to Main content Skip to Navigation
Theses

Extraction de relations en domaine de spécialité

Abstract : The amount of available scientific literature is constantly growing. If the experts of a domain want to easily access this information, it must be extracted and structured. To obtain structured data, both entities and relations of the texts must be detected. Our research is about the problem of complex relation extraction which represent experimental results, and detection and classification of binary relations between biomedical entities. We are interested in experimental results presented in scientific papers. An experimental result is a quantitative result obtained by an experimentation and linked with information that describes this experimentation. These results are important for biology experts, for example for doing modelization. In the domain of renal physiology, a database was created to centralize these experimental results, but the base is manually populated, therefore the population takes a long time. We propose a solution to automatically extract relevant knowledge for the database from the scientific papers, that is experimental results which are represented by a n-ary relation. The method proceeds in two steps: automatic extraction from documents and proposal of information extracted for approval or modification by the experts via an interface. We also proposed a method based on machine learning for extraction and classification of binary relations in specialized domains. We focused on the variations of the expression of relations, and how to represent them in a machine learning system. We studied the way to take into account syntactic structure of the sentence and the sentence simplification guided by the task of relation extraction. In particular, we developed a simplification method based on machine learning, which uses a series of classifiers.
Document type :
Theses
Complete list of metadatas

Cited literature [104 references]  Display  Hide  Download

https://tel.archives-ouvertes.fr/tel-00777749
Contributor : Abes Star :  Contact
Submitted on : Friday, January 18, 2013 - 9:22:10 AM
Last modification on : Wednesday, October 14, 2020 - 3:41:48 AM
Long-term archiving on: : Friday, April 19, 2013 - 4:01:16 AM

File

VD2_MINARD_ANNE-LYSE_07122012....
Version validated by the jury (STAR)

Identifiers

  • HAL Id : tel-00777749, version 1

Collections

Citation

Anne-Lyse Minard. Extraction de relations en domaine de spécialité. Autre [cs.OH]. Université Paris Sud - Paris XI, 2012. Français. ⟨NNT : 2012PA112273⟩. ⟨tel-00777749⟩

Share

Metrics

Record views

664

Files downloads

891