Skip to Main content Skip to Navigation
Theses

Connaissances a priori pour la Recherche d'Information textuelle basée sur l'apprentissage profond

Abstract : This thesis work is in the fields of textual information retrieval (IR) and deep learning using neural networks. The motivation for this thesis work is that the use of neural networks in textual IR has proven to be efficient under certain conditions but that their use still presents several limitations that can greatly restrict their application in practice.In this thesis work, we propose to study the incorporation of prior knowledge to address 3 limitations of the use of neural networks for textual IR: (1) the need to have large amounts of labeled data, (2) a representation of the text-based only on statistical analysis, (3) the lack of efficiency.We focused on three types of prior knowledge to address the limitations mentioned above: (1) knowledge from a semi-structured resource: Wikipedia; (2) knowledge from structured resources in the form of semantic resources such as ontologies or thesauri; (3) knowledge from unstructured text.At first, we propose WIKIR: an open-access toolkit to automatically build IR collections from Wikipedia. The neural networks trained on the collections created automatically need less labeled data afterward to achieve good performance. Secondly, we developed neural networks for IR that use semantic resources. The integration of semantic resources into neural networks allows them to achieve better performance for information retrieval in the medical field. Finally, we present neural networks that use knowledge from unstructured text to improve the performance and efficiency of non-learning baseline IR models.
Document type :
Theses
Complete list of metadata

https://tel.archives-ouvertes.fr/tel-03322649
Contributor : ABES STAR :  Contact
Submitted on : Thursday, August 19, 2021 - 2:57:10 PM
Last modification on : Wednesday, July 6, 2022 - 4:12:33 AM
Long-term archiving on: : Saturday, November 20, 2021 - 7:14:28 PM

File

FREJ_2021_archivage.pdf
Version validated by the jury (STAR)

Identifiers

  • HAL Id : tel-03322649, version 1

Collections

Citation

Jibril Frej. Connaissances a priori pour la Recherche d'Information textuelle basée sur l'apprentissage profond. Recherche d'information [cs.IR]. Université Grenoble Alpes [2020-..], 2021. Français. ⟨NNT : 2021GRALM002⟩. ⟨tel-03322649⟩

Share

Metrics

Record views

115

Files downloads

200