HAL will be down for maintenance from Friday, June 10 at 4pm through Monday, June 13 at 9am. More information
Skip to Main content Skip to Navigation
Master thesis

Improving a Search Engine for Answering User Questions in Natural Language

Abdenour Chaoui 1
1 CEDAR - Rich Data Analytics at Cloud Scale
LIX - Laboratoire d'informatique de l'École polytechnique [Palaiseau], Inria Saclay - Ile de France
Abstract : During this internship, we worked on improving an open domain question answering system. We addressed the document selection part which is structured as a text ranking task. The first step was to explore classical methods, also known as sparse retrievers, and to test those algorithms on our evaluation dataset. These methods produced only minor differences in performance. The next step was to employ deep language models, namely the BERT based architectures. A variety of techniques and designs were considered. First, we tackled the lack of data to train such models on the French language, followed by the definition of the problem (classification or ranking), and finally, we addressed the problem of limited text length in BERT-based models. The final results show a 12% improvement in performance over the original model.
Document type :
Master thesis
Complete list of metadata

Contributor : Oana Balalau Connect in order to contact the contributor
Submitted on : Thursday, January 13, 2022 - 1:57:29 PM
Last modification on : Friday, February 4, 2022 - 3:12:27 AM
Long-term archiving on: : Thursday, April 14, 2022 - 6:28:50 PM


Files produced by the author(s)


  • HAL Id : hal-03524281, version 1


Abdenour Chaoui. Improving a Search Engine for Answering User Questions in Natural Language. Artificial Intelligence [cs.AI]. 2021. ⟨hal-03524281⟩



Record views


Files downloads