Renforcements naturels pour la collaboration homme-machine

Esther Nicart 1
1 Equipe MAD - Laboratoire GREYC - UMR6072
GREYC - Groupe de Recherche en Informatique, Image, Automatique et Instrumentation de Caen
Abstract : Information extraction (IE) is defined as the identification and extraction of elements of interest, such as named entities, their relationships, and their roles in events. For example, a web-crawler might collect open-source documents, which are then processed by an IE treatment chain to produce a summary of the information contained in them.We model such an IE document treatment chain} as a Markov Decision Process, and use reinforcement learning to allow the agent to learn to construct custom-made chains ``on the fly'', and to continuously improve them.We build a platform, BIMBO (Benefiting from Intelligent and Measurable Behaviour Optimisation) which enables us to measure the impact on the learning of various models, algorithms, parameters, etc.We apply this in an industrial setting, specifically to a document treatment chain which extracts events from massive volumes of web pages and other open-source documents.Our emphasis is on minimising the burden of the human analysts, from whom the agent learns to improve guided by their feedback on the events extracted. For this, we investigate different types of feedback, from numerical rewards, which requires a lot of user effort and tuning, to partially and even fully qualitative feedback, which is much more intuitive, and demands little to no user intervention. We carry out experiments, first with numerical rewards, then demonstrate that intuitive feedback still allows the agent to learn effectively.Motivated by the need to rapidly propagate the rewards learnt at the final states back to the initial ones, even on exploration, we propose Dora: an improved version Q-Learning.
Document type :
Theses
Complete list of metadatas

Cited literature [135 references]  Display  Hide  Download

https://tel.archives-ouvertes.fr/tel-01517109
Contributor : Abes Star <>
Submitted on : Tuesday, May 2, 2017 - 4:41:08 PM
Last modification on : Tuesday, April 2, 2019 - 1:34:11 AM
Long-term archiving on : Thursday, August 3, 2017 - 1:32:59 PM

File

2017-NICART-ESTHER-VO.pdf
Version validated by the jury (STAR)

Identifiers

  • HAL Id : tel-01517109, version 1

Citation

Esther Nicart. Renforcements naturels pour la collaboration homme-machine. Human-Computer Interaction [cs.HC]. Normandie Université, 2017. English. ⟨NNT : 2017NORMC206⟩. ⟨tel-01517109⟩

Share

Metrics

Record views

512

Files downloads

371