Renforcements naturels pour la collaboration homme-machine

Esther Hoare Nicart

Thèse Année : 2017

Renforcements naturels pour la collaboration homme-machine

Qualitative reinforcement for man-machine interactions

(1)

Esther Hoare Nicart

Fonction : Auteur

Equipe MAD - Laboratoire GREYC - UMR6072

Résumé

Information extraction (IE) is defined as the identification and extraction of elements of interest, such as named entities, their relationships, and their roles in events. For example, a web-crawler might collect open-source documents, which are then processed by an IE treatment chain to produce a summary of the information contained in them.We model such an IE document treatment chain} as a Markov Decision Process, and use reinforcement learning to allow the agent to learn to construct custom-made chains ``on the fly'', and to continuously improve them.We build a platform, BIMBO (Benefiting from Intelligent and Measurable Behaviour Optimisation) which enables us to measure the impact on the learning of various models, algorithms, parameters, etc.We apply this in an industrial setting, specifically to a document treatment chain which extracts events from massive volumes of web pages and other open-source documents.Our emphasis is on minimising the burden of the human analysts, from whom the agent learns to improve guided by their feedback on the events extracted. For this, we investigate different types of feedback, from numerical rewards, which requires a lot of user effort and tuning, to partially and even fully qualitative feedback, which is much more intuitive, and demands little to no user intervention. We carry out experiments, first with numerical rewards, then demonstrate that intuitive feedback still allows the agent to learn effectively.Motivated by the need to rapidly propagate the rewards learnt at the final states back to the initial ones, even on exploration, we propose Dora: an improved version Q-Learning.

Nous modélisons une chaîne de traitement de documents comme un processus de décision markovien, et nous utilisons l’apprentissage par renforcement afin de permettre à l’agent d’apprendre à construire des chaînes adaptées à la volée, et de les améliorer en continu. Nous construisons une plateforme qui nous permet de mesurer l’impact sur l’apprentissage de divers modèles, services web, algorithmes, paramètres, etc. Nous l’appliquons dans un contexte industriel, spécifiquement à une chaîne visant à extraire des événements dans des volumes massifs de documents provenant de pages web et d’autres sources ouvertes. Nous visons à réduire la charge des analystes humains, l’agent apprenant à améliorer la chaîne, guidé par leurs retours (feedback) sur les événements extraits. Pour ceci, nous explorons des types de retours différents, d’un feedback numérique requérant un important calibrage, à un feedback qualitatif, beaucoup plus intuitif et demandant peu, voire pas du tout, de calibrage. Nous menons des expériences, d’abord avec un feedback numérique, puis nous montrons qu’un feedback qualitatif permet toujours à l’agent d’apprendre efficacement.

Mots clés

Artificial intelligence Reinforcement learning Extraction and knowledge management Man-machine interaction Open source intelligence (OSINT

Extraction et gestion des connaissances Renseignement d’origine source ouverte (ROSO)

Domaines

Interface homme-machine [cs.HC]

Fichier principal

2017-NICART-ESTHER-VO.pdf (4.64 Mo)

Origine : Version validée par le jury (STAR)

ABES STAR : Contact

https://theses.hal.science/tel-01517109

Soumis le : mardi 2 mai 2017-16:41:08

Dernière modification le : mercredi 20 mars 2024-16:20:04

Archivage à long terme le : jeudi 3 août 2017-13:32:59

Dates et versions

tel-01517109 , version 1 (02-05-2017)

Identifiants

HAL Id : tel-01517109 , version 1

Citer

Esther Hoare Nicart. Renforcements naturels pour la collaboration homme-machine. Human-Computer Interaction [cs.HC]. Normandie Université, 2017. English. ⟨NNT : 2017NORMC206⟩. ⟨tel-01517109⟩

Exporter

BibTeX XML-TEI Dublin Core DC Terms EndNote DataCite

Collections

CNRS STAR GREYC GREYC-MAD COMUE-NORMANDIE THESES-NU ENSICAEN UNICAEN

446 Consultations

396 Téléchargements

Renforcements naturels pour la collaboration homme-machine

Qualitative reinforcement for man-machine interactions

Résumé

Mots clés

Domaines

Dates et versions

Identifiants

Citer

Exporter

Collections

Partager