Algorithmes de machine learning en assurance : solvabilité, textmining, anonymisation et transparence

Antoine Ly

Thèse Année : 2019

Machine learning algorithms in insurance : solvency, textmining, anonymization and transparency

Algorithmes de machine learning en assurance : solvabilité, textmining, anonymisation et transparence

(1)

Antoine Ly

Fonction : Auteur

Laboratoire d'Analyse et de Mathématiques Appliquées

Résumé

In summer 2013, the term "Big Data" appeared and attracted a lot of interest from companies. This thesis examines the contribution of these methods to actuarial science. It addresses both theoretical and practical issues on high-potential themes such as textit{Optical Character Recognition} (OCR), text analysis, data anonymization and model interpretability. Starting with the application of machine learning methods in the calculation of economic capital, we then try to better illustrate the boundary that may exist between automatic learning and statistics. Highlighting certain advantages and different techniques, we then study the application of deep neural networks in the optical analysis of documents and text, once extracted. The use of complex methods and the implementation of the General Data Protection Regulation (GDPR) in 2018 led us to study its potential impacts on pricing models. By applying anonymization methods to pure premium calculation models in non-life insurance, we explored different generalization approaches based on unsupervised learning. Finally, as regulations also impose criteria in terms of model explanation, we conclude with a general study of methods that now allow a better understanding of complex methods such as neural networks

En été 2013, le terme de "Big Data" fait son apparition et suscite un fort intérêt auprès des entreprises. Cette thèse étudie ainsi l'apport de ces méthodes aux sciences actuarielles. Elle aborde aussi bien les enjeux théoriques que pratiques sur des thématiques à fort potentiel comme l'textit{Optical Character Recognition} (OCR), l'analyse de texte, l'anonymisation des données ou encore l'interprétabilité des modèles. Commençant par l'application des méthodes du machine learning dans le calcul du capital économique, nous tentons ensuite de mieux illustrer la frontrière qui peut exister entre l'apprentissage automatique et la statistique. Mettant ainsi en avant certains avantages et différentes techniques, nous étudions alors l'application des réseaux de neurones profonds dans l'analyse optique de documents et de texte, une fois extrait. L'utilisation de méthodes complexes et la mise en application du Réglement Général sur la Protection des Données (RGPD) en 2018 nous a amené à étudier les potentiels impacts sur les modèles tarifaires. En appliquant ainsi des méthodes d'anonymisation sur des modèles de calcul de prime pure en assurance non-vie, nous avons exploré différentes approches de généralisation basées sur l'apprentissage non-supervisé. Enfin, la réglementation imposant également des critères en terme d'explication des modèles, nous concluons par une étude générale des méthodes qui permettent aujourd'hui de mieux comprendre les méthodes complexes telles que les réseaux de neurones

Mots clés

Machine Learning Actuarial Science Ocr Text Mining Anonimyzation Transparency

Apprentissage statistique Science actuarielle Ocr Analyse Sémantique Anonimysation Transparence

Domaines

Mathématiques générales [math.GM]

Fichier principal

TH2019PESC2030.pdf (11.86 Mo)

Origine : Version validée par le jury (STAR)

ABES STAR : Contact

https://theses.hal.science/tel-02413664

Soumis le : lundi 16 décembre 2019-12:39:26

Dernière modification le : jeudi 14 mars 2024-03:13:10

Dates et versions

tel-02413664 , version 1 (16-12-2019)

tel-02413664 , version 2 (16-12-2019)

Identifiants

HAL Id : tel-02413664 , version 2

Citer

Antoine Ly. Algorithmes de machine learning en assurance : solvabilité, textmining, anonymisation et transparence. Mathématiques générales [math.GM]. Université Paris-Est, 2019. Français. ⟨NNT : 2019PESC2030⟩. ⟨tel-02413664v2⟩

Exporter

BibTeX XML-TEI Dublin Core DC Terms EndNote DataCite

Collections

CNRS STAR LAMA_UMR8050 UPEC UNIV-EIFFEL

1350 Consultations

2740 Téléchargements

Machine learning algorithms in insurance : solvency, textmining, anonymization and transparency

Algorithmes de machine learning en assurance : solvabilité, textmining, anonymisation et transparence

Résumé

Mots clés

Domaines

Dates et versions

Identifiants

Citer

Exporter

Collections

Partager