HAL will be down for maintenance from Friday, June 10 at 4pm through Monday, June 13 at 9am. More information
Skip to Main content Skip to Navigation
Master thesis

Automatic or semi-automatic detection of companies in difficulty or weakened by the crisis

Thomas Meunier 1, 2
1 CEDAR - Rich Data Analytics at Cloud Scale
LIX - Laboratoire d'informatique de l'École polytechnique [Palaiseau], Inria Saclay - Ile de France
Abstract : In this report, we will attempt to improve a failure prediction model that is currently used by the French Ministry of Economy and Finance. First, we studied several models and benchmarked them in order to compare our results with those of the articles studied. As a result, we were able to select four models that stood out from the rest, and that we should improve as much as possible. Secondly, we decided to look at the data itself. We realized that our dataset was static. That is, for each row in our table, we had data only for a time T. We therefore decided to add variables, which we will call temporal features, in order to take temporality into account in the model. This addition was more than conclusive, because it allowed us to obtain excellent results that had not been achieved until then. Afterwards, we will proceed with this new dataset. In order to further improve our results, we have started to search by sector of activity. We separated our dataset into several datasets, the separation being done on the sector of activity of the companies. In doing so, we realized that if we applied different models for each business sector, we would get much better results. Depending on the operational needs, we conclude that this is an area to consider seriously. Finally, we decided to look at the importance of features in our models. To do this, we looked at the importance of the variables in the classifiers, and we realized that only a fraction of the input variables were actually useful, and among those, our temporal features that we added. It would therefore be appropriate to reduce the number of input variables, and to go even further in temporalizing the model on the small remaining dataset.
Document type :
Master thesis
Complete list of metadata

Contributor : Oana Balalau Connect in order to contact the contributor
Submitted on : Wednesday, January 12, 2022 - 3:06:59 PM
Last modification on : Thursday, February 3, 2022 - 11:16:32 AM
Long-term archiving on: : Wednesday, April 13, 2022 - 11:00:31 PM


Article_SF (2).pdf
Files produced by the author(s)


Public Domain


  • HAL Id : hal-03523010, version 1


Thomas Meunier. Automatic or semi-automatic detection of companies in difficulty or weakened by the crisis. Artificial Intelligence [cs.AI]. 2021. ⟨hal-03523010⟩



Record views


Files downloads