Amélioration de la qualité des données : correction sémantique des anomalies inter-colonnes

Abstract : Data quality represents a major challenge because the cost of anomalies can be very high especially for large databases in enterprises that need to exchange information between systems and integrate large amounts of data. Decision making using erroneous data has a bad influence on the activities of organizations. Quantity of data continues to increase as well as the risks of anomalies. The automatic correction of these anomalies is a topic that is becoming more important both in business and in the academic world. In this report, we propose an approach to better understand the semantics and the structure of the data. Our approach helps to correct automatically the intra-column anomalies and the inter-columns ones. We aim to improve the quality of data by processing the null values and the semantic dependencies between columns.
Complete list of metadatas

Cited literature [9 references]  Display  Hide  Download

https://tel.archives-ouvertes.fr/tel-01636619
Contributor : Abes Star <>
Submitted on : Thursday, November 16, 2017 - 5:27:19 PM
Last modification on : Saturday, December 21, 2019 - 3:52:46 AM

File

These_HoudaZAIDI.pdf
Version validated by the jury (STAR)

Identifiers

  • HAL Id : tel-01636619, version 1

Collections

Citation

Houda Zaidi. Amélioration de la qualité des données : correction sémantique des anomalies inter-colonnes. Base de données [cs.DB]. Conservatoire national des arts et metiers - CNAM; École Nationale des Sciences de l'Informatique (La Manouba, Tunisie), 2017. Français. ⟨NNT : 2017CNAM1094⟩. ⟨tel-01636619⟩

Share

Metrics

Record views

1676

Files downloads

403