Skip to Main content Skip to Navigation
Theses

Découverte de schéma pour les données du Web sémantique

Abstract : An increasing number of linked data sources are published on the Web. However, their schema may be incomplete or missing. In addition, data do not necessarily follow their schema. This flexibility for describing the data eases their evolution, but makes their exploitation more complex. In our work, we have proposed an automatic and incremental approach enabling schema discovery from the implicit structure of the data. To complement the description of the types in a schema, we have also proposed an approach for finding the possible versions (patterns) for each of them. It proceeds online without having to download or browse the source. This can be expensive or even impossible because the sources may have some access limitations, either on the query execution time, or on the number of queries.We have also addressed the problem of annotating the types in a schema, which consists in finding a set of labels capturing their meaning. We have proposed annotation algorithms which provide meaningful labels using external knowledge bases. Our approach can be used to find meaningful type labels during schema discovery, and also to enrichthe description of existing types.Finally, we have proposed an approach to evaluate the gap between a data source and itsschema. To this end, we have proposed a setof quality factors and the associated metrics, aswell as a schema extension allowing to reflect the heterogeneity among instances of the sametype. Both factors and schema extension are used to analyze and improve the conformity between a schema and the instances it describes
Complete list of metadatas

Cited literature [103 references]  Display  Hide  Download

https://tel.archives-ouvertes.fr/tel-01630962
Contributor : Abes Star :  Contact
Submitted on : Wednesday, November 29, 2017 - 11:04:12 PM
Last modification on : Wednesday, October 14, 2020 - 4:20:51 AM

File

70792_KELLOU_2017_archivage.pd...
Version validated by the jury (STAR)

Identifiers

  • HAL Id : tel-01630962, version 2

Collections

Citation

Kenza Kellou-Menouer. Découverte de schéma pour les données du Web sémantique. Web. Université Paris-Saclay, 2017. Français. ⟨NNT : 2017SACLV047⟩. ⟨tel-01630962v2⟩

Share

Metrics

Record views

915

Files downloads

708