Skip to Main content Skip to Navigation
Theses

Querying and extracting heterogeneous graphs from structured data and unstrutured content

Abstract : The present work introduces a set of solutions to extract graphs from enterprise data and facilitate the process of information search on these graphs. First of all we have defined a new graph model called the SPIDER-Graph, which models complex objects and permits to define heterogeneous graphs. Furthermore, we have developed a set of algorithms to extract the content of a database from an enterprise and to represent it in this new model. This latter representation allows us to discover relations that exist in the data but are hidden due to their poor compatibility with the classical relational model. Moreover, in order to unify the representation of all the data of the enterprise, we have developed a second approach which extracts from unstructured data an enterprise's ontology containing the most important concepts and relations that can be found in a given enterprise. Having extracted the graphs from the relational databases and documents using the enterprise ontology, we propose an approach which allows the users to extract an interaction graph between a set of chosen enterprise objects. This approach is based on a set of relations patterns extracted from the graph and the enterprise ontology concepts and relations. Finally, information retrieval is facilitated using a new visual graph query language called GraphVQL, which allows users to query graphs by drawing a pattern visually for the query. This language covers different query types from the simple selection and aggregation queries to social network analysis queries.
Document type :
Theses
Complete list of metadata

Cited literature [159 references]  Display  Hide  Download

https://tel.archives-ouvertes.fr/tel-00740663
Contributor : Abes Star :  Contact
Submitted on : Wednesday, October 10, 2012 - 4:03:24 PM
Last modification on : Friday, October 23, 2020 - 4:49:45 PM
Long-term archiving on: : Friday, January 11, 2013 - 3:41:57 AM

File

Thesis.pdf
Version validated by the jury (STAR)

Identifiers

  • HAL Id : tel-00740663, version 1

Collections

Citation

Rania Soussi. Querying and extracting heterogeneous graphs from structured data and unstrutured content. Other. Ecole Centrale Paris, 2012. English. ⟨NNT : 2012ECAP0030⟩. ⟨tel-00740663⟩

Share

Metrics

Record views

786

Files downloads

3786