Co-evolution pattern mining in dynamic attributed graphs

Abstract : This thesis was conducted within the project ANR FOSTER, ``Spatio-Temporal Data Mining: application to the understanding and monitoring of erosion'' (ANR-2010-COSI-012-02, 2011-2014). In this context, we are interested in the modeling of spatio- temporal data in enriched graphs so that computation of patterns on such data can be used to formulate interesting hypotheses about phenomena to understand. Specifically, we are working on pattern mining in relational graphs (each vertex is uniquely identified), attributed (each vertex of the graph is described by numerical attributes) and dynamic (attribute values and relations between vertices may change over time). We propose a new pattern domain that has been called co-evolution patterns. These are trisets of vertices, times and signed attributes, i.e., attributes associated with a trend (increasing or decreasing). The interest of these patterns is to describe a subset of the data that has a specific behaviour and a priori interesting to conduct non-trivial analysis. For this purpose, we define two types of constraints, a constraint on the structure of the graph and a constraint on the co-evolution of the value worn by vertices attributes. To confirm the specificity of the pattern with regard to the rest of the data, we define three measures of density that tend to answer to three questions. How similar is the behaviour of the vertices outside the co-evolution pattern to the ones inside it? What is the behaviour of the pattern over time, does it appear suddenly? Does the vertices of the pattern behave similarly only on the attributes of the pattern or even outside? We propose the use of a hierarchy of attributes as an a priori knowledge of the user to obtain more general patterns and we adapt the set of constraints to the use of this hierarchy. Finally, to simplify the use of the algorithm by the user by reducing the number of thresholds to be set and to extract only all the most interesting patterns, we use the concept of ``skyline'' reintroduced recently in the domain of data mining. We propose three constraint-based algorithms, called MINTAG, H-MINTAG and Sky-H-MINTAG, that are complete to extract the set of all patterns that meet the different constraints. These algorithms are based on constraints, i.e., they use the anti-monotonicity and piecewise monotonicity/anti-monotonicity properties to prune the search space and make the computation feasible in practical contexts. To validate our method, we experiment on several sets of data (graphs) created from real-world data.
Document type :
Theses
Complete list of metadatas

Cited literature [109 references]  Display  Hide  Download

https://tel.archives-ouvertes.fr/tel-01127630
Contributor : Abes Star <>
Submitted on : Saturday, March 7, 2015 - 4:06:25 AM
Last modification on : Friday, May 17, 2019 - 10:17:33 AM
Long-term archiving on : Monday, June 8, 2015 - 4:50:51 PM

File

2014ISAL0071.pdf
Version validated by the jury (STAR)

Identifiers

  • HAL Id : tel-01127630, version 1

Citation

Elise Desmier. Co-evolution pattern mining in dynamic attributed graphs. Computer Science [cs]. INSA de Lyon, 2014. English. ⟨NNT : 2014ISAL0071⟩. ⟨tel-01127630⟩

Share

Metrics

Record views

234

Files downloads

420