Gestion et exploitation de données capteurs : une approche basée sur la réduction de données

Abstract : In many modern applications (stemming from scientific fields, transport, energy, environment, etc.),data represent a raw material and a product with high added value for decision-making. The deluge of data generated by these applications makes some classic processing paradigms no longer completely relevant way to some decision-making situations. Thus, a renewed interest (of researchers) for some data processing approaches is observed. Recently, the approach using the principle of data reduction has aroused a real enthusiasm. The principle of this approach is to reduce the amount of data in imput of the processing process. This approach allows a less expensive data exploitation (in terms of calculation and time) and to obtain approximate answers or just some trends of the target data. This is particularly desirable in contexts where an approximate answer is rather desirable andprovides enough information to be acceptable.There are many techniques for reducing data volume, of which data summary structures (or synopsis)are part of these techniques. As part of this thesis, we are interested in a family of summary structures borrowed from the field of computational intelligence. These structures (such as non-classical mathematical quantifiers, typicity, labels / linguistic patterns, etc.) have two interesting features : (i) the intelligibility of the summaries constructed and ; (ii) the generation of summaries that describe the data at different levels of abstraction. The target data are real data coming from multi-sensors in (i) aircraft flights collected within the framework of the ADSB project and (ii) Smart Cities within the context of the neOCampus project. As first contribution of the thesis, we proposed a method for summary extracting using (i) nonclassical quantifiers and (ii) the notion of typicity. Measures to characterize the properties of the constructed summaries (veracity, representativeness, imprecision, etc.) are also defined knowing that these properties evolve in a contradictory way. Then, we analyzed the different ways to use each of the summaries for the decision-making purpose. Secondly, we were interested in the study of certain characteristics of data trends (in sensor data or time series) such as dynamic change, duration and variability. This study allowed us to select the best summary among the summaries constructed using the non-classical quantifiers. This selection is formalized as a multi-objective optimization problem. The proposed resolution approach uses a genetic algorithm suitably chosen. Finally, a set of experiments were carried out on real data to validate and compare all our proposal.
