Addressing big data issues in scientific data infrastructure, Collaboration Technologies and Systems (CTS), 2013 International Conference on, pp.48-55, 2013. ,
Resilient distributed datasets: A fault-tolerant abstraction for in-memory cluster computing, Proceedings of the 9th USENIX Symposium on Networked Systems Design and Implementation, NSDI 2012, pp.15-28, 2012. ,
A density-based algorithm for discovering clusters in large spatial databases with noise, Proceedings of the Second International Conference on Knowledge Discovery and Data Mining (KDD-96), pp.226-231, 1996. ,
, Data Clustering: Algorithms and Applications, 2014.
,
Evolution-based technique for stream clustering, ADMA, pp.605-615, 2007. ,
Data streaming with affinity propagation, ECML/PKDD, pp.628-643, 2008. ,
DOI : 10.1007/978-3-540-87481-2_41
URL : https://hal.archives-ouvertes.fr/inria-00289679
Big Data: Principles and best practices of scalable realtime data systems, 2015. ,
Data mining: concepts and techniques, 2011. ,
Contributions to Large Scale Data Clustering and Streaming with Affinity Propagation. Application to Autonomic Grids, 2010. ,
Big data and the next wave of infrastress problems, solutions, opportunities, 1998. ,
Mining big data: current status, and forecast to the future, ACM sIGKDD Explorations Newsletter, vol.14, issue.2, pp.1-5, 2013. ,
3D data management: Controlling data volume, velocity, and variety, 2001. ,
Extracting value from chaos, IDC iview, vol.1142, pp.1-12, 2011. ,
The google file system, ACM SIGOPS operating systems review, vol.37, pp.29-43, 2003. ,
DOI : 10.1145/1165389.945450
The chubby lock service for loosely-coupled distributed systems, Proceedings of the 7th symposium on Operating systems design and implementation, pp.335-350, 2006. ,
The hadoop distributed file system: Architecture and design, vol.11, p.21, 2007. ,
Mapreduce: simplified data processing on large clusters, Communications of the ACM, vol.51, issue.1, pp.107-113, 2008. ,
Spark: Cluster computing with working sets, Proceedings of the 2Nd USENIX Conference on Hot Topics in Cloud Computing, HotCloud'10, pp.10-10, 2010. ,
Survey of real-time processing systems for big data, 18th International Database Engineering & Applications Symposium, pp.356-361, 2014. ,
Discretized streams: An efficient and fault-tolerant model for stream processing on large clusters, Proceedings of the 4th USENIX Conference on Hot Topics Bibliography, p.169 ,
, HotCloud'12, pp.10-10, 2012.
Discretized streams: fault-tolerant streaming computation at scale ,
, ACM SIGOPS 24th Symposium on Operating Systems Principles, SOSP '13, pp.423-438, 2013.
Fault-tolerance in the borealis distributed stream processing system, ACM Trans. Database Syst, vol.33, issue.1, 2008. ,
High-availability algorithms for distributed stream processing, Proceedings of the 21st International Conference on Data Engineering, pp.779-790, 2005. ,
Micro-batching growing neural gas for clustering data streams using spark streaming, INNS Conference on Big Data, pp.158-166, 2015. ,
DOI : 10.1016/j.procs.2015.07.290
URL : https://doi.org/10.1016/j.procs.2015.07.290
Matei Zaharia, and Ameet Talwalkar. Mllib: Machine learning in apache spark, 2015. ,
A framework for clustering evolving data streams, VLDB, pp.81-92, 2003. ,
StreamKM++: A clustering algorithm for data streams, ACM Journal of Experimental Algorithmics, vol.17, issue.1, 2012. ,
DOI : 10.1137/1.9781611972900.16
all roads lead to rome": optimistic recovery for distributed iterative data processing, Bibliography 22nd ACM International Conference on Information and Knowledge Management, CIKM'13, pp.1919-1928, 2013. ,
MOA: massive online analysis, a framework for stream classification and clustering, Proceedings of the First Workshop on Applications of Pattern Analysis, WAPA 2010, pp.44-50, 2010. ,
The ClusTree: indexing micro-clusters for anytime stream mining, Knowledge and information systems, vol.29, issue.2, pp.249-272, 2011. ,
Density-based clustering over an evolving data stream with noise, SDM, pp.328-339, 2006. ,
Density-based clustering for real-time stream data, Proceedings of the 13th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp.133-142, 2007. ,
SAMOA: scalable advanced massive online analysis, Journal of Machine Learning Research, vol.16, pp.149-153, 2015. ,
Algorithms for Clustering Data, 1988. ,
k-means++: the advantages of careful seeding, Proceedings of the Eighteenth Annual ACM-SIAM Symposium on Discrete Algorithms, pp.1027-1035, 2007. ,
Scalable k-means++, Proceedings of the VLDB Endowment, vol.5, pp.622-633, 2012. ,
Bibliography of self-organizing map (som) papers: 1981-1997, Neural computing surveys, vol.1, p.171, 1998. ,
Neural Networks: A Comprehensive Foundation, vol.0132733501, 1998. ,
, Self-Organizing Maps
A "Neural-Gas, Network Learns Topologies. Artificial Neural Networks, vol.I, pp.397-402, 1991. ,
Unsupervised clustering with growing cell structures, Proceedings of the International Joint Conference on Neural Networks, pp.531-536 ,
, IEEE, 1991.
A growing neural gas network learns topologies, NIPS, pp.625-632, 1994. ,
Online semi-supervised growing neural gas, Int. J. Neural Syst, vol.22, issue.5, 2012. ,
Clustering by passing messages between data points. science, vol.315, pp.972-976, 2007. ,
On density-based data streams clustering algorithms: A survey, J. Comput. Sci. Technol, vol.29, issue.1, pp.116-141, 2014. ,
A survey of clustering data mining techniques, Grouping multidimensional data, pp.25-71, 2006. ,
How many clusters? which clustering method? answers via model-based cluster analysis, The computer journal, vol.41, issue.8, pp.578-588, 1998. ,
Maximum likelihood from incomplete data via the em algorithm, Journal of the royal statistical society. Series B (methodological), pp.1-38, 1977. ,
The EM algorithm and extensions, vol.382, 2007. ,
Robust estimation of a global gaussian mixture by decentralized aggregations of local models, Web Intelligence Bibliography and Agent Systems, vol.11, issue.3, pp.245-262, 2013. ,
URL : https://hal.archives-ouvertes.fr/hal-00794452
Estimation robuste des modèles de mélange sur des données distribuées. Theses, 2012. ,
Google's mapreduce programming model-revisited, Science of computer programming, vol.70, issue.1, pp.1-30, 2008. ,
Dbdc: Density based distributed clustering, Advances in Database Technology-EDBT 2004, pp.88-105, 2004. ,
SOM clustering using spark-mapreduce, 2014 IEEE International Parallel & Distributed Processing Symposium Workshops, pp.1727-1734, 2014. ,
Parallel k-means clustering based on mapreduce, Cloud computing, pp.674-679, 2009. ,
The hadoop distributed file system, Mass Storage Systems and Technologies (MSST), 2010 IEEE 26th Symposium on, pp.1-10, 2010. ,
Mrdbscan: a scalable mapreduce-based dbscan algorithm for heavily skewed data, Shengzhong Feng, and Jianping Fan, vol.8, pp.83-99, 2014. ,
Big data clustering: a review, Computational Science and Its Applications-ICCSA 2014, pp.707-720, 2014. ,
Google news personalization: scalable online collaborative filtering, Proceedings of the 16th international conference on World Wide Web, pp.271-280, 2007. ,
Parallel implementation of expectationmaximization for fast convergence ,
Mapreduce for bayesian network parameter learning using the em algorithm, Proc. of Big Learning: Algorithms, Systems and Tools, 2012. ,
Fast clustering using mapreduce, Proceedings of the 17th ACM SIGKDD international conference on Knowledge discovery and data mining, pp.681-689, 2011. ,
DOI : 10.1145/2020408.2020515
URL : http://arxiv.org/pdf/1109.1579.pdf
A survey of stream clustering algorithms, Data Clustering: Algorithms and Applications, pp.231-258, 2013. ,
Yew-Kwong Woon, and Wee Keong Ng. A survey on data stream clustering and classification, Knowl. Inf. Syst, vol.45, issue.3, pp.535-569, 2015. ,
Data stream clustering: Challenges and issues. CoRR, abs/1006, vol.5261, 2010. ,
Clustering techniques for streaming data-a survey, Advance Computing Conference (IACC), 2013 IEEE 3rd International, pp.951-956, 2013. ,
DOI : 10.1109/iadcc.2013.6514355
Data stream clustering algorithms: A review, International Journal of Advances in Soft Computing & Its Applications, vol.7, issue.3, 2015. ,
Data stream clustering: A survey, ACM Comput. Surv, vol.46, issue.1, p.13, 2013. ,
Issues in data stream management, ACM Sigmod Record, vol.32, issue.2, pp.5-14, 2003. ,
DOI : 10.1145/776985.776986
URL : http://www.cs.virginia.edu/~son/cs851/stream/papers/p5-golab.pdf
Mining data streams: a review, ACM Sigmod Record, vol.34, issue.2, pp.18-26, 2005. ,
DOI : 10.1145/1083784.1083789
Statstream: Statistical monitoring of thousands of data streams in real time, Proceedings of the 28th international conference on Very Large Data Bases, pp.358-369, 2002. ,
Duplicate detection in click streams, Proceedings of the 14th international conference on World Wide Web, pp.12-21, 2005. ,
DOI : 10.1145/1060745.1060753
Data streams: models and algorithms, vol.31, 2007. ,
A framework for diagnosing changes in evolving data streams, Proceedings of the 2003 ACM SIGMOD international conference on Management of data, pp.575-586, 2003. ,
Detecting change in data streams, Proceedings of the Thirtieth international conference on Very large data bases, vol.30, pp.180-191, 2004. ,
Dynamic histograms: Capturing evolving data sets, Proceedings of the international conference on data engineering, pp.86-86 ,
DOI : 10.1109/icde.2000.839394
URL : http://www.cs.wisc.edu/~donjerko/hist.pdf
Mining data streams under block evolution, ACM SIGKDD Explorations Newsletter, vol.3, issue.2, pp.1-10, 2002. ,
DOI : 10.1145/507515.507517
BIRCH: An efficient data clustering method for very large databases, SIGMOD Conference, pp.103-114, 1996. ,
Hue-stream: Evolution-based clustering technique for heterogeneous data streams with uncertainty, Advanced Data Mining and Applications -7th International Conference, pp.27-40, 2011. ,
A framework for clustering uncertain data streams, Proceedings of the 24th International Conference on Data Engineering, ICDE 2008, pp.150-159, 2008. ,
Hclustream: A novel approach for clustering evolving heterogeneous data stream, Workshops Proceedings of the 6th IEEE International Conference on Data Mining (ICDM 2006), pp.682-688, 2006. ,
R-trees: A dynamic index structure for spatial searching, SIGMOD'84, Proceedings of Annual Meeting, pp.47-57, 1984. ,
Maintaining gaussian mixture models of data streams under block evolution, International Conference on Computational Science, pp.1071-1074, 2006. ,
A state-space approach to modeling functional time series application to rail supervision, 22nd European Signal Processing Conference, pp.1402-1406, 2014. ,
Dynamic classification and modeling of non-stationary temporal data. Theses, 2014. ,
SOStream: Self organizing density-based clustering over data stream, MLDM, pp.264-278, 2012. ,
DOI : 10.1007/978-3-642-31537-4_21
SVStream: A support vector-based algorithm for clustering data streams ,
DOI : 10.1109/tkde.2011.263
, Knowl. Data Eng, vol.25, issue.6, pp.1410-1424, 2013.
Support vector clustering, Journal of Machine Learning Research, vol.2, pp.125-137, 2001. ,
Support vector domain description, Pattern Recognition Letters, vol.20, pp.1191-1199, 1999. ,
Growing neural gas for temporal clustering, 19th International Conference on Pattern Recognition (ICPR 2008), pp.1-4, 2008. ,
Fgng: A fast multi-dimensional growing neural gas implementation, Neurocomputing, vol.128, pp.328-340, 2014. ,
The growing neural gas and clustering of large amounts of data, Optical Memory and Neural Networks, vol.20, issue.4, pp.260-270, 2011. ,
A self-organizing network that can follow non-stationary distributions, Artificial Neural Networks -ICANN '97, 7th International Conference, pp.613-618, 1997. ,
A self-organising network that grows when required, Neural Networks, vol.15, issue.8-9, pp.1041-1058, 2002. ,
On-line novelty detection for autonomous mobile robots, Robotics and Autonomous Systems, vol.51, issue.2, pp.191-206, 2005. ,
An incremental growing neural gas learns topologies, Neural Networks, 2005. IJCNN'05. Proceedings. 2005 IEEE International Joint Conference on, vol.2, pp.1211-1216, 2005. ,
DOI : 10.1109/ijcnn.2005.1556026
Incremental classification of invoice documents, Pattern Recognition, 2008. ICPR 2008. 19th International Conference on, pp.1-4, 2008. ,
URL : https://hal.archives-ouvertes.fr/inria-00346942
A new incremental growing neural gas algorithm based on clusters labeling maximization: application to clustering of heterogeneous textual data, Trends in Applied Intelligent Systems, pp.139-148, 2010. ,
URL : https://hal.archives-ouvertes.fr/inria-00535942
Autonomous growing neural gas for applications with time constraint: optimal parameter estimation, Neural Networks, vol.32, pp.196-208, 2012. ,
A review of novelty detection, Signal Processing, vol.99, pp.215-249, 2014. ,
An adaptive incremental clustering method based on the growing neural gas algorithm, ICPRAM, pp.42-49, 2013. ,
URL : https://hal.archives-ouvertes.fr/hal-00794354
Clustering data streams: Theory and practice. Knowledge and Data Engineering, IEEE Transactions on, vol.15, issue.3, pp.515-528, 2003. ,
DOI : 10.1109/tkde.2003.1198387
Stream data mining repository (web site), 2010. ,
UCI machine learning repository, 2013. ,
Cluster ensembles -a knowledge reuse framework for combining multiple partitions, Journal of Machine Learning Research, vol.3, pp.583-617, 2002. ,
stream: Infrastructure for Data Stream Mining, 2014. ,
Tracking clusters in evolving data streams over sliding windows, Knowl. Inf. Syst, vol.15, issue.2, pp.181-214, 2008. ,
G-stream: Growing neural gas over data stream, Neural Information Processing -21st International Conference, pp.207-214, 2014. ,
Clustering over data streams based on growing neural gas, Advances in Knowledge Discovery and Data Mining -19th Pacific-Asia Conference, PAKDD 2015, pp.134-145, 2015. ,
Fusion of big RDF data: A semantic entity resolution and query rewriting-based inference approach, Web Information Systems Engineering -WISE 2015 -16th International Conference, pp.300-307, 2015. ,
URL : https://hal.archives-ouvertes.fr/hal-01377590
The Elements of Statistical Learning: Data Mining, Inference, and Prediction, 2009. ,
rpart: Recursive Partitioning and Regression Trees, 2015. ,
A hierarchical ant based clustering algorithm and its use in three real-world applications ,
URL : https://hal.archives-ouvertes.fr/hal-01020927
, European Journal of Operational Research, vol.179, issue.3, pp.906-922, 2007.
Growing selforganizing trees for knowledge discovery from data, The 2012 International Joint Conference on Neural Networks (IJCNN), pp.1-8, 2012. ,
Self-organizing trees for visualizing protein dataset, The 2013 International Joint Conference on Neural Networks, IJCNN 2013, pp.1-8, 2013. ,
Tulip : A huge graph visualisation framework, Graph Drawing Softwares, Mathematics and Visualization, pp.105-126, 2003. ,
URL : https://hal.archives-ouvertes.fr/hal-00307626
Clustering binary data streams with k-means, Proceedings of the 8th ACM SIGMOD workshop on Research issues in data mining and knowledge discovery, pp.12-19, 2003. ,
Maintaining variance and k-medians over data stream windows, Proceedings of the Twenty-Second ACM SIGACT-SIGMOD-SIGART Symposium on Principles of ,
, Database Systems, pp.234-243, 2003.
Better streaming algorithms for clustering problems, Proceedings of the Thirty-fifth Annual ACM Symposium on Theory of Computing, pp.30-39, 2003. ,
DOI : 10.1145/780542.780548
, Carte topologique pour données qualitatives: applicationà la reconnaissance automatique de la densité du trafic routier. Master's thesis, 2003.
, Data Analysis, 2009.
URL : https://hal.archives-ouvertes.fr/hal-00447855
Open challenges for data stream mining research, ACM SIGKDD explorations newsletter, vol.16, pp.1-10, 2014. ,
DOI : 10.1145/2674026.2674028
A survey on learning from data streams: current and future trends, Progress in AI, vol.1, issue.1, pp.45-55, 2012. ,
Knowledge Discovery from Data Streams. Chapman and Hall / CRC Data Mining and Knowledge Discovery Series, 2010. ,
DOI : 10.1201/ebk1439826119
Synopses for massive data: Samples, histograms, wavelets, sketches. Foundations and Trends in Databases, vol.4, pp.1-294, 2012. ,
DOI : 10.1561/1900000004
A single pass algorithm for clustering evolving data streams based on swarm intelligence, Data Min. Knowl. Discov, vol.26, issue.1, pp.1-26, 2013. ,
Objective criteria for the evaluation of clustering methods, Journal of the American Statistical Association, vol.66, issue.336, pp.846-850, 1971. ,