Text Region Extraction from Quality Degraded Document Images, Proceedings of the 2nd international conference on Pattern recognition and machine intelligence, PReMI'07, pp.519-527, 2007. ,
DOI : 10.1007/978-3-540-77046-6_64
Automatic subspace clustering of high dimensional data for data mining applications, Proceedings of the 1998 ACM SIGMOD international conference on Management of data, SIGMOD '98, pp.94-105, 1998. ,
Automatic document classification and indexing in high-volume applications, International Journal on Document Analysis and Recognition, vol.4, issue.2, pp.69-83, 2001. ,
DOI : 10.1007/PL00010904
Automatic text block separation in document images, Intelligent Sensing and Information Processing. ICISIP 2006. Fourth International Conference on, 2006. ,
Colour quantisation technique based on image decomposition and its embedded system implementation, IEE Proceedings - Vision, Image, and Signal Processing, vol.151, issue.6, pp.511-524, 2004. ,
DOI : 10.1049/ip-vis:20040552
Document image defect models, Document image analysis, chapter Document image defect models, pp.315-325, 1995. ,
Document image defect models and their uses, Proceedings of 2nd International Conference on Document Analysis and Recognition (ICDAR '93), pp.62-67, 1993. ,
DOI : 10.1109/ICDAR.1993.395781
URL : http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.41.5986
Principal direction divisive partitioning, Data Mining and Knowledge Discovery, vol.2, issue.4, pp.325-344, 1998. ,
DOI : 10.1023/A:1009740529316
Comparison and optimization of methods of color image quantization, IEEE Transactions on Image Processing, vol.6, issue.7, pp.1048-1051, 1997. ,
DOI : 10.1109/83.597280
Traitement et analyse des images numériques, 2003. ,
Modena, and Itc irst Via Sommarive Geometric layout analysis techniques for document image understanding : a review, 1998. ,
A monothetic clustering method, Pattern Recognition Letters, vol.19, issue.11, pp.989-996, 1998. ,
DOI : 10.1016/S0167-8655(98)00087-7
URL : https://hal.archives-ouvertes.fr/hal-00260963
A survey of document image classification: problem statement, classifier architecture and performance evaluation, International Journal of Document Analysis and Recognition (IJDAR), vol.18, issue.6 ,
DOI : 10.1007/s10032-006-0020-2
A double-threshold image binarization method based on edge detector, Pattern Recognition, vol.41, issue.4, pp.1254-1267, 2008. ,
DOI : 10.1016/j.patcog.2007.09.007
A multi-plane approach for text segmentation of complex document images, Pattern Recognition, vol.42, issue.7, pp.1419-1444, 2009. ,
DOI : 10.1016/j.patcog.2008.10.032
Experiments for the Number of Clusters in K-Means, Proceedings of the aritficial intelligence 13th Portuguese conference on Progress in artificial intelligence, EPIA'07, pp.395-405, 2007. ,
DOI : 10.1007/978-3-540-77002-2_33
Segmentation of Text and Graphics from Document Images, Ninth International Conference on Document Analysis and Recognition (ICDAR 2007) Vol 2, pp.619-623, 2007. ,
DOI : 10.1109/ICDAR.2007.4376989
Automatic text extraction in digital videos using FFT and neural network, FUZZ-IEEE'99. 1999 IEEE International Fuzzy Systems. Conference Proceedings (Cat. No.99CH36315), 1999. ,
DOI : 10.1109/FUZZY.1999.793110
Analyse en composantes indépendantes et identification aveugle, Traitement du Signal, vol.07, p.5, 1990. ,
Une nouvelle méthode en classification automatique et reconnaissance des formes : la méthode des nuées dynamiques, Machine Learning, pp.273-29719, 1971. ,
A coupled mean shiftanisotropic diffusion approach for document image segmentation and restoration, IEEE, pp.814-818, 2007. ,
OCR Accuracy Improvement through a PDE-Based Approach, Ninth International Conference on Document Analysis and Recognition (ICDAR 2007) Vol 2, pp.1068-1072, 2007. ,
DOI : 10.1109/ICDAR.2007.4377079
Cascade classifier: design and application to digit recognition, Eighth International Conference on Document Analysis and Recognition (ICDAR'05), 2005. ,
DOI : 10.1109/ICDAR.2005.69
URL : https://hal.archives-ouvertes.fr/hal-01505744
Quad trees a data structure for retrieval on composite keys, Acta Informatica, vol.4, issue.1, pp.1-9, 1974. ,
DOI : 10.1007/BF00288933
A robust algorithm for text string separation from mixed text/graphics images, IEEE Trans. Pattern Anal. Mach. Intell, vol.10, pp.910-918, 1988. ,
UCI machine learning repository, 2010. ,
Combining multiple clusterings using evidence accumulation, IEEE Transactions on Pattern Analysis and Machine Intelligence, vol.27, pp.835-850, 2005. ,
A Decision-Theoretic Generalization of On-Line Learning and an Application to Boosting, Proceedings of the Second European Conference on Computational Learning Theory, pp.23-37, 1995. ,
DOI : 10.1006/jcss.1997.1504
The estimation of the gradient of a density function, with applications in pattern recognition. Information Theory, IEEE Transactions on, vol.21, issue.1, pp.32-40, 1975. ,
On foreground ??? background separation in low quality document images, International Journal of Document Analysis and Recognition (IJDAR), vol.36, issue.4, pp.47-63, 2006. ,
DOI : 10.1007/s10032-005-0007-4
Adaptive degraded document image binarization, Pattern Recognition, vol.39, issue.3, pp.317-327, 2006. ,
DOI : 10.1016/j.patcog.2005.09.010
Graphics gems. chapter A simple method for color quantization : octree quantization, pp.287-293, 1990. ,
Genetic Algorithms in Search, Optimization and Machine Learning, 1989. ,
On detection of contextual advertisements, 2010 2nd International Asia Conference on Informatics in Control, Automation and Robotics (CAR 2010), pp.29-32, 2010. ,
DOI : 10.1109/CAR.2010.5456544
Cure : an efficient clustering algorithm for large databases, Proceedings of the 1998 ACM SIGMOD international conference on Management of data, SIGMOD '98, pp.73-84, 1998. ,
Detecting the number of clusters during expectation-maximization clustering using information criterion, Machine Learning and Computing, International Conference on, 2010. ,
A general segmentation scheme for djvu document compression, ISMM'02, International Symposium on Mathematical Morphology. Publications, 2002. ,
Color segmentation for text extraction, International Journal on Document Analysis and Recognition, vol.6, issue.4, pp.271-284, 2003. ,
DOI : 10.1007/s10032-003-0119-7
Fast connected-component labeling, Pattern Recognition, vol.42, issue.9, pp.1977-1987, 2009. ,
DOI : 10.1016/j.patcog.2008.10.013
Color image quantization for frame buffer display, ACM SIGGRAPH Computer Graphics, vol.16, issue.3, pp.297-307, 1982. ,
DOI : 10.1145/965145.801294
Introduction to the theory of neural computation, 1991. ,
The random subspace method for constructing decision forests, IEEE Trans. Pattern Anal. Mach. Intell, vol.20, issue.8, pp.832-844, 1998. ,
Text extraction from graphical document images using sparse representation, Proceedings of the 9th IAPR International Workshop on Document Analysis Systems, pp.143-150, 2010. ,
Neural networks and physical systems with emergent collective computational abilities, pp.457-464, 1988. ,
Data clustering: a review, ACM Computing Surveys, vol.31, issue.3, pp.264-323, 1999. ,
DOI : 10.1145/331499.331504
Automatic text location in images and video frames, Pattern Recognition Proceedings. Fourteenth International Conference on, pp.1497-1499, 1998. ,
Data clustering : 50 years beyond k-means, Award winning papers from the 19th International Conference on Pattern Recognition (ICPR) 19th International Conference in Pattern Recognition (ICPR), pp.31651-666, 2010. ,
Algorithms for clustering data, 1988. ,
Document representation and its application to page decomposition, IEEE Transactions on Pattern Analysis and Machine Intelligence, vol.20, issue.3, pp.294-308, 1998. ,
Binarization of noisy gray-scale character images by thin line modeling, Pattern Recognition, vol.32, issue.5, pp.743-752, 1999. ,
DOI : 10.1016/S0031-3203(98)00019-3
Text information extraction in images and video: a survey, Pattern Recognition, vol.37, issue.5, pp.977-997, 2004. ,
DOI : 10.1016/j.patcog.2003.10.012
Density-Connected Subspace Clustering for High-Dimensional Data, Proc. 4th SIAM International Conference on Data Mining, 2004. ,
DOI : 10.1137/1.9781611972740.23
Stochastic language models for style-directed layout analysis of document images, IEEE Transactions on Image Processing, vol.12, issue.5, pp.583-596, 2003. ,
DOI : 10.1109/TIP.2003.811487
Colour text segmentation in web images based on human perception, Image and Vision Computing, vol.25, issue.5, pp.564-577, 2007. ,
DOI : 10.1016/j.imavis.2006.05.003
Text -image separation in Devanagari documents, Seventh International Conference on Document Analysis and Recognition, 2003. Proceedings., p.1265, 2003. ,
DOI : 10.1109/ICDAR.2003.1227861
Color Transfer in Images Based on Separation of Chromatic and Achromatic Colors, MIRAGE '09 ,
DOI : 10.1109/38.946629
Grouping Multidimensional Data : Recent Advances in Clustering, 2006. ,
DOI : 10.1007/3-540-28349-8
Content based image retrieval using gradient color fields, Proceedings 15th International Conference on Pattern Recognition. ICPR-2000, 2000. ,
DOI : 10.1109/ICPR.2000.905646
DEBORA: Digital AccEss to BOoks of the RenAissance, International Journal of Document Analysis and Recognition (IJDAR), vol.28, issue.3, pp.193-221, 2007. ,
DOI : 10.1007/s10032-006-0030-0
Document analysis in gray level and typography extraction using character pattern redundancies, Proceedings of the Fifth International Conference on Document Analysis and Recognition. ICDAR '99 (Cat. No.PR00318), p.177, 1999. ,
DOI : 10.1109/ICDAR.1999.791753
Serialized k-Means for Adaptative Color Image Segmentation, DAS2004, Lecture Notes in Computer Science, pp.252-263, 2004. ,
DOI : 10.1007/978-3-540-28640-0_24
Serialized unsupervised classifer for adaptative color image segmentation : Application to digitized ancient manuscripts, ICPR 2004, pp.494-497, 2004. ,
On Detection of Advertising Images, Multimedia and Expo, 2007 IEEE International Conference on, pp.1758-1761, 2007. ,
DOI : 10.1109/ICME.2007.4285011
Validation of image defect models for optical character recognition, IEEE Transactions on Pattern Analysis and Machine Intelligence, vol.18, pp.99-108, 1996. ,
Text extraction in mpeg compressed video for content-based indexing, Pattern Recognition Proceedings . 15th International Conference on, pp.409-412, 2000. ,
Document image binarization based on texture features, IEEE Transactions on Pattern Analysis and Machine Intelligence, vol.19, issue.5, pp.540-544, 1997. ,
Semi-supervised learning for text-line detection, Pattern Recognition Letters, vol.31, issue.11, pp.1260-1273, 2010. ,
DOI : 10.1016/j.patrec.2010.03.015
Adaptive region growing color segmentation for text using irregular pyramid Document Analysis Systems VI, Charless Fowlkes, and Jitendra Malik. Using contours to detect and localize junctions in natural images CVPR, pp.103-106, 2004. ,
A database of human segmented natural images and its application to evaluating segmentation algorithms and measuring ecological statistics, Proceedings Eighth IEEE International Conference on Computer Vision. ICCV 2001, pp.416-423, 2001. ,
DOI : 10.1109/ICCV.2001.937655
Clustering for Data Mining : A Data Recovery Approach, 2005. ,
DOI : 10.1201/9781420034912
Adaptive, quadratic preprocessing of document images for binarization, IEEE Transactions on Image Processing, vol.7, issue.7, pp.992-999, 1998. ,
Caractérisation des écritures médiévales par des méthodes statistiques basées sur la cooccurrence, 2009. ,
A fuzzy region growing approach for segmentation of color images, Pattern Recognition, vol.30, issue.6, pp.867-881, 1997. ,
DOI : 10.1016/S0031-3203(96)00084-2
Randomized Clustering Forests for Image Classification, IEEE Transactions on Pattern Analysis and Machine Intelligence, vol.30, issue.9, pp.1632-1646, 2008. ,
DOI : 10.1109/TPAMI.2007.70822
URL : https://hal.archives-ouvertes.fr/inria-00548666
Text Extraction in Complex Color Document Images for Enhanced Readability, Intelligent Information Management, vol.02, issue.02, pp.120-133, 2010. ,
DOI : 10.4236/iim.2010.22015
Hierarchical representation of optically scanned documents, Proceedings of the 7th International Conference on Pattern Recognition Montreal Canada, pp.347-349, 1984. ,
Color reduction for complex document images, International Journal of Imaging Systems and Technology, vol.28, issue.1, pp.14-26, 2009. ,
DOI : 10.1002/ima.20174
Foreground Text Extraction in Color Document Images for Enhanced Readability, Proceedings of the 3rd International Conference on Pattern Recognition and Machine Intelligence, PReMI '09, pp.387-392, 2009. ,
DOI : 10.1007/978-3-642-11164-8_63
Subsampling text images, ICDAR, 1991. ,
An improved binarization algorithm based on a water flow model for document image with inhomogeneous backgrounds, Pattern Recognition, vol.38, issue.12, pp.2612-2625, 2005. ,
DOI : 10.1016/j.patcog.2004.11.025
Recognizing characters in scene images, IEEE Transactions on Pattern Analysis and Machine Intelligence, vol.16, issue.2, pp.214-220, 1994. ,
DOI : 10.1109/34.273729
Chromatic / Achromatic Separation in Noisy Document Images, 2011 International Conference on Document Analysis and Recognition, 2011. ,
DOI : 10.1109/ICDAR.2011.42
URL : https://hal.archives-ouvertes.fr/hal-01354444
Adaptive color reduction, IEEE Transactions on Systems, Man and Cybernetics, Part B (Cybernetics), vol.32, issue.1, pp.44-56, 2002. ,
DOI : 10.1109/3477.979959
Page segmentation by white streams, ICDAR, 1991. ,
Text extraction from color documentsclustering approaches in three and four dimensions, Document Analysis and Recognition Proceedings. Sixth International Conference on, pp.937-941, 2001. ,
Color quantization for image processing using self information, 2007 6th International Conference on Information, Communications & Signal Processing, 2007. ,
DOI : 10.1109/ICICS.2007.4449822
URL : https://hal.archives-ouvertes.fr/hal-01502228
Coarse adaptive color image segmentation for visual object classification, 2008 15th International Conference on Systems, Signals and Image Processing, 2008. ,
DOI : 10.1109/IWSSIP.2008.4604391
URL : https://hal.archives-ouvertes.fr/hal-01501223
Text Localization and Extraction from Complex Color Images, Advances in Visual Computing, pp.486-493, 2005. ,
DOI : 10.1007/11595755_59
URL : http://eprints.iisc.ernet.in/17405/1/fulltext.pdf
Improving Zernike Moments Comparison for Optimal Similarity and Rotation Angle Retrieval, IEEE Transactions on Pattern Analysis and Machine Intelligence, vol.31, issue.4, pp.627-636, 2009. ,
DOI : 10.1109/TPAMI.2008.115
URL : https://hal.archives-ouvertes.fr/hal-01437606
Automatic removal of advertising from web-page display, JCDL '02 : Proceedings of the 2nd ACM/IEEE-CS joint conference on Digital libraries, pp.406-406, 2002. ,
Automatic tv advertisement detection from mpeg bitstream Determining the number of clusters/segments in hierarchical clustering/segmentation algorithms, PRIS '01 Tools with Artificial Intelligence, pp.14-25, 2001. ,
Density-based clustering in spatial databases : The algorithm gdbscan and its applications, Data Mining and Knowledge Discovery, vol.2, issue.2, pp.169-194, 1998. ,
DOI : 10.1023/A:1009745219419
A genetic rule-based data clustering toolkit, Proceedings of the 2002 Congress on Evolutionary Computation. CEC'02 (Cat. No.02TH8600), 2002. ,
DOI : 10.1109/CEC.2002.1004420
A comparison of clustering algorithms applied to color image quantization, Pattern Recognition Letters, vol.18, issue.11-13, pp.1379-1384, 1997. ,
DOI : 10.1016/S0167-8655(97)00116-5
Image Analysis and Mathematical Morphology, 1983. ,
A color classification algorithm for color images, Lecture Notes in Computer Science, vol.301, 1988. ,
Self-organizing Maps and Ancient Documents, Document Analysis Systems, pp.125-134, 2004. ,
DOI : 10.1007/978-3-540-28640-0_12
URL : https://hal.archives-ouvertes.fr/inria-00100146
Text extraction from colored book and journal covers, International Journal on Document Analysis and Recognition, vol.2, pp.163-176, 2000. ,
DOI : 10.1109/icdar.1999.791724
URL : http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.25.2681
Text extraction in complex color documents, Pattern Recognition, vol.35, issue.8, pp.1743-1758, 2002. ,
DOI : 10.1016/S0031-3203(01)00167-4
Enhancing principal direction divisive clustering, Pattern Recognition, vol.43, issue.10, pp.3391-3411, 2010. ,
DOI : 10.1016/j.patcog.2010.05.025
Turbo recognition : A statistical approach to layout analysis, 2001. ,
Evaluation of binarization methods for document images, IEEE Transactions on Pattern Analysis and Machine Intelligence, vol.17, pp.312-315, 1995. ,
Major components of a complete text reading system, Proceedings of the IEEE, vol.80, issue.7, pp.1133-1149, 1992. ,
DOI : 10.1109/5.156475
Color image segmentation using competitive learning, IEEE Transactions on Pattern Analysis and Machine Intelligence, vol.16, issue.12, pp.1197-1206, 1994. ,
DOI : 10.1109/34.387488
URL : http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.84.6282
Robust Commercial Retrieval in Video Streams, Multimedia and Expo, 2007 IEEE International Conference on, 2007. ,
DOI : 10.1109/ICME.2007.4284636
Soccer video processing for the detection of advertisement billboards, Pattern Recognition Letters, vol.29, issue.7, pp.994-1006, 2008. ,
DOI : 10.1016/j.patrec.2008.01.022
Color segmentation in the hsi color space using the k-means algorithm, SPIE, vol.3026, pp.143-154, 1997. ,
Object count/area graphs for the evaluation of object detection and segmentation algorithms, International Journal of Document Analysis and Recognition (IJDAR), vol.6, issue.4, pp.280-296, 2006. ,
DOI : 10.1007/s10032-006-0014-0
Text localization, enhancement and binarization in multimedia documents, Object recognition supported by user interaction for service robots, pp.1037-1040, 2002. ,
DOI : 10.1109/ICPR.2002.1048482
Color quantization by dynamic programming and principal analysis, ACM Transactions on Graphics, vol.11, issue.4 ,
DOI : 10.1145/146443.146475
Efficient Mean-shift Clustering Using Gaussian KD-Tree, Computer Graphics Forum, vol.243, issue.3, pp.2065-2073, 2010. ,
DOI : 10.1111/j.1467-8659.2010.01793.x
URL : http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.677.405
Extraction of bibliography information based on image of book cover, Proceedings of the 10th International Conference on Image Analysis and Processing, p.921, 1999. ,
Kpca plus lda : a complete kernel fisher discriminant framework for feature extraction and recognition. Pattern Analysis and Machine Intelligence, IEEE Transactions on, vol.27, issue.2, pp.230-244, 2005. ,
Color-based clustering for text detection and extraction in image, Proceedings of the 15th international conference on Multimedia , MULTIMEDIA '07, pp.847-850, 2007. ,
DOI : 10.1145/1291233.1291426
Robust Commercial Detection System, Multimedia and Expo, 2007 IEEE International Conference on, pp.587-590, 2007. ,
DOI : 10.1109/ICME.2007.4284718
Locating text in complex color images, Pattern Recognition, vol.28, issue.10, pp.1523-1535, 1995. ,
DOI : 10.1016/0031-3203(95)00030-4
Segmentation et classification dans les images de documents numérisés Nature : Doctorat Numéro d'ordre : 2012-ISAL0044 École doctorale : InfoMaths Spécialité ,
Les images en sortie du scanner sont traitées sans aucune information a priori ou intervention humaine. Ainsi, pour les caractériser, nous présentons un système d'analyse de documents composites couleur qui réalise une segmentation en zones colorimétriquement homogènes et qui adapte les algorithmes d'extraction de textes aux caractéristiques locales de chaque zone. Les informations colorimétriques et textuelles fournies par ce système alimentent une méthode de segmentation physique des pages de presse numérisée. Les blocs issus de cette décomposition font l'objet d'une classification permettant, entre autres, de détecter les zones publicitaires. Dans la continuité et l'expansion des travaux de classification effectués dans la première partie, nous présentons un nouveau moteur de classification et de classement générique, rapide et facile à utiliser ,
Laboratoire d'InfoRmatique en Images et Systèmes d'information (LIRIS) Président de jury : Composition du jury : Pr ,