. Achanta, ensta-paristech.fr 3. http ://gostai.com 4. http ://flowers.inria.fr 5, Slic superpixels. École Polytechnique Fédéral de Lausssanne (EPFL), Tech. Rep, 2010.

. Aldavert, Real-time object segmentation using a bag of features approach, Artificial Intelligence Research and Development, vol.35, pp.321-329, 2010.

. Asada, Cognitive Developmental Robotics: A Survey, IEEE Transactions on Autonomous Mental Development, vol.1, issue.1, p.18, 2009.
DOI : 10.1109/TAMD.2009.2021702

R. Baillargeon, Young infants' expectations about hidden objects: a reply to three challenges, Developmental Science, vol.2, issue.2, pp.115-132, 1999.
DOI : 10.1111/1467-7687.00061

. Bay, Speeded-Up Robust Features (SURF), Computer Vision and Image Understanding, vol.110, issue.3, pp.346-359, 2008.
DOI : 10.1016/j.cviu.2007.09.014

. Beale, Probabilistic models for robot-based object segmentation, Robotics and Autonomous Systems, vol.59, issue.12, pp.1080-1089, 2011.
DOI : 10.1016/j.robot.2011.08.003

. Benois-pineau, Visual indexing and retrieval, pp.31-34, 2012.
DOI : 10.1007/978-1-4614-3588-4
URL : https://hal.archives-ouvertes.fr/hal-00695914

D. E. Berlyne, Conflict, arousal, and curiosity, p.20, 1960.
DOI : 10.1037/11164-000

F. Bertenthal, B. I. Bertenthal, and K. W. Fischer, Development of self-recognition in the infant., Developmental Psychology, vol.14, issue.1, pp.44-62, 1978.
DOI : 10.1037/0012-1649.14.1.44

J. Bigün, A structure feature for some image processing applications based on spiral functions, Computer Vision, Graphics, and Image Processing, vol.50, issue.2, pp.166-194, 1990.
DOI : 10.1016/0734-189X(90)90047-Y

U. Borenstein, E. Borenstein, and S. Ullman, Class-specific, topdown segmentation, Computer Vision-ECCV 2002, pp.109-122, 2002.
DOI : 10.1007/3-540-47967-8_8
URL : http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.585.6495

. Bornstein, Color vision and hue categorization in young human infants., Journal of Experimental Psychology: Human Perception and Performance, vol.2, issue.1, pp.115-130, 1976.
DOI : 10.1037/0096-1523.2.1.115

T. Bouchard, G. Bouchard, B. Triggs, and J. Bouguet, Hierarchical Part-Based Visual Object Categorization, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05), pp.710-715, 2001.
DOI : 10.1109/CVPR.2005.174
URL : https://hal.archives-ouvertes.fr/inria-00548513

. Boureau, Learning midlevel features for recognition, Computer Vision and Pattern Recognition (CVPR), 2010 IEEE Conference on, pp.2559-2566, 2010.

. Brand, Evidence for ???motionese???: modifications in mothers??? infant-directed action, Developmental Science, vol.17, issue.1, pp.72-83, 2002.
DOI : 10.1111/1467-7687.00211

. Browatzki, Active object recognition on a humanoid robot, 2012 IEEE International Conference on Robotics and Automation, pp.2021-2028, 2012.
DOI : 10.1109/ICRA.2012.6225218

B. Burger, W. Burger, and M. J. Burge, Digital image processing, pp.31-42, 2008.
DOI : 10.1007/978-1-4471-6684-9

. Carbonetto, Learning to Recognize Objects with Little Supervision, International Journal of Computer Vision, vol.73, issue.2, pp.1-3219, 2008.
DOI : 10.1007/s11263-007-0067-7
URL : https://hal.archives-ouvertes.fr/inria-00548668

P. Cauwenberghs, G. Cauwenberghs, and T. Poggio, Incremental and decremental support vector machine learning Advances in neural information processing systems, pp.409-415, 2001.

. Chandrashekhariah, Let it learn : a curious vision system for autonomous object learning, Proceedings of the International Conference on Computer Vision Theory and Applications, p.28, 2013.

L. B. Cohen and C. H. Cashon, Infant perception and cognition. Handbook of psychology, p.15, 2003.

D. Comaniciu and P. Meer, Mean shift : A robust approach toward feature space analysis. Pattern Analysis and Machine Intelligence, IEEE Transactions on, vol.24, issue.5, pp.603-619, 2002.

D. J. Crandall and D. P. Huttenlocher, Weakly supervised learning of part-based spatial models for visual object recognition Visual categorization with bags of keypoints, Computer Vision-ECCV 2006 Workshop on statistical learning in computer vision, ECCV, pp.16-29, 2004.

. Dickscheid, Coding Images with Local Features, International Journal of Computer Vision, vol.59, issue.1, pp.154-174, 1997.
DOI : 10.1007/s11263-010-0340-z

. Fei-fei, What do we perceive in a glance of a real-world scene?, Journal of Vision, vol.7, issue.1, p.12, 2007.
DOI : 10.1167/7.1.10

E. Fischler, M. A. Fischler, R. A. Elschlager, P. Fitzpatrick, and G. Metta, The representation and matching of pictorial structures. Computers Grounding vision through experimental manipulation, IEEE Transactions on Philosophical Transactions of the Royal Society of London. Series A : Mathematical, Physical and Engineering Sciences, vol.100, issue.38, pp.67-92, 1811.

. Fitzpatrick, Shared challenges in object perception for robots and infants. Infant and Child Development, pp.7-24, 2008.

W. Förstner, A framework for low level feature extraction, European Conf. on Computer Vision (ECCV), pp.383-394, 1994.
DOI : 10.1007/BFb0028370

. Förstner, Detecting interpretable and accurate scale-invariant keypoints, 2009 IEEE 12th International Conference on Computer Vision, pp.2256-2263, 2009.
DOI : 10.1109/ICCV.2009.5459458

. Fritsch, Improving adaptive skin color segmentation by incorporating results from face detection, Proceedings. 11th IEEE International Workshop on Robot and Human Interactive Communication, pp.337-343, 2002.
DOI : 10.1109/ROMAN.2002.1045645

G. Gaël and J. Benoît, Eigen v3, 2010.

E. J. Gibson-]-gibson, Exploratory Behavior in the Development of Perceiving, Acting, and the Acquiring of Knowledge, Annual Review of Psychology, vol.39, issue.1, pp.1-42, 1988.
DOI : 10.1146/annurev.ps.39.020188.000245

. Gold, . Scassellati, K. Gold, and B. Scassellati, Learning about the self and others through contingency, AAAI Spring Symposium on Developmental Robotics, p.99, 2005.

E. B. Goldstein, Sensation and perception, pp.28-29, 2010.

. Grzyb, B. J. Grzyb, and A. P. Del-pobil, Developing a sense of bodily self. differentiation, p.99, 2008.

F. Guerin, Learning like a baby: a survey of artificial intelligence approaches, The Knowledge Engineering Review, vol.15, issue.02, pp.209-236, 2011.
DOI : 10.1016/S0378-4754(97)00057-8

M. Haith, Visual scanning in infants. segimal meating of the society for research in Child Development, p.14, 1968.

. Han, Combined feature evaluation for adaptive visual object tracking, Computer Vision and Image Understanding, vol.115, issue.1, pp.69-80, 2011.
DOI : 10.1016/j.cviu.2010.09.004

S. Hart, J. W. Hart, and B. Scassellati, Robotic self-models inspired by human development, Metacognition for Robust Social Systems, p.17, 2010.

J. Hérault, Vision : Images, Signals and Neural Networks : Models of Neural Processing in Visual Perception, World Scientific, vol.19, issue.11, p.65, 2010.
DOI : 10.1142/7311

O. Holland, Grey walter : the pioneer of real artificial life, Proceedings of the 5th international workshop on artificial life, pp.34-44, 1997.

. Hulse, Robotic hand-eye coordination without global reference: A biologically inspired learning scheme, 2009 IEEE 8th International Conference on Development and Learning, pp.1-6, 2009.
DOI : 10.1109/DEVLRN.2009.5175514

. Iriki, Coding of modified body schema during tool use by macaque postcentral neurones, Neuroreport, issue.14, pp.72325-98, 1996.

. Itti, . Koch, L. Itti, C. Koch, and . Ivaldi, Computational modelling of visual attention Computing robot internal/external wrenches by means of inertial, tactile and f/t sensors : theory and implementation on the icub, Humanoid Robots (Humanoids) 11th IEEE-RAS International Conference on, pp.194-203, 2001.
DOI : 10.1038/35058500

. Ivaldi, Perception and human interaction for developmental learning of objects and affordances, 2012 12th IEEE-RAS International Conference on Humanoid Robots (Humanoids 2012), p.123, 2012.
DOI : 10.1109/HUMANOIDS.2012.6651528
URL : https://hal.archives-ouvertes.fr/hal-00755297

. Ivaldi, A cognitive architecture for developmental objects learning through active exploration, Humanoids, 2012. Proceedings. 2012 IEEE International Conference on, p.123, 2012.

. Jégou, Aggregating local descriptors into a compact image representation, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, pp.3304-3311, 2010.
DOI : 10.1109/CVPR.2010.5540039

N. Johnson, . Sr, S. P. Johnson, N. Sr, and J. , Young infant's perception of object unity in two-dimensional displays, Infant Behavior and Development, vol.18, issue.2, pp.133-143, 1995.
DOI : 10.1016/0163-6383(95)90043-8

. Katz, Interactive Perception of Articulated Objects, 12th International Symposium of Experimental Robotics, pp.1-28, 2010.
DOI : 10.1007/978-3-642-28572-1_21

A. Kellman, P. J. Kellman, and M. E. Arterberry, The cradle of knowledge : Development of perception in infancy, p.16, 2000.

. Kestenbaum, Perception of objects and object boundaries by 3-month-old infants, British Journal of Developmental Psychology, vol.5, issue.4, pp.367-383, 1987.
DOI : 10.1111/j.2044-835X.1987.tb01073.x

. Kokkinos, I. Yuille-]-kokkinos, and A. Yuille, Inference and Learning with Hierarchical Shape Models, International Journal of Computer Vision, vol.18, issue.2, pp.201-225, 2011.
DOI : 10.1007/s11263-010-0398-7
URL : https://hal.archives-ouvertes.fr/hal-00857538

. Kootstra, Exploring objects for recognition in the real word, 2007 IEEE International Conference on Robotics and Biomimetics (ROBIO), pp.429-434, 2007.
DOI : 10.1109/ROBIO.2007.4522200

. Kraft, Development of object and grasping knowledge by robot exploration. Autonomous Mental Development, IEEE Transactions on, vol.2, issue.4, pp.368-383, 2010.

. Krüger, Early cognitive vision as a front-end for cognitive systems, ECCV 2010 Workshop on " Vision for Cognitive Tasks, 2010.

K. Kyrki, V. Kyrki, and D. Kragic, Recent Trends in Computational and Robot Vision, Unifying Perspectives in Computational and Robot Vision, pp.1-10, 2008.
DOI : 10.1007/978-0-387-75523-6_1

K. Lederman, S. J. Lederman, and R. L. Klatzky, Hand movements: A window into haptic object recognition, Cognitive Psychology, vol.19, issue.3, pp.342-368, 1987.
DOI : 10.1016/0010-0285(87)90008-9

D. G. Lowe, Distinctive Image Features from Scale-Invariant Keypoints, International Journal of Computer Vision, vol.60, issue.2, pp.91-110, 2004.
DOI : 10.1023/B:VISI.0000029664.99615.94

M. S. Mahler, The psychological birth of the human infant : Symbiosis and individuation. Basic Books, p.17, 2000.

. Marjanovic, Selftaught visually-guided pointing for a humanoid robot. From Animals to Animats : Proceedings of, p.98, 1996.

. Matas, Robust wide baseline stereo from maximally stable extremal regions, British machine vision conference, pp.384-393, 2002.

G. Metta and P. Fitzpatrick, Early integration of vision and manipulation, Proceedings of the International Joint Conference on Neural Networks, 2003., pp.2703-99, 2003.
DOI : 10.1109/IJCNN.2003.1223994

. Metta, YARP: Yet Another Robot Platform, International Journal of Advanced Robotic Systems, vol.35, issue.2, pp.43-48, 2006.
DOI : 10.5772/5761

. Michel, Motion-based robotic selfrecognition, Intelligent Robots and Systems (IROS) IEEE/RSJ International Conference on, pp.2763-2768, 2004.

B. Micusik and J. Kosecka, Semantic segmentation of street scenes by superpixel co-occurrence and 3D geometry, 2009 IEEE 12th International Conference on Computer Vision Workshops, ICCV Workshops, pp.625-632, 2009.
DOI : 10.1109/ICCVW.2009.5457645

K. Modayil, J. Modayil, and B. Kuipers, The initial development of object knowledge by a learning robot, Robotics and Autonomous Systems, vol.56, issue.11, pp.56879-890, 2008.
DOI : 10.1016/j.robot.2008.08.004

. Mohan, Inference Through Embodied Simulation in Cognitive Robots, Cognitive Computation, vol.364, issue.4, pp.1-28, 2013.
DOI : 10.1007/s12559-013-9205-4

. Nagi, Max-pooling convolutional neural networks for vision-based hand gesture recognition, 2011 IEEE International Conference on Signal and Image Processing Applications (ICSIPA), pp.342-347, 2011.
DOI : 10.1109/ICSIPA.2011.6144164

B. Needham, A. Needham, and R. Baillargeon, Effects of prior experience on 4.5-month old infants' object segregation. Infant behavior and development, pp.1-24, 1998.

. Nguyen, Learning to recognize objects through curiosity-driven manipulation with the icub humanoid robot Active choice of teachers , learning strategies and goals for a socially guided intrinsic motivation learner, Development and Learning, 2013. Proceedings. 2013 International Conference on, pp.136-146, 2013.

S. Nister, D. Nister, and H. Stewenius, Scalable Recognition with a Vocabulary Tree, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, Volume 2 (CVPR'06), pp.2161-2168, 2006.
DOI : 10.1109/CVPR.2006.264

. Oakes, . Baumgartner, L. M. Oakes, and H. A. Baumgartner, Manual object exploration and learning about object features in human infants, 2012 IEEE International Conference on Development and Learning and Epigenetic Robotics (ICDL), pp.1-6, 2012.
DOI : 10.1109/DevLrn.2012.6400819

T. Oliva, A. Oliva, and A. Torralba, Chapter 2 Building the gist of a scene: the role of global image features in recognition, Progress in brain research, vol.155, pp.23-36, 2006.
DOI : 10.1016/S0079-6123(06)55002-2

. Orabona, Object-based Visual Attention: a Model for a Behaving Robot, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05), Workshops, pp.89-89, 2005.
DOI : 10.1109/CVPR.2005.502

. Oudeyer, Intrinsic Motivation Systems for Autonomous Mental Development, IEEE Transactions on Evolutionary Computation, vol.11, issue.2, pp.265-286, 2007.
DOI : 10.1109/TEVC.2006.890271

P. Paletta, L. Paletta, and A. Pinz, Active object recognition by view integration and reinforcement learning, Robotics and Autonomous Systems, vol.31, issue.1-2, pp.71-86, 2000.
DOI : 10.1016/S0921-8890(99)00079-2
URL : http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.29.5843

A. Paternoster, Vision Science and the Problem of Perception, Cartographies of the Mind, pp.53-64, 2007.
DOI : 10.1007/1-4020-5444-0_4

J. Piaget, Play, dreams and imitation in childhood, Routledge, vol.16, pp.14-17, 1999.

A. Prest, Weakly supervised methods for learning actions and objects, p.28, 2012.
URL : https://hal.archives-ouvertes.fr/tel-00758797

Z. W. Pylyshyn, Visual indexes, preconceptual objects, and situated vision, Cognition, vol.80, issue.1-2, pp.127-158, 2001.
DOI : 10.1016/S0010-0277(00)00156-6

R. A. Rensink, P. Rochat, and P. Rochat, Seeing, sensing, and scrutinizing. Vision research The infant's world, pp.10-121469, 2000.

. Rohlfing, How can multimodal cues from child-directed interaction reduce learning complexity in robots?, Advanced Robotics, vol.20, issue.10, pp.201183-1199, 2006.
DOI : 10.1163/156855306778522532

. Rouanet, An integrated system for teaching new visually grounded words to a robot for non-expert users using a mobile device, 2009 9th IEEE-RAS International Conference on Humanoid Robots, p.28, 2009.
DOI : 10.1109/ICHR.2009.5379540
URL : https://hal.archives-ouvertes.fr/inria-00420249

. Rudinac, Learning and recognition of objects inspired by early cognition, 2012 IEEE/RSJ International Conference on Intelligent Robots and Systems, pp.4177-4184, 2012.
DOI : 10.1109/IROS.2012.6385895

. Saegusa, Body definition based on visuomotor correlation. Industrial Electronics, IEEE Transactions on, vol.21, issue.8, pp.593199-3210, 2012.
DOI : 10.1109/tie.2011.2157280

. Saegusa, Action learning based on developmental body perception, 2013 IEEE International Conference on Industrial Technology (ICIT), p.103, 2013.
DOI : 10.1109/ICIT.2013.6505961

. Shi, . Tomasi, J. Shi, and C. Tomasi, Good features to track, IEEE Conf. on Computer Vision and Pattern Recognition (CVPR), pp.593-600, 1994.

F. Y. Shih, Image processing and mathematical morphology : Fundamentals and applications, p.48, 2009.
DOI : 10.1201/9781420089448

. Shotton, Contour-based learning for object detection, Tenth IEEE International Conference on Computer Vision (ICCV'05) Volume 1, pp.503-510, 2005.
DOI : 10.1109/ICCV.2005.63

. Shotton, Semantic texton forests for image categorization and segmentation, 2008 IEEE Conference on Computer Vision and Pattern Recognition, pp.1-8, 2008.
DOI : 10.1109/CVPR.2008.4587503

I. Siagian, C. Siagian, and L. Itti, Rapid biologically-inspired scene classification using features shared with visual attention. Pattern Analysis and Machine Intelligence, IEEE Transactions on, vol.29, issue.28, pp.300-312, 2007.

. Sivic, Discovering object categories in image collections, Proceedings of the International Conference on Computer Vision, p.41, 2005.

Z. Sivic, J. Sivic, and A. Zisserman, Video Google: a text retrieval approach to object matching in videos, Proceedings Ninth IEEE International Conference on Computer Vision, pp.1470-1477, 2003.
DOI : 10.1109/ICCV.2003.1238663

. Slater, Form perception at birth: revisited, Journal of Experimental Child Psychology, vol.51, issue.3, pp.395-406, 1978.
DOI : 10.1016/0022-0965(91)90084-6

. Smith, . Gasser, . Smith, and M. Gasser, The Development of Embodied Cognition: Six Lessons from Babies, Artificial Life, vol.45, issue.3, pp.13-29, 2005.
DOI : 10.1126/science.134.3491.1692

L. Southey, T. Southey, and J. J. Little, Object discovery through motion , appearance and shape, AAAI Workshop on Cognitive Robotics, pp.9-28, 2006.

D. M. Tax, One-class classification, 2001.

. Bibliographie, . Treisman, . Gormican, A. Treisman, S. Gormican et al., Feature analysis in early vision : Evidence from search asymmetries Making object learning and recognition an active process, Psychological Review International Journal of Humanoid Robotics, vol.95, issue.101 102, pp.15-48, 1988.

S. Ullman, Three-dimensional object recognition based on the combination of views, Cognition, vol.67, issue.1-2, pp.21-44, 1998.
DOI : 10.1016/S0010-0277(98)00013-4

. Van-de-walle, Bases for Object Individuation in Infancy: Evidence From Manual Search, Journal of Cognition and Development, vol.12, issue.3, pp.249-280, 2000.
DOI : 10.1016/S0010-0277(99)00007-4

H. Van, Maximally informative interaction learning for scene exploration, Intelligent Robots and Systems (IROS), 2012 IEEE International Conference on, p.101, 2012.

P. Viola and M. J. Jones, Robust Real-Time Face Detection, International Journal of Computer Vision, vol.57, issue.2, pp.137-154, 2004.
DOI : 10.1023/B:VISI.0000013087.49260.fb
URL : http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.102.9805

D. Volkmann, F. C. Volkmann, and M. V. Dobson, Infant responses of ocular fixation to moving visual stimuli, Journal of Experimental Child Psychology, vol.22, issue.1, pp.86-99, 1976.
DOI : 10.1016/0022-0965(76)90092-8

A. Vyshedskiy, D. Walther, C. Koch, and . Weng, On the origin of the human mind. scientific american Modeling attention to salient proto-objects Autonomous mental development by robots and animals, Neural Networks Science, vol.10, issue.285504, pp.1395-407, 2001.

. Wersing, ONLINE LEARNING OF OBJECTS IN A BIOLOGICALLY MOTIVATED VISUAL ARCHITECTURE, International Journal of Neural Systems, vol.17, issue.04, pp.219-230, 2007.
DOI : 10.1142/S0129065707001081

. Zhou, Visual information abstraction for interactive robot learning, 2011 15th International Conference on Advanced Robotics (ICAR), pp.328-334, 2011.
DOI : 10.1109/ICAR.2011.6088626

. Zhu, Segmenting hands of arbitrary color, Automatic Face and Gesture Recognition Proceedings. Fourth IEEE International Conference on, pp.446-453, 2000.

. Slater, familiarization trial with six stimuli with the same angle but different orientations, b)second familiarization trial with six stimuli with the same angle but different orientations, c)two test trials, where A and B are pairs of stimuli of same orientation but different angles, p.16, 1991.

K. Walther, Proto-objects accessed by visual attention, p.30, 2006.

. Dickscheid, Examples of feature detectors : a)the original image, b)SIFT, c)EDGE, d)SFOP junctions, p.32, 2011.

. Nguyen, the average recognition rate obtained at different stages of the learning process for with two different ; the results are computed for both "biased" and "unbiased" teachers using the curiositydriven exploration strategy and random exploration strategy ; b) f-measure with respect to time. At the bottom of the plot, the manipulated object is shown at each timestamp, p.133, 2013.