, 3) -original "basic" block

, B(1, 3, 1) -with the same dimensionality of all convolutions

, B(3, 1, 1) -Network-in-Network style block

Measuring the objectness of image windows, 2012. ,

Multiscale combinatorial grouping, CVPR, 2014. ,

Layer normalization. CoRR, 2016. ,

Neural machine translation by jointly learning to align and translate, 2014. ,

Pn-net: Conjoined triple deep network for learning local image descriptors, 2016. ,

Surf: Speeded up robust features, ECCV, pp.404-417, 2006. ,

Improving the convergence of back-propagation learning with secondorder methods, Proc. of the 1988 Connectionist Models Summer School, pp.29-37, 1989. ,

Inside-outside net: Detecting objects in context with skip pooling and recurrent neural nets, 2016. ,

Understanding the difficulty of training deep feedforward neural networks, Proceedings of AISTATS 2010, vol.9, pp.249-256, 2010. ,

Scaling learning algorithms towards AI, Large Scale Kernel Machines, 2007. ,

Fully-convolutional siamese networks for object tracking, European Conference on Computer Vision, pp.850-865, 2016. ,

,

On the complexity of shallow and deep neural network classifiers, 22th European Symposium on Artificial Neural Networks, 2014. ,

Sparse quantization for patch description, CVPR, 2013. ,

Signature verification using a siamese time delay neural network, NIPS, 1993. ,

Discriminative learning of local image descriptors. Pattern Analysis and Machine Intelligence, IEEE Transactions on, vol.33, issue.1, pp.43-57, 2011. ,

A gradient method for optimizing multi-stage allocation processes, Proc, 1961. ,

, Symposium on digital computers and their applications

Model compression, KDD, pp.535-541, 2006. ,

Brief: Binary robust independent elementary features, 2010. ,

Return of the devil in the details: Delving deep into convolutional nets, British Machine Vision Conference, 2014. ,

Net2net: Accelerating learning via knowledge transfer, International Conference on Learning Representation, 2016. ,

Learning a similarity metric discriminatively, with application to face verification, CVPR, 2005. ,

Universal correspondence network, 2016. ,

, Advances in Neural Information Processing Systems, vol.29, pp.2414-2422

Fast and accurate deep network learning by exponential linear units (elus), 2015. ,

Group equivariant convolutional networks, ICML, 2016. ,

Torch7: A matlab-like environment for machine learning, BigLearn, NIPS Workshop, 2011. ,

Inference by learning: Speeding-up graphical model optimization via a coarse-to-fine cascade of pruning classifier, NIPS, 2014. ,

Approximation by superpositions of a sigmoidal function, Mathematics of Control, Signals, and Systems (MCSS), vol.2, pp.303-314, 1989. ,

Instance-aware semantic segmentation via multi-task network cascades, CVPR, 2016. ,

Imagenet: A large-scale hierarchical image database, CVPR, 2009. ,

Learning where to attend with deep architectures for image tracking, Neural Computation, 2012. ,

Fast feature pyramids for object detection, 2014. ,

The computational solution of optimal control problems with time lag, IEEE Transactions on Automatic Control, vol.18, issue.4, pp.383-385, 1973. ,

Improving generalization performance using double backpropagation, IEEE Transaction on Neural Networks, vol.3, issue.6, pp.991-997, 1992. ,

Adaptive subgradient methods for online learning and stochastic optimization, COLT, 2010. ,

A guide to convolution arithmetic for deep learning, 2016. ,

Depth map prediction from a single image using a multi-scale deep network, NIPS, 2014. ,

The PASCAL visual object classes (VOC) challenge, 2010. ,

Real-time correlation-based stereo : algorithm, implementations and applications, 1993. ,

URL : https://hal.archives-ouvertes.fr/inria-00074658

Object detection with discriminatively trained part-based models, 2010. ,

Descriptor matching with convolutional neural networks: a comparison to, 2014. ,

Neocognitron: A self-organizing neural network model for a mechanism of pattern recognition unaffected by shift in position, Biological Cybernetics, vol.36, pp.193-202, 1980. ,

Learning local image descriptors with deep siamese and triplet convolutional networks by minimizing global loss functions, IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp.5385-5394, 2016. ,

, Theoria motus corporum coelestium in sectionibus conicis solem ambientium, 1809.

Object detection via a multi-region and semantic segmentationaware cnn model, ICCV, 2015. ,

Locnet: Improving localization accuracy for object detection, Computer Vision and Pattern Recognition (CVPR), 2016 IEEE Conference on, 2016. ,

URL : https://hal.archives-ouvertes.fr/hal-01832507

Fast R-CNN, ICCV, 2015. ,

Rich feature hierarchies for accurate object detection and semantic segmentation, CVPR, 2014. ,

Multi-view stereo for community photo collections, Proceedings of the 11th International Conference on Computer Vision (ICCV 2007), pp.265-270, 2007. ,

, Deep Learning, 2016.

Maxout networks, Proceedings of the 30th International Conference on Machine Learning (ICML'13), pp.1319-1327, 2013. ,

Fractional max-pooling, 2014. ,

Matchnet: Unifying feature and metric learning for patch-based matching, CVPR, 2015. ,

Hypercolumns for object segmentation and fine-grained localization, CVPR, 2015. ,

Spatial pyramid pooling in deep convolutional networks for visual recognition, ECCV, 2014. ,

Delving deep into rectifiers: Surpassing human-level performance on imagenet classification, IEEE International Conference on Computer Vision (ICCV), pp.1026-1034, 2015. ,

Deep residual learning for image recognition, IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp.770-778, 2016. ,

Identity mappings in deep residual networks, ECCV, 2016. ,

Distilling the knowledge in a neural networks, 2015. ,

Long short-term memory, 1997. ,

Deep metric learning using triplet network, SIMBAD, 2015. ,

Diagnosing error in object detectors, ICCV, 2012. ,

Neural networks and physical systems with emergent collective computational abilities, Proceedings National Academy of Science, vol.79, pp.2554-2558, 1982. ,

Multilayer feedforward networks are universal approximators, Neural Networks, vol.2, issue.5, pp.359-366, 1989. ,

What makes for effective detection proposals?, 2015. ,

Deep networks with stochastic depth, ECCV, 2016. ,

Batch normalization: Accelerating deep network training by reducing internal covariate shift, Proceedings of the 32nd International Conference on Machine Learning (ICML-15), pp.448-456, 2015. ,

The group method of data handling -a rival of the method of stochastic approximation, Soviet Automatic Control, vol.13, issue.3, pp.43-55, 1968. ,

Polynomial theory of complex systems, IEEE Transactions on Systems, Man and Cybernetics, issue.4, pp.364-378, 1971. ,

, Cybernetic Predicting Devices. CCM Information Corporation, 1965.

Cybernetics and forecasting techniques, 1967. ,

Gradient theory of optimal flight paths, ARS Journal, vol.30, issue.10, pp.947-954, 1960. ,

Adam: A method for stochastic optimization, 2014. ,

Correlation matrix memories. Computers, IEEE Transactions on, vol.100, issue.4, pp.353-359, 1972. ,

Self-Organization and Associative Memory, 1988. ,

Fast, approximately optimal solutions for single and dynamic MRFs, CVPR, 2007. ,

A mrf shape prior for facade parsing with occlusions, IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp.2820-2828, 2015. ,

URL : https://hal.archives-ouvertes.fr/hal-01232598

Optimizing neural networks with kronecker-factored approximate curvature, ICML, 2015. ,

A logical calculus of the ideas immanent in nervous activity, Bulletin of Mathematical Biophysics, vol.7, pp.115-133, 1943. ,

A performance evaluation of local descriptors, IEEE Transactions on Pattern Analysis & Machine Intelligence, vol.27, issue.10, pp.1615-1630, 2005. ,

URL : https://hal.archives-ouvertes.fr/inria-00548227

Recurrent models of visual attention, NIPS, 2014. ,

On the number of linear regions of deep neural networks, Advances in Neural Information Processing Systems 27: Annual Conference on Neural Information Processing Systems, pp.2924-2932, 2014. ,

Learning Visual Similarity Measures for Comparing Never Seen Objects, CPVR 2007 -IEEE Conference on Computer Vision and Pattern Recognition, pp.1-8, 2007. ,

URL : https://hal.archives-ouvertes.fr/hal-00203958

Is object localization for free? -weaklysupervised learning with convolutional neural networks, Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2015. ,

URL : https://hal.archives-ouvertes.fr/hal-01015140

Scaling the scattering transform: Deep hybrid networks, The IEEE International Conference on Computer Vision (ICCV), 2017. ,

URL : https://hal.archives-ouvertes.fr/hal-01495734

Benchmarking deep learning frameworks for the classification of very high resolution satellite multispectral data, 2016. ,

, ISPRS Annals of Photogrammetry, Remote Sensing and Spatial Information Sciences, III-7, pp.83-88

Learning-logic, 1985. ,

Deep face recognition, British Machine Vision Conference, 2015. ,

Learning to segment object candidates, NIPS, 2015. ,

Learning to refine object segments, ECCV, 2016. ,

Some methods of speeding up the convergence of iteration methods, Ussr Computational Mathematics and Mathematical Physics, vol.4, pp.1-17, 1964. ,

, The Mathematical Theory of Optimal Processes, 1961.

Recognizing indoor scenes, CVPR, 2009. ,

Deep learning made easier by linear transformations in perceptrons, Proceedings of the Fifteenth International Conference on Artificial Intelligence and Statistics, vol.22, pp.924-932, 2012. ,

CNN features off-the-shelf: An astounding baseline for recognition, IEEE Conference on Computer Vision and Pattern Recognition, CVPR Workshops, pp.512-519, 2014. ,

Faster R-CNN: Towards real-time object detection with region proposal networks, NIPS, 2015. ,

The dynamic representation of scenes, Visual Cognition, pp.17-42, 2000. ,

FitNets: Hints for thin deep nets, 2014. ,

The perceptron: a probabilistic model for information storage and organization in the brain, Psychological review, vol.65, issue.6, p.386, 1958. ,

Orb: An efficient alternative to sift or surf, Proceedings of the 2011 International Conference on Computer Vision, ICCV '11, pp.2564-2571, 2011. ,

Weight normalization: A simple reparameterization to accelerate training of deep neural networks, Neural Information Processing Systems, 2016. ,

Learning complex, extended sequences using the principle of history compression, Neural Computation, vol.4, issue.2, pp.234-242, 1992. ,

Deep learning in neural networks: An overview, Neural networks : the official journal of the International Neural Network Society, vol.61, pp.85-117, 2015. ,

Grad-cam: Visual explanations from deep networks via gradient-based localization, The IEEE International Conference on Computer Vision (ICCV), 2017. ,

Pedestrian detection with unsupervised multi-stage feature learning, CVPR, 2013. ,

Overfeat: Integrated recognition, localization and detection using convolutional networks, ICLR, 2014. ,

Discriminative Learning of Deep Convolutional Feature Point Descriptors, Proceedings of the International Conference on Computer Vision (ICCV), 2015. ,

Very deep convolutional networks for large-scale image recognition, 2015. ,

Deep inside convolutional networks: Visualising image classification models and saliency maps, ICLR Workshop, 2014. ,

Photo tourism: Exploring photo collections in 3d, ACM Trans. Graph, vol.25, issue.3, pp.835-846, 2006. ,

Modeling the world from internet photo collections, Int. J. Comput. Vision, vol.80, issue.2, pp.189-210, 2008. ,

Striving for simplicity: The all convolutional net, 2015. ,

Dropout: A simple way to prevent neural networks from overfitting, 2014. ,

Training very deep networks, Advances in Neural Information Processing Systems, vol.28, pp.2377-2385, 2015. ,

On benchmarking camera calibration and multi-view stereo for high resolution imagery, CVPR, 2008. ,

DOI : 10.1109/cvpr.2008.4587706

On the importance of initialization and momentum in deep learning, Proceedings of the 30th International Conference on Machine Learning (ICML-13), vol.28, pp.1139-1147, 2013. ,

Intriguing properties of neural networks, 2013. ,

, Scalable, high-quality object detection, 2014.

Going deeper with convolutions, CVPR, 2015. ,

Inception-v4, inception-resnet and the impact of residual connections on learning, 2016. ,

Deepface: Closing the gap to human-level performance in face verification, Conference on Computer Vision and Pattern Recognition (CVPR), 2014. ,

, Siamese instance search for tracking, 2016.

A Fast Local Descriptor for Dense Matching, Proceedings of Computer Vision and Pattern Recognition, 2008. ,

Contextual priming for object detection, 2003. ,

Learning image descriptors with the boosting-trick, NIPS, 2012. ,

Boosting binary keypoint descriptors, IEEE Conference on Computer Vision and Pattern Recognition, pp.2874-2881, 2013. ,

Selective search for object recog, 2013. ,

Fast convolutional nets with fbfft: A GPU performance evaluation, 2014. ,

Robust real-time face detection, 2004. ,

The Caltech-UCSD Birds-200-2011 Dataset, 2011. ,

Applications of advances in nonlinear sensitivity analysis, Proceedings of the 10th IFIP Conference, 31.8 -4.9, pp.762-770, 1981. ,

The marginal value of adaptive gradient methods in machine learning, NIPS, 2017. ,

Show, attend and tell: Neural image caption generation with visual attention, ICML, 2015. ,

Stacked attention networks for image question answering, IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp.21-29, 2016. ,

Learning to assign orientations to feature points, IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp.107-116, 2016. ,

Lift: Learned invariant feature transform, ECCV, 2016. ,

Learning to compare image patches via convolutional neural networks, Conference on Computer Vision and Pattern Recognition (CVPR), 2015. ,

URL : https://hal.archives-ouvertes.fr/hal-01246261

Deep compare: A study on using convolutional neural networks to compare image patches, Computer Vision and Image Understanding Special Issue: Deep Learning, 2016. ,

URL : https://hal.archives-ouvertes.fr/hal-01830004

Wide residual networks, BMVC, 2016. ,

URL : https://hal.archives-ouvertes.fr/hal-01832503

, Diracnets: Training very deep neural networks without skip-connections, 2017.

Paying more attention to attention: Improving the performance of convolutional neural networks via attention transfer, ICLR, 2017. ,

URL : https://hal.archives-ouvertes.fr/hal-01832769

A multipath network for object detection, BMVC, 2016. ,

Computing the stereo matching cost with a convolutional neural network, IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp.1592-1599, 2015. ,

Visualizing and understanding convolutional networks, ECCV, 2014. ,

Learning deep features for discriminative localization, Computer Vision and Pattern Recognition, 2016. ,

Edge boxes: Locating object proposals from edges, ECCV, 2014. ,