, Simplifying the scale factor determination by normalizing DepthNet
113 6.5 Our proof of concept ,
, Conclusions for this chapter
, Multi range Real-time depth inference from a monocular stabilized footage using a Fully Convolutional Neural Network, This chapter aims at studying the possibilities of using our network DepthNet to get real-time depth maps, 2017.
, Perspectives and Conclusions Contents 7.1 Limitations and future
, 119 7.2.2 Robust and evolutive passive depth sensing, p.119
, We showed in this work a working strategy to train a neural network to sense depth based on motion from a stabilized camera, for any kind of domain. We even showed that the next step in the general obstacle avoidance strategy, which was to train a neural network to avoid obstacles from a perfect depth map provided by a simulator before concatenating it with a real depth sensing solution might be replaced by a differentiable MPC
, The strategy presented in the introduction (section 1.4.3) and in figure 1.8 is still valid and might be the subject for future work. However, thanks to a thorough study on possible drawbacks of depth from vision algorithms, we could identify several drawbacks of DepthNet that might need to be solved
, On the other hand, due to the lack of other validation sets for depth sensing in the context of outdoor flight, we could only have a subjective validation for the UAV use-case. It thus seems necessary to construct a validation set with reliable ground-truth depth and navigation data the same way KITTI was constructed. Several solutions can be considered: ? Construct a UAV with a calibrated Lidar, similarly to KITTI
,
Large-Scale Data for Multiple-View Stereopsis, International Journal of Computer Vision, pp.1-16, 2016. ,
Differentiable MPC for End-to-end Planning and Control, Advances in Neural Information Processing Systems, pp.8299-8310, 2018. ,
A computational framework and an algorithm for the measurement of visual motion, International Journal of Computer Vision, vol.2, issue.3, pp.283-310, 1989. ,
A database and evaluation methodology for optical flow, International Journal of Computer Vision, vol.92, pp.1-31, 2011. ,
A More General Robust Loss Function, IEEE Conference on Computer Vision and Pattern Recognition (CVPR) ,
Curriculum learning, Proceedings of the 26th annual international conference on machine learning, pp.41-48, 2009. ,
Active, optical range imaging sensors, Machine vision and applications, vol.1, pp.127-152, 1988. ,
Large displacement optical flow: descriptor matching in variational motion estimation, IEEE transactions on pattern analysis and machine intelligence, vol.33, pp.500-513, 2011. ,
Large-scale machine learning with stochastic gradient descent, Proceedings of COMPSTAT'2010, pp.177-186, 2010. ,
A naturalistic open source movie for optical flow evaluation, European Conf. on Computer Vision (ECCV) ,
, Part IV, vol.7577, pp.611-625, 2012.
Past, present, and future of simultaneous localization and mapping: Toward the robust-perception age, IEEE Transactions on Robotics, vol.32, pp.1309-1332, 2016. ,
Unsupervised Learning of Depth and Ego-Motion: A Structured Approach, p.19, 2019. ,
Lens-systems, decentered, Monthly notices of the royal astronomical society, vol.79, pp.384-390, 1919. ,
The Cityscapes Dataset for Semantic Urban Scene Understanding, Proc. of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2016. ,
Subjective contours and apparent depth, Psychological Review, vol.79, p.359, 1972. ,
A first-order primal-dual algorithm for convex problems with applications to imaging, Journal of mathematical imaging and vision, vol.40, pp.120-145, 2011. ,
URL : https://hal.archives-ouvertes.fr/hal-00490826
Fast and accurate deep network learning by exponential linear units (elus), 2015. ,
Vestibulo-ocular reflex arc, Archives of Neurology & Psychiatry, vol.30, pp.245-291, 1933. ,
FlowNet: Learning Optical Flow with Convolutional Networks, IEEE International Conference on Computer Vision (ICCV), 2015. ,
Visual illusions and neurobiology, Nature Reviews Neuroscience, vol.2, p.920, 2001. ,
Depth map prediction from a single image using a multi-scale deep network, Advances in neural information processing systems, pp.2366-2374, 2014. ,
Two-frame motion estimation based on polynomial expansion, Scandinavian conference on Image analysis, pp.363-370, 2003. ,
Distance behaviour on motorways with regard to active safety-a comparison between adaptive cruise control (acc) and driver, Proceedings: International Technical Conference on the Enhanced Safety of Vehicles, p.8, 2001. ,
RGBD Datasets: Past, Present and Future, CVPR Workshop on Large Scale 3D Data: Acquisition, Modelling and Analysis, 2016. ,
Article L6111-1". In: Code de l'aviation civile, Livre 1er, 2016. ,
Monocular visual-inertial slam-based collision avoidance strategy for fail-safe UAV using fuzzy logic controllers, Journal of Intelligent & Robotic Systems, vol.73, pp.513-533, 2014. ,
Deep Ordinal Regression Network for Monocular Depth Estimation, IEEE Conference on Computer Vision and Pattern Recognition (CVPR) ,
URL : https://hal.archives-ouvertes.fr/hal-01741163
Neocognitron: A self-organizing neural network model for a mechanism of pattern recognition unaffected by shift in position, Biological cybernetics, vol.36, pp.193-202, 1980. ,
Unsupervised cnn for single view depth estimation: Geometry to the rescue, European Conference on Computer Vision, pp.740-756, 2016. ,
Vision meets robotics: The KITTI dataset, The International Journal of Robotics Research, vol.32, pp.1231-1237, 2013. ,
A machine learning approach to visual perception of forest trails for mobile robots, IEEE Robotics and Automation Letters, vol.1, issue.2, pp.661-667, 2016. ,
Are we ready for Autonomous Driving? The KITTI Vision Benchmark Suite, Conference on Computer Vision and Pattern Recognition (CVPR), 2012. ,
Digging Into Self-Supervised Monocular Depth Estimation, 2018. ,
Unsupervised Monocular Depth Estimation with Left-Right Consistency, 2017. ,
Learning to fly by crashing, 2017 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS). IEEE. 2017, pp.3948-3955 ,
Time-of-flight cameras: principles, methods and applications, 2012. ,
URL : https://hal.archives-ouvertes.fr/hal-00725654
Matchnet: Unifying feature and metric learning for patch-based matching, Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp.3279-3286, 2015. ,
Distance and velocity estimation using optical flow from a monocular camera, International Journal of Micro Air Vehicles, vol.9, pp.198-208, 2017. ,
Deep residual learning for image recognition, Proceedings of the IEEE conference on computer vision and pattern recognition, pp.770-778, 2016. ,
Deep reinforcement learning that matters, Thirty-Second AAAI Conference on Artificial Intelligence, 2018. ,
OctoMap: An Efficient Probabilistic 3D Mapping Framework Based on Octrees, Autonomous Robots, 2013. ,
Determining optical flow, Artificial intelligence, vol.17, pp.185-203, 1981. ,
Densely connected convolutional networks, Proceedings of the IEEE conference on computer vision and pattern recognition, pp.4700-4708, 2017. ,
Distilling the Knowledge in a Neural Network, p.9, 2015. ,
Flownet 2.0: Evolution of optical flow estimation with deep networks, 2016. ,
Occlusions, Motion and Depth Boundaries with a Generic Network for Disparity, Optical Flow or Scene Flow Estimation, European Conference on Computer Vision (ECCV), 2018. ,
Batch normalization: Accelerating deep network training by reducing internal covariate shift, 2015. ,
Unsupervised Learning of Multi-Frame Optical Flow with Occlusions, European Conference on Computer Vision (ECCV) ,
, Lecture Notes in Computer Science, p.11220
, , pp.713-731, 2018.
The one hundred layers tiramisu: Fully convolutional densenets for semantic segmentation, Computer Vision and Pattern Recognition Workshops, pp.1175-1183, 2017. ,
Spatial transformer networks, Advances in neural information processing systems, pp.2017-2025, 2015. ,
Margini Quasi-Percettivi in Campi con Stimolazione Omogenea, Rivista di Psicologia, vol.49, pp.7-30, 1955. ,
Adam: A Method for Stochastic Optimization, 2014. ,
What uncertainties do we need in bayesian deep learning for computer vision?" In: Advances in neural information processing systems, pp.5574-5584, 2017. ,
Multi-task learning using uncertainty to weigh losses for scene geometry and semantics, Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp.7482-7491, 2018. ,
A History of the Photographic Lens, pp.59-62, 1989. ,
Parallel tracking and mapping for small AR workspaces, 6th IEEE and ACM International Symposium on, pp.225-234, 2007. ,
Imagenet classification with deep convolutional neural networks, Advances in neural information processing systems, pp.1097-1105, 2012. ,
A large-scale hierarchical multi-view rgb-d object dataset, 2011 IEEE international conference on robotics and automation, pp.1817-1824, 2011. ,
OpenDR: An Approximate Differentiable Renderer, Computer Vision -ECCV 2014, vol.8695 ,
Gradient-based learning applied to document recognition, Proceedings of the IEEE, vol.86, pp.2278-2324, 1998. ,
Aggressive 3-d collision avoidance for high-speed navigation, 2017 IEEE International Conference on Robotics and Automation (ICRA) ,
, , pp.5759-5765, 2017.
Differentiable Monte Carlo Ray Tracing through Edge Sampling, ACM Trans. Graph. (Proc. SIGGRAPH Asia), vol.37, p.11, 2018. ,
Feature Pyramid Networks for Object Detection, Computer Vision and Pattern Recognition (CVPR), 2017 IEEE Conference, pp.936-944, 2017. ,
Scale-space: A framework for handling image structures at multiple scales, 1996. ,
Ssd: Single shot multibox detector, European conference on computer vision, pp.21-37, 2016. ,
Dronet: Learning to fly by driving, IEEE Robotics and Automation Letters, vol.3, pp.1088-1095, 2018. ,
Fully convolutional networks for semantic segmentation, Proceedings of the IEEE conference on computer vision and pattern recognition, pp.3431-3440, 2015. ,
Some methods for classification and analysis of multivariate observations, Proceedings of the Fifth Berkeley Symposium on Mathematical Statistics and Probability, vol.1 ,
, , pp.281-297, 1967.
ORB-SLAM: a Versatile and Accurate Monocular SLAM System, IEEE Transactions on Robotics, vol.31, pp.1147-1163, 2015. ,
Toward domain independence for learning-based monocular depth estimation, IEEE Robotics and Automation Letters, vol.2, issue.3, pp.1778-1785, 2017. ,
Is light in pictures presumed to come from the left side?, In: Perception 33, vol.12, pp.1421-1436, 2004. ,
Concrete problems for autonomous vehicle safety: advantages of Bayesian deep learning, International Joint Conferences on Artificial Intelligence, 2017. ,
AN IMAGE-BASED APPROACH TO THREE-DIMENSIONAL COMPUTER GRAPHICS, 1997. ,
Exposure fusion: A simple and practical alternative to high dynamic range photography, Computer graphics forum, vol.28, pp.161-171, 2009. ,
Human-level control through deep reinforcement learning, Nature, vol.518, p.529, 2015. ,
Off-road obstacle avoidance through end-to-end learning, Advances in neural information processing systems, pp.739-746, 2006. ,
Unsupervised Learning of Depth and Ego-Motion from Monocular Video Using 3D Geometric Constraints, 2018. ,
An investigation of smoothness constraints for the estimation of displacement vector fields from image sequences, IEEE Transactions on Pattern Analysis and Machine Intelligence, vol.5, pp.565-593, 1986. ,
Stereovision-based algorithm for obstacle avoidance, International Conference on Intelligent Robotics and Applications, pp.195-204, 2009. ,
DTAM: Dense tracking and mapping in real-time, Computer Vision (ICCV), pp.2320-2327, 2011. ,
A Large Dataset to Train Convolutional Networks for Disparity, Optical Flow, and Scene Flow Estimation, IEEE International Conference on Computer Vision and Pattern Recognition (CVPR), 2016. ,
Automatic differentiation in PyTorch, 2017. ,
End-to-end depth from motion with stabilized monocular videos, ISPRS Annals of Photogrammetry, Remote Sensing and Spatial Information Sciences IV-2/W3 (2017), pp.67-74 ,
URL : https://hal.archives-ouvertes.fr/hal-01587652
Multi range Real-time depth inference from a monocular stabilized footage using a Fully Convolutional Neural Network, European Conference on Mobile Robotics. ENSTA ParisTech, 2017. ,
URL : https://hal.archives-ouvertes.fr/hal-01587658
Learning structure-from-motion from motion, ECCV GMDL Workshop, 2018. ,
URL : https://hal.archives-ouvertes.fr/hal-01995833
Scale-space and edge detection using anisotropic diffusion, IEEE Transactions, vol.12, pp.629-639, 1990. ,
, Early Stopping -But When?" In: Neural Networks: Tricks of the Trade: Second Edition
, , pp.53-67, 2012.
Adversarial Collaboration: Joint Unsupervised Learning of Depth, Camera Motion, Optical Flow and Motion Segmentation, 2018. ,
You only look once: Unified, real-time object detection, Proceedings of the IEEE conference on computer vision and pattern recognition, pp.779-788, 2016. ,
U-net: Convolutional networks for biomedical image segmentation, International Conference on Medical image computing and computer-assisted intervention, pp.234-241, 2015. ,
A reduction of imitation learning and structured prediction to no-regret online learning, Proceedings of the fourteenth international conference on artificial intelligence and statistics, pp.627-635, 2011. ,
Learning representations by back-propagating errors, Cognitive modeling, vol.5, p.1, 1988. ,
Model predictive heuristic control, Automatica (Journal of IFAC), vol.14, pp.413-428, 1978. ,
Nonlinear total variation based noise removal algorithms, Physica D: nonlinear phenomena, vol.60, pp.259-268, 1992. ,
FitNets: Hints for Thin Deep Nets, Proceedings of ICLR, 2015. ,
Learning monocular reactive uav control in cluttered natural environments, 2013 IEEE international conference on robotics and automation, pp.1765-1772, 2013. ,
The perceptron: a probabilistic model for information storage and organization in the brain, Psychological review, vol.65, p.386, 1958. ,
Mathematical models for local nontexture inpaintings, SIAM Journal on Applied Mathematics, vol.62, pp.1019-1043, 2002. ,
Deep learning in neural networks: An overview, Neural networks, pp.85-117, 2015. ,
Pixelwise View Selection for Unstructured Multi-View Stereo, European Conference on Computer Vision (ECCV), 2016. ,
3-d depth reconstruction from a single still image, International journal of computer vision, vol.76, pp.53-69, 2008. ,
AirSim: High-Fidelity Visual and Physical Simulation for Autonomous Vehicles, Field and Service Robotics. 2017. eprint ,
Orientation Maps of Subjective Contours in Visual Cortex, Science, vol.274, pp.36-8075, 1996. ,
Mastering the game of Go with deep neural networks and tree search, NATURE, vol.529, p.28, 2016. ,
Optical flow based robot obstacle avoidance, International Journal of Advanced Robotic Systems, vol.4, p.2, 2007. ,
Large displacement optical flow computation withoutwarping, Computer Vision, pp.1609-1614, 2009. ,
Monocular obstacle avoidance for blind people using probabilistic focus of expansion estimation, pp.1-9, 2016. ,
A taxonomy and evaluation of dense two-frame stereo correspondence algorithms, International journal of computer vision, vol.47, pp.7-42, 2002. ,
A Benchmark for the Evaluation of RGB-D SLAM Systems, Proc. of the International Conference on Intelligent Robot Systems (IROS) ,
An overview of free view-point depth-image-based rendering (DIBR), p.2010 ,
PWC-Net: CNNs for Optical Flow Using Pyramid, Warping, and Cost Volume, IEEE Conference on Computer Vision and Pattern Recognition (CVPR) ,
Very Deep Convolutional Networks for Large-Scale Image Recognition, 2014. ,
Going deeper with convolutions, Proceedings of the IEEE conference on computer vision and pattern recognition, pp.1-9, 2015. ,
Neural network studies. 1. Comparison of overfitting and overtraining, In: Journal of chemical information and computer sciences, vol.35, pp.826-833, 1995. ,
Sparsity Invariant CNNs, International Conference on 3D Vision (3DV), 2017. ,
DeMoN: Depth and Motion Network for Learning Monocular Stereo, IEEE Conference on Computer Vision and Pattern Recognition (CVPR, 2017. ,
SfM-Net: Learning of Structure and Motion from Video, 2017. ,
Image quality assessment: from error visibility to structural similarity, IEEE transactions on image processing, vol.13, pp.600-612, 2004. ,
Learning from delayed rewards, 1989. ,
Learning Depth from Monocular Videos using Direct Methods, IEEE Conference on Computer Vision and Pattern Recognition, pp.2022-2030 ,
Maximizing acquisition functions for Bayesian optimization, Advances in Neural Information Processing Systems, pp.9884-9895, 2018. ,
An improved illumination model for shaded display, ACM Siggraph, p.4, 2005. ,
Deep3d: Fully automatic 2d-to-3d video conversion with deep convolutional neural networks, European Conference on Computer Vision, pp.842-857, 2016. ,
Sun3d: A database of big spaces reconstructed using sfm and object labels, Proceedings of the IEEE International Conference on Computer Vision, pp.1625-1632, 2013. ,
GeoNet: Unsupervised Learning of Dense Depth, Optical Flow and Camera Pose, 2018. ,
Sense and avoid technologies with applications to unmanned aircraft systems: Review and prospects, Progress in Aerospace Sciences, vol.74, pp.152-166, 2015. ,
, IEEE Conference on Computer Vision and Pattern Recognition (CVPR)
Is L2 a Good Loss Function for Neural Networks for Image Processing, 2015. ,
Unsupervised Learning of Depth and Ego-Motion from Video, 2017. ,
MAV navigation through indoor corridors using optical flow, Robotics and Automation (ICRA), 2010 IEEE International Conference on. IEEE. 2010, pp.3361-3368 ,
Learning to Compare Image Patches via Convolutional Neural Networks, Conference on Computer Vision and Pattern Recognition (CVPR), 2015. ,
URL : https://hal.archives-ouvertes.fr/hal-01246261
Stereo matching by training a convolutional neural network to compare image patches, Journal of Machine Learning Research, vol.17, pp.1-32, 2016. ,