, , 2016.
, , 2016.
, , 2017.
, Results with our ResNet-101 Implementation Basenet ResNet-101 (Ours) 81.2 75
,
, Results after Cityscapes Pretraining Basenet ResNet-101 (Ours) 85
,
, Sequence segmentation using joint RNN and structured prediction models, Acoustics, Speech and Signal Processing (ICASSP), 2017 IEEE International Conference on, pp.2422-2426, 2017.
, Faruk Ahmed, Dany Tarlow and Dhruv Batra Optimizing expected intersection-over-union with candidate-constrained crfs, Proceedings of the IEEE International Conference on Computer Vision, pp.1850-1858, 2015.
Higher Order Conditional Random Fields in Deep Neural Networks, European Conference on Computer Vision, pp.524-540, 2016. ,
DOI : 10.1109/CVPR.2014.119
SegNet: A Deep Convolutional Encoder-Decoder Architecture for Image Segmentation, ArXiV CoRR, p.78, 2015. ,
DOI : 10.1109/TPAMI.2016.2644615
The fast bilateral solver, ECCV, 2016, pp.14-26, 2016. ,
Representation Learning: A Review and New Perspectives, IEEE Transactions on Pattern Analysis and Machine Intelligence, vol.35, issue.8, pp.1798-1828, 2013. ,
DOI : 10.1109/TPAMI.2013.50
Optimization of the Jaccard index for image segmentation with the Lovász hinge, CoRR, vol.abs, 1705. ,
, Convolutional Random Walk Networks for Semantic Image Segmentation, 2016.
, , 1605.
, Stephen Boyd and Lieven Vandenberghe. Convex optimization, 2004.
GPstruct: Bayesian Structured Prediction Using Gaussian Processes, IEEE Transactions on Pattern Analysis and Machine Intelligence, vol.37, issue.7, pp.1514-1520, 2015. ,
DOI : 10.1109/TPAMI.2014.2366151
, Leo Breiman. Random forests. Machine learning, vol.45, issue.1, pp.5-32, 2001.
Segmentation and Recognition Using Structure from Motion Point Clouds, ECCV, p.2017, 2017. ,
DOI : 10.1109/CVPR.2006.305
One-Shot Video Object Segmentation, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp.2017-71, 2017. ,
DOI : 10.1109/CVPR.2017.565
Fully- Connected CRFs with Non-Parametric Pairwise Potentials, CVPR, 2013, 2013. ,
The devil is in the details: an evaluation of recent feature encoding methods, Procedings of the British Machine Vision Conference 2011, 2011. ,
DOI : 10.5244/C.25.76
Return of the Devil in the Details: Delving Deep into Convolutional Nets, Proceedings of the British Machine Vision Conference 2014, p.2014, 2014. ,
DOI : 10.5244/C.28.6
Semantic image segmentation with deep convolutional nets and fully connected crfs. arXiv preprint arXiv:1412, pp.42-43, 2014. ,
Detect What You Can: Detecting and Representing Objects Using Holistic Models and Body Parts, 2014 IEEE Conference on Computer Vision and Pattern Recognition, p.63, 2014. ,
DOI : 10.1109/CVPR.2014.254
Learning Deep Structured Models, ICML, pp.14-25, 2015. ,
Weakly-and Semi-Supervised Learning of a Deep Convolutional Network for Semantic Image Segmentation, ICCV, vol.14, pp.20-26, 2015. ,
DeepLab: Semantic Image Segmentation with Deep Convolutional Nets, Atrous Convolution, and Fully Connected CRFs, pp.45-61, 2016. ,
Attention to Scale: Scale-aware Semantic Image Segmentation, CVPR, pp.40-63, 2016. ,
Rethinking Atrous Convolution for Semantic Image Segmentation, 2017. ,
, , 1706.
The computational complexity of probabilistic inference using Bayesian belief networks, Artificial intelligence, vol.42, issue.2- 3, pp.393-405, 1990. ,
The Cityscapes dataset for semantic urban scene understanding, p.2016 ,
Support-vector networks, Machine Learning, vol.1, issue.3, pp.273-297, 1995. ,
DOI : 10.1007/BF00994018
, Navneet Dalal and Bill Triggs. Histograms of oriented gradients for human detection, Computer Vision and Pattern Recognition, 2005. CVPR 2005. IEEE Computer Society Conference on, pp.886-893, 2005.
, IEEE, 2005.
ImageNet: A large-scale hierarchical image database, 2009 IEEE Conference on Computer Vision and Pattern Recognition, 2009. ,
DOI : 10.1109/CVPR.2009.5206848
Ecient continuous relaxations for dense CRF, European Conference on Computer Vision, pp.818-833, 2016. ,
, Learning to Rank using High-Order Information. ECCV, p.2014, 2014.
Long-term recurrent convolutional networks for visual recognition and description, Proceedings of the IEEE conference on computer vision and pattern recognition, pp.2625-2634, 2015. ,
Notes on Matrix Calculus, pp.35-56, 2005. ,
Scene parsing with Multiscale Feature Learning, Purity Trees, and Optimal Covers, ICML, 2012, 2012. ,
URL : https://hal.archives-ouvertes.fr/hal-00715469
Learning Hierarchical Features for Scene Labeling, IEEE Transactions on Pattern Analysis and Machine Intelligence, vol.35, issue.8, p.2013, 2013. ,
DOI : 10.1109/TPAMI.2012.231
URL : https://hal.archives-ouvertes.fr/hal-00742077
, Semantic Instance Segmentation via Deep Metric Learning. CoRR, 1703.
Stacked Deconvolutional Network for Semantic Segmentation, CoRR, vol.abs, 1708. ,
Semantic Video CNNs Through Representation Warping, 2017 IEEE International Conference on Computer Vision (ICCV), 2017. ,
DOI : 10.1109/ICCV.2017.477
URL : http://arxiv.org/pdf/1708.03088
, Ganin and V. Lempitsky. N 4 -fields: Neural network nearest neighbor fields for image transforms, ACCV, p.2014, 2014.
Simultaneous Detection and Segmentation, ECCV, 2014. ,
DOI : 10.1007/978-3-319-10584-0_20
, Matrix Computations, vol.3, issue.39, p.38, 1996.
Generative adversarial nets, Advances in neural information processing systems, pp.2672-2680, 2014. ,
Random Walks for Image Segmentation, PAMI, 2006. ,
DOI : 10.1109/TPAMI.2006.233
Learning Rich Features from RGB-D Images for Object Detection and Segmentation, ECCV, 2014. ,
DOI : 10.1007/978-3-319-10584-0_23
Digital selection and analogue amplification coexist in a cortex-inspired silicon circuit, Nature, vol.405, issue.6789, p.947, 2000. ,
Hypercolumns for object segmentation and fine-grained localization, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp.6-61, 2015. ,
DOI : 10.1109/CVPR.2015.7298642
, Konstantinos Derpanis and Iasonas Kokkinos Deep networks for saliency detection via local estimation and global search, CVPR, p.2015, 2015.
Derpanis and Iasonas Kokkinos. Learning Dense Convolutional Embeddings for Semantic Segmentation, ICCV, p.2017, 2017. ,
, Kaiming He Xiangyu Zhang, Shaoqing Ren and Jian Sun Deep Residual Learning for Image Recognition, CVPR, 2016. (Cited on pages 4, pp.9-45, 2016.
Densely Connected Convolutional Networks, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), p.2017, 2017. ,
DOI : 10.1109/CVPR.2017.243
FlowNet 2.0: Evolution of Optical Flow Estimation with Deep Networks, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2017. ,
DOI : 10.1109/CVPR.2017.179
, Sergey Ioe and Christian Szegedy Batch normalization: Accelerating deep network training by reducing internal covariate shift, International conference on machine learning, pp.448-456, 2015.
FusionSeg: Learning to combine motion and appearance for fully automatic segmention of generic objects in videos. arXiv preprint, p.2017, 2017. ,
Learning Sparse High Dimensional Filters: Image Filtering, Dense CRFs and Bilateral Neural Networks, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp.4452-4461, 2016. ,
DOI : 10.1109/CVPR.2016.482
Video Propagation Networks, IEEE Conference on Computer Vision and Pattern Recognition (CVPR), p.2017, 2017. ,
Regression Tree Fields -An Ecient, Non-parametric Approach to Image Labeling Problems, CVPR, 2012. (Cited on pages 4, pp.18-29, 2012. ,
The One Hundred Layers Tiramisu: Fully Convolutional DenseNets for Semantic Segmentation, 2017 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), pp.1175-1183, 2017. ,
DOI : 10.1109/CVPRW.2017.156
Cae: Convolutional Architecture for Fast Feature Embedding. arXiv preprint, p.2014, 2014. ,
, Video Scene Parsing with Predictive Feature Learning. CoRR, pp.119-2016, 1612.
Large-Scale Video Classification with Convolutional Neural Networks, 2014 IEEE Conference on Computer Vision and Pattern Recognition, pp.1725-1732, 2014. ,
DOI : 10.1109/CVPR.2014.223
Bayesian segnet: Model uncertainty in deep convolutional encoder-decoder architectures for scene understanding, ArXiV CoRR, p.2015, 2015. ,
Lucid Data Dreaming for Object Tracking. arXiv preprint, 2017. ,
, Iasonas Kokkinos Pushing the Boundaries of Boundary Detection using Deep Learning, ICLR, 2016, p.62, 2016.
, Iasonas Kokkinos UberNet: A Universal CNN for the joint treatment of Low-, Mid-, and High-Level Vision Problems, CVPR, p.64, 2017.
Probabilistic Graphical Models: Principles and Techniques, 2007. ,
Convergent Tree-Reweighted Message Passing for Energy Minimization, IEEE Transactions on Pattern Analysis and Machine Intelligence, vol.28, issue.10, pp.1568-1583, 2006. ,
DOI : 10.1109/TPAMI.2006.200
Minimizing Nonsubmodular Functions with Graph Cuts-A Review, IEEE Transactions on Pattern Analysis and Machine Intelligence, vol.29, issue.7, 2007. ,
DOI : 10.1109/TPAMI.2007.1031
, Philipp Krähenbühl and Vladlen Koltun Ecient Inference in Fully Connected CRFs with Gaussian Edge Potentials, NIPS, pp.13-43, 2011.
ImageNet classification with deep convolutional neural networks, NIPS, 2012, 2012. ,
DOI : 10.1162/neco.2009.10-08-881
Feature Space Optimization for Semantic Video Segmentation, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp.3168-3175, 2016. ,
DOI : 10.1109/CVPR.2016.345
Conditional random fields: Probabilistic models for segmenting and labeling sequence data, 2001. ,
, Iro Laina, Christian Rupprecht, Vasileios Belagiannis, Federico Tombari and Nassir Navab. Deeper depth prediction with fully convolutional residual networks, 3D Vision, pp.239-248, 2016.
Gradient-based learning applied to document recognition, Proceedings of the IEEE, 1998. ,
, Microsoft COCO: Common objects in context, ECCV, 2014, 2014.
, Catalin Ionescu and Cristian Sminchisescu Random Fourier approximations for skewed multiplicative histogram kernels, Joint Pattern Recognition Symposium, pp.262-271, 2010.
The Secrets of Salient Object Segmentation, 2014 IEEE Conference on Computer Vision and Pattern Recognition, p.64, 2014. ,
DOI : 10.1109/CVPR.2014.43
Visual saliency based on multiscale deep features, CVPR, p.64, 2015. ,
Deep Contrast Learning for Salient Object Detection, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), p.64, 2016. ,
DOI : 10.1109/CVPR.2016.58
URL : http://arxiv.org/pdf/1603.01976
Xiaoou Tang and Chen Change Loy. Video Object Segmentation with Re-identification. arXiv preprint, p.2017, 2017. ,
, Bibliography
Semantic Object Parsing with Graph LSTM, European Conference on Computer Vision, pp.125-143, 2016. ,
DOI : 10.1162/neco.1997.9.8.1735
Semantic Object Parsing with Local-Global Long Short-Term Memory, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), p.63, 2016. ,
DOI : 10.1109/CVPR.2016.347
Ecient piecewise training of deep structured models for semantic segmentation, pp.2016-2042, 2016. ,
Deep convolutional neural fields for depth estimation from a single image, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp.26-32, 2015. ,
DOI : 10.1109/CVPR.2015.7299152
Semantic Image Segmentation via Deep Parsing Network, 2015 IEEE International Conference on Computer Vision (ICCV), pp.1377-1385, 2015. ,
DOI : 10.1109/ICCV.2015.162
Learning Depth from Single Monocular Images Using Deep Convolutional Neural Fields, IEEE Transactions on Pattern Analysis and Machine Intelligence, vol.38, issue.10, pp.2024-2039, 2016. ,
DOI : 10.1109/TPAMI.2015.2505283
Fully convolutional networks for semantic segmentation, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp.3431-3440, 2015. ,
DOI : 10.1109/CVPR.2015.7298965
Distinctive image features from scale-invariant keypoints, International journal of computer vision, vol.60, issue.2, pp.91-110, 2004. ,
Multi-digit recognition using a space displacement neural network, Advances in neural information processing systems, pp.488-495, 1992. ,
Feedforward semantic segmentation with zoom-out features, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), p.2015, 2015. ,
DOI : 10.1109/CVPR.2015.7298959
, Alejandro Newell and Jia Deng Associative Embedding: End-to-End Learning for Joint Detection and Grouping, 1611.
Semantic Video Segmentation by Gated Recurrent Flow Propagation, CoRR, vol.abs, 1612. ,
Learning Deconvolution Network for Semantic Segmentation, 2015 IEEE International Conference on Computer Vision (ICCV), 2015. ,
DOI : 10.1109/ICCV.2015.178
Structured learning and prediction in computer vision, Foundations and Trends R ? in Computer Graphics and Vision, vol.64, issue.3, pp.185-365, 2011. ,
Gaussian Sampling by Local Perturbations, Proc. Int. Conf. on Neural Information Processing Systems (NIPS), pp.1858-1866, 2010. ,
Perturb-and-MAP random fields: Using discrete optimization to learn and sample from energy models, 2011 International Conference on Computer Vision, pp.193-200, 2011. ,
DOI : 10.1109/ICCV.2011.6126242
A Benchmark Dataset and Evaluation Methodology for Video Object Segmentation, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), p.2016, 2016. ,
DOI : 10.1109/CVPR.2016.85
Learning Video Object Segmentation from Static Images, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), p.71, 2017. ,
DOI : 10.1109/CVPR.2017.372
URL : http://arxiv.org/pdf/1612.02646
, Florent Perronnin, Jorge Sánchez and Thomas Mensink Improving the fisher kernel for large-scale image classification, European conference on computer vision, pp.143-156, 2010.
Recurrent convolutional neural networks for scene labeling, ICML, 2014. ,
Alexander Sorkine-Hornung and Luc Van Gool. The 2017 DAVIS Challenge on Video Object Segmentation, p.2017 ,
Numerical recipes in c, p.38, 1992. ,
On the hardness of approximate reasoning, Artificial Intelligence, vol.82, issue.1-2, pp.273-302, 1996. ,
DOI : 10.1016/0004-3702(94)00092-1
, Bibliography
, Rue and L. Held. Gaussian Markov random fields: Theory and applications
, , pp.14-39, 2005.
, Deep Learning in Neural Networks: An Overview, Neural Networks, vol.61, pp.85-117
Fully connected deep structured networks. arXiv preprint, 2015. ,
Soumith Chintala and Yann LeCun. Pedestrian detection with unsupervised multi-stage feature learning, Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp.3626-3633, 2013. ,
Clockwork Convnets for Video Semantic Segmentation, CoRR, vol.abs, 1608. ,
An Introduction to the Conjugate Gradient Method Without the Agonizing Pain, pp.34-52, 1994. ,
TextonBoost for Image Understanding: Multi-Class Object Recognition and Segmentation by Jointly Modeling Texture, Layout, and Context, International Journal of Computer Vision, vol.62, issue.1???2, pp.2-23, 2009. ,
DOI : 10.1007/s11263-005-4635-4
, Very deep convolutional networks for large-scale image recognition. ICLR, 2015, 2015.
Piecewise training for undirected models. arXiv preprint, 2012. ,
An Introduction to Conditional Random Fields, Machine Learning, pp.267-373, 2012. ,
DOI : 10.1561/2200000013
Learning Gaussian Conditional Random Fields for Low-Level Vision, CVPR, pp.14-29, 2007. ,
The logistic random field -A convenient graphical model for learning parameters for MRF-based labeling, Computer Vision and Pattern Recognition CVPR 2008. IEEE Conference on, pp.1-8, 2008. ,
Learning associative Markov networks, Twenty-first international conference on Machine learning , ICML '04, p.102, 2004. ,
DOI : 10.1145/1015330.1015444
URL : http://www.aicml.cs.ualberta.ca/banff04/icml/pages/papers/394.ps
Ecient Additive Kernels via Explicit Feature Maps, Proceedings of the IEEE Conf. on Computer Vision and Pattern Recognition (CVPR), 2010. ,
DOI : 10.1109/cvpr.2010.5539949
URL : http://eprints.pascal-network.org/archive/00006964/01/vedaldi10.pdf
Deep Gaussian Conditional Random Field Network: A Model-Based Deep Network for Discriminative Denoising, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2016. ,
DOI : 10.1109/CVPR.2016.519
Gaussian Conditional Random Field Network for Semantic Segmentation, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp.29-31, 2016. ,
DOI : 10.1109/CVPR.2016.351
Improved Initialization and Gaussian Mixture Pairwise Terms for Dense Random Fields with Mean-field Inference, Procedings of the British Machine Vision Conference 2012, 2013. ,
DOI : 10.5244/C.26.73
URL : http://www.bmva.org/bmvc/2012/BMVC/paper073/paper073.pdf
ReSeg: A Recurrent Neural Network-Based Model for Semantic Segmentation, 2016 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), p.2016, 2016. ,
DOI : 10.1109/CVPRW.2016.60
URL : http://arxiv.org/pdf/1511.07053
, Paul Voigtlaender and Bastian Leibe Online Adaptation of Convolutional Neural Networks for Video Object Segmentation, BMVC, 2017, pp.71-78, 2017.
, Context-aware CNNs for person head detection, ICCV, pp.2893-2901, 2015.
Graphical Models , Exponential Families, and Variational Inference, Found. Trends Mach. Learn, vol.1, issue.1-2, pp.136-138, 2008. ,
PISA: Pixelwise Image Saliency by Aggregating Complementary Appearance Contrast Measures With Edge-Preserving Coherence, IEEE Transactions on Image Processing, vol.24, issue.10, 2015. ,
DOI : 10.1109/TIP.2015.2432712
URL : http://arxiv.org/pdf/1505.03227
Deep networks for saliency detection via local estimation and global search, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), p.64, 2015. ,
DOI : 10.1109/CVPR.2015.7298938
URL : http://www.cv-foundation.org/openaccess/content_cvpr_2015/papers/Wang_Deep_Networks_for_2015_CVPR_paper.pdf
, Bibliography
, Shenlong Wang, Sanja Fidler and Raquel Urtasun Proximal deep structured models, Advances in Neural Information Processing Systems, pp.865-873, 2016.
, local Neural Networks. CoRR, 1711.
Postal address block location using a convolutional locator network, Advances in Neural Information Processing Systems, pp.745-752, 1994. ,
Zoom better to see clearer: Human part segmentation with auto zoom net, ECCV, p.63, 2016. ,
DOI : 10.1007/978-3-319-46454-1_39
URL : http://arxiv.org/pdf/1511.06881
Holistically-nested edge detection, ICCV, pp.1395-1403, 2015. ,
Multi-scale context aggregation by dilated convolutions. ICLR, 2016, 2016. ,
CASENet: Deep Category-Aware Semantic Edge Detection. arXiv preprint, p.2017, 2017. ,
DOI : 10.1109/cvpr.2017.191
URL : http://arxiv.org/pdf/1705.09759
A support vector method for optimizing average precision, Proceedings of the 30th annual international ACM SIGIR conference on Research and development in information retrieval, SIGIR '07, pp.271-278, 2007. ,
DOI : 10.1145/1277741.1277790
URL : http://radlinski.org/papers/YueEtAl_SIGIR2007.pdf
, Sergey Zagoruyko and Nikos Komodakis. Wide Residual Networks, BMVC, 2016, 2016.
Saliency detection by multicontext deep learning, CVPR, p.64, 2015. ,
DOI : 10.1109/cvpr.2015.7298731
, Pyramid Scene Parsing Network. CoRR, vol.63, pp.71-82, 1105.
Conditional Random Fields as Recurrent Neural Networks, ICCV, 2015. (Cited on pages 13, pp.31-40, 2015. ,
, Bibliography
Deep Feature Flow for Video Recognition, CoRR, vol.abs, 1611. ,