D. , , 2016.

+. Dilation8 and . Fso, , 2016.

. Tiramisu, , 2017.

, Results with our ResNet-101 Implementation Basenet ResNet-101 (Ours) 81.2 75

+. Basenet and G. Spatial,

, Results after Cityscapes Pretraining Basenet ResNet-101 (Ours) 85

+. Basenet and G. Spatial,

, Sequence segmentation using joint RNN and structured prediction models, Acoustics, Speech and Signal Processing (ICASSP), 2017 IEEE International Conference on, pp.2422-2426, 2017.

, Faruk Ahmed, Dany Tarlow and Dhruv Batra Optimizing expected intersection-over-union with candidate-constrained crfs, Proceedings of the IEEE International Conference on Computer Vision, pp.1850-1858, 2015.

A. Arnab, S. Jayasumana, S. Zheng, H. Philip, and . Torr, Higher Order Conditional Random Fields in Deep Neural Networks, European Conference on Computer Vision, pp.524-540, 2016.
DOI : 10.1109/CVPR.2014.119

]. V. Badrinarayanan, A. Kendall, and R. Cipolla, SegNet: A Deep Convolutional Encoder-Decoder Architecture for Image Segmentation, ArXiV CoRR, p.78, 2015.
DOI : 10.1109/TPAMI.2016.2644615

T. Jonathan, B. Barron, and . Poole, The fast bilateral solver, ECCV, 2016, pp.14-26, 2016.

]. Bengio, A. Courville, and P. Vincent, Representation Learning: A Review and New Perspectives, IEEE Transactions on Pattern Analysis and Machine Intelligence, vol.35, issue.8, pp.1798-1828, 2013.
DOI : 10.1109/TPAMI.2013.50

M. Berman and M. B. Blaschko, Optimization of the Jaccard index for image segmentation with the Lovász hinge, CoRR, vol.abs, 1705.

]. Bertasius, L. Torresani, S. X. Yu, and J. Shi, Convolutional Random Walk Networks for Semantic Image Segmentation, 2016.

. Corr, , 1605.

, Stephen Boyd and Lieven Vandenberghe. Convex optimization, 2004.

]. Bratieres, N. Quadrianto, and Z. Ghahramani, GPstruct: Bayesian Structured Prediction Using Gaussian Processes, IEEE Transactions on Pattern Analysis and Machine Intelligence, vol.37, issue.7, pp.1514-1520, 2015.
DOI : 10.1109/TPAMI.2014.2366151

, Leo Breiman. Random forests. Machine learning, vol.45, issue.1, pp.5-32, 2001.

]. G. Brostow, J. Shotton, J. Fauqueur, and R. Cipolla, Segmentation and Recognition Using Structure from Motion Point Clouds, ECCV, p.2017, 2017.
DOI : 10.1109/CVPR.2006.305

S. Caelles, K. Maninis, J. Pont-tuset, L. Leal-taixé, D. Cremers et al., One-Shot Video Object Segmentation, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp.2017-71, 2017.
DOI : 10.1109/CVPR.2017.565

D. F. Neill, K. Campbell, J. Subr, and . Kautz, Fully- Connected CRFs with Non-Parametric Pairwise Potentials, CVPR, 2013, 2013.

]. Chatfield, V. Lempitsky, A. Voedaldi, and A. Zisserman, The devil is in the details: an evaluation of recent feature encoding methods, Procedings of the British Machine Vision Conference 2011, 2011.
DOI : 10.5244/C.25.76

]. Chatfield, K. Simonyan, A. Vedaldi, and A. Zisserman, Return of the Devil in the Details: Delving Deep into Convolutional Nets, Proceedings of the British Machine Vision Conference 2014, p.2014, 2014.
DOI : 10.5244/C.28.6

G. Liang-chieh-chen, I. Papandreou, K. Kokkinos, A. L. Murphy, and . Yuille, Semantic image segmentation with deep convolutional nets and fully connected crfs. arXiv preprint arXiv:1412, pp.42-43, 2014.

]. X. Chen, R. Mottaghi, X. Liu, S. Fidler, R. Urtasun et al., Detect What You Can: Detecting and Representing Objects Using Holistic Models and Body Parts, 2014 IEEE Conference on Computer Vision and Pattern Recognition, p.63, 2014.
DOI : 10.1109/CVPR.2014.254

]. Chen, A. G. Schwing, A. L. Yuille, and R. Urtasun, Learning Deep Structured Models, ICML, pp.14-25, 2015.

G. Liang-chieh-chen, K. Papandreou, A. L. Murphy, and . Yuille, Weakly-and Semi-Supervised Learning of a Deep Convolutional Network for Semantic Image Segmentation, ICCV, vol.14, pp.20-26, 2015.

G. Liang-chieh-chen, I. Papandreou, K. Kokkinos, A. L. Murphy, and . Yuille, DeepLab: Semantic Image Segmentation with Deep Convolutional Nets, Atrous Convolution, and Fully Connected CRFs, pp.45-61, 2016.

Y. Liang-chieh-chen, J. Yang, W. Wang, A. L. Xu, and . Yuille, Attention to Scale: Scale-aware Semantic Image Segmentation, CVPR, pp.40-63, 2016.

G. Liang-chieh-chen and . Papandreou, Rethinking Atrous Convolution for Semantic Image Segmentation, 2017.

. Corr, , 1706.

F. Gregory and . Cooper, The computational complexity of probabilistic inference using Bayesian belief networks, Artificial intelligence, vol.42, issue.2- 3, pp.393-405, 1990.

S. Omran, T. Ramos, M. Scharwachter, R. Enzweiler, U. Benenson et al., The Cityscapes dataset for semantic urban scene understanding, p.2016

C. Cortes and V. Vapnik, Support-vector networks, Machine Learning, vol.1, issue.3, pp.273-297, 1995.
DOI : 10.1007/BF00994018

, Navneet Dalal and Bill Triggs. Histograms of oriented gradients for human detection, Computer Vision and Pattern Recognition, 2005. CVPR 2005. IEEE Computer Society Conference on, pp.886-893, 2005.

, IEEE, 2005.

]. J. Deng, W. Dong, R. Socher, L. Li, K. Li et al., ImageNet: A large-scale hierarchical image database, 2009 IEEE Conference on Computer Vision and Pattern Recognition, 2009.
DOI : 10.1109/CVPR.2009.5206848

A. Desmaison, R. Bunel, P. Kohli, H. Philip, M. Torr et al., Ecient continuous relaxations for dense CRF, European Conference on Computer Vision, pp.818-833, 2016.

]. P. Dokania, A. Behl, C. V. Jawahar, and P. K. Kumar, Learning to Rank using High-Order Information. ECCV, p.2014, 2014.

J. Donahue, L. A. Hendricks, S. Guadarrama, M. Rohrbach, S. Venugopalan et al., Long-term recurrent convolutional networks for visual recognition and description, Proceedings of the IEEE conference on computer vision and pattern recognition, pp.2625-2634, 2015.

L. Paul and . Fackler, Notes on Matrix Calculus, pp.35-56, 2005.

C. Farabet, C. Couprie, L. Najman, and Y. Lecun, Scene parsing with Multiscale Feature Learning, Purity Trees, and Optimal Covers, ICML, 2012, 2012.
URL : https://hal.archives-ouvertes.fr/hal-00715469

C. Farabet, C. Couprie, L. Najman, and Y. Le-cun, Learning Hierarchical Features for Scene Labeling, IEEE Transactions on Pattern Analysis and Machine Intelligence, vol.35, issue.8, p.2013, 2013.
DOI : 10.1109/TPAMI.2012.231
URL : https://hal.archives-ouvertes.fr/hal-00742077

A. Fathi, Z. Wojna, V. Rathod, P. Wang, H. O. Song et al., Semantic Instance Segmentation via Deep Metric Learning. CoRR, 1703.

J. Fu, Y. Liu, H. Wang, and . Lu, Stacked Deconvolutional Network for Semantic Segmentation, CoRR, vol.abs, 1708.

]. Gadde, V. Jampani, and P. V. Gehler, Semantic Video CNNs Through Representation Warping, 2017 IEEE International Conference on Computer Vision (ICCV), 2017.
DOI : 10.1109/ICCV.2017.477
URL : http://arxiv.org/pdf/1708.03088

, Ganin and V. Lempitsky. N 4 -fields: Neural network nearest neighbor fields for image transforms, ACCV, p.2014, 2014.

]. B. Hariharan, P. Arbeláez, R. Girshick, and J. Malik, Simultaneous Detection and Segmentation, ECCV, 2014.
DOI : 10.1007/978-3-319-10584-0_20

H. Gene, V. Golub, . Loan, and F. Charles, Matrix Computations, vol.3, issue.39, p.38, 1996.

]. Goodfellow, J. Pouget-abadie, M. Mirza, B. Xu, D. Warde-farley et al., Generative adversarial nets, Advances in neural information processing systems, pp.2672-2680, 2014.

]. and G. , Random Walks for Image Segmentation, PAMI, 2006.
DOI : 10.1109/TPAMI.2006.233

]. S. Gupta, R. Girshick, P. Arbelaez, and J. Malik, Learning Rich Features from RGB-D Images for Object Detection and Segmentation, ECCV, 2014.
DOI : 10.1007/978-3-319-10584-0_23

H. Richard, R. Hahnloser, . Sarpeshkar, A. Misha, . Mahowald et al., Digital selection and analogue amplification coexist in a cortex-inspired silicon circuit, Nature, vol.405, issue.6789, p.947, 2000.

P. Bharath-hariharan, R. Arbeláez, J. Girshick, and . Malik, Hypercolumns for object segmentation and fine-grained localization, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp.6-61, 2015.
DOI : 10.1109/CVPR.2015.7298642

, Konstantinos Derpanis and Iasonas Kokkinos Deep networks for saliency detection via local estimation and global search, CVPR, p.2015, 2015.

A. W. Harley and G. Konstantinos, Derpanis and Iasonas Kokkinos. Learning Dense Convolutional Embeddings for Semantic Segmentation, ICCV, p.2017, 2017.

, Kaiming He Xiangyu Zhang, Shaoqing Ren and Jian Sun Deep Residual Learning for Image Recognition, CVPR, 2016. (Cited on pages 4, pp.9-45, 2016.

]. Huang, Z. Liu, Q. Kilian, L. Weinberger, and . Van-der-maaten, Densely Connected Convolutional Networks, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), p.2017, 2017.
DOI : 10.1109/CVPR.2017.243

]. E. Ilg, N. Mayer, T. Saikia, M. Keuper, A. Dosovitskiy et al., FlowNet 2.0: Evolution of Optical Flow Estimation with Deep Networks, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2017.
DOI : 10.1109/CVPR.2017.179

, Sergey Ioe and Christian Szegedy Batch normalization: Accelerating deep network training by reducing internal covariate shift, International conference on machine learning, pp.448-456, 2015.

B. Suyog-jain, K. Xiong, and . Grauman, FusionSeg: Learning to combine motion and appearance for fully automatic segmention of generic objects in videos. arXiv preprint, p.2017, 2017.

V. Jampani, M. Kiefel, and P. V. Gehler, Learning Sparse High Dimensional Filters: Image Filtering, Dense CRFs and Bilateral Neural Networks, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp.4452-4461, 2016.
DOI : 10.1109/CVPR.2016.482

R. Varun-jampani, P. V. Gadde, and . Gehler, Video Propagation Networks, IEEE Conference on Computer Vision and Pattern Recognition (CVPR), p.2017, 2017.

J. Jancsary, S. Nowozin, T. Sharp, and C. Rother, Regression Tree Fields -An Ecient, Non-parametric Approach to Image Labeling Problems, CVPR, 2012. (Cited on pages 4, pp.18-29, 2012.

S. Jégou, M. Drozdzal, D. Vazquez, A. Romero, and Y. Bengio, The One Hundred Layers Tiramisu: Fully Convolutional DenseNets for Semantic Segmentation, 2017 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), pp.1175-1183, 2017.
DOI : 10.1109/CVPRW.2017.156

]. Jia, E. Shelhamer, J. Donahue, S. Karayev, J. Long et al., Cae: Convolutional Architecture for Fast Feature Embedding. arXiv preprint, p.2014, 2014.

X. Jin, X. Li, H. Xiao, X. Shen, Z. Lin et al., Video Scene Parsing with Predictive Feature Learning. CoRR, pp.119-2016, 1612.

A. Karpathy, G. Toderici, S. Shetty, T. Leung, R. Sukthankar et al., Large-Scale Video Classification with Convolutional Neural Networks, 2014 IEEE Conference on Computer Vision and Pattern Recognition, pp.1725-1732, 2014.
DOI : 10.1109/CVPR.2014.223

]. A. Kendall, V. Badrinarayanan, and R. Cipolla, Bayesian segnet: Model uncertainty in deep convolutional encoder-decoder architectures for scene understanding, ArXiV CoRR, p.2015, 2015.

A. Khoreva, R. Benenson, E. Ilg, T. Brox, and B. Schiele, Lucid Data Dreaming for Object Tracking. arXiv preprint, 2017.

, Iasonas Kokkinos Pushing the Boundaries of Boundary Detection using Deep Learning, ICLR, 2016, p.62, 2016.

, Iasonas Kokkinos UberNet: A Universal CNN for the joint treatment of Low-, Mid-, and High-Level Vision Problems, CVPR, p.64, 2017.

]. D. Koller and N. Friedman, Probabilistic Graphical Models: Principles and Techniques, 2007.

V. Kolmogorov, Convergent Tree-Reweighted Message Passing for Energy Minimization, IEEE Transactions on Pattern Analysis and Machine Intelligence, vol.28, issue.10, pp.1568-1583, 2006.
DOI : 10.1109/TPAMI.2006.200

V. Kolmogorov and C. Rother, Minimizing Nonsubmodular Functions with Graph Cuts-A Review, IEEE Transactions on Pattern Analysis and Machine Intelligence, vol.29, issue.7, 2007.
DOI : 10.1109/TPAMI.2007.1031

, Philipp Krähenbühl and Vladlen Koltun Ecient Inference in Fully Connected CRFs with Gaussian Edge Potentials, NIPS, pp.13-43, 2011.

A. Krizhevsky, I. Sutskever, and G. E. Hinton, ImageNet classification with deep convolutional neural networks, NIPS, 2012, 2012.
DOI : 10.1162/neco.2009.10-08-881

A. Kundu, V. Vineet, and V. Koltun, Feature Space Optimization for Semantic Video Segmentation, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp.3168-3175, 2016.
DOI : 10.1109/CVPR.2016.345

]. Laerty, A. Mccallum, C. Fernando, and . Pereira, Conditional random fields: Probabilistic models for segmenting and labeling sequence data, 2001.

, Iro Laina, Christian Rupprecht, Vasileios Belagiannis, Federico Tombari and Nassir Navab. Deeper depth prediction with fully convolutional residual networks, 3D Vision, pp.239-248, 2016.

L. ]-yann-lecun, Y. Bottou, P. Bengio, and . Haner, Gradient-based learning applied to document recognition, Proceedings of the IEEE, 1998.

, Microsoft COCO: Common objects in context, ECCV, 2014, 2014.

, Catalin Ionescu and Cristian Sminchisescu Random Fourier approximations for skewed multiplicative histogram kernels, Joint Pattern Recognition Symposium, pp.262-271, 2010.

]. Y. Li, X. Hou, C. Koch, J. M. Rehg, and A. L. Yuille, The Secrets of Salient Object Segmentation, 2014 IEEE Conference on Computer Vision and Pattern Recognition, p.64, 2014.
DOI : 10.1109/CVPR.2014.43

]. G. Li and Y. Yu, Visual saliency based on multiscale deep features, CVPR, p.64, 2015.

]. G. Li and Y. Yu, Deep Contrast Learning for Salient Object Detection, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), p.64, 2016.
DOI : 10.1109/CVPR.2016.58
URL : http://arxiv.org/pdf/1603.01976

]. Li, Y. Qi, Z. Wang, K. Chen, Z. Liu et al., Xiaoou Tang and Chen Change Loy. Video Object Segmentation with Re-identification. arXiv preprint, p.2017, 2017.

, Bibliography

]. Liang, X. Shen, J. Feng, L. Lin, and S. Yan, Semantic Object Parsing with Graph LSTM, European Conference on Computer Vision, pp.125-143, 2016.
DOI : 10.1162/neco.1997.9.8.1735

]. Liang, X. Shen, D. Xiang, J. Feng, L. Lin et al., Semantic Object Parsing with Local-Global Long Short-Term Memory, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), p.63, 2016.
DOI : 10.1109/CVPR.2016.347

]. Lin, C. Shen, I. D. Reid, and A. Van, Ecient piecewise training of deep structured models for semantic segmentation, pp.2016-2042, 2016.

]. F. Liu, C. Shen, and G. Lin, Deep convolutional neural fields for depth estimation from a single image, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp.26-32, 2015.
DOI : 10.1109/CVPR.2015.7299152

]. Liu, X. Li, P. Luo, C. Loy, and X. Tang, Semantic Image Segmentation via Deep Parsing Network, 2015 IEEE International Conference on Computer Vision (ICCV), pp.1377-1385, 2015.
DOI : 10.1109/ICCV.2015.162

]. Liu, C. Shen, G. Lin, and I. Reid, Learning Depth from Single Monocular Images Using Deep Convolutional Neural Fields, IEEE Transactions on Pattern Analysis and Machine Intelligence, vol.38, issue.10, pp.2024-2039, 2016.
DOI : 10.1109/TPAMI.2015.2505283

]. Long, E. Shelhamer, and T. Darrell, Fully convolutional networks for semantic segmentation, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp.3431-3440, 2015.
DOI : 10.1109/CVPR.2015.7298965

G. David and . Lowe, Distinctive image features from scale-invariant keypoints, International journal of computer vision, vol.60, issue.2, pp.91-110, 2004.

. Ofer-matan, J. Christopher, Y. Burges, . Lecun, S. John et al., Multi-digit recognition using a space displacement neural network, Advances in neural information processing systems, pp.488-495, 1992.

]. M. Mostajabi, P. Yadollahpour, and G. Shakhnarovich, Feedforward semantic segmentation with zoom-out features, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), p.2015, 2015.
DOI : 10.1109/CVPR.2015.7298959

, Alejandro Newell and Jia Deng Associative Embedding: End-to-End Learning for Joint Detection and Grouping, 1611.

D. Nilsson and C. Sminchisescu, Semantic Video Segmentation by Gated Recurrent Flow Propagation, CoRR, vol.abs, 1612.

]. H. Noh, S. Hong, and B. Han, Learning Deconvolution Network for Semantic Segmentation, 2015 IEEE International Conference on Computer Vision (ICCV), 2015.
DOI : 10.1109/ICCV.2015.178

. Sebastian-nowozin, H. Christoph, and . Lampert, Structured learning and prediction in computer vision, Foundations and Trends R ? in Computer Graphics and Vision, vol.64, issue.3, pp.185-365, 2011.

]. G. Papandreou and A. Yuille, Gaussian Sampling by Local Perturbations, Proc. Int. Conf. on Neural Information Processing Systems (NIPS), pp.1858-1866, 2010.

]. G. Papandreou and A. Yuille, Perturb-and-MAP random fields: Using discrete optimization to learn and sample from energy models, 2011 International Conference on Computer Vision, pp.193-200, 2011.
DOI : 10.1109/ICCV.2011.6126242

]. F. Perazzi, J. Pont-tuset, B. Mcwilliams, L. Van-gool, M. Gross et al., A Benchmark Dataset and Evaluation Methodology for Video Object Segmentation, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), p.2016, 2016.
DOI : 10.1109/CVPR.2016.85

]. F. Perazzi, A. Khoreva, R. Benenson, B. Schiele, and A. Sorkine-hornung, Learning Video Object Segmentation from Static Images, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), p.71, 2017.
DOI : 10.1109/CVPR.2017.372
URL : http://arxiv.org/pdf/1612.02646

, Florent Perronnin, Jorge Sánchez and Thomas Mensink Improving the fisher kernel for large-scale image classification, European conference on computer vision, pp.143-156, 2010.

]. P. Pinheiro and R. Collobert, Recurrent convolutional neural networks for scene labeling, ICML, 2014.

F. Perazzi, S. Caelles, and P. Arbeláez, Alexander Sorkine-Hornung and Luc Van Gool. The 2017 DAVIS Challenge on Video Object Segmentation, p.2017

S. A. Teukolsky, W. T. Vetterling, and B. P. Flannery, Numerical recipes in c, p.38, 1992.

]. and R. , On the hardness of approximate reasoning, Artificial Intelligence, vol.82, issue.1-2, pp.273-302, 1996.
DOI : 10.1016/0004-3702(94)00092-1

, Bibliography

, Rue and L. Held. Gaussian Markov random fields: Theory and applications

&. Chapman and . Hall, , pp.14-39, 2005.

, Deep Learning in Neural Networks: An Overview, Neural Networks, vol.61, pp.85-117

G. Alexander, R. Schwing, and . Urtasun, Fully connected deep structured networks. arXiv preprint, 2015.

]. Sermanet and K. Kavukcuoglu, Soumith Chintala and Yann LeCun. Pedestrian detection with unsupervised multi-stage feature learning, Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp.3626-3633, 2013.

]. Shelhamer, K. Rakelly, J. Homan, and T. Darrell, Clockwork Convnets for Video Semantic Segmentation, CoRR, vol.abs, 1608.

]. and R. Shewchuk, An Introduction to the Conjugate Gradient Method Without the Agonizing Pain, pp.34-52, 1994.

J. Shotton, J. Winn, C. Rother, and A. Criminisi, TextonBoost for Image Understanding: Multi-Class Object Recognition and Segmentation by Jointly Modeling Texture, Layout, and Context, International Journal of Computer Vision, vol.62, issue.1???2, pp.2-23, 2009.
DOI : 10.1007/s11263-005-4635-4

, Very deep convolutional networks for large-scale image recognition. ICLR, 2015, 2015.

C. Sutton and A. Mccallum, Piecewise training for undirected models. arXiv preprint, 2012.

]. Sutton and A. Mccallum, An Introduction to Conditional Random Fields, Machine Learning, pp.267-373, 2012.
DOI : 10.1561/2200000013

F. Marshall, C. Tappen, E. H. Liu, W. T. Adelson, and . Freeman, Learning Gaussian Conditional Random Fields for Low-Level Vision, CVPR, pp.14-29, 2007.

F. Marshall, . Tappen, G. Kegan, . Samuel, V. Craig et al., The logistic random field -A convenient graphical model for learning parameters for MRF-based labeling, Computer Vision and Pattern Recognition CVPR 2008. IEEE Conference on, pp.1-8, 2008.

]. Taskar, V. Chatalbashev, and D. Koller, Learning associative Markov networks, Twenty-first international conference on Machine learning , ICML '04, p.102, 2004.
DOI : 10.1145/1015330.1015444
URL : http://www.aicml.cs.ualberta.ca/banff04/icml/pages/papers/394.ps

]. A. Vedaldi and A. Zisserman, Ecient Additive Kernels via Explicit Feature Maps, Proceedings of the IEEE Conf. on Computer Vision and Pattern Recognition (CVPR), 2010.
DOI : 10.1109/cvpr.2010.5539949
URL : http://eprints.pascal-network.org/archive/00006964/01/vedaldi10.pdf

]. Vemulapalli, O. Tuzel, and M. Liu, Deep Gaussian Conditional Random Field Network: A Model-Based Deep Network for Discriminative Denoising, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2016.
DOI : 10.1109/CVPR.2016.519

]. Vemulapalli, O. Tuzel, M. Liu, and R. Chellapa, Gaussian Conditional Random Field Network for Semantic Segmentation, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp.29-31, 2016.
DOI : 10.1109/CVPR.2016.351

]. Vineet, J. Warrell, P. Sturgess, H. Philip, and . Torr, Improved Initialization and Gaussian Mixture Pairwise Terms for Dense Random Fields with Mean-field Inference, Procedings of the British Machine Vision Conference 2012, 2013.
DOI : 10.5244/C.26.73
URL : http://www.bmva.org/bmvc/2012/BMVC/paper073/paper073.pdf

F. Visin, M. Ciccone, A. Romero, K. Kastner, K. Cho et al., ReSeg: A Recurrent Neural Network-Based Model for Semantic Segmentation, 2016 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), p.2016, 2016.
DOI : 10.1109/CVPRW.2016.60
URL : http://arxiv.org/pdf/1511.07053

, Paul Voigtlaender and Bastian Leibe Online Adaptation of Convolutional Neural Networks for Video Object Segmentation, BMVC, 2017, pp.71-78, 2017.

, Context-aware CNNs for person head detection, ICCV, pp.2893-2901, 2015.

J. Martin, M. I. Wainwright, and . Jordan, Graphical Models , Exponential Families, and Variational Inference, Found. Trends Mach. Learn, vol.1, issue.1-2, pp.136-138, 2008.

]. K. Wang, L. Lin, J. Lu, C. Li, and K. Shi, PISA: Pixelwise Image Saliency by Aggregating Complementary Appearance Contrast Measures With Edge-Preserving Coherence, IEEE Transactions on Image Processing, vol.24, issue.10, 2015.
DOI : 10.1109/TIP.2015.2432712
URL : http://arxiv.org/pdf/1505.03227

]. L. Wang, H. Lu, X. Ruan, and M. Yang, Deep networks for saliency detection via local estimation and global search, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), p.64, 2015.
DOI : 10.1109/CVPR.2015.7298938
URL : http://www.cv-foundation.org/openaccess/content_cvpr_2015/papers/Wang_Deep_Networks_for_2015_CVPR_paper.pdf

, Bibliography

, Shenlong Wang, Sanja Fidler and Raquel Urtasun Proximal deep structured models, Advances in Neural Information Processing Systems, pp.865-873, 2016.

]. Wang, R. B. Girshick, A. Gupta, and K. He, local Neural Networks. CoRR, 1711.

C. John and . Platt, Postal address block location using a convolutional locator network, Advances in Neural Information Processing Systems, pp.745-752, 1994.

]. F. Xia, P. Wang, L. Chen, and A. L. Yuille, Zoom better to see clearer: Human part segmentation with auto zoom net, ECCV, p.63, 2016.
DOI : 10.1007/978-3-319-46454-1_39
URL : http://arxiv.org/pdf/1511.06881

Z. Saining-xie and . Tu, Holistically-nested edge detection, ICCV, pp.1395-1403, 2015.

Y. Fisher and V. Koltun, Multi-scale context aggregation by dilated convolutions. ICLR, 2016, 2016.

]. Yu, C. Feng, M. Liu, and S. Ramalingam, CASENet: Deep Category-Aware Semantic Edge Detection. arXiv preprint, p.2017, 2017.
DOI : 10.1109/cvpr.2017.191
URL : http://arxiv.org/pdf/1705.09759

T. Yue, F. Finley, T. Radlinski, and . Joachims, A support vector method for optimizing average precision, Proceedings of the 30th annual international ACM SIGIR conference on Research and development in information retrieval, SIGIR '07, pp.271-278, 2007.
DOI : 10.1145/1277741.1277790
URL : http://radlinski.org/papers/YueEtAl_SIGIR2007.pdf

, Sergey Zagoruyko and Nikos Komodakis. Wide Residual Networks, BMVC, 2016, 2016.

]. R. Zhao, W. Ouyang, H. Li, and X. Wang, Saliency detection by multicontext deep learning, CVPR, p.64, 2015.
DOI : 10.1109/cvpr.2015.7298731

H. Zhao, J. Shi, X. Qi, X. Wang, and J. Jia, Pyramid Scene Parsing Network. CoRR, vol.63, pp.71-82, 1105.

S. Shuai-zheng, B. Jayasumana, V. Romera-paredes, Z. Vineet, D. Su et al., Conditional Random Fields as Recurrent Neural Networks, ICCV, 2015. (Cited on pages 13, pp.31-40, 2015.

, Bibliography

. Zhu-2016-]-xizhou, Y. Zhu, J. Xiong, L. Dai, Y. Yuan et al., Deep Feature Flow for Video Recognition, CoRR, vol.abs, 1611.