M. Abadi, P. Barham, J. Chen, Z. Chen, A. Davis et al., Tensorflow: A system for large-scale machine learning, 12th USENIX Symposium on Operating Systems Design and Implementation (OSDI 16), pp.265-283, 2016.

A. E. Abdel-hakim and A. A. Farag, Csift: A sift descriptor with color invariant characteristics, IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06), vol.2, pp.1978-1983, 2006.

R. Achanta, A. Shaji, K. Smith, A. Lucchi, P. Fua et al., Slic superpixels compared to state-of-the-art superpixel methods, IEEE transactions, vol.34, issue.11, pp.2274-2282, 2012.

T. Ahonen, A. Hadid, and M. Pietikäinen, Face recognition with local binary patterns, European conference on computer vision, pp.469-481, 2004.

T. Ahonen, A. Hadid, and M. Pietikainen, Face description with local binary patterns: Application to face recognition, IEEE Transactions on Pattern Analysis & Machine Intelligence, issue.12, pp.2037-2041, 2006.

B. Andres, A. Fuksová, and J. Lange, Lifting of multicuts, p.3, 2015.

B. Babenko, M. Yang, and S. Belongie, Visual tracking with online multiple instance learning, Computer Vision and Pattern Recognition, pp.983-990, 2009.

B. Babenko, M. Yang, and S. Belongie, Robust object tracking with online multiple instance learning, IEEE transactions on pattern analysis and machine intelligence, vol.33, pp.1619-1632, 2011.

O. Barkan, J. Weill, L. Wolf, A. , and H. , Fast high dimensional vector multiplication face recognition, Proceedings of the IEEE International Conference on Computer Vision, pp.1960-1967, 2013.

P. N. Belhumeur, J. P. Hespanha, and D. J. Kriegman, Eigenfaces vs. fisherfaces: Recognition using class specific linear projection, IEEE Transactions, vol.19, issue.7, pp.711-720, 1997.

Y. Y. Boykov and M. Jolly, Interactive graph cuts for optimal boundary & region segmentation of objects in nd images, Proceedings eighth IEEE international conference on computer vision. ICCV, vol.1, pp.105-112, 2001.

J. Bromley, J. W. Bentz, L. Bottou, I. Guyon, Y. Lecun et al., Signature verification using a "siamese, time delay neural network. IJPRAI, vol.7, issue.4, pp.669-688, 1993.

A. A. Butt and R. T. Collins, Multi-target tracking by lagrangian relaxation to min-cost network flow, Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp.1846-1853, 2013.

J. Canny, A computational approach to edge detection, IEEE Transactions, issue.6, pp.679-698, 1986.

J. Carreira and C. Sminchisescu, Constrained parametric min-cuts for automatic object segmentation, Computer Vision and Pattern Recognition (CVPR), 2010 IEEE Conference on, pp.3241-3248, 2010.

G. Chechik, V. Sharma, U. Shalit, and S. Bengio, Large scale online learning of image similarity through ranking, Journal of Machine Learning Research, vol.11, pp.1109-1135, 2010.

W. Chen, X. Chen, J. Zhang, and K. Huang, A multi-task deep network for person re-identification, Thirty-First AAAI Conference on Artificial Intelligence, 2017.

Y. Chen, X. Zhu, and S. Gong, Person re-identification by deep learning multi-scale representations, Proceedings of the IEEE International Conference on Computer Vision, pp.2590-2600, 2017.

N. Chiba and T. Nishizeki, Arboricity and subgraph listing algorithms, SIAM Journal on computing, vol.14, issue.1, pp.210-223, 1985.

S. Chopra, R. Hadsell, and Y. Lecun, Learning a similarity metric discriminatively, with application to face verification, Computer Vision and Pattern Recognition, 2005. CVPR 2005. IEEE Computer Society Conference on, vol.1, pp.539-546, 2005.

Q. Chu, W. Ouyang, H. Li, X. Wang, B. Liu et al., Online multiobject tracking using cnn-based single object tracker with spatial-temporal attention mechanism, Proceedings of the IEEE International Conference on Computer Vision, pp.4836-4845, 2017.

B. Cuan, K. Idrissi, and C. Garcia, Deep siamese network for multiple object tracking, IEEE 20th International Workshop on Multimedia Signal Processing (MMSP), pp.1-6, 2018.
URL : https://hal.archives-ouvertes.fr/hal-01807982

N. Dalal and B. Triggs, Histograms of oriented gradients for human detection, Computer Vision and Pattern Recognition, 2005. CVPR 2005. IEEE Computer Society Conference on, vol.1, pp.886-893, 2005.
URL : https://hal.archives-ouvertes.fr/inria-00548512

J. G. Daugman, Uncertainty relation for resolution in space, spatial frequency, and orientation optimized by two-dimensional visual cortical filters, JOSA A, vol.2, issue.7, pp.1160-1169, 1985.

A. Dehghan, Y. Tian, P. H. Torr, and M. Shah, Target identity-aware network flow for online multiple target tracking, Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp.1146-1154, 2015.

J. Deng, A. Berg, S. Satheesh, H. Su, A. Khosla et al., , 2012.

M. Everingham, L. Van-gool, C. K. Williams, J. Winn, and A. Zisserman, The pascal visual object classes (voc) challenge, International Journal of Computer Vision, vol.88, issue.2, pp.303-338, 2010.

P. Felzenszwalb, D. Mcallester, and D. Ramanan, A discriminatively trained, multiscale, deformable part model, 2008.

P. F. Felzenszwalb, R. B. Girshick, D. Mcallester, and D. Ramanan, Object detection with discriminatively trained part-based models, IEEE transactions on pattern analysis and machine intelligence, vol.32, pp.1627-1645, 2010.

C. Garcia and M. Delakis, Convolutional face finder: A neural architecture for fast and robust face detection, IEEE Transactions, vol.26, issue.11, pp.1408-1423, 2004.

M. Geng, Y. Wang, T. Xiang, and Y. Tian, Deep transfer learning for person re-identification, 2016.

N. Gheissari, T. B. Sebastian, and R. Hartley, Person reidentification using spatiotemporal appearance, IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06), vol.2, pp.1528-1535, 2006.

A. Gilbert and R. Bowden, Tracking objects across cameras by incrementally learning inter-camera colour calibration and patterns of activity, European conference on computer vision, pp.125-136, 2006.

R. Girshick, Fast r-cnn, Proceedings of the IEEE International Conference on Computer Vision, pp.1440-1448, 2015.

R. Girshick, J. Donahue, T. Darrell, M. , and J. , Rich feature hierarchies for accurate object detection and semantic segmentation, Proceedings of the IEEE conference on computer vision and pattern recognition, pp.580-587, 2014.

R. Girshick, F. Iandola, T. Darrell, M. , and J. , Deformable part models are convolutional neural networks, Proceedings of the IEEE conference on Computer Vision and Pattern Recognition, pp.437-446, 2015.

I. Goodfellow, Y. Bengio, and A. Courville, Deep Learning, 2016.

R. Hadsell, S. Chopra, and Y. Lecun, Dimensionality reduction by learning an invariant mapping, IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06), vol.2, pp.1735-1742, 2006.

K. He, G. Gkioxari, P. Dollár, and R. Girshick, Mask r-cnn, Proceedings of the IEEE international conference on computer vision, pp.2961-2969, 2017.

K. He and J. Sun, Convolutional neural networks at constrained time cost, Proceedings of the IEEE conference on computer vision and pattern recognition, pp.5353-5360, 2015.

K. He, X. Zhang, S. Ren, and J. Sun, Spatial pyramid pooling in deep convolutional networks for visual recognition, European Conference on Computer Vision, pp.346-361, 2014.

K. He, X. Zhang, S. Ren, and J. Sun, Deep residual learning for image recognition, Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp.770-778, 2016.

M. Heikkilä, M. Pietikäinen, and C. Schmid, Description of interest regions with center-symmetric local binary patterns, Computer vision, graphics and image processing, pp.58-69, 2006.

A. Hermans, L. Beyer, and B. Leibe, defense of the triplet loss for person re-identification, 2017.

G. Hinton, Neural networks for machine learning coursera video lecturesgeoffrey hinton, 2012.

G. E. Hinton, N. Srivastava, A. Krizhevsky, I. Sutskever, and R. R. Salakhutdinov, Improving neural networks by preventing co-adaptation of feature detectors, 2012.

K. Hornik, M. Stinchcombe, and H. White, Multilayer feedforward networks are universal approximators, Neural networks, vol.2, issue.5, pp.359-366, 1989.

W. Hu, W. Li, X. Zhang, and S. Maybank, Single and multiple object tracking using a multi-feature joint sparse representation, IEEE transactions, vol.37, issue.4, pp.816-833, 2014.

M. Jaderberg, K. Simonyan, and A. Zisserman, Spatial transformer networks, Advances in neural information processing systems, pp.2017-2025, 2015.

K. Jarrett, K. Kavukcuoglu, and Y. Lecun, What is the best multi-stage architecture for object recognition? In Computer Vision, IEEE 12th International Conference on, pp.2146-2153, 2009.

O. Javed, K. Shafique, and M. Shah, Appearance modeling for tracking in multiple non-overlapping cameras, IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05), vol.2, pp.26-33, 2005.

Y. Jia, E. Shelhamer, J. Donahue, S. Karayev, J. Long et al., Caffe: Convolutional architecture for fast feature embedding, 2014.

C. Kim, F. Li, A. Ciptadi, and J. M. Rehg, Multiple hypothesis tracking revisited, Proceedings of the IEEE International Conference on Computer Vision, pp.4696-4704, 2015.

A. Krizhevsky, I. Sutskever, and G. E. Hinton, Imagenet classification with deep convolutional neural networks, Advances in neural information processing systems, pp.1097-1105, 2012.

J. Lafferty, A. Mccallum, and F. Pereira, Conditional random fields: Probabilistic models for segmenting and labeling sequence data, Proceedings of the eighteenth international conference on machine learning, ICML, vol.1, pp.282-289, 2001.

L. Leal-taixé, C. Canton-ferrer, and K. Schindler, Learning by tracking: Siamese cnn for robust target association, Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops, pp.33-40, 2016.

F. Li, T. Kim, A. Humayun, D. Tsai, and J. M. Rehg, Video segmentation by tracking many figure-ground segments, Proceedings of the IEEE International Conference on Computer Vision, pp.2192-2199, 2013.

W. Li, R. Zhao, T. Xiao, W. , and X. , Deepreid: Deep filter pairing neural network for person re-identification, Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp.152-159, 2014.

W. Li, X. Zhu, and S. Gong, Person re-identification by deep joint learning of multi-loss classification, 2017.

S. Liao, Y. Hu, X. Zhu, and S. Z. Li, Person re-identification by local maximal occurrence representation and metric learning, Proceedings of the IEEE conference on computer vision and pattern recognition, pp.2197-2206, 2015.

T. Lin, P. Dollár, R. Girshick, K. He, B. Hariharan et al., Feature pyramid networks for object detection, Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp.2117-2125, 2017.

T. Lin, M. Maire, S. Belongie, J. Hays, P. Perona et al., Microsoft coco: Common objects in context, European conference on computer vision, pp.740-755, 2014.

W. Liu, D. Anguelov, D. Erhan, C. Szegedy, S. Reed et al., Ssd: Single shot multibox detector, European conference on computer vision, pp.21-37, 2016.

J. Long, E. Shelhamer, D. , and T. , Fully convolutional networks for semantic segmentation, Proceedings of the IEEE conference on computer vision and pattern recognition, pp.3431-3440, 2015.

D. G. Lowe, Perceptual organization and visual recognition, 1985.

D. G. Lowe, Object recognition from local scale-invariant features, The proceedings of the seventh IEEE international conference on, vol.2, pp.1150-1157, 1999.

A. Milan, L. Leal-taixé, I. Reid, S. Roth, and K. Schindler, Mot16: A benchmark for multi-object tracking, 2016.

A. Milan, S. Roth, and K. Schindler, Continuous energy minimization for multitarget tracking, IEEE transactions on pattern analysis and machine intelligence, vol.36, pp.58-72, 2013.

M. Minsky and S. Papert, , 1969.

V. Nair and G. E. Hinton, Rectified linear units improve restricted boltzmann machines, Proceedings of the 27th international conference on machine learning (ICML-10), pp.807-814, 2010.

A. Ng, Machine learning and ai via brain simulations, 2013.

H. V. Nguyen and L. Bai, Cosine similarity metric learning for face verification, Asian Conference on Computer Vision, pp.709-720, 2010.

S. J. Pan and Q. Yang, A survey on transfer learning, IEEE Transactions on knowledge and data engineering, vol.22, issue.10, pp.1345-1359, 2010.

A. Paszke, S. Gross, S. Chintala, G. Chanan, E. Yang et al., Automatic differentiation in pytorch, NIPS-W, 2017.

A. A. Perera, C. Srinivas, A. Hoogs, G. Brooksby, and W. Hu, Multi-object tracking through simultaneous long occlusions and split-merge conditions, Computer Vision and Pattern Recognition, vol.1, pp.666-673, 2006.

M. Piccardi, Background subtraction techniques: a review, Systems, man and cybernetics, vol.4, pp.3099-3104, 2004.

H. Pirsiavash, D. Ramanan, and C. C. Fowlkes, Globally-optimal greedy algorithms for tracking a variable number of objects, CVPR 2011, pp.1201-1208, 2011.

L. Pishchulin, E. Insafutdinov, S. Tang, B. Andres, M. Andriluka et al., Deepcut: Joint subset partition and labeling for multi person pose estimation, Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp.4929-4937, 2016.

B. J. Prosser, S. Gong, and T. Xiang, Multi-camera matching using bidirectional cumulative brightness transfer functions, BMVC, vol.8, p.74, 2008.

B. J. Prosser, W. Zheng, S. Gong, T. Xiang, M. et al., Person reidentification by support vector ranking, BMVC, vol.2, p.6, 2010.

J. Redmon, S. Divvala, R. Girshick, and A. Farhadi, You only look once: Unified, real-time object detection, Proceedings of the IEEE conference on computer vision and pattern recognition, pp.779-788, 2016.

J. Redmon and A. Farhadi, Yolo9000: better, faster, stronger, Proceedings of the IEEE conference on computer vision and pattern recognition, pp.7263-7271, 2017.

J. Redmon and A. Farhadi, Yolov3: An incremental improvement. arXiv, 2018.

S. Ren, K. He, R. Girshick, and J. Sun, Faster r-cnn: Towards real-time object detection with region proposal networks, Advances in neural information processing systems, pp.91-99, 2015.

S. Ren, K. He, R. Girshick, X. Zhang, and J. Sun, Object detection networks on convolutional feature maps, IEEE transactions on pattern analysis and machine intelligence, vol.39, pp.1476-1481, 2017.

O. Rippel, M. Paluri, P. Dollar, and L. Bourdev, Metric learning with adaptive density discrimination, 2015.

F. Rosenblatt, The perceptron: A probabilistic model for information storage and organization in the brain, Psychological review, vol.65, issue.6, p.386, 1958.

D. E. Rumelhart, G. E. Hinton, and R. J. Williams, Learning representations by back-propagating errors, Cognitive modeling, vol.5, issue.3, p.1, 1988.

O. Russakovsky, J. Deng, H. Su, J. Krause, S. Satheesh et al., ImageNet Large Scale Visual Recognition Challenge, International Journal of Computer Vision (IJCV), vol.115, issue.3, pp.211-252, 2015.

F. Schroff, D. Kalenichenko, and J. Philbin, Facenet: A unified embedding for face recognition and clustering, Proceedings of the IEEE conference on computer vision and pattern recognition, pp.815-823, 2015.

W. R. Schwartz and L. S. Davis, Learning discriminative appearance-based models using partial least squares, 2009 XXII Brazilian symposium on computer graphics and image processing, pp.322-329, 2009.

A. V. Segal and I. Reid, Latent data association: Bayesian model selection for multi-target tracking, Proceedings of the IEEE International Conference on Computer Vision, pp.2904-2911, 2013.

E. Simo-serra, E. Trulls, L. Ferraz, I. Kokkinos, P. Fua et al., Discriminative learning of deep convolutional feature point descriptors, Proceedings of the IEEE International Conference on Computer Vision, pp.118-126, 2015.
URL : https://hal.archives-ouvertes.fr/hal-02432714

K. Simonyan and A. Zisserman, Very deep convolutional networks for largescale image recognition, 2014.

L. Sirovich and M. Kirby, Low-dimensional procedure for the characterization of human faces, vol.4, pp.519-524, 1987.

I. Sobel and G. Feldman, A 3x3 isotropic gradient operator for image processing. a talk at the Stanford Artificial Project in, pp.271-272, 1968.

R. K. Srivastava, K. Greff, and J. Schmidhuber, , 2015.

R. K. Srivastava, K. Greff, and J. Schmidhuber, Training very deep networks, Advances in neural information processing systems, pp.2377-2385, 2015.

C. Szegedy, W. Liu, Y. Jia, P. Sermanet, S. Reed et al., Going deeper with convolutions, Proceedings of the IEEE conference on computer vision and pattern recognition, pp.1-9, 2015.

S. Tang, B. Andres, M. Andriluka, and B. Schiele, Subgraph decomposition for multi-target tracking, Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp.5033-5041, 2015.

S. Tang, B. Andres, M. Andriluka, and B. Schiele, Multi-person tracking by multicut and deep matching, European Conference on Computer Vision, pp.100-111, 2016.

S. Tang, M. Andriluka, B. Andres, and B. Schiele, Multiple people tracking by lifted multicut and person re-identification, Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp.3539-3548, 2017.

J. R. Uijlings, K. E. Van-de-sande, T. Gevers, and A. W. Smeulders, Selective search for object recognition, International journal of computer vision, vol.104, issue.2, pp.154-171, 2013.

A. Veit, M. J. Wilber, and S. Belongie, Residual networks behave like ensembles of relatively shallow networks, Advances in Neural Information Processing Systems, pp.550-558, 2016.

P. Viola and M. Jones, Rapid object detection using a boosted cascade of simple features, Proceedings of the 2001 IEEE Computer Society Conference on, vol.1, 2001.

X. Wang, Intelligent multi-camera video surveillance: A review, Pattern recognition letters, vol.34, issue.1, pp.3-19, 2013.

X. Wang, G. Doretto, T. Sebastian, J. Rittscher, and P. Tu, Shape and appearance context modeling, ieee 11th international conference on computer vision, pp.1-8, 2007.

K. Q. Weinberger and L. K. Saul, Distance metric learning for large margin nearest neighbor classification, Journal of Machine Learning Research, vol.10, pp.207-244, 2009.

P. Weinzaepfel, J. Revaud, Z. Harchaoui, and C. Schmid, Deepflow: Large displacement optical flow with deep matching, Proceedings of the IEEE International Conference on Computer Vision, pp.1385-1392, 2013.
URL : https://hal.archives-ouvertes.fr/hal-00873592

P. Werbos, Beyond regression: New tools for prediction and analysis in the behavioral sciences, 1974.

N. Wojke and A. Bewley, Deep cosine metric learning for person reidentification, 2018 IEEE winter conference on applications of computer vision (WACV), pp.748-756, 2018.

F. Yang, W. Choi, L. , and Y. , Exploit all the layers: Fast and accurate cnn object detector with scale dependent pooling and cascaded rejection classifiers, Proceedings of the IEEE conference on computer vision and pattern recognition, pp.2129-2137, 2016.

A. Yilmaz, O. Javed, and M. Shah, Object tracking: A survey, Acm computing surveys (CSUR), vol.38, p.13, 2006.

W. Zajdel, Z. Zivkovic, and B. Krose, Keeping track of humans: Have i seen this person before?, Proceedings of the 2005 IEEE International Conference on Robotics and Automation, pp.2081-2086, 2005.

A. R. Zamir, A. Dehghan, and M. Shah, Gmcp-tracker: Global multi-object tracking using generalized minimum clique graphs, European Conference on Computer Vision, pp.343-356, 2012.

L. Zhang, Y. Li, and R. Nevatia, Global data association for multi-object tracking using network flows, 2008 IEEE Conference on Computer Vision and Pattern Recognition, pp.1-8, 2008.

L. Zheng, Triangular Similarity Metric Learning: a Siamese Architecture Approach, 2016.
URL : https://hal.archives-ouvertes.fr/tel-01314392

L. Zheng, K. Idrissi, C. Garcia, S. Duffner, and A. Baskurt, Triangular similarity metric learning for face verification, 11th IEEE International Conference and Workshops on, vol.1, pp.1-7, 2015.
URL : https://hal.archives-ouvertes.fr/hal-01158908

L. Zheng, L. Shen, L. Tian, S. Wang, J. Wang et al., Scalable person re-identification: A benchmark, Proceedings of the IEEE International Conference on Computer Vision, pp.1116-1124, 2015.

L. Zheng, Y. Yang, and A. G. Hauptmann, Person re-identification: Past, present and future, 2016.

Z. Zhong, L. Zheng, D. Cao, L. , and S. , Re-ranking person re-identification with k-reciprocal encoding, Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp.1318-1327, 2017.

Z. Zhong, L. Zheng, G. Kang, S. Li, Y. et al., Random erasing data augmentation, 2017.