Skip to Main content Skip to Navigation

Deep similarity metric learning for multiple object tracking

Abstract : Multiple object tracking, i.e. simultaneously tracking multiple objects in the scene, is an important but challenging visual task. Objects should be accurately detected and distinguished from each other to avoid erroneous trajectories. Since remarkable progress has been made in object detection field, “tracking-by-detection” approaches are widely adopted in multiple object tracking research. Objects are detected in advance and tracking reduces to an association problem: linking detections of the same object through frames into trajectories. Most tracking algorithms employ both motion and appearance models for data association. For multiple object tracking problems where exist many objects of the same category, a fine-grained discriminant appearance model is paramount and indispensable. Therefore, we propose an appearance-based re-identification model using deep similarity metric learning to deal with multiple object tracking in mono-camera videos. Two main contributions are reported in this dissertation: First, a deep Siamese network is employed to learn an end-to-end mapping from input images to a discriminant embedding space. Different metric learning configurations using various metrics, loss functions, deep network structures, etc., are investigated, in order to determine the best re-identification model for tracking. In addition, with an intuitive and simple classification design, the proposed model achieves satisfactory re-identification results, which are comparable to state-of-the-art approaches using triplet losses. Our approach is easy and fast to train and the learned embedding can be readily transferred onto the domain of tracking tasks. Second, we integrate our proposed re-identification model in multiple object tracking as appearance guidance for detection association. For each object to be tracked in a video, we establish an identity-related appearance model based on the learned embedding for re-identification. Similarities among detected object instances are exploited for identity classification. The collaboration and interference between appearance and motion models are also investigated. An online appearance-motion model coupling is proposed to further improve the tracking performance. Experiments on Multiple Object Tracking Challenge benchmark prove the effectiveness of our modifications, with a state-of-the-art tracking accuracy.
Document type :
Complete list of metadata

Cited literature [122 references]  Display  Hide  Download
Contributor : Abes Star :  Contact
Submitted on : Thursday, July 16, 2020 - 12:04:16 PM
Last modification on : Tuesday, June 1, 2021 - 2:08:09 PM
Long-term archiving on: : Tuesday, December 1, 2020 - 8:09:19 PM


Version validated by the jury (STAR)


  • HAL Id : tel-02900639, version 1


Bonan Cuan. Deep similarity metric learning for multiple object tracking. Artificial Intelligence [cs.AI]. Université de Lyon, 2019. English. ⟨NNT : 2019LYSEI065⟩. ⟨tel-02900639⟩



Record views


Files downloads