Skip to Main content Skip to Navigation

Handling the speed-accuracy trade-off in deep-learning based pedestrian detection systems

Ujjwal Ujjwal 1
1 STARS - Spatio-Temporal Activity Recognition Systems
CRISAM - Inria Sophia Antipolis - Méditerranée
Abstract : The main objective of this thesis is to improve the detection performance of deep learning based pedestrian detection systems without sacrificing detection speed. Detection speed and accuracy are traditionally known to be at trade-off with one another. Thus, this thesis aims to handle this trade-off in a way that amounts to faster and better pedestrian detection. To achieve this, we first conduct a systematic quantitative analysis of various deep learning techniques with respect to pedestrian detection. This analysis allows us to identify the optimal configuration of various deep learning components of a pedestrian detection pipeline. We then consider the important question of convolutional layer selection for pedestrian detection and propose a pedestrian detection system called Multiple-RPN, which utilizes multiple convolutional layers simultaneously. We propose Multiple-RPN in two configurations -- early-fused and late-fused; and go on to demonstrate that early fusion is a better approach than late fusion for detection across scales and occlusion levels of pedestrians. This work furthermore, provides a quantitative demonstration of the selectivity of various convolutional layers to pedestrian scale and occlusion levels. We next, integrate the early fusion approach with that of pseudo-semantic segmentation to reduce the number of processing operations. In this approach, pseudo-semantic segmentation is shown to reduce false positives and false negatives. This coupled with reduced number of processing operations results in improved detection performance and speed (~20 fps) simultaneously; performing at state-of-art level on caltechreasonable (3.79% miss-rate) and citypersons (7.19% miss-rate) datasets. The final contribution in this thesis is that of an anchor classification layer, which further reduces the number of processing operations for detection. The result is doubling of detection speed (~40 fps) with a minimal loss in detection performance (3.99% and 8.12% miss-rate in caltech-reasonable and citypersons datasets respectively) which is still at the state-of-art standard.
Document type :
Complete list of metadatas

Cited literature [194 references]  Display  Hide  Download
Contributor : Abes Star :  Contact
Submitted on : Tuesday, July 7, 2020 - 4:05:10 PM
Last modification on : Wednesday, July 8, 2020 - 11:39:54 AM


Version validated by the jury (STAR)


  • HAL Id : tel-02416418, version 2



Ujjwal Ujjwal. Handling the speed-accuracy trade-off in deep-learning based pedestrian detection systems. Artificial Intelligence [cs.AI]. COMUE Université Côte d'Azur (2015 - 2019), 2019. English. ⟨NNT : 2019AZUR4087⟩. ⟨tel-02416418v2⟩



Record views


Files downloads