Skip to Main content Skip to Navigation

Weight parameterizations in deep neural networks

Abstract : Multilayer neural networks were first proposed more than three decades ago, and various architectures and parameterizations were explored since. Recently, graphics processing units enabled very efficient neural network training, and allowed training much larger networks on larger datasets, dramatically improving performance on various supervised learning tasks. However, the generalization is still far from human level, and it is difficult to understand on what the decisions made are based. To improve on generalization and understanding we revisit the problems of weight parameterizations in deep neural networks. We identify the most important, to our mind, problems in modern architectures: network depth, parameter efficiency, and learning multiple tasks at the same time, and try to address them in this thesis. We start with one of the core problems of computer vision, patch matching, and propose to use convolutional neural networks of various architectures to solve it, instead of manual hand-crafting descriptors. Then, we address the task of object detection, where a network should simultaneously learn to both predict class of the object and the location. In both tasks we find that the number of parameters in the network is the major factor determining it's performance, and explore this phenomena in residual networks. Our findings show that their original motivation, training deeper networks for better representations, does not fully hold, and wider networks with less layers can be as effective as deeper with the same number of parameters. Overall, we present an extensive study on architectures and weight parameterizations, and ways of transferring knowledge between them
Complete list of metadata

Cited literature [172 references]  Display  Hide  Download
Contributor : ABES STAR :  Contact
Submitted on : Friday, March 29, 2019 - 12:54:22 PM
Last modification on : Saturday, January 15, 2022 - 3:58:50 AM
Long-term archiving on: : Sunday, June 30, 2019 - 2:36:57 PM


Version validated by the jury (STAR)


  • HAL Id : tel-02084044, version 1



Sergey Zagoruyko. Weight parameterizations in deep neural networks. Neural and Evolutionary Computing [cs.NE]. Université Paris-Est, 2018. English. ⟨NNT : 2018PESC1129⟩. ⟨tel-02084044⟩



Record views


Files downloads