Weight parameterizations in deep neural networks

Abstract : Multilayer neural networks were first proposed more than three decades ago, and various architectures and parameterizations were explored since. Recently, graphics processing units enabled very efficient neural network training, and allowed training much larger networks on larger datasets, dramatically improving performance on various supervised learning tasks. However, the generalization is still far from human level, and it is difficult to understand on what the decisions made are based. To improve on generalization and understanding we revisit the problems of weight parameterizations in deep neural networks. We identify the most important, to our mind, problems in modern architectures: network depth, parameter efficiency, and learning multiple tasks at the same time, and try to address them in this thesis. We start with one of the core problems of computer vision, patch matching, and propose to use convolutional neural networks of various architectures to solve it, instead of manual hand-crafting descriptors. Then, we address the task of object detection, where a network should simultaneously learn to both predict class of the object and the location. In both tasks we find that the number of parameters in the network is the major factor determining it's performance, and explore this phenomena in residual networks. Our findings show that their original motivation, training deeper networks for better representations, does not fully hold, and wider networks with less layers can be as effective as deeper with the same number of parameters. Overall, we present an extensive study on architectures and weight parameterizations, and ways of transferring knowledge between them
Complete list of metadatas

Cited literature [20 references]  Display  Hide  Download

https://tel.archives-ouvertes.fr/tel-02084044
Contributor : Abes Star <>
Submitted on : Friday, March 29, 2019 - 12:54:22 PM
Last modification on : Saturday, May 18, 2019 - 12:27:26 AM

File

TH2018PESC1129.pdf
Version validated by the jury (STAR)

Identifiers

  • HAL Id : tel-02084044, version 1

Collections

Citation

Sergey Zagoruyko. Weight parameterizations in deep neural networks. Neural and Evolutionary Computing [cs.NE]. Université Paris-Est, 2018. English. ⟨NNT : 2018PESC1129⟩. ⟨tel-02084044⟩

Share

Metrics

Record views

101

Files downloads

98