Skip to Main content Skip to Navigation

Secure, efficient automatic speaker verification for embedded applications

Abstract : This industrial CIFRE PhD thesis addresses automatic speaker verification (ASV) issues in the context of embedded applications. The first part of this thesis focuses on more traditional problems and topics. The first work investigates the minimum enrolment data requirements for a practical, text-dependent short-utterance ASV system. Contributions in part A of the thesis consist in a statistical analysis whose objective is to isolate text-dependent factors and prove they are consistent across different sets of speakers. For very short utterances, the influence of a specific text content on the system performance can be considered a speaker-independent factor. Part B of the thesis focuses on neural network-based solutions. While it was clear that neural networks and deep learning were becoming state-of-the-art in several machine learning domains, their use for embedded solutions was hindered by their complexity. Contributions described in the second part of the thesis comprise blue-sky, experimental research which tackles the substitution of hand-crafted, traditional speaker features in favour of operating directly upon the audio waveform and the search for optimal network architectures and weights by means of genetic algorithms. This work is the most fundamental contribution: lightweight, neuro-evolved network structures which are able to learn from the raw audio input.
Complete list of metadata

Cited literature [292 references]  Display  Hide  Download
Contributor : ABES STAR :  Contact
Submitted on : Thursday, November 12, 2020 - 12:15:26 PM
Last modification on : Sunday, June 26, 2022 - 10:01:32 AM
Long-term archiving on: : Saturday, February 13, 2021 - 7:20:52 PM


Version validated by the jury (STAR)


  • HAL Id : tel-03001286, version 1


Giacomo Valenti. Secure, efficient automatic speaker verification for embedded applications. Artificial Intelligence [cs.AI]. Sorbonne Université, 2019. English. ⟨NNT : 2019SORUS471⟩. ⟨tel-03001286⟩



Record views


Files downloads