Direction-Aware Joint Adaptation of Neural Speech Enhancement and Recognition in Real Multiparty Conversational Environments - Equipe Signal, Statistique et Apprentissage Accéder directement au contenu
Communication Dans Un Congrès Année : 2022

Direction-Aware Joint Adaptation of Neural Speech Enhancement and Recognition in Real Multiparty Conversational Environments

Résumé

This paper describes noisy speech recognition for an augmented reality headset that helps verbal communication with in real multiparty conversational environments. A major approach that has actively been studied in simulated environments is to sequentially perform speech enhancement and automatic speech recognition (ASR) based on deep neural networks (DNNs) trained in a supervised manner. In our task, however, such a pretrained system fails to work due to the mismatch between the training and test conditions and the head movements of the user. To enhance only the utterances of a target speaker, we use beamforming based on a DNN-based speech mask estimator that can adaptively extract the speech components corresponding to a head-relative particular direction. We propose a semi-supervised adaptation method that jointly updates the mask estimator and the ASR model at run-time using clean speech signals with ground-truth transcriptions and noisy speech signals with highly-confident estimated transcriptions. Comparative experiments using the state-of-theart distant speech recognition system show that the proposed method significantly improves the ASR performance.
Fichier principal
Vignette du fichier
interspeech_2022.pdf (638.68 Ko) Télécharger le fichier
Origine : Fichiers produits par l'(les) auteur(s)

Dates et versions

hal-03727181 , version 1 (19-07-2022)

Identifiants

  • HAL Id : hal-03727181 , version 1

Citer

Yicheng Du, Aditya Arie Nugraha, Kouhei Sekiguchi, Yoshiaki Bando, Mathieu Fontaine, et al.. Direction-Aware Joint Adaptation of Neural Speech Enhancement and Recognition in Real Multiparty Conversational Environments. INTERSPEECH, 2022, Incheon, South Korea. ⟨hal-03727181⟩
25 Consultations
66 Téléchargements

Partager

Gmail Facebook X LinkedIn More