Skip to Main content Skip to Navigation
New interface

Spoken Language Understanding for Abstractive Meeting Summarization

Abstract : With the impressive progress that has been made in transcribing spoken language, it is becoming increasingly possible to exploit transcribed data for tasks that require comprehension of what is said in a conversation. The work in this dissertation, carried out in the context of a project devoted to the development of a meeting assistant, contributes to ongoing efforts to teach machines to understand multi-party meeting speech. We have focused on the challenge of automatically generating abstractive meeting summaries.We first present our results on Abstractive Meeting Summarization (AMS), which aims to take a meeting transcription as input and produce an abstractive summary as output. We introduce a fully unsupervised framework for this task based on multi-sentence compression and budgeted submodular maximization. We also leverage recent advances in word embeddings and graph degeneracy applied to NLP, to take exterior semantic knowledge into account and to design custom diversity and informativeness measures.Next, we discuss our work on Dialogue Act Classification (DAC), whose goal is to assign each utterance in a discourse a label that represents its communicative intention. DAC yields annotations that are useful for a wide variety of tasks, including AMS. We propose a modified neural Conditional Random Field (CRF) layer that takes into account not only the sequence of utterances in a discourse, but also speaker information and in particular, whether there has been a change of speaker from one utterance to the next.The third part of the dissertation focuses on Abstractive Community Detection (ACD), a sub-task of AMS, in which utterances in a conversation are grouped according to whether they can be jointly summarized by a common abstractive sentence. We provide a novel approach to ACD in which we first introduce a neural contextual utterance encoder featuring three types of self-attention mechanisms and then train it using the siamese and triplet energy-based meta-architectures. We further propose a general sampling scheme that enables the triplet architecture to capture subtle patterns (e.g., overlapping and nested clusters).
Complete list of metadata
Contributor : ABES STAR :  Contact
Submitted on : Monday, March 15, 2021 - 5:13:09 PM
Last modification on : Tuesday, March 16, 2021 - 3:31:57 AM
Long-term archiving on: : Wednesday, June 16, 2021 - 7:22:59 PM


Version validated by the jury (STAR)


  • HAL Id : tel-03169877, version 1



Guokan Shang. Spoken Language Understanding for Abstractive Meeting Summarization. Computation and Language [cs.CL]. Institut Polytechnique de Paris, 2021. English. ⟨NNT : 2021IPPAX011⟩. ⟨tel-03169877⟩



Record views


Files downloads