Skip to Main content Skip to Navigation
Theses

Few-Shot Intent Classification in User-Generated Short Texts : Application to Conversational Agents

Abstract : To classify user intents, a rigorous annotation must be conducted. In order to overcome the problem of lack of annotated data, we use few-shot classification methods.In a first step, this thesis focuses on a new comparison of few-shot classification methods. The methods were compared with different text encoders, which led to a biased comparison. When each method is equipped with the same transform-based sentence encoder (BERT), older few-shot classification methods take over.Next, we study pseudo-labeling, i.e. the automatic assignment of pseudo-labels to unlabeled data. In this context, we introduce a new pseudo-labeling method inspired by hierarchical clustering. Our method does not use any hyper-parameter and knows how to ignore unlabeled examples that would be too far from the known distribution. We will also show that it is complementary to other existing methods.As a final contribution, we introduce ProtAugment, a meta-learning architecture for the intention detection problem. This new extension trains the model to retrieve the original sentence based on prototypes computed from paraphrases. We will also introduce our own method for generating paraphrases, and see that the way these paraphrases are generated plays an important role.All the code used to run the experiments presented in this thesis is available on my github account (https://github.com/tdopierre/).
Complete list of metadata

https://tel.archives-ouvertes.fr/tel-03722690
Contributor : ABES STAR :  Contact
Submitted on : Wednesday, July 13, 2022 - 3:07:13 PM
Last modification on : Thursday, July 14, 2022 - 3:46:55 AM

File

These-Dopierre-Thomas-2021.pdf
Version validated by the jury (STAR)

Identifiers

  • HAL Id : tel-03722690, version 1

Collections

Citation

Thomas Dopierre. Few-Shot Intent Classification in User-Generated Short Texts : Application to Conversational Agents. Computation and Language [cs.CL]. Université de Lyon, 2021. English. ⟨NNT : 2021LYSES044⟩. ⟨tel-03722690⟩

Share

Metrics

Record views

34

Files downloads

6