Skip to Main content Skip to Navigation
Theses

Deep learning methods for style extraction and transfer

Abstract : One aspect of a successful human-machine interface (e.g. human-robot interaction, chatbots, speech, handwriting…,etc) is the ability to have a personalized interaction. This affects the overall human experience, and allow for a more fluent interaction. At the moment, there is a lot of work that uses machine learning in order to model such interactions. However, these models do not address the issue of personalized behavior: they try to average over the different examples from different people in the training set. Identifying the human styles (persona) opens the possibility of biasing the models output to take into account the human preference. In this thesis, we focused on the problem of styles in the context of handwriting.Defining and extracting handwriting styles is a challenging problem, since there is no formal definition for those styles (i.e., it is an ill-posed problem). Styles are both social - depending on the writer's training, especially in middle school - and idiosyncratic - depends on the writer's shaping (letter roundness, sharpness…,etc) and force distribution over time. As a consequence, there are no easy/generic metrics to measure the quality of style in a machine behavior.We may want to change the task or adapt to a new person. Collecting data in the human-machine interface domain can be quite expensive and time consuming. Although most of the time the new task has many things in common with the old task, traditional machine learning techniques fail to take advantage of this commonality, leading to a quick degradation in performance. Thus, one of the objectives of my thesis is to study and evaluate the idea of transferring knowledge about the styles between different tasks, within the machine learning paradigm.The objective of my thesis is to study these problems of styles, in the domain of handwriting. Available to us is IRONOFF dataset, an online handwriting datasets, with 410 writers, with ~25K examples of uppercase, lowercase letters and digits drawings. For transfer learning, we used an extra dataset, QuickDraw!, a sketch drawing dataset containing ~50 million drawing over 345 categories.Major contributions of my thesis are:1) Propose a work pipeline to study the problem of styles in handwriting. This involves proposing methodology, benchmarks and evaluation metrics.We choose temporal generative models paradigm in deep learning in order to generate drawings, and evaluate their proximity/relevance to the intended/ground truth drawings. We proposed two metrics, to evaluate the curvature and the length of the generated drawings. In order to ground those metics, we proposed multiple benchmarks - which we know their relative power in advance -, and then verified that the metrics actually respect the relative power relationship.2) Propose a framework to study and extract styles, and verify its advantage against the previously proposed benchmarks.We settled on the idea of using a deep conditioned-autoencoder in order to summarize and extract the style information, without the need to focus on the task identity (since it is given as a condition). We validate this framework to the previously proposed benchmark using our evaluation metrics. We also to visualize on the extracted styles, leading to some exciting outcomes!3) Using the proposed framework, propose a way to transfer the information about styles between different tasks, and a protocol in order to evaluate the quality of transfer.We leveraged the deep conditioned-autoencoder used earlier, by extract the encoder part in it - which we believe had the relevant information about the styles - and use it to in new models trained on new tasks. We extensively test this paradigm over a different range of tasks, on both IRONOFF and QuickDraw! datasets. We show that we can successfully transfer style information between different tasks.
Complete list of metadatas

Cited literature [157 references]  Display  Hide  Download

https://tel.archives-ouvertes.fr/tel-02488856
Contributor : Abes Star :  Contact
Submitted on : Monday, March 2, 2020 - 2:44:39 PM
Last modification on : Wednesday, October 14, 2020 - 3:51:47 AM
Long-term archiving on: : Wednesday, June 3, 2020 - 2:23:55 PM

File

MOHAMMED_2019_archivage.pdf
Version validated by the jury (STAR)

Identifiers

  • HAL Id : tel-02488856, version 2

Collections

Citation

Omar Mohammed. Deep learning methods for style extraction and transfer. Signal and Image processing. Université Grenoble Alpes, 2019. English. ⟨NNT : 2019GREAT035⟩. ⟨tel-02488856v2⟩

Share

Metrics

Record views

216

Files downloads

462