Skip to Main content Skip to Navigation
Theses

Structured Data Modeling with Deep Kernel Machines and Applications in Computational Biology

Dexiong Chen 1, 2
2 Thoth - Apprentissage de modèles à partir de données massives
Inria Grenoble - Rhône-Alpes, LJK - Laboratoire Jean Kuntzmann
Abstract : Developing efficient algorithms to learn appropriate representations of structured data, including sequences or graphs, is a major and central challenge in machine learning. To this end, deep learning has become popular in structured data modeling. Deep neural networks have drawn particular attention in various scientific fields such as computer vision, natural language understanding or biology. For instance, they provide computational tools for biologists to possibly understand and uncover biological properties or relationships among macromolecules within living organisms. However, most of the success of deep learning methods in these fields essentially relies on the guidance of empirical insights as well as huge amounts of annotated data. Exploiting more data-efficient models is necessary as labeled data is often scarce.Another line of research is kernel methods, which provide a systematic and principled approach for learning non-linear models from data of arbitrary structure. In addition to their simplicity, they exhibit a natural way to control regularization and thus to avoid overfitting.However, the data representations provided by traditional kernel methods are only defined by simply designed hand-crafted features, which makes them perform worse than neural networks when enough labeled data are available. More complex kernels inspired by prior knowledge used in neural networks have thus been developed to build richer representations and thus bridge this gap. Yet, they are less scalable. By contrast, neural networks are able to learn a compact representation for a specific learning task, which allows them to retain the expressivity of the representation while scaling to large sample size.Incorporating complementary views of kernel methods and deep neural networks to build new frameworks is therefore useful to benefit from both worlds.In this thesis, we build a general kernel-based framework for modeling structured data by leveraging prior knowledge from classical kernel methods and deep networks. Our framework provides efficient algorithmic tools for learning representations without annotations as well as for learning more compact representations in a task-driven way. Our framework can be used to efficiently model sequences and graphs with simple interpretation of predictions. It also offers new insights about designing more expressive kernels and neural networks for sequences and graphs.
Document type :
Theses
Complete list of metadata

https://tel.archives-ouvertes.fr/tel-03193220
Contributor : Abes Star :  Contact Connect in order to contact the contributor
Submitted on : Thursday, April 8, 2021 - 4:03:27 PM
Last modification on : Friday, June 11, 2021 - 2:27:07 PM

File

CHEN_2020_archivage.pdf
Version validated by the jury (STAR)

Identifiers

  • HAL Id : tel-03193220, version 1

Collections

Citation

Dexiong Chen. Structured Data Modeling with Deep Kernel Machines and Applications in Computational Biology. Bioinformatics [q-bio.QM]. Université Grenoble Alpes [2020-..], 2020. English. ⟨NNT : 2020GRALM070⟩. ⟨tel-03193220⟩

Share

Metrics

Record views

191

Files downloads

167