Skip to Main content Skip to Navigation
New interface

Learning from time-dependent streaming data with online stochastic algorithms

Abstract : In recent decades, intelligent systems, such as machine learning and artificial intelligence, have become mainstream in many parts of society. However, many of these methods often work in a batch or offline learning setting, where the model is re-trained from scratch when new data arrives. Such learning methods suffer some critical drawbacks, such as expensive re-training costs when dealing with new data and thus poor scalability for large-scale and real-world applications. At the same time, these intelligent systems generate a practically infinite amount of large datasets, many of which come as a continuous stream of data, so-called streaming data. Therefore, first-order methods with low per-iteration computational costs have become predominant in the literature in recent years, in particular the Stochastic Gradient (SG) descent. These SG methods have proven scalable and robust in many areas ranging from smooth and strongly convex problems to complex non-convex ones, which makes them applicable in many learning tasks for real-world applications where data are large in size (and dimension) and arrive at a high velocity. Such first-order methods have been intensively studied in theory and practice in recent years. Nevertheless, there is still a lack of theoretical understanding of how dependence and biases affect these learning algorithms. A central theme in this thesis is to learn from time-dependent streaming data and examine how changing data streams affect learning. To achieve this, we first construct the Stochastic Streaming Gradient (SSG) algorithm, which can handle streaming data; this includes several SG-based meth- ods, such as the well-known SG descent and mini-batch methods, along with their Polyak-Ruppert average estimates. The SSG combines SG-based methods’ applicability, computational benefits, variance-reducing properties through mini-batching, and the accelerated convergence from Polyak-Ruppert averaging. Our analysis links the dependency and convexity level, enabling us to improve convergence. Roughly speaking, SSG methods can converge using non-decreasing streaming batches, which break long-term and short-term dependence, even using biased gradient estimates. More surprisingly, these results form a heuristic that can help increase the stability of SSG methods in practice. In particular, our analysis reveals how noise reduction and accelerated convergence can be achieved by processing the dataset in a specific pattern, which is beneficial for large-scale learning problems. At last, we propose an online adaptive recursive estimation routine for Generalized AutoRegres- sive Conditional Heteroskedasticity (GARCH) models called AdaVol. The AdaVol procedure relies on stochastic algorithms combined with Variance Targeting Estimation (VTE); AdaVol has com- putationally efficient properties, while VTE overcomes some convergence difficulties due to the lack of convexity of the Quasi-Maximum Likelihood (QML) procedure. Empirical demonstrations show favorable trade-offs between AdaVol’s stability and its ability to adapt to time-varying estimates.
Document type :
Complete list of metadata
Contributor : ABES STAR :  Contact
Submitted on : Monday, October 24, 2022 - 7:53:15 PM
Last modification on : Thursday, October 27, 2022 - 3:58:49 AM


Version validated by the jury (STAR)


  • HAL Id : tel-03827838, version 1


Nicklas Werge. Learning from time-dependent streaming data with online stochastic algorithms. Statistics [math.ST]. Sorbonne Université, 2022. English. ⟨NNT : 2022SORUS217⟩. ⟨tel-03827838⟩



Record views


Files downloads