Skip to Main content Skip to Navigation

Targeted learning in Big Data : bridging data-adaptive estimation and statistical inference

Abstract : This dissertation focuses on developing robust semiparametric methods for complex parameters that emerge at the interface of causal inference and biostatistics, with applications to epidemiological and medical research in the era of Big Data. Specifically, we address two statistical challenges that arise in bridging the disconnect between data-adaptive estimation and statistical inference. The first challenge arises in maximizing information learned from Randomized Control Trials (RCT) through the use of adaptive trial designs. We present a framework to construct and analyze group sequential covariate-adjusted response-adaptive (CARA) RCTs that admits the use of data-adaptive approaches in constructing the randomization schemes and in estimating the conditional response model. This framework adds to the existing literature on CARA RCTs by allowing flexible options in both their design and analysis and by providing robust effect estimates even under model mis-specifications. The second challenge arises from obtaining a Central Limit Theorem when data-adaptive estimation is used to estimate the nuisance parameters. We consider as target parameter of interest the marginal risk difference of the outcome under a binary treatment, and propose a Cross-validated Targeted Minimum Loss Estimator (TMLE), which augments the classical TMLE with a sample-splitting procedure. The proposed Cross-Validated TMLE (CV-TMLE) inherits the double robustness properties and efficiency properties of the classical TMLE , and achieves asymptotic linearity at minimal conditions by avoiding the Donsker class condition.
Document type :
Complete list of metadatas

Cited literature [51 references]  Display  Hide  Download
Contributor : Abes Star :  Contact
Submitted on : Tuesday, March 13, 2018 - 3:46:07 PM
Last modification on : Friday, August 21, 2020 - 5:14:27 AM
Long-term archiving on: : Thursday, June 14, 2018 - 3:55:20 PM


Version validated by the jury (STAR)


  • HAL Id : tel-01730786, version 1


Wenjing Zheng. Targeted learning in Big Data : bridging data-adaptive estimation and statistical inference. General Mathematics [math.GM]. Université Sorbonne Paris Cité, 2016. English. ⟨NNT : 2016USPCB044⟩. ⟨tel-01730786⟩



Record views


Files downloads