Targeted learning in Big Data : bridging data-adaptive estimation and statistical inference

Abstract : This dissertation focuses on developing robust semiparametric methods for complex parameters that emerge at the interface of causal inference and biostatistics, with applications to epidemiological and medical research in the era of Big Data. Specifically, we address two statistical challenges that arise in bridging the disconnect between data-adaptive estimation and statistical inference. The first challenge arises in maximizing information learned from Randomized Control Trials (RCT) through the use of adaptive trial designs. We present a framework to construct and analyze group sequential covariate-adjusted response-adaptive (CARA) RCTs that admits the use of data-adaptive approaches in constructing the randomization schemes and in estimating the conditional response model. This framework adds to the existing literature on CARA RCTs by allowing flexible options in both their design and analysis and by providing robust effect estimates even under model mis-specifications. The second challenge arises from obtaining a Central Limit Theorem when data-adaptive estimation is used to estimate the nuisance parameters. We consider as target parameter of interest the marginal risk difference of the outcome under a binary treatment, and propose a Cross-validated Targeted Minimum Loss Estimator (TMLE), which augments the classical TMLE with a sample-splitting procedure. The proposed Cross-Validated TMLE (CV-TMLE) inherits the double robustness properties and efficiency properties of the classical TMLE , and achieves asymptotic linearity at minimal conditions by avoiding the Donsker class condition.
Document type :
Theses
Complete list of metadatas

Cited literature [51 references]  Display  Hide  Download

https://tel.archives-ouvertes.fr/tel-01730786
Contributor : Abes Star <>
Submitted on : Tuesday, March 13, 2018 - 3:46:07 PM
Last modification on : Friday, September 20, 2019 - 4:34:03 PM
Long-term archiving on : Thursday, June 14, 2018 - 3:55:20 PM

File

va_zheng_wenjing.pdf
Version validated by the jury (STAR)

Identifiers

  • HAL Id : tel-01730786, version 1

Collections

Citation

Wenjing Zheng. Targeted learning in Big Data : bridging data-adaptive estimation and statistical inference. General Mathematics [math.GM]. Université Sorbonne Paris Cité, 2016. English. ⟨NNT : 2016USPCB044⟩. ⟨tel-01730786⟩

Share

Metrics

Record views

110

Files downloads

115