Skip to content

A Case Study of the Impact of Data-Adaptive Versus Model-Based Estimation of the Propensity Scores on Causal Inferences from Three Inverse Probability Weighting Estimators

Consistent estimation of causal effects with inverse probability weighting estimators is known to rely on consistent estimation of propensity scores. To alleviate the bias expected from incorrect model specification for these nuisance parameters in observational studies, data-adaptive estimation and in particular an ensemble learning approach known as Super Learning has been proposed as an alternative to the common practice of estimation based on arbitrary model specification. While the theoretical arguments against the use of the latter haphazard estimation strategy are evident, the extent to which data-adaptive estimation can improve inferences in practice is not. Some practitioners may view bias concerns over arbitrary parametric assumptions as academic considerations that are inconsequential in practice. They may also be wary of data-adaptive estimation of the propensity scores for fear of greatly increasing estimation variability due to extreme weight values. With this report, we aim to contribute to the understanding of the potential practical consequences of the choice of estimation strategy for the propensity scores in real-world comparative effectiveness research. We implement secondary analyses of Electronic Health Record data from a large cohort of type 2 diabetes patients to evaluate the effects of four adaptive treatment intensification strategies for glucose control (dynamic treatment regimens) on subsequent development or progression of urinary albumin excretion. Three Inverse Probability Weighting estimators are implemented using both model-based and data-adaptive estimation strategies for the propensity scores. Their practical performances for proper confounding and selection bias adjustment are compared and evaluated against results from previous randomized experiments. Results suggest both potential reduction in bias and increase in efficiency at the cost of an increase in computing time when using Super Learning to implement Inverse Probability Weighting estimators to draw causal inferences.

Authors: Neugebauer R; Schmittdiel JA; van der Laan MJ

Int J Biostat. 2016 05 01;12(1):131-55.

PubMed abstract

Explore all studies and publications

Back To Top