Sign In

Software for Causal Inference Research


This R package automates the implementation of various estimators of the effects of time-varying static, dynamic, and stochastic treatment and monitoring interventions on time-to-event outcomes (e.g., counterfactual discrete-time survival curves or coefficients of Marginal Structural Models). To adjust for both observed time-dependent confounding and informative right-censoring, the following estimation approaches are automated: inverse probability weighting, g-computation, and targeted minimum loss based estimation. Nuisance parameters can be estimated using user-specified generalized linear models or H2O machine learning algorithms (including an H2O ensemble learning approach, also known as Super Learning). Analytic results can be automatically exported in standard HTML, MS Word, or PDF reports.


This R package is a flexible tool for simulating complex longitudinal data using structural equations, with emphasis on problems in causal inference. The user interface is designed to facilitate the conduct of transparent and reproducible simulation studies, and allows concise expression of complex functional dependencies for a large number of time-varying nodes. In particular, the following steps of a standard data simulation workflow are facilitated by this software: specify interventions and simulate from intervened data generating distributions, define and evaluate treatment-specific means, the average treatment effects and coefficients from working marginal structural models. 

MSMstructure (zip file)


This SAS macro and R package automate the processing of longitudinal electronic health record data from an observational cohort study into a structured analytic dataset suitable for the evaluation of the effects of time-varying treatment and monitoring interventions on a survival outcome using, for example, inverse probability weighting or targeted minimum loss based estimation. In particular, output from both of these software products can be used with the MSM macro, ltmle R package, or the stremr R package described above. The R routine f_Long_to_Wide may be used to convert the output data in long format (generated either by the MSMstructure macro or the LtAtStructuR R package) into the wide format used by the ltmle R package.

DSA: Data-Adaptive Estimation with Cross-Validation and the D/S/A Algorithm 

DSA_3.1.4.tar.gz (tar file) (zip file)

modelUtils_3.1.4.tar.gz (tar file) (zip file)

This combination of two R packages (modelUtils must be installed and loaded first) performs data-adaptive estimation through estimator selection based on cross-validation and the L2 loss function. Candidate estimators are defined with polynomial generalized linear models generated with the Deletion/Substitution/Addition (D/S/A) algorithm under user-specified constraints. This software may be used for prediction or for data-adaptive estimation of the nuisance parameters (e.g., propensity scores) involved in the estimation of causal estimands. 

For more information, contact Romain Neugebauer, PhD or Oleg Sofrygin, PhD.