Chapter 21. G-methods for time-varying treatments

By Catie Wiener in What if?

July 7, 2023

Hernán MA, Robins JM (2020). Causal Inference: What If. Boca Raton: Chapman & Hall/CRC.

21.1 The g-formula for time-varying treatments

Time-fixed

Suppose the below data arise from a sequentially randomized experiment. There are two time points \(k\). Treatment at time \(k=0\) is assigned with probability 0.5. Treatment at time \(k=1\) is assigned with probability 0.4 if \(L_1=0\) and 0.8 if \(L_1=1\). The outcome is a function of covariate L and other clinical measures, with higher values signifying better health.

\(N\) \(A_0\) \(L_1\) \(A_1\) \(\bar{Y}\)
2400 0 0 0 84
1600 0 0 1 84
2400 0 1 0 52
9600 0 1 1 52
4800 1 0 0 76
3200 1 0 1 76
1600 1 1 0 44
6400 1 1 1 44

We are interested in the effect of the time-fixed treatment \(A_1\): i.e.

$$E[Y^{a_1=1}]-E[Y^{a_1=0}]$$

Under the identifiability conditions, each mean is the weighted average of the mean outcome conditional on the time-fixed treatment and confounders.

For example, the g-formula for \(E[Y^{a_1=1}]\) is:

$$\sum_{l_1}{E[Y|A_1=a_1, L_1=l]\times f(l_1)}$$ which is just the weighted average.

Time-varying

Deterministic treatment strategies

In order to calculate the counterfactual means for a time-varying treatment \(E\[Y^{a_0,a_1}\]\), we must generalize the above g-formula. It will still be a weighted average, but now conditional on the time-varying treatment as well as confounders.

The g-formula for two time points then is:

$$\sum_{l_1}{E[Y|A_0=a_0,A_1=a_1, L_1=l]\times f(l_1|a_0)}$$

Generalizing further, the g-formula for \(K+1\) time points is:

$$\sum_{\bar{l}}{E[Y|\bar{A}=\bar{a},\bar{L}=\bar{l}]}\prod^K_{k=0}f(l_k|\bar{a}_{k-1},\bar{l}_{k-1})$$

Often, the components of the g-formula cannot be computed if the data are high-dimensional; therefore, each part of the g-formula must be estimated. The estimates can then be plugged into the g-formula (the plug-in g-formula, or, when based on parametric models, the parametric g-formula).

Random treatment strategies

Under sequential exchangeability, the g-formula can be used to compute the counterfactual mean of the outcome under a random treatment strategy \(f^{int}\).

Example: independently at each time \(k\), treatment individuals with probability 0.3:

$$f^{int}(1|\bar{a}_{k-1},\bar{l}_k)=0.3$$

In plain english: no matter what the treatment history or time-varying covariate, the probability of treatment is 0.3.

Question: Does this mean the probability of treatment is independent of treatment and covariate history?

General form for both:

$$\sum_{\bar{l},\bar{a}}{E[Y|\bar{A}=\bar{a},\bar{L}=\bar{l}]}\prod^K_{k=0}f(l_k|\bar{a}_{k-1},\bar{l}_{k-1})\prod^K_{k=0}f^{int}(a_k|\bar{a}_{k-1},\bar{l}_{k-1})$$

Question: I’m starting to suspect some of my misunderstanding is happening because the products are part of the sum -> it’s not 3 distinct parts?