Chapter 21. G-methods for time-varying treatments
By Catie Wiener in What if?
July 7, 2023
Hernán MA, Robins JM (2020). Causal Inference: What If. Boca Raton: Chapman & Hall/CRC.
21.1 The g-formula for time-varying treatments
Time-fixed
Suppose the below data arise from a sequentially randomized experiment. There are two time points \(k\)
. Treatment at time \(k=0\)
is assigned with probability 0.5. Treatment at time \(k=1\)
is assigned with probability 0.4 if \(L_1=0\)
and 0.8 if \(L_1=1\)
. The outcome is a function of covariate L and other clinical measures, with higher values signifying better health.
\(N\) |
\(A_0\) |
\(L_1\) |
\(A_1\) |
\(\bar{Y}\) |
---|---|---|---|---|
2400 | 0 | 0 | 0 | 84 |
1600 | 0 | 0 | 1 | 84 |
2400 | 0 | 1 | 0 | 52 |
9600 | 0 | 1 | 1 | 52 |
4800 | 1 | 0 | 0 | 76 |
3200 | 1 | 0 | 1 | 76 |
1600 | 1 | 1 | 0 | 44 |
6400 | 1 | 1 | 1 | 44 |
We are interested in the effect of the time-fixed treatment \(A_1\)
: i.e.
$$E[Y^{a_1=1}]-E[Y^{a_1=0}]$$
Under the identifiability conditions, each mean is the weighted average of the mean outcome conditional on the time-fixed treatment and confounders.
For example, the g-formula for \(E[Y^{a_1=1}]\)
is:
$$\sum_{l_1}{E[Y|A_1=a_1, L_1=l]\times f(l_1)}$$
which is just the weighted average.
Time-varying
Deterministic treatment strategies
In order to calculate the counterfactual means for a time-varying treatment \(E\[Y^{a_0,a_1}\]\)
, we must generalize the above g-formula. It will still be a weighted average, but now conditional on the time-varying treatment as well as confounders.
The g-formula for two time points then is:
$$\sum_{l_1}{E[Y|A_0=a_0,A_1=a_1, L_1=l]\times f(l_1|a_0)}$$
Generalizing further, the g-formula for \(K+1\)
time points is:
$$\sum_{\bar{l}}{E[Y|\bar{A}=\bar{a},\bar{L}=\bar{l}]}\prod^K_{k=0}f(l_k|\bar{a}_{k-1},\bar{l}_{k-1})$$
Often, the components of the g-formula cannot be computed if the data are high-dimensional; therefore, each part of the g-formula must be estimated. The estimates can then be plugged into the g-formula (the plug-in g-formula, or, when based on parametric models, the parametric g-formula).
Random treatment strategies
Under sequential exchangeability, the g-formula can be used to compute the counterfactual mean of the outcome under a random treatment strategy \(f^{int}\)
.
Example: independently at each time \(k\)
, treatment individuals with probability 0.3:
$$f^{int}(1|\bar{a}_{k-1},\bar{l}_k)=0.3$$
In plain english: no matter what the treatment history or time-varying covariate, the probability of treatment is 0.3.
Question: Does this mean the probability of treatment is independent of treatment and covariate history?
General form for both:
$$\sum_{\bar{l},\bar{a}}{E[Y|\bar{A}=\bar{a},\bar{L}=\bar{l}]}\prod^K_{k=0}f(l_k|\bar{a}_{k-1},\bar{l}_{k-1})\prod^K_{k=0}f^{int}(a_k|\bar{a}_{k-1},\bar{l}_{k-1})$$
Question: I’m starting to suspect some of my misunderstanding is happening because the products are part of the sum -> it’s not 3 distinct parts?