using Propensity score matching
9/2/24
RCTs are expensive:
May not be practical or ethically feasible
Real World Data RWD:
Real World Evidence RWE:
Subject | \(Y_i(T)\) | \(Y_i(C)\) | Trt. Effect: \(Y_i(T) - Y_i(C)\) |
---|---|---|---|
1 | 14 | ? | ? |
2 | 9 | ? | ? |
3 | 8 | ? | ? |
4 | ? | 5 | ? |
5 | ? | 10 | ? |
6 | ? | 7 | ? |
Mean | 10.33 | 7.33 | 3 |
In RCT, \(\tau = \bar{Y}_i(T) - \bar{Y}_i(C)\) estimates a causal effect
In general, \(\tau\) is not causal for observational studies (OS)
“Probability of treatment assignment based on observed baseline covariates” (Rosenbaum and Rubin 1983)
\[ PS_i = Pr(X_i = 1) = f(\boldsymbol{Z}_i) + \epsilon_i \qquad(1)\]
Logistic regression or machine learning to estimate Equation 1
Confounder \(\boldsymbol{Z}\) causes trt \(\boldsymbol{X}\) and outcome \(\boldsymbol{Y}\)
PS models the relation between \(\boldsymbol{X}\) and \(\boldsymbol{Z}\), thus removes confounding
https://sixsigmadsi.com/glossary/confounding/
Without a model for how treatments are assigned to units, formal causal inference is impossible (Little and Rubin 2000)
The PS is a balancing score: patients with similar PS should have similar baseline covariates (Austin 2011a)
Recommendations from simulations of (Austin 2014a):
https://help.easymedstat.com/support/solutions/articles/77000538175-caliper-in-propensity-score-matching
ATE: \(E\big[Y_i(T) - Y_i(C)\big]\)
ATT: \(E\big[Y_i(T) - Y_i(C)\big | T]\)
ATC: \(E\big[Y_i(T) - Y_i(C)\big | C]\)
Method | ATE | ATT/ATC |
---|---|---|
Matching | ❌ | ✅ |
Stratification | ✅ | ✅ |
Inverse probability weighting | ✅ | ✅ |
Covariate adjustment | ❌ | ❌ |
?distance
gives many other options (e.g. LASSO, random forests, boosting, NNets)Control | Treated | |
---|---|---|
Total | 429 | 185 |
Matched | 113 (26.3%) | 113 (61.1%) |
Unmatched | 316 (73.7%) | 72 (38.9%) |
Cutoffs:
trt.mean | ctrl.mean | diff | conf.low | conf.high | p.value |
---|---|---|---|---|---|
6510 | 4938 | 1572 | -365 | 3508 | 0.111 |
Stratification on PS uses all patients, but does not work well with survival data (Austin 2014b, 2016)
Inverse probability weighting and regression adjustment are very flexible, but generally not accepted by FDA (unlike matching) (Lu 2019)
Causal Random Forests look promising: https://grf-labs.github.io/grf/