Contamination Bias in Linear Regressions

Working Paper: NBER ID: w30108

Authors: Paul Goldsmith-Pinkham; Peter Hull; Michal Kolesar

Abstract: We study regressions with multiple treatments and a set of controls that is flexible enough to purge omitted variable bias. We show that these regressions generally fail to estimate convex averages of heterogeneous treatment effects—instead, estimates of each treatment’s effect are contaminated by non-convex averages of the effects of other treatments. We discuss three estimation approaches that avoid such contamination bias, including the targeting of easiest-to-estimate weighted average effects. A re-analysis of nine empirical applications finds economically and statistically meaningful contamination bias in observational studies; contamination bias in experimental studies is more limited due to smaller variability in propensity scores.

Keywords: No keywords provided

JEL Codes: C14; C21; C22; C90


Causal Claims Network Graph

Edges that are evidenced by causal inference methods are in orange, and the rest are in light blue.


Causal Claims

CauseEffect
Contamination Bias (Q53)Misinterpretation of Regression Coefficients (C29)
Regression Coefficients for Treatments (C29)Nonconvex Averages of Effects of Other Treatments (C32)
Nonlinearity of Treatment Assignment Probabilities (C22)Contamination Bias (Q53)
Contamination Bias (Q53)Bias Term is Linear Combination of Causal Effects of Other Treatments (C32)
Flexible Covariate Adjustment (C24)Persistence of Contamination Bias (D91)
Contamination Bias (Q53)Relevance in Instrumental Variable Regressions (C36)
Effect Heterogeneity Uncorrelated with Treatment Propensity Scores (C21)Minimal Bias (D79)
Effect Heterogeneity Correlated with Treatment Propensity Scores (C21)Significant Contamination Bias (C83)

Back to index