Working Paper: NBER ID: w15716
Authors: Patrick Kline; Andres Santos
Abstract: This paper develops methods for assessing the sensitivity of empirical conclusions regarding conditional distributions to departures from the missing at random (MAR) assumption. We index the degree of non-ignorable selection governing the missingness process by the maximal Kolmogorov-Smirnov (KS) distance between the distributions of missing and observed outcomes across all values of the covariates. Sharp bounds on minimum mean square approximations to conditional quantiles are derived as a function of the nominal level of selection considered in the sensitivity analysis and a weighted bootstrap procedure is developed for conducting inference. Using these techniques, we conduct an empirical assessment of the sensitivity of observed earnings patterns in U.S. Census data to deviations from the MAR assumption. We find that the well-documented increase in the returns to schooling between 1980 and 1990 is relatively robust to deviations from the missing at random assumption except at the lowest quantiles of the distribution, but that conclusions regarding heterogeneity in returns and changes in the returns function between 1990 and 2000 are very sensitive to departures from ignorability.
Keywords: missing data; sensitivity analysis; conditional quantiles; returns to schooling; U.S. wage structure
JEL Codes: C01; C12; J3
Edges that are evidenced by causal inference methods are in orange, and the rest are in light blue.
Cause | Effect |
---|---|
increase in returns to schooling between 1980 and 1990 (I26) | robustness to deviations from the MAR assumption (C20) |
deviations from the MAR assumption (B51) | sensitivity of conclusions regarding heterogeneity in returns and changes in the returns function between 1990 and 2000 (C22) |
modest deviations from MAR (C59) | robustness of the apparent convexification of the earnings-education profile (J79) |
changes in the wage structure at lower quantiles (J31) | susceptibility to selection biases (C83) |
KS distance (C49) | assessment of sensitivity without prior knowledge of the missing data mechanism (C52) |