Working Paper: NBER ID: w21961
Authors: Raj Chetty; John N. Friedman; Jonah Rockoff
Abstract: Value-added (VA) models measure the productivity of agents such as teachers or doctors based on the outcomes they produce. The utility of VA models for performance evaluation depends on the extent to which VA estimates are biased by selection, for instance by differences in the abilities of students assigned to teachers. One widely used approach for evaluating bias in VA is to test for balance in lagged values of the outcome, based on the intuition that today’s inputs cannot influence yesterday’s outcomes. We use Monte Carlo simulations to show that, unlike in conventional treatment effect analyses, tests for balance using lagged outcomes do not provide robust information about the degree of bias in value-added models for two reasons. First, the treatment itself (value-added) is estimated, rather than exogenously observed. As a result, correlated shocks to outcomes can induce correlations between current VA estimates and lagged outcomes that are sensitive to model specification. Second, in most VA applications, estimation error does not vanish asymptotically because sample sizes per teacher (or principal, manager, etc.) remain small, making balance tests sensitive to the specification of the error structure even in large datasets. We conclude that bias in VA models is better evaluated using techniques that are less sensitive to model specification, such as randomized experiments, rather than using lagged outcomes.
Keywords: Value-added models; Teacher effectiveness; Bias evaluation; Lagged outcomes; Randomized experiments
JEL Codes: C18; H75; I21; J01; J08; J45; M50; M54
Edges that are evidenced by causal inference methods are in orange, and the rest are in light blue.
Cause | Effect |
---|---|
Tests for balance using lagged outcomes (C22) | Bias in VA models (C52) |
Correlated shocks affect both current VA estimates and lagged outcomes (C32) | Tests for balance using lagged outcomes do not provide robust information about the degree of bias in VA models (C22) |
Estimation errors due to small sample sizes per teacher (C21) | Tests for balance using lagged outcomes do not provide robust information about the degree of bias in VA models (C22) |
The treatment effect (VA) is estimated rather than observed (C22) | Identification of bias in VA models is complicated (C32) |
Randomized experiments (C90) | Bias in VA models (C52) |