Working Paper: NBER ID: w23925
Authors: Jonathan M.V. Davis; Jonathan Guryan; Kelly Hallberg; Jens Ludwig
Abstract: Most randomized controlled trials (RCT) of social programs test interventions at modest scale. While the hope is that promising programs will be scaled up, we have few successful examples of this scale-up process in practice. Ideally we would like to know which programs will work at large scale before we invest the resources to take them to scale. But it would seem that the only way to tell whether a program works at scale is to test it at scale. Our goal in this paper is to propose a way out of this Catch-22. We first develop a simple model that helps clarify the type of scale-up challenge for which our method is most relevant. Most social programs rely on labor as a key input (teachers, nurses, social workers, etc.). We know people vary greatly in their skill at these jobs. So social programs, like firms, confront a search problem in the labor market that can lead to inelastically-supplied human capital. The result is that as programs scale, either average costs must increase if program quality is to be held constant, or else program quality will decline if average costs are held fixed. Our proposed method for reducing the costs of estimating program impacts at large scale combines the fact that hiring inherently involves ranking inputs with the most powerful element of the social science toolkit: randomization. We show that it is possible to operate a program at modest scale n but learn about the input supply curves facing the firm at much larger scale (S × n) by randomly sampling the inputs the provider would have hired if they operated at scale (S × n). We build a simple two-period model of social-program decision making and use a model of Bayesian learning to develop heuristics for when scale-up experiments of the sort we propose are likely to be particularly valuable. We also present a series of results to illustrate the method, including one application to a real-world tutoring program that highlights an interesting observation: The noisier the program provider’s prediction of input quality, the less pronounced is the scale-up problem.
Keywords: Scaleup; Social Programs; Randomized Controlled Trials; Input Quality; Cost-Benefit Analysis
JEL Codes: D24; I2; J2; L25; L38; M5
Edges that are evidenced by causal inference methods are in orange, and the rest are in light blue.
Cause | Effect |
---|---|
input quality (L15) | program outcomes (A21) |
scale increases (R12) | average costs must increase (J30) |
scale increases (R12) | program quality declines (I21) |
inelastic supply of skilled labor (J24) | higher costs (J32) |
inelastic supply of skilled labor (J24) | lower returns as programs expand (I26) |
provider's ability to predict worker effectiveness (J24) | degree of decline in quality (L15) |
noisier predictions of input quality (L15) | less pronounced decline in program quality (L15) |
provider's rank-ordering of tutors by predicted quality (A14) | value-added of tutors does not decline (I21) |
scale-up experiments (C90) | unbiased estimates of average input quality (C51) |