Using Digitized Newspapers to Refine Historical Measures: The Case of the Boll Weevil

Working Paper: NBER ID: w29808

Authors: Andreas Ferrara; Joung Yeob Ha; Randall Walsh

Abstract: This paper shows how to remove attenuation bias in regression analyses due to measurement error in historical data for a given variable of interest by using a secondary measure which can be easily generated from digitized newspapers. We provide three methods for using this secondary variable to deal with non-classical measurement error in a binary treatment: set identification, bias reduction via sample restriction, and a parametric bias correction. We demonstrate the usefulness of our methods by replicating two recent studies on the effect of the boll weevil. Relative to the initial analysis, our results yield markedly larger coefficient estimates.

Keywords: digitized newspapers; historical data; measurement error; boll weevil; economic impact

JEL Codes: N01


Causal Claims Network Graph

Edges that are evidenced by causal inference methods are in orange, and the rest are in light blue.


Causal Claims

CauseEffect
newspaper-derived measure (C80)reduction in measurement error (C83)
mismeasured treatment variable (x1) (C20)lower bound on true effect (C51)
newspaper-derived measure (x2) (C80)upper bound on true effect (C51)
agreement sample (x1 and x2 align) (Y20)substantial reduction in OLS bias (C51)
parametric bias correction (C51)more accurate estimate of true parameter (C51)
newspaper data (Y10)larger coefficient estimates (C51)
parametric bias correction (C51)largest coefficient (C46)
previous studies (C92)underestimated economic impact of boll weevil (N11)

Back to index