Imputing Missing Values in the US Census Bureau’s County Business Patterns

Working Paper: NBER ID: w26632

Authors: Fabian Eckert; Teresa C. Fort; Peter K. Schott; Natalie J. Yang

Abstract: The County Business Patterns data published by the US Census Bureau track employment by county and industry from 1946 to the present. Two features of the data limit their usefulness to researchers: (1) employment for the majority of county-industry cells is suppressed to protect confidentiality, and (2) industry classifications change over time. We address both issues. First, we develop a linear programming method that exploits the large set of adding-up constraints implicit in the hierarchical arrangement of the data to impute missing employment. Second, we provide concordances to map all data to a consistent set of industry codes. Finally, we construct a user-friendly, 1975 to 2016 county-level panel that classifies industries according to a consistent set of 2012 NAICS codes in all years.

Keywords: County Business Patterns; Data Imputation; Linear Programming; Employment Data

JEL Codes: E24; F16; J21; L6


Causal Claims Network Graph

Edges that are evidenced by causal inference methods are in orange, and the rest are in light blue.


Causal Claims

CauseEffect
Imputation Method (C36)Imputed Employment Values (J68)
Hierarchical Constraints (D10)Imputed Employment Values (J68)
Imputed Employment Values (J68)True Employment Figures (J68)
Employment Counts of Industrial Children (J82)Employment Counts of Industrial Parents (J39)
Imputation Method (C36)Closer Approximation to True Employment Figures (J60)

Back to index