Working Paper: NBER ID: w25818
Authors: Joseph Staudt; Yifang Wei; Lisa Singh; Shawn D. Klimek; J. Bradford Jensen; Andrew L. Baer
Abstract: Between the 2007 and 2012 Economic Censuses (EC), the count of franchise-affiliated establishments declined by 9.8%. One reason for this decline was a reduction in resources that the Census Bureau was able to dedicate to the manual evaluation of survey responses in the franchise section of the EC. Extensive manual evaluation in 2007 resulted in many establishments, whose survey forms indicated they were not franchise-affiliated, being recoded as franchise-affiliated. No such evaluation could be undertaken in 2012. In this paper, we examine the potential of using external data harvested from the web in combination with machine learning methods to automate the process of evaluating responses to the franchise section of the 2017 EC. Our method allows us to quickly and accurately identify and recode establishments have been mistakenly classified as not being franchise-affiliated, increasing the unweighted number of franchise-affiliated establishments in the 2017 EC by 22%-42%.
Keywords: franchising; economic census; machine learning; data automation
JEL Codes: C81; L8
Edges that are evidenced by causal inference methods are in orange, and the rest are in light blue.
Cause | Effect |
---|---|
external data (O36) | identification of franchise status (L14) |
machine learning approach (C45) | increase in count of franchise-affiliated establishments (L26) |
identification of franchise status (L14) | correction of survey misclassifications (C83) |