Working Paper: NBER ID: w31646
Authors: Daniel Greenwald; Sabrina T. Howell; Cangyuan Li; Emmanuel Yimfor
Abstract: When race is not directly observed, regulators and analysts commonly predict it using algorithms based on last name and address. In small business lending—where regulators assess fair lending law compliance using the Bayesian Improved Surname Geocoding (BISG) algorithm—we document large prediction errors among Black Americans. The errors bias measured racial disparities in loan approval rates downward by 43%, with greater bias for traditional vs. fintech lenders. Regulation using self-identified race would increase lending to Black borrowers, but also shift lending toward affluent areas because errors correlate with socioeconomics. Overall, using race proxies in policymaking and research presents challenges.
Keywords: race prediction algorithms; fair lending; discrimination; small business lending; regulatory compliance
JEL Codes: C81; G21; G23; G28; J15
Edges that are evidenced by causal inference methods are in orange, and the rest are in light blue.
Cause | Effect |
---|---|
Prediction errors in race classification (C52) | Bias in measured racial disparities in loan approval rates (J15) |
Use of BISG (L86) | Significant prediction errors for black Americans (J15) |
Use of BISG (L86) | Underestimate of racial disparities in loan approval rates (J15) |
Regulators using BISG (L51) | Illusion of better compliance with fair lending laws (G28) |
Switching to self-identified race data (J15) | Increase in lending to black borrowers (G21) |
Switching to self-identified race data (J15) | Shift in lending towards affluent areas (G21) |