Refining Public Policies with Machine Learning: The Case of Tax Auditing

Working Paper: NBER ID: w30777

Authors: Marco Battaglini; Luigi Guiso; Chiara Lacava; Douglas L. Miller; Eleonora Patacchini

Abstract: We study the extent to which ML techniques can be used to improve tax auditing efficiency using administrative data, without the need of randomized audits. Using Italy's population data on sole proprietorship tax returns, audits and their outcome, we develop a new approach to address the so called selective labels problem - the fact that a ML algorithm must necessarily be trained on endogenously selected data. We document the existence of substantial margins for raising revenue from audits by improving the selection of taxpayers to audit with ML. Replacing the 10% least productive audits with an equal number of taxpayers selected by our trained algorithm raises detected tax evasion by as much as 38%, and evasion that is actually payed back by 29%.

Keywords: Tax enforcement; Tax evasion; Policy prediction problems

JEL Codes: H2; H20; H26

Causal Claims Network Graph

Edges that are evidenced by causal inference methods are in orange, and the rest are in light blue.

Causal Claims

Cause	Effect
Machine Learning Techniques (C45)	Detected Tax Evasion (H26)
Machine Learning Techniques (C45)	Evasion Paid Back (H26)
Poorly Performing Audits (M42)	Replacement with ML Selected Audits (C52)
Replacement with ML Selected Audits (C52)	Detected Tax Evasion (H26)
Replacement with ML Selected Audits (C52)	Evasion Paid Back (H26)

Back to index