Refining Public Policies with Machine Learning: The Case of Tax Auditing

Working Paper: CEPR ID: DP17796

Authors: Marco Battaglini; Luigi Guiso; Chiara Lacava; Douglas L. Miller; Eleonora Patacchini

Abstract: We study how ML techniques can be used to improve tax auditing efficiency using administrative data without the need of randomized audits. Using Italy’s population data on sole proprietorship tax returns and audits, our new approach addresses the challenge that predictions must be trained on human-selected data. There are substantial margins for raising revenue from audits by improving the selection of taxpayers to audit with ML. Replacing the 10% least promising audits with an equal number selected by our algorithm raises detected tax evasion by as much as 38%, and evasion that is actually paid back by 29%.

Keywords: Tax Enforcement; Tax Evasion; Policy Prediction Problems

JEL Codes: C55; H26

Causal Claims Network Graph

Edges that are evidenced by causal inference methods are in orange, and the rest are in light blue.

Causal Claims

Cause	Effect
ML algorithm (C45)	detected tax evasion (H26)
ML algorithm (C45)	recovered tax evasion (H26)
audit selection quality (M42)	revenue recovery (H27)
random selection (C90)	detected tax evasion (H26)
random selection (C90)	recovered tax evasion (H26)
audit selection method (M42)	audit yield (M42)
audit selection method (M42)	recovered evasion yield (H26)

Back to index