Working Paper: NBER ID: w25980
Authors: Samuel Bazzi; Robert A. Blair; Christopher Blattman; Oeindrila Dube; Matthew Gudgeon; Richard Merton Peck
Abstract: Policymakers can take actions to prevent local conflict before it begins, if such violence can be accurately predicted. We examine the two countries with the richest available sub-national data: Colombia and Indonesia. We assemble two decades of fine-grained violence data by type, alongside hundreds of annual risk factors. We predict violence one year ahead with a range of machine learning techniques. Models reliably identify persistent, high-violence hot spots. Violence is not simply autoregressive, as detailed histories of disaggregated violence perform best. Rich socio-economic data also substitute well for these histories. Even with such unusually rich data, however, the models poorly predict new outbreaks or escalations of violence. "Best case" scenarios with panel data fall short of workable early-warning systems.
Keywords: conflict prediction; machine learning; Colombia; Indonesia
JEL Codes: C52; C53; D74
Edges that are evidenced by causal inference methods are in orange, and the rest are in light blue.
Cause | Effect |
---|---|
historical violence incidents (N41) | future violence (D74) |
socioeconomic conditions (P36) | future violence (D74) |
terrain ruggedness in Colombia (R14) | violence (D74) |
religious/ethnic diversity in Indonesia (Z12) | violence (D74) |
detailed histories of violent incidents (Y50) | accurate predictions of violence (D74) |
severity metrics (deaths and property damage) (H84) | accurate predictions of violence (D74) |
models (C52) | identification of high-violence hot spots (R23) |
models (C52) | prediction of timing of violence (C41) |
sophisticated methods and rich dataset (C55) | prediction of escalation of violence (D74) |