A Generalizable and Accessible Approach to Machine Learning with Global Satellite Imagery

Working Paper: NBER ID: w28045

Authors: Esther Rolf; Jonathan Proctor; Tamma Carleton; Ian Bolliger; Vaishaal Shankar; Miyabi Ishihara; Benjamin Recht; Solomon Hsiang

Abstract: Combining satellite imagery with machine learning (SIML) has the potential to address global challenges by remotely estimating socioeconomic and environmental conditions in data-poor regions, yet the resource requirements of SIML limit its accessibility and use. We show that a single encoding of satellite imagery can generalize across diverse prediction tasks (e.g. forest cover, house price, road length). Our method achieves accuracy competitive with deep neural networks at orders of magnitude lower computational cost, scales globally, delivers label super-resolution predictions, and facilitates characterizations of uncertainty. Since image encodings are shared across tasks, they can be centrally computed and distributed to unlimited researchers, who need only fit a linear regression to their own ground truth data in order to achieve state-of-the-art SIML performance.

Keywords: machine learning; satellite imagery; global challenges; socioeconomic conditions; environmental conditions

JEL Codes: C02; C8; O13; O18; Q5; R1

Causal Claims Network Graph

Edges that are evidenced by causal inference methods are in orange, and the rest are in light blue.

Causal Claims

Cause	Effect
encoding of satellite imagery (Y90)	prediction tasks (C53)
encoding of satellite imagery (Y90)	improved predictions of ground conditions (C53)
SIML method (C59)	accurate predictions of socioeconomic and environmental conditions (R11)
shared features (Y80)	linear regression model to ground truth data (C51)
SIML method (C59)	reduced resource-intensive requirements (Q32)
encoding of satellite imagery (Y90)	accuracy competitive with deep neural networks (C45)
R-squared values (C29)	effectiveness of SIML method (C52)

Back to index