Classification Detection and Consequences of Data Error: Evidence from the Human Development Index

Working Paper: NBER ID: w16572

Authors: Hendrik Wolff; Howard Chong; Maximilian Auffhammer

Abstract: We measure and examine data error in health, education and income statistics used to construct the Human Development Index. We identify three sources of data error which are due to (i) data updating, (ii) formula revisions and (iii) thresholds to classify a country's development status. We propose a simple statistical framework to calculate country specific measures of data uncertainty and investigate how data error biases rank assignments. We find that up to 34% of countries are misclassified and, by replicating prior studies, we show that key estimated parameters vary by up to 100% due to data error.

Keywords: Human Development Index; data error; misclassification; statistics; development

JEL Codes: C43


Causal Claims Network Graph

Edges that are evidenced by causal inference methods are in orange, and the rest are in light blue.


Causal Claims

CauseEffect
data error (Y10)misclassification (J79)
data error (Y10)HDI scores (O15)
HDI scores (O15)ordinal rank assignments (C69)
data revision (C59)HDI scores (O15)
formula changes (C29)HDI scores (O15)
arbitrary cutoff values (C24)HDI scores (O15)

Back to index