Working Paper: NBER ID: w24855
Authors: M. Scott Taylor
Abstract: All empirical researchers know that having more sources of variation in a dataset is valuable. What is not known is how valuable, and if the marginal value of adding another source of variation diminishes or increases. This note provides explicit answers to these questions. It defines "valuable" as the number of independent questions the data can potentially answer, and provides a surprisingly simple and useful rule that tells the researcher not only when they have "emptied the tank" of their data's valuable implications, but also the marginal value of further data collection. An illustration using home heating costs is provided.
Keywords: data variation; empirical research; marginal value; independent questions
JEL Codes: A20; Q40; Q41
Edges that are evidenced by causal inference methods are in orange, and the rest are in light blue.
Cause | Effect |
---|---|
sources of variation (C90) | independent questions (C12) |
additional sources of variation (C39) | independent questions (C12) |
m = 2^n - 1 (C30) | independent questions (C12) |
marginal value of adding another source of variation (C59) | independent questions (C12) |