Datasnooping Biases in Tests of Financial Asset Pricing Models

Working Paper: NBER ID: w3001

Authors: Andrew W. Lo; Craig Mackinlay

Abstract: We investigate the extent to which tests of financial asset pricing models may be biased by using properties of the data to construct the test statistics. Specifically, we focus on tests using returns to portfolios of common stock where portfolios are constructed by sorting on some empirically motivated characteristic of the securities such as market value of equity. We present both analytical calculations and Monte Carlo simulations that show the effects of this type of data-snooping to be substantial. Even when the sorting characteristic is only marginally correlated with individual security statistics, 5 percent tests based on sorted portfolio returns may reject with probability one under the null hypothesis. This bias is shown to worsen as the number of securities increases given a fixed number of portfolios, and as the number of portfolios decreases given a fixed number of securities. We provide an empirical example that illustrates the practical relevance of these biases.

Keywords: datasnooping; financial asset pricing models; CAPM; APT

JEL Codes: G12; C12

Causal Claims Network Graph

Edges that are evidenced by causal inference methods are in orange, and the rest are in light blue.

Causal Claims

Cause	Effect
biases in test statistics (C46)	distorted inferences in asset pricing models (G19)
data snooping (C52)	rejection of null hypothesis (C12)
number of securities (G12)	bias in rejection of null hypothesis (C12)
sorting based on empirical characteristics (C55)	biases in test statistics (C46)

Back to index