The Virtue of Complexity in Return Prediction

Working Paper: NBER ID: w30217

Authors: Bryan T. Kelly; Semyon Malamud; Kangying Zhou

Abstract: Much of the extant literature predicts market returns with “simple” models that use only a few parameters. Contrary to conventional wisdom, we theoretically prove that simple models severely understate return predictability compared to “complex” models in which the number of parameters exceeds the number of observations. We empirically document the virtue of complexity in US equity market return prediction. Our findings establish the rationale for modeling expected returns through machine learning.

Keywords: Return Prediction; Machine Learning; Portfolio Construction

JEL Codes: C1; C45; G1

Causal Claims Network Graph

Edges that are evidenced by causal inference methods are in orange, and the rest are in light blue.

Causal Claims

Cause	Effect
model complexity (C52)	expected out-of-sample forecast accuracy (C53)
model complexity (C52)	portfolio performance (G11)
increased model complexity (C52)	improved performance (D29)
high-complexity regime (p > t) (P10)	expected out-of-sample forecast accuracy (C53)
high-complexity regime (p > t) (P10)	portfolio performance (G11)
model complexity (C52)	predictive outcomes (C52)
high-dimensional models (C52)	outperform simpler models (C52)
nontrivial shrinkage (C24)	enhance Sharpe ratio (G11)

Back to index