The Virtue of Complexity in Return Prediction

Working Paper: CEPR ID: DP17194

Authors: Semyon Malamud; Bryan Kelly; Kangying Zhou

Abstract: We theoretically characterize the behavior of return prediction models in the high complexity regime, i.e. when the number of parameters exceeds the number of observations. Contrary to conventional wisdom in finance, return prediction R2 and optimal portfolio Sharpe ratio generally increase with model parameterization, even when minimal regularization is used. Empirically, we document this "virtue of complexity" in US equity market prediction. High complexity models deliver economically large and statistically significant out-of-sample portfolio gains relative to simpler models, due in large part to their remarkable ability to predict recessions.

Keywords: Portfolio Choice; Machine Learning; Random Matrix Theory; Benign Overfit; Overparameterization

JEL Codes: C3; C58; C61; G11; G12; G14

Causal Claims Network Graph

Edges that are evidenced by causal inference methods are in orange, and the rest are in light blue.

Causal Claims

Cause	Effect
Increasing model complexity (C52)	Improved out-of-sample predictive performance (C52)
Increasing model complexity (C52)	Increased expected out-of-sample R² (C51)
Increasing model complexity (C52)	Increased Sharpe ratio (G40)
High complexity models (C59)	Generate substantial economic profits (D33)
High complexity models (C59)	Predict recessions accurately (E32)
High complexity models (C59)	Achieve significant economic gains (O57)
High complexity models (C59)	Low or negative predictive R² (C29)

Back to index