Working Paper: CEPR ID: DP7197
Authors: Vladimir Kuzin; Massimiliano Marcellino; Christian Schumacher
Abstract: This paper discusses pooling versus model selection for now- and forecasting in the presence of model uncertainty with large, unbalanced datasets. Empirically, unbalanced data is pervasive in economics and typically due to different sampling frequencies and publication delays. Two model classes suited in this context are factor models based on large datasets and mixed-data sampling (MIDAS) regressions with few predictors. The specification of these models requires several choices related to, amongst others, the factor estimation method and the number of factors, lag length and indicator selection. Thus, there are many sources of mis-specification when selecting a particular model, and an alternative could be pooling over a large set of models with different specifications. We evaluate the relative performance of pooling and model selection for now- and forecasting quarterly German GDP, a key macroeconomic indicator for the largest country in the euro area, with a large set of about one hundred monthly indicators. Our empirical findings provide strong support for pooling over many specifications rather than selecting a specific model.
Keywords: Factor Models; Forecast Combination; Forecast Pooling; MIDAS; Mixed-Frequency Data; Model Selection; Nowcasting
JEL Codes: C53; E37
Edges that are evidenced by causal inference methods are in orange, and the rest are in light blue.
Cause | Effect |
---|---|
Pooling the entire set of MIDAS and factor models (C51) | forecast performance (G17) |
Pooling (C83) | model misspecification risks (C52) |
model selection based on information criteria (C52) | forecasting performance (C53) |
method of model selection (C52) | forecast accuracy (C53) |