Few “discoveries” in finance and economics stand the test of time, much less so in the realm of forecasting. One of the rare exceptions is the seminal research by Bates and Granger (“The Combination of Forecasts”), which celebrates its 51st anniversary this year.
The paper’s central insight: combining forecasts from multiple methodologies tends to deliver better projections compared with one model’s estimate. Many researchers, along with practicing economists and investment analysts, have extended and retested the Bates and Granger study many times, in various applications, over the decades. It’s no trivial point that the paper’s main finding has weathered the years rather well. No wonder, then, that the research (and its ever-expanding family of descendants) is widely recognized as a robust foundation for improving forecasts.
A recent working paper adds to this research lineage, finding that forecasting’s results tend to improve when drawing on multiple estimates vs. relying on a lone projection. The authors (including economist Allan Timmermann at University of California, San Diego) explore the question: “Do Any Economists Have Superior Forecasting Skills?” In search of an answer, by way of a “large set of Bloomberg survey forecasts of U.S. economic data,” the paper concludes “that there is very little evidence that any individual forecasters can beat a simple equal-weighted average of peer forecasts.”
The result is a hardy perennial in the literature that’s followed in the wake of the Bates and Granger study. As a 2015 article in the International Journal of Forecasting noted, “Since the seminal work of forecast combination by Bates & Granger (1969), thousands of research papers have been published on this topic with various combining schemes.”
The common theme: If you must forecast, don’t bet the farm on one model. One researcher advises: “When feasible, use five or more methods,” adding that “an equal-weights rule offers a reasonable starting point, and a trimmed mean is desirable if you combine forecasts resulting from five or more methods.”
Why are ensemble forecasting methods more reliable than individual forecasts? At a basic level, it’s just common sense a la don’t put all your prediction eggs in one basket. No model is perfect and so hedging your forecasting bets by diversifying across methodologies tends to deliver superior results over time.
That’s not to say that combined forecasts are always superior. In fact, it’s reasonable to assume that there’s usually a better model for any given time period. The problem is that the best model tends to change through time and so beating, say, the average forecast via multiple models is a high bar.
The one constant is that calculating an estimate from several models, each bringing a different set of biases to the forecasting task, tends to be competitive if not superior to selecting one model in advance.
As a toy example to illustrate the concept, let’s predict the rolling one-year percentage change in U.S. private payrolls with four basic econometric models: vector autoregression (var), autoregressive integrated moving average (aa), linear regression (lr) and a neural network (nn). We’ll also compute an equal-weight model (ew) of the four models’ forecasts. For a baseline, we’ll also generate naïve forecasts — using the last data point to predict the next-period result. In all cases, a one-step-ahead window is used.
To get into the weeds a bit, the var and linear regression models employ multivariate frameworks—in this case I’m using payrolls in concert with industrial production, personal income, and personal consumption expenditures to generate forecasts. The other two models are univariate, i.e., using payrolls alone for computing predictions. Keep in mind, too, that for this test the selected parameters for each model are more or less default choices, which implies that a superior set-up is possible. But that’s a task for another day.
Training the models is based on the rolling one-year percentage changes for private payrolls (using revised data) for the period April 2002 through March 2008. The out-of-sample forecasts begin in April 2008 and run through August 2012. To compare results, we’ll use root mean square error (RMSE), a standard tool for measuring variation in model estimates.
The main question for this test: Is an equally weighted combination of all four models superior to any one model? As the chart below shows, the best model is the autoregressive integrated moving average (aa), which has the smallest error. (A perfect forecast would score a zero RMSE.) Note, however, that the equal-weighted model (ew) is nearly as good as aa. It’s also clear that ew’s error is significantly smaller than the naïve estimate, which is the worst model in this case.
No one should assume that combining forecasts automatically produces better forecasts. But just as a diversified investment portfolio offers a degree of protection from the higher uncertainty that typically relates to any one asset or asset class, there’s strength in numbers when using multiple models in the precarious business of developing perspective about the future.
There are limits to error reduction in forecasting via ensemble modeling methods. Note, too, that results very much depend on the wisdom of models selected. The best mix uses models that are complimentary in terms of biases. Finding a robust mix isn’t always easy, and perhaps impossible, depending on the data you’re trying to anticipate. But whenever there’s an opportunity to draw on a wider set of complimentary models, it’s usually wise to do so.
As George Box famously wrote in a 1976 paper in the Journal of the American Statistical Association, “all models are wrong…” but some are “useful.” By the same logic, one can point to history and research to reason that forecasting by combining models is an imperfect solution, but developing estimates with the ensemble methodology will likely produce predictions that are less imperfect vs. any one model.
By James Picerno, Director of Analytics
IMPORTANT DISCLOSURES: PLEASE REMEMBER THAT PAST PERFORMANCE MAY NOT BE INDICATIVE OF FUTURE RESULTS. DIFFERENT TYPES OF INVESTMENTS INVOLVE VARYING DEGREES OF RISK, AND THERE CAN BE NO ASSURANCE THAT THE FUTURE PERFORMANCE OF ANY SPECIFIC INVESTMENT, INVESTMENT STRATEGY, OR PRODUCT MADE REFERENCE TO DIRECTLY OR INDIRECTLY FROM THE MILWAUKEE COMPANY™, WILL BE PROFITABLE, EQUAL ANY CORRESPONDING INDICATED HISTORICAL PERFORMANCE LEVEL(S), OR BE SUITABLE FOR YOUR PORTFOLIO. DUE TO VARIOUS FACTORS, INCLUDING CHANGING MARKET CONDITIONS, THE CONTENT MAY NO LONGER BE REFLECTIVE OF CURRENT OPINIONS OR POSITIONS. MOREOVER, YOU SHOULD NOT ASSUME THAT ANY DISCUSSION OR INFORMATION CONTAINED IN THE MILWAUKEE COMPANY™ SERVES AS THE RECEIPT OF, OR AS A SUBSTITUTE FOR, PERSONALIZED INVESTMENT ADVICE FROM THE MILWAUKEE COMPANY™