The Budget Forecasts. . .

And forecasts in general.

The Administration released its budget proposal yesterday. Others have dissected the implications [1] [2] [3]. Here I want to focus on the Administration forecast.

The forecasts, and comparisons against alternatives, are presented in Table 2-2 of the Analytical Perspectives of the Budget.

One important fact to remember is that the Administration forecast (from the Troika of Treasury, OMB and CEA) was “locked down” in mid-November. That would mean that the forecast was finalized before the disappointing advance release for 2011Q4 GDP, but also before the relatively positive January 2012 employment release. We can place the Administration forecast in context by plotting the implied GDP (log) levels associated with the Administration, January Blue Chip and Fed forecasts.


Figure 1: Log actual GDP (black), Administration forecast (red), Blue Chip (blue), and January Fed central tendency high and low (gray). Since only Q4 q4/q4 growth rates are reported, the q4/q4 growth rates are assumed to hold throughout the corresponding year. NBER defined recession dates shaded gray. Source: FY 2013 Budget Analytical Perspectives, NBER, author’s calculations.
The Administration forecast is above the Blue Chip forecast, but roughly at the top end of the Fed’s central tendency. budgetfcast3.gif
Figure 2: Log actual GDP (black), Administration forecast (red), Wall Street Journal mean (blue), and WSJ 20% trimmed high (Shephardson/Hi Frequency) and low (Sterne/Econ Analysis) (gray). Since only Q4 q4/q4 growth rates are reported, the q4/q4 growth rates are assumed to hold throughout the corresponding year, except for WSJ over 2012. Trimming is applied to only those respondents that provided 2012-2014 growth rates. NBER defined recession dates shaded gray. Source: FY 2013 Budget Analytical Perspectives, WSJ February survey, NBER and author’s calculations.
Figure 2 plots the Administration forecast as well as the mean and 20% trimmed hi/low range from the WSJ February survey. Since the February forecast is conditioned on the 2011Q4 release, it starts at a lower level than the Administration forecast which (conditioned on mid-November data) predicted a slightly higher level of output than estimated in the 2011Q4 advance release. Even so, the Administration forecasted level is below the trimmed high in 2013-2014 (clearly, it would have been farther below the actual high in the WSJ survey).

Forecast Accuracy

Prakash Loungani had a timely piece on forecasting a couple weeks ago (especially given the possibility of negative shocks emanating from Europe or the Persian Gulf), In it, he noted:

With my colleagues Jair Rodriguez and Hites Ahir, I’ve since looked at the record of forecasting recessions over the decade of the 2000s and during the Great Recession of 2007-09.

Let’s consider the 2000s first and restrict attention to forecasts for twelve large economies — the G7 plus the ‘E7’ (emerging market economies–Brazil, China, India, Korea, Mexico, Russia and Turkey), which together account for over three-quarters of world GDP.

There were a total of 26 recessions in this set of countries. Only two recessions were predicted a year in advance and one of those predictions came toward the turn of the year. Requiring recessions to be predicted a year ahead may seem like an unreasonably high bar to set.

Moreover, while forecasters increasingly started to recognize recessions in the year in which they occurred, the magnitude of the recession was underpredicted in the vast majority of cases. For instance, even as late as December of the year of the recession, the forecast was more optimistic than the outcome in 15 cases.

Figure 1 illustrates this point.

Figure 1 from Loungani (2012).That being said, I think his take-away is relevant:

The failure to predict recessions does not mean that forecasts of economic growth have no value. But it does suggest that users of forecasts might be better served by paying greater attention to the description of the outlook and the associated risks than to just the central forecast itself.

Reassuringly, it is becoming more common to show how much uncertainty there is about whether the central forecast will come true. It is particularly useful to be explicit about the downside risks to a growth forecast as it can provide a wake-up call for policies and actions needed to keep those risks from materializing.

So, even though there are increasing signs of a durable, albeit slow, recovery, it would be a mistake to take the central tendency of any forecast too seriously. In particular, given the clearly evident downside risks, it would be a big mistake to withdraw stimulus too soon; the payroll tax reduction and the extension of unemployment insurance are critical in this regard. An assessment of what these types of measures can do is recounted here. As CBO notes, these types of measures have potentially the biggest per dollar impact.

Loungani’s other work on forecasting is also relevant, given the foreign sources of uncertainty. From Information Rigidity in Growth Forecasts: Some Cross-Country Evidence (with H. Steckler and N. Tamarisa):

First, there is considerable sluggishness in revisions of growth forecasts. This is consistent with the sticky information models of Mankiw and Reis (2002), the imperfect information models of Woodford (2002) and Sims (2003), and behavioral explanations for forecast smoothing (Nordhaus 1987, Nordhaus and Durlauf, 1984, Fildes and Stekler, 2002).

Second, the sluggishness in forecast revisions declines during recessions and banking crises. We find that forecasts in the year preceding a year of recession start to depart from the unconditional mean, and the pace of revision picks over the course of the year of the recession. A similar pattern holds for banking crises. These finding supports models with state-dependent acquisition of information (e.g. Gorodnichenko 2008).

Third, we confirm the finding of sluggish adjustment in a multivariate setting, by estimating a seven-country VAR model for forecast revisions. The seven economies are the so-called G-3 (U.S., Germany, Japan) and the BRICs (Brazil, Russia, India, China) Forecasters are somewhat slower in absorbing news from other countries than own-country (or domestic) news. Forecasts for non-U.S countries, particularly those for Germany and Japan, are generally slow to absorb news from the U.S. There is also a tendency to absorb news from China at a very sluggish pace.

Comparing Models in Forecasting

Volker Wieland and Mark Wolters provide some additional information regarding the relative accuracy of DSGE and old-style Keynesian macroeconometric models, in their Vox article “Macroeconomic model comparisons and forecast competitions”:

… we propose a comparative approach to macroeconomic policy analysis that is open to competing modelling paradigms. We have developed a database of macroeconomic models that enables a systematic comparative approach to macroeconomic modelling with the objective of identifying policy recommendations that are robust to model uncertainty. This comparative approach enables individual researchers to conduct model comparisons easily, frequently, at low cost, and on a large scale.

The macroeconomic model database is available to download from and includes over 50 models. We have included models that are used at policy institutions like the IMF, the ECB, the Fed, and in academia. The database includes models of the US economy, the Eurozone, and several multi-country models. Some of the models are fairly small and focus on explaining output, inflation, and interest-rate dynamics. Many others are of medium scale and cover many key macroeconomic aggregates.

We use two small micro-founded New Keynesian models, two medium-size state-of-the-art New Keynesian business-cycle models — often referred to as DSGE models — and for comparison purposes an earlier-generation New Keynesian model (also with rational expectations and nominal rigidities but less strict microeconomic foundations) and a Bayesian VAR model. For each forecast we re-estimate all five models using exactly the data as they were available for professional forecasters when they submitted their forecasts to the SPF. Using these historical data vintages is crucial to ensure comparability to historical forecasts by professionals. We compute successive quarter-by-quarters forecasts up to five quarters ahead for all models.

Predicting the recession of 2008–09

Figure 1 shows forecasts for annualised quarterly real output growth for the recent financial crisis. The black line shows real-time data until the forecast starting point and revised data afterwards. The grey lines show forecasts collected in the SPF and the green line shows their mean. Model forecasts are shown in red. While data for real GDP become available with a lag of one quarter, professional forecasters can use within-quarter information from data series with a higher frequency. In contrast the models can process only quarterly data. To put the models on an equal footing in terms of information with the forecasts of experts, we condition their forecasts on the mean estimate of the current state of the economy from the SPF.

Here is Figure 1.

Figure 1 from Wieland and Wolters(2012). Mean SPF is mean of forecasts from Survey of Professional Forecasters.Wieland and Wolters conclude that old style Keynesian models are not necessarily to be favored, assuming that their use predominates in the Survey of Professional Forecasters.

I would note that at least the mean SPF predicted a growth deceleration as of 2008Q2, while all the newer vintage models continued predicting acceleration. On the other hand, the newer vintage models did catch the growth rebound better. This characterization is even more true for forecasts starting in 09Q1 or 09Q2. However, for forecast horizons that extend to mid-2010, the SPF seems to me (eyeballing) to do better at matching the deceleration of growth (see their Figure 2).

Further, these are growth rate forecasts; perhaps these are the most important. But as Morely has pointed out, oftentimes we are interested in levels. The treatment of trends then becomes critically important, as discussed at length here (see specifically Morley (2010) and Tovar (2008)).

This post originally appeared at Econbrowser and is posted with permission.