GDP flash estimate. Timeliness at the expense of precision?
The latest announcement of the Central Statistical Bureau (CSB) on the full gross domestic product (GDP) data caused surprise and even bafflement to specialists and non-specialists alike. Namely, at the end of October 2015, the CSB published the so-called flash estimate that indicated that in the third quarter of 2015, GDP had grown by 2.5% year-on-year (unadjusted data). A month later, at the end of November, however, the CSB published its full or second estimate that made it known that, in that same period, GDP had grown by 3.3%. Fans of conspiracy theories usually find this the right moment to raise awareness of imprecision of statistics and quote the joke that "there are lies, big lies and then there are the statistics". There is no doubt that a difference of 0.8 percentage points (year-on-year) between the first and second estimates is rather substantial, but to put it in a slightly wider context, I wrote this blog entry.
Fig. 1. The difference between y-o-y GDP change in flash and full estimates; percentage points
Data source: Central Statistical Bureau, data announcement of the relevant period author's calculations
Briefly on the GDP flash estimate
What exactly is the GDP flash estimate? It is a semi-synthetic indicator that is partially based on operational information and partially on econometric models. The GDP flash estimate used to appear around the 40th day after the quarter of reference and, beginning with the third quarter of 2014, and in keeping with the Eurostat conditions, the CSB has begun publishing the flash estimate even earlier – around the 30th day after the end of the period of reference.
The GDP flash estimate is based on operational information by branch that has been received from businesses. In arriving at the flash estimate, information on the manufacturing of industrial production and retail trade turnover is taken into account, and so is preliminary information (i.e. without including all enterprises, which could influence the results at the moment of developing of the full GDP estimate) and information on construction production, labour force and turnover of the service sector. It is important to understand why differences between the flash and full estimates arise. The main reason is the fact that flash estimate is based on only partial information. At the moment the flash estimate is made, full information is available on only a few branches – retail and industry. As the full estimate is made, this information has been updated and supplemented by a number of data that were not used for the flash estimate. The data on industry and trade are the ones that are crucial in reflecting the changes in the dynamic of the economy. And the fact that they have already been included in the flash estimate are the reason why the flash estimate is relatively close to the full one. At the same time, surprises may occur by other economically important branches, for instance, real estate, financial services or transport and storage industries, about which, at the moment of making the flash estimate, only partial information is available or it is based on the predictions of econometric models.
In addition, in the case of flash estimate, econometric modelling is also carried out, involving three aspects: GDP direct estimate; GDP from the point of view of expenditures and GDP from the point of view of production. In modelling each of these, the CSB uses a number of vector autoregressive models. Afterwards, the CSB compares the results obtained from operational data and modelling and, if the difference between the results is not too big, the mean of the two is used. If, on the other hand, the difference is bigger, the result of mathematical models is rejected. The process of flash or first estimate is rather complicated.
Thus the reader may logically ask: why bother with the flash estimate if one can wait a little (another 30 days) and obtain a more reliable picture of the state of the economy? The answer is simple: in the dilemma between the timeliness and precision of data, economic analysts, financiers, economic policy makers and other data users – they all prefer timeliness. It means that economic analysts will most often require very timely economic data, even if they are not as precise or reliable. In Latvia it may be less pronounced because the flash estimate of GDP will have almost no impact on the stock exchange index or the government bond yields, or the behaviour of consumers in the market. If anything, the GDP figures might affect the adoption of fiscal decisions (drafting of the budget). But in countries with developed financial markets, the GDP flash estimate is the first piece of news about the actual state of the economy, and it can change the sentiment at the stock exchange, affect the profitability of debt instruments and also foster relevant changes in monetary policy. At the moment when the updated GDP comes out, it will attract attention, yet the focus will be on other indicators characterizing the current quarter, for instance, different labour market indicators and the leading indicators (PMI, ESI, Sentix, IFO utt.).
It can also be said that the flash estimate is more useful to those who are more interested in the short-term cyclical position of the economy. In other words, is the economy growing and at what rate. The full estimate will be of greater interest to those who are thinking on the long-term prospects of the economy. This is because of the fact that the flash estimate usually consists of only one number – the one indicating the amount of GDP growth. Another couple of figures may be mentioned in the report to explain the flash estimate result, but there is no reason to expect a great degree of detailing. On the other hand, the full or second estimate contains much structural information as to how the various branches of the economy and the components of spending and revenue are developing.
Fig. 2. Flash, full and last estimate of the rate of GDP y-o-y changes, %
Data source: Central Statistical Bureau, report on data of the reference period
How precise have we been up to now?
The GDP flash estimate has been available in Latvia as of the fourth quarter of 2006. Up to now we thus have 36 quarters of information that allow us to make certain conclusions. In Figure 2 I have shown the rate of year-on-year change according to the first or flash estimate, second or full estimate and the actual GDP time series. It cannot be said that there is a lot of theoretical literature exploring issue of the estimate precision. In what is available, the focus is usually on these three time series. The graph suggests that the difference between flash and full estimates is very insignificant, yet it is related to the rather wide scale that must be used to illustrate the above time period.
What, to me, is most important in evaluating the precision of flash estimates? The main thing of course is that the absolute revision/error (i.e. disregarding the sign) should be as low as possible. It would also be important for the revision/error between the flash and full estimates to be close to zero in the long run. In Figure 1, I compared the difference in year-on-year change expressed in percentage points. Within the above mentioned time series, the absolute error is 0.35 percentage points. It means that in the time period from the fourth quarter of 2006 to the third quarter of 2015, flash estimates differed from the full estimate by 0.35 percentage points on average (the results are similar in keeping with both the old and new flash estimate publishing deadlines). The mean error in this same time period was 0.11 percentage points, which, given the relatively short time series and the large standard deviation, should be considered a respectable result.
At the same time, I must note that in the "peaceful period", beginning with the first quarter of 2011 (disregarding the crisis period when the fluctuation of GDP changes was extraordinarily great), the mean revision amount has been +0.23 percentage points, in support of recent observation that the flash estimate does not fully appreciate economic growth. It may be related to the fact that it is those branches about which there is no full information in the flash estimate that grow the fastest.
Yet we must keep in mind that here we are talking about the differences between the flash and full estimates. That does not exclude the problem that during future revisions the GDP time series is corrected beyond recognition and, as a result, there is little to gain from the fact that the flash and full GDP estimates are close.
The third quarter of 2015 stands out against the overall background, yet there have been quarters with even greater revisions. Three times during recent years, the revision has amounted to 0.9 percentage points. At the same time, in 18 cases out of 36, or in exactly one half, the amount of revision was minuscule (less than 0.2 percentage points). In my opinion, these results indicate that the CSB flash estimate method has enjoyed notable success. The flash estimate shows a greatly plausible result, which will be close to the one mentioned in the full GDP estimate. Can it be said that the full estimate is more precise? It is a question from the realm of speculation, for what does "more precise" in this case mean? If we assume that "more precise" means the latest, then we must compare the flash and full estimates against the current GDP time series. The correlation between the flash and current estimates is 0.957, and the correlation between the full estimate and current estimate is 0.963. There is an improvement, but it is rather insubstantial. It confirms that the impact of the regular or methodological data revisions that follow the publication of the full estimate is substantially greater. It means that the real problem of the GDP time series quality is found not in the differences between the flash and full estimates but in the following data revisions, which sometimes change the initial GDP profile quite substantially. Moreover, in this case, we are only talking about the GDP time series: at the component/branch level, the problem is even more pronounced.
Is the absolute error of 0.35 percentage points large compared with other countries? Analysis by the U.S. Bureau of Economic Analysis indicates that in the time period from 1993 to 2014, the absolute revision in the US was 0.5 percentage points and mean revision 0.1 percentage points. The average absolute revision of the current time series against the flash estimate time series is 1.2 percentage points. In the case of the United Kingdom, in the time period between the first quarter of 1993 to the second quarter of 2015, the GDP average revision was zero and the absolute revision only 0.05 percentage points. Yet, in a longer term, the subsequent revisions change the GDP estimate substantially. Thus, over time, the average absolute revision against the flash estimate is 0.33 percentage points. In the case of Latvia, however, the average absolute revision of flash estimate against the current time series is substantially higher at 1.76 percentage points. This is another confirmation that in the case of Latvia, the quality of flash estimate as compared to the full estimate is satisfactory. However, the reliability of both, flash and full, estimates as compared to the subsequent revisions is rather weak. That gives rise to a certain dilemma for the economic forecasters – which GDP should they forecast? The one that will be published in the flash and full data reports or perhaps the one that will be thoroughly revised?
Here I touched upon only one small dimension of the precision of GDP estimates, whereas the potential field of action is much greater. In my opinion, it would be important to data users that measures would be taken that would diminish the impact of revisions on the full estimate. It means that at the time when the transition is made from the quarter national account data to the year national account data, the revision against the full GDP estimate would not be so large. I am aware how daunting this task is, for the requirements of data users (including Eurostat un OECD) in terms of data quality cause constant changes in methodology.
The attitude of the providers of the initial data is not always serious, which makes the CSB work so much more difficult. A large part of the revisions result not from the activities of data compilers but those of data providers. It must be kept in mind that it is the data of data providers (businesses, households, state administration and other economic participants) with which the CSB deals on a daily basis.
I also think that the time has come for economic data users to get used to the idea that a revision of data is not bad news but an integral process of the work (even though in some cases technical errors cannot be excluded). To achieve this, it is worth learning from the experience of international institutions. The central banks and research institutes pay attention not only to the revisions of the main GDP time series but also to those of separate components and take into account the results in forming the econometric models for GDP forecasts. For example, the Internet page of the British Office for National Statistics provides the so-called revision triangles that contain very valuable information as to how, in the course of time, the relevant time series of GDP has changed. It would be desirable if the Latvian CSB also published revision triangles. That would give an opportunity to the local researchers to deal with this area more seriously, which in the future would allow for fewer forecasting errors and possibly improve the quality of data.
 Note that, as of the third quarter of 2014, the GDP flash estimate is published 10 days earlier than before. As a result, the concept of "flash estimate" includes similar, yet slightly different concepts.