Standard error of forecast

General discussion about calculators, SwissMicros or otherwise
User avatar
Walter
Posts: 3070
Joined: Tue May 02, 2017 11:13 am
Location: On a mission close to DRS, Germany

Standard error of forecast

Post by Walter »

For the statistic nerds here, this may be a dumb question, but it confuses me:

Assume you have a set of data points and fit a straight line y = a + b x through them the usual way. Then you can determine the standard errors of the parameters a and b as well, i.e. s(a) and s(b).

Using these fit results, you can compute a forecast y* = a + b x*. According to Gauss' error propagation, the standard error of this forecast should be

s(y*) = SQRT((dy*/da)² s(a)² + (dy*/db)² s(b)²) = SQRT(s(a)² + x* s(b)²).

OTOH, there is a 'standard error of the estimate' s(y^) = sy SQRT((n - 1)(1 - r²)/(n - 2)) with sy being the standard deviation of the y-values. This formula is said to describe the scattering of y of the data points around the fit line. Hence s(y^) is independent of x.

What is the correct formula for the standard error of the forecast y* ?
If s(y^) then why are you not allowed to apply Gauss here? Is it since a and b are not independent variables?
And I wonder why the error of the forecast should be constant over the entire range of x. I think it should increase when x* exceeds the range covered by the data points.

I looked through some statistics books but found noone really covering this topic (or I did miss it).

Thanks in advance for enlightenment.
Last edited by Walter on Sat Sep 25, 2021 10:40 am, edited 1 time in total.
WP43 SN00000, 34S, and 31S for obvious reasons; HP-35, 45, ..., 35S, 15CE, DM16L S/N# 00093, DM42β SN:00041
rawi
Posts: 102
Joined: Sat Dec 28, 2019 4:50 am
Location: Bavaria, Germany

Re: Standard error of forecast (error propagation in fitting?)

Post by rawi »

Hi Walter,

the formula for the variance of a forecasting value f is:
S(f)^2 = S(e)^2 * (1+(1/N) + (x0-mean(x))^2/sum(xi-mean(x))^2)
Whereas:
N = number of observations
mean(x) = arithmetic mean of x
x0 = value of x for which you want a forecast
Se^2 = sum(yi-(a+b*xi))^2/(N-2)
a is the constant, b the slope of the regression function.

The standard deviation is the root of that.

The reason why your formula is not correct is because the regression parameters a and b are correlated.

Attached you will find a small example.

Reference: Hill / Griffiths / Lim: Principles of Econometrics, Wiley 2008

If you need further information please indicate.

Best

Raimund
Attachments
Forecast_var.xlsx
(16.61 KiB) Downloaded 125 times
User avatar
PierreMengisen
Posts: 305
Joined: Wed Nov 29, 2017 1:38 pm
Location: Neuchâtel CH

Re: Standard error of forecast (error propagation in fitting?)

Post by PierreMengisen »

according to the Microsoft Excel help

The equation for the standard error of the predicted y-value is as follows: 

formule.jpg
formule.jpg (35.2 KiB) Viewed 3117 times

where x and y are the sample means AVERAGE(x_known) and AVERAGE(y_known), and n is the sample size.
Pierre
[TI59 with PC100C; TI-84 Plus CE-T; HP41CV with HP IL loop & 2*82161A DCD & 82162 TP; HP15C; HP28S; DM41; DM41L; DM42; DM41X]
rawi
Posts: 102
Joined: Sat Dec 28, 2019 4:50 am
Location: Bavaria, Germany

Re: Standard error of forecast (error propagation in fitting?)

Post by rawi »

Hi Pierre,

your formula delivers the standard error of all y-values of the regression, one result for all y-values together (see the example in Excel).
The formula I posted delivers the standard error for one single value which may be in the data set of the regression analysis but may as well be an additional value not included in the regression analysis for a forecast. This is how I understood Walter's question.
It is clear that a formula for a forecast has to take into account explicitely the x0-value for which the forecast is done. The farer away the x0-value is from the mean of the x-values that are the basis for the regression analysis the greater the standard error gets.

Best

Raimund
User avatar
Walter
Posts: 3070
Joined: Tue May 02, 2017 11:13 am
Location: On a mission close to DRS, Germany

Re: Standard error of forecast (error propagation in fitting?)

Post by Walter »

Danke Raimund, merci Pierre, for confirming that Gauss can't be used here. With respect to the formula for S(f), I tend to Raimund's since it depends also on x in a plausible way.

But I think S(e) shall be replaced by sy therein, doesn't it? Please compare https://de.wikipedia.org/wiki/Lineare_E ... Vorhersage. Though I observed more than once that different authors state different (statistical) formulas since they write and condense terms differently, making it difficult to find out whether those are actually identical or not.
WP43 SN00000, 34S, and 31S for obvious reasons; HP-35, 45, ..., 35S, 15CE, DM16L S/N# 00093, DM42β SN:00041
rawi
Posts: 102
Joined: Sat Dec 28, 2019 4:50 am
Location: Bavaria, Germany

Re: Standard error of forecast (error propagation in fitting?)

Post by rawi »

Hi Walter,

s(e) should not be replaced by s(y). In the wikipedia article sigma is defined only implicitly. Under the headline "t-test" you see it is assumed that the error term is normally distributed with mean 0 and variance sigma^2. From this it is clear that the sigma used later is the standard deviation of the error term which I called s(e).

As well in my reference "Principles of Econometrics" on page 102 it is clear that it is the variance of the errors that is used in the formula I cited.

And this makes sense: If you use s(y) instead of s(e) the quality of the forecast would not depend on the quality of the regression.

BTW: I agree 100% to your comment on statistical formulas in different textbooks

Best

Raimund
User avatar
Walter
Posts: 3070
Joined: Tue May 02, 2017 11:13 am
Location: On a mission close to DRS, Germany

Re: Standard error of forecast (error propagation in fitting?)

Post by Walter »

rawi wrote:
Thu Sep 02, 2021 8:24 pm
the formula for the variance of a forecasting value f is:
S(f)^2 = S(e)^2 * (1+(1/N) + (x0-mean(x))^2/sum(xi-mean(x))^2)
Whereas:
N = number of observations
mean(x) = arithmetic mean of x
x0 = value of x for which you want a forecast
Se^2 = sum(yi-(a+b*xi))^2/(N-2)
a is the constant, b the slope of the regression function.
Yes in principle, but this is the standard error of the sample values. I think if the forecasting value is on the regression line then
S(f)^2 = t^(-1)(n-2,0.683) * S(e)^2 * ((1/N) + (x0 - mean(x))^2/sum(xi - mean(x))^2)
shall be taken (standard error for estimating the mean response).
You can write this formula easier using Sx and S(b) then.

(EDIT: Added t...)
WP43 SN00000, 34S, and 31S for obvious reasons; HP-35, 45, ..., 35S, 15CE, DM16L S/N# 00093, DM42β SN:00041
rawi
Posts: 102
Joined: Sat Dec 28, 2019 4:50 am
Location: Bavaria, Germany

Re: Standard error of forecast (error propagation in fitting?)

Post by rawi »

Hi Walter,

I fear I do not understand you well.

The formula I gave is not the standard value of the sample values but for a value x0 that may be a sample value or may not be.
I do not know what t is and 2,0.683 Is a typo by sure but even if I read it as 2 I cannot see why this multplication makes sense here.
Sorry, but for the time being I cannot help further.

And BTW; I still believe that the formula I gave you is correct.

Best

Raimund
User avatar
Walter
Posts: 3070
Joined: Tue May 02, 2017 11:13 am
Location: On a mission close to DRS, Germany

Re: Standard error of forecast (error propagation in fitting?)

Post by Walter »

This t^(-1) is the inverse of the t distribution for n-2 degrees of freedom and a probability of 0.683 (it returns some 0.5 ± 5%). I found this looking at some old books and papers explaining confidence limits for the estimated forecast (Schätzwert). The probability 0.683 corresponds to the standard error. Your formula matches (though also with a t^(-1) multiplied) with the confidence limits for the population. The only difference is the term "1+" under the square root.

Some authors ditch the t^(-1) for unknown reasons. The errors become 2 times greater then, so you can say that's a conservative assumption (and you don't have to remember n anymore).

Statistics can be a little messy, especially when authors refrain from deriving their results...
WP43 SN00000, 34S, and 31S for obvious reasons; HP-35, 45, ..., 35S, 15CE, DM16L S/N# 00093, DM42β SN:00041
User avatar
Walter
Posts: 3070
Joined: Tue May 02, 2017 11:13 am
Location: On a mission close to DRS, Germany

Re: Standard error of forecast

Post by Walter »

Found the errors I made: For calculating t^(-1) for the standard error of a forecast f on the regression line, the probability must be set to one minus half the error probability, i.e. 1 - (1 - 0.683)/2 = 0.8415. Then t^(-1) approx. equals 1, so ditching this factor in the formula

S(f) = t^(-1)(n-2, 0.8415) * S(e) * SQRT((1/N) + (x0 - mean(x))^2/sum(xi - mean(x))^2)

is legal. Just to save you pondering on that stuff.

P.S.: And the forecast f means what HP called a forecast y^ of linear regression. Some textbooks use 'forecast' for other results in this context. Confusion guaranteed.
WP43 SN00000, 34S, and 31S for obvious reasons; HP-35, 45, ..., 35S, 15CE, DM16L S/N# 00093, DM42β SN:00041
Post Reply