# Question on population regression model

Watch this thread
Announcements

Page 1 of 1

Go to first unread

Skip to page:

coconut64

Badges:
16

Rep:

?
You'll earn badges for being active around the site. Rep gems come when your posts are rated by other community members.
#1

Hi, after doing some research and reading the textbook I still don't fully get the terms used in the regression topic. There is a difference between population regression model and sample population model. But since regression is run for a set of sample, surely you can't describe the scatter plot generated as the population regression model?

Thanks

Thanks

0

reply

VannR

Badges:
21

Rep:

?
You'll earn badges for being active around the site. Rep gems come when your posts are rated by other community members.
#2

Report

#2

In a nutshell, the

*population*regression model is the theoretical model which you assert your data has i.e. Y = g(X) + e, where e is some statistical error term. The*sample*regression model is the estimation of this model which you produce from your data using least-squares estimators.
0

reply

coconut64

Badges:
16

Rep:

?
You'll earn badges for being active around the site. Rep gems come when your posts are rated by other community members.
#3

(Original post by

In a nutshell, the

**VannR**)In a nutshell, the

*population*regression model is the theoretical model which you assert your data has i.e. Y = g(X) + e, where e is some statistical error term. The*sample*regression model is the estimation of this model which you produce from your data using least-squares estimators.The regression statistics generated, which includes value for R squared and SE are described as the sample regression model?

0

reply

VannR

Badges:
21

Rep:

?
You'll earn badges for being active around the site. Rep gems come when your posts are rated by other community members.
#4

Report

#4

(Original post by

So if you run regression for two sets of data in linear regression. You call the scatter plot you construct the population regression model since it shows the line with the equation Y = g(X) + e?

The regression statistics generated, which includes value for R squared and SE are described as the sample regression model?

**coconut64**)So if you run regression for two sets of data in linear regression. You call the scatter plot you construct the population regression model since it shows the line with the equation Y = g(X) + e?

The regression statistics generated, which includes value for R squared and SE are described as the sample regression model?

Everything else you have is a set of statistics

*generated from*the sample regression model which can be used to check the quality of the regression.

EDIT: Y = g(X) + e is a description of the population model. We cannot know what the population model is for certain unless we have the entire population, which we don't. That is why we're using least-squares estimators. The term e is a normally distributed random variable with a mean of 0 and a variance of 1.

0

reply

coconut64

Badges:
16

Rep:

?
You'll earn badges for being active around the site. Rep gems come when your posts are rated by other community members.
#5

(Original post by

The EQUATION that you get is called the sample regression model.

Everything else you have is a set of statistics

EDIT: Y = g(X) + e is a description of the population model. We cannot know what the population model is for certain unless we have the entire population, which we don't. That is why we're using least-squares estimators. The term e is a normally distributed random variable with a mean of 0 and a variance of 1.

**VannR**)The EQUATION that you get is called the sample regression model.

Everything else you have is a set of statistics

*generated from*the sample regression model which can be used to check the quality of the regression.EDIT: Y = g(X) + e is a description of the population model. We cannot know what the population model is for certain unless we have the entire population, which we don't. That is why we're using least-squares estimators. The term e is a normally distributed random variable with a mean of 0 and a variance of 1.

Thanks

0

reply

VannR

Badges:
21

Rep:

?
You'll earn badges for being active around the site. Rep gems come when your posts are rated by other community members.
#6

Report

#6

(Original post by

Oh okay, thank you! So the error term is only present in the population regression equation? Why is ei not in the sample regression equation?

Thanks

**coconut64**)Oh okay, thank you! So the error term is only present in the population regression equation? Why is ei not in the sample regression equation?

Thanks

Y = a + bx + e

e ~ N(0, (sigma)^2). Finding the expectation and variance of Y for a fixed value x:

E(Y) = E(a + bx + e) = a + bx + E(e) = a + bx

Var(Y) = Var(a + bx + e) = Var(e) = (sigma)^2

So, the population regression model models Y ~ N(a + bx, (sigma)^2)

The problem now is that we do not know what a and b are. Thus, we use least-squares estimators a^, b^ such that E(a^) = a and E(b^) = b.

The sample regression model is then an estimator of Y, Y^, for a given value of x, such that Y^ = a^ + b^ . x, and where if we take expectations:

E(Y^) = E(a^ + b^ . x) = E(a^) + E(b^ . x) = a + b.x

Since E(Y^) = Y, we then have our original model without the error:

Y = a + b.x

P.S. "Where has the error gone?" - the regression model is not perfect! We

*think*that the error is likely to be 0, and our model is based on the assertion that it is. We might be dreadfully wrong about this though. This is why regression analysis needs a lot of "quality controls" before it can be used for real-world inferences.

P.P.S. I'm studying mathematics at university, hence all the details. I'm not sure of your level, so if you need anything more explained just message

1

reply

X

Page 1 of 1

Go to first unread

Skip to page:

### Quick Reply

Back

to top

to top