Why is the regression equation called 'least squares line'?
Maths and statistics discussion, revision, exam and homework help.
-
Re: Why is the regression equation called 'least squares line'?
To see this its best to do from first principles just once. Take 5 pairs of values (x,y) that fit almost but not exactly on a straight line, say 1,5 2,11 3,15 4,19 5,26 Then work out the regression line in the normal manner. Then find the predicted values from the line for x = 1 to x = 5, take these away from the true values (5 11 15 19 26). Find the difference between the predicted and the true values, square these differences and add them together (call it S). No other line (say for example y = 5x) for these data will give you a smaller value for S.
Why this should be considered the "best" line is something worth considering. It has 3 desirable properties that are not mathematically sophisticated...we all get the same answer....it uses all the data...it is easy to calculate. -
Re: Why is the regression equation called 'least squares line'?
It means the method produces a line that produces the LOWEST sum total of all the differences to the line squared.
For simplicity, say we have two points 10 units apart. Although we can produce a line that goes through both points, consider the following and that we are producing a horizontal line that separates the two points:
We can make the line that goes through one point and is 10 units away from the other, the total sum of the squares would produce: 0^2+10^2=100;
But we could also create a line that is 3 units from one point and 7 units from the other. This would give us a sum of squares value of 7^2+3^2=58 which has a lower sum of squares.
However, we could produce a line that is 5 units from each point. This is the optimal solution as it gives us a sum of squares value of 5^2+5^2=50, the lowest achievable value.
So it just means the LOWEST sum of the squares of the differences is selected. -
Re: Why is the regression equation called 'least squares line'?I think you're jumping the gun a bit when referring to BLUE.(Original post by BrightStarXXXpa)
Why this should be considered the "best" line is something worth considering. It has 3 desirable properties that are not mathematically sophisticated...we all get the same answer....it uses all the data...it is easy to calculate.