Why is the regression equation called 'least squares line'?

Maths and statistics discussion, revision, exam and homework help.

Announcements Posted on
TSR launches Learn Together! - Our new subscription to help improve your learning 16-05-2013
IMPORTANT: You must wait until midnight (morning exams)/4.30AM (afternoon exams) to discuss Edexcel exams and until 1pm/6pm the following day for STEP and IB exams. Please read before posting, including for rules for practical and oral exams. 28-04-2013
Sign in to Reply
  1. sk007_644's Avatar
    • Overlord in Training
    • Location: South London
    Why is the regression equation called 'least squares line'?
    Please answer the question.
  2. TenOfThem's Avatar
    • TSR Royalty
    Re: Why is the regression equation called 'least squares line'?
    Because you measure the vertical distances from the points to the line

    Then you square them (to eliminate the negative)

    Then you add them

    Then you minimise this value to get the optimum line
  3. BrightStarXXXpa's Avatar
    • New Member
    • Posts: 4
    Re: Why is the regression equation called 'least squares line'?
    To see this its best to do from first principles just once. Take 5 pairs of values (x,y) that fit almost but not exactly on a straight line, say 1,5 2,11 3,15 4,19 5,26 Then work out the regression line in the normal manner. Then find the predicted values from the line for x = 1 to x = 5, take these away from the true values (5 11 15 19 26). Find the difference between the predicted and the true values, square these differences and add them together (call it S). No other line (say for example y = 5x) for these data will give you a smaller value for S.
    Why this should be considered the "best" line is something worth considering. It has 3 desirable properties that are not mathematically sophisticated...we all get the same answer....it uses all the data...it is easy to calculate.
  4. Jam''s Avatar
    • Exalted and Worshipped Member
    • Location: London
    • Posts: 1,105
    Re: Why is the regression equation called 'least squares line'?
    It means the method produces a line that produces the LOWEST sum total of all the differences to the line squared.

    For simplicity, say we have two points 10 units apart. Although we can produce a line that goes through both points, consider the following and that we are producing a horizontal line that separates the two points:

    We can make the line that goes through one point and is 10 units away from the other, the total sum of the squares would produce: 0^2+10^2=100;

    But we could also create a line that is 3 units from one point and 7 units from the other. This would give us a sum of squares value of 7^2+3^2=58 which has a lower sum of squares.

    However, we could produce a line that is 5 units from each point. This is the optimal solution as it gives us a sum of squares value of 5^2+5^2=50, the lowest achievable value.

    So it just means the LOWEST sum of the squares of the differences is selected.
  5. .ACS.'s Avatar
    • Community Assistant
    • TSR Idol
    Re: Why is the regression equation called 'least squares line'?
    (Original post by BrightStarXXXpa)
    Why this should be considered the "best" line is something worth considering. It has 3 desirable properties that are not mathematically sophisticated...we all get the same answer....it uses all the data...it is easy to calculate.
    I think you're jumping the gun a bit when referring to BLUE.
Sign in to Reply
Share this discussion:  
Article updates
Moderators

We have a brilliant team of more than 60 volunteers looking after discussions on The Student Room, helping to make it a fun, safe and useful place to hang out.

Reputation gems:
The Reputation gems seen here indicate how well reputed the user is, red gem indicate negative reputation and green indicates a good rep.
Post rating score:
These scores show if a post has been positively or negatively rated by our members.