You are Here: Home >< Maths

# Standard deviation (s watch

1. My phone is spazzing out for the moment...sorry. I'll get back to this thread after I try sort this out
2. 1.)

2.)Attachment 459705459707
For standard deviation, there are two methods(using either of the two formulas from above). The longer one (pic1)and the shorter one.(pic 2) How is it that they give me slightly different answers? Is that supposed to happen... Shouldn't the answer be the same.
Attached Images

3. (Original post by Questioness)
1.)

2.)Attachment 459705459707
For standard deviation, there are two methods(using either of the two formulas from above). The longer one (pic1)and the shorter one.(pic 2) How is it that they give me slightly different answers? Is that supposed to happen... Shouldn't the answer be the same.
Try replacing (n-1) by n in the first formula.
4. (Original post by metaltron)
Try replacing (n-1) by n in the first formula.
Wow thank you. It's exactly the Same now. How come the formulas say n-1?
5. (Original post by Questioness)
Wow thank you. It's exactly the Same now. How come the formulas say n-1?
In short the two formulas have slightly different uses. I think you should use the first one (with the (n-1)) as this gives an estimate of the variance of X if you have taken a sample x_1, x_2, ..., x_n of values of X. I will try to write out something explaining why the two formulas are different in a bit.
6. (Original post by Questioness)
Wow thank you. It's exactly the Same now. How come the formulas say n-1?
I'll talk about variance mainly, which is just the square of the standard deviation.

I''ll first explain how you might derive a formula for the variance, which is designed to be a measure of how spread out your data is. Suppose we measure something n times, and we get values x_1, x_2, ... , x_n.
Let x(bar) = (x_1 + x_2 + ... + x_n)/n.

We want to work out how spread out the data is from the mean, so the quantities will be important for 1 <= i <= n. So we could take the mean of these as a possible measure of spread (average difference between value and mean), however (Persuade yourself that this is true!) so this is a rubbish measure of spread!

The easy fix is to take the average squared difference between each value and the sample mean and then we do get a measure of spread in that the higher the variance, the bigger the spread about the sample mean:

Since we squared each value when we calculated the variance the units are all wrong, which is why the standard deviation is also useful since square root now gives you back the correct units.

Now notice that:

So we have the alternative formula:

So now suppose we have a random variable X. Then X will have a mean and a variance, call them and V. However, we don't know what the values of the mean and variance are. For example, suppose we are calculating average human height, then we don't know exactly what the mean/variance is, we can only get a rough idea by taking a sample of humans.

So say we have a sample x_1, x_2 , ... , x_n of values from X. Then if we want to estimate it turns out that the sample mean is a good estimate (in that its expected value is equal to ) . We say that is an unbiased estimator of the mean.

Now let be the sample variance. It turns out that this is not an unbiased estimator of the actual variance of X, it is expected to be slightly lower than the actual variance of X. This is because the sample mean is unlikely to be the actual mean, so the data is likely to be less spread about the sample mean (which is what the sample variance calculates) than about the actual mean.

However it turns out that:

is an unbiased estimator of the variance of X (ie ).

Therefore if you have a sample x_1, x_2 , ... , x_n of values from X you should:

1) Use the formula with (n-1) if you want to estimate the variance of X
2) Use the formula with n if you want to calculate the sample variance.

(As it happens will not give an unbiased estimate for the standard deviation of X, since the square root function is non-linear. However, it is actually impossible to find an unbiased estimator for the standard deviation in the same way we did for the variance, so square rooting s^2 is about a good as we can do to estimate the standard deviation of X)
7. (Original post by metaltron)
I'll talk about variance mainly, which is just the square of the standard deviation.

I''ll first explain how you might derive a formula for the variance, which is designed to be a measure of how spread out your data is. Suppose we measure something n times, and we get values x_1, x_2, ... , x_n.
Let x(bar) = (x_1 + x_2 + ... + x_n)/n.

We want to work out how spread out the data is from the mean, so the quantities will be important for 1 <= i <= n. So we could take the mean of these as a possible measure of spread (average difference between value and mean), however (Persuade yourself that this is true!) so this is a rubbish measure of spread!

The easy fix is to take the average squared difference between each value and the sample mean and then we do get a measure of spread in that the higher the variance, the bigger the spread about the sample mean:

Since we squared each value when we calculated the variance the units are all wrong, which is why the standard deviation is also useful since square root now gives you back the correct units.

Now notice that:

So we have the alternative formula:

So now suppose we have a random variable X. Then X will have a mean and a variance, call them and V. However, we don't know what the values of the mean and variance are. For example, suppose we are calculating average human height, then we don't know exactly what the mean/variance is, we can only get a rough idea by taking a sample of humans.

So say we have a sample x_1, x_2 , ... , x_n of values from X. Then if we want to estimate it turns out that the sample mean is a good estimate (in that its expected value is equal to ) . We say that is an unbiased estimator of the mean.

Now let be the sample variance. It turns out that this is not an unbiased estimator of the actual variance of X, it is expected to be slightly lower than the actual variance of X. This is because the sample mean is unlikely to be the actual mean, so the data is likely to be less spread about the sample mean (which is what the sample variance calculates) than about the actual mean.

However it turns out that:

is an unbiased estimator of the variance of X (ie ).

Therefore if you have a sample x_1, x_2 , ... , x_n of values from X you should:

1) Use the formula with (n-1) if you want to estimate the variance of X
2) Use the formula with n if you want to calculate the sample variance.

(As it happens will not give an unbiased estimate for the standard deviation of X, since the square root function is non-linear. However, it is actually impossible to find an unbiased estimator for the standard deviation in the same way we did for the variance, so square rooting s^2 is about a good as we can do to estimate the standard deviation of X)
Thank you so much for this explanation. Really well explained, this clears up pretty much all the confusion I had and more!

### Related university courses

TSR Support Team

We have a brilliant team of more than 60 Support Team members looking after discussions on The Student Room, helping to make it a fun, safe and useful place to hang out.

This forum is supported by:
Updated: September 7, 2015
Today on TSR

### He lied about his age

Thought he was 19... really he's 14

### University open days

Wed, 25 Jul '18
2. University of Buckingham
Wed, 25 Jul '18
3. Bournemouth University
Wed, 1 Aug '18
Poll
Useful resources

### Maths Forum posting guidelines

Not sure where to post? Read the updated guidelines here

### How to use LaTex

Writing equations the easy way

### Study habits of A* students

Top tips from students who have already aced their exams