You are Here: Home >< Maths

# Sample and population Variance and SD watch

1. I am reposting this part of an answer I gave in order to get an explanation from a stats expert (hopefully).

My confusion is with this wikipedia entry where variance is said to be unbiased whereas SD is biased. The entry says

"...the sample variance is an unbiased estimator for the population variance, but its square root, the sample standard deviation, is a biased estimator for the population standard deviation.

My understanding is that sample bias comes about due to the increased probability that a sample is near the mean in a sample of a distribution than is the case with a full population. The n-1 compensates for that.

In the wikipedia entry I can't see how the variance can be unbiased whilst its root is then biased. Is this correct? In stats it is so often a case of what words mean rather than what numbers mean.
2. (Original post by nerak99)
I am reposting this part of an answer I gave in order to get an explanation from a stats expert (hopefully).

My confusion is with this wikipedia entry where variance is said to be unbiased whereas SD is biased. The entry says

"...the sample variance is an unbiased estimator for the population variance, but its square root, the sample standard deviation, is a biased estimator for the population standard deviation.
Yes, this is one of those nasty little facts that creeps up behind you and bops you on the head. One of the things that it tells you is that finding unbiased estimators for things can be hard; a second thing that is less often appreciated is that unbiased estimators may not be the nirvana one is searching for. If one is doing predictive estimation, it is not unusual for a biased predictor to give a lower RMS prediction error than an unbiased one.

It's a nice problem in mathematical statistics (which is sometimes used to torture students) to show what the expected value for the sample standard deviation is when dealing with Normal random variables.

My understanding is that sample bias comes about due to the increased probability that a sample is near the mean in a sample of a distribution than is the case with a full population. The n-1 compensates for that.
I'm not sure what you mean by saying that a sample is "near the mean" - the expected value of the empirical distribution function of a random sample is the population distribution. The usual explanation of the n-1 correction is that you're using the sample mean to estimate the population mean in the calculation of the sample standard deviation - and that needs a correction to take it into account.

In the wikipedia entry I can't see how the variance can be unbiased whilst its root is then biased. Is this correct? In stats it is so often a case of what words mean rather than what numbers mean.
The basic point here is that need not be equal to for a general function f. Just write the equations down and you'll see that there's no reason to expect them to be equal.

If f is linear, all is sweetness and light, and if X is specified, it's sometimes possible to calculate how these two will differ.

### Related university courses

TSR Support Team

We have a brilliant team of more than 60 Support Team members looking after discussions on The Student Room, helping to make it a fun, safe and useful place to hang out.

This forum is supported by:
Updated: November 12, 2016
The home of Results and Clearing

### 3,069

people online now

### 1,567,000

students helped last year
Today on TSR

### University open days

1. Sheffield Hallam University
Tue, 21 Aug '18
2. Bournemouth University
Wed, 22 Aug '18
3. University of Buckingham
Thu, 23 Aug '18
Poll
Useful resources

### Maths Forum posting guidelines

Not sure where to post? Read the updated guidelines here

### How to use LaTex

Writing equations the easy way

### Study habits of A* students

Top tips from students who have already aced their exams