Sample and population Variance and SD

Watch
nerak99
Badges: 13
Rep:
?
#1
Report Thread starter 4 years ago
#1
I am reposting this part of an answer I gave in order to get an explanation from a stats expert (hopefully).

My confusion is with this wikipedia entry where variance is said to be unbiased whereas SD is biased. The entry says

"...the sample variance is an unbiased estimator for the population variance, but its square root, the sample standard deviation, is a biased estimator for the population standard deviation.

My understanding is that sample bias comes about due to the increased probability that a sample is near the mean in a sample of a distribution than is the case with a full population. The n-1 compensates for that.

In the wikipedia entry I can't see how the variance can be unbiased whilst its root is then biased. Is this correct? In stats it is so often a case of what words mean rather than what numbers mean.
0
reply
Gregorius
Badges: 14
Rep:
?
#2
Report 4 years ago
#2
(Original post by nerak99)
I am reposting this part of an answer I gave in order to get an explanation from a stats expert (hopefully).

My confusion is with this wikipedia entry where variance is said to be unbiased whereas SD is biased. The entry says

"...the sample variance is an unbiased estimator for the population variance, but its square root, the sample standard deviation, is a biased estimator for the population standard deviation.
Yes, this is one of those nasty little facts that creeps up behind you and bops you on the head. One of the things that it tells you is that finding unbiased estimators for things can be hard; a second thing that is less often appreciated is that unbiased estimators may not be the nirvana one is searching for. If one is doing predictive estimation, it is not unusual for a biased predictor to give a lower RMS prediction error than an unbiased one.

It's a nice problem in mathematical statistics (which is sometimes used to torture students) to show what the expected value for the sample standard deviation is when dealing with Normal random variables.

My understanding is that sample bias comes about due to the increased probability that a sample is near the mean in a sample of a distribution than is the case with a full population. The n-1 compensates for that.
I'm not sure what you mean by saying that a sample is "near the mean" - the expected value of the empirical distribution function of a random sample is the population distribution. The usual explanation of the n-1 correction is that you're using the sample mean to estimate the population mean in the calculation of the sample standard deviation - and that needs a correction to take it into account.

In the wikipedia entry I can't see how the variance can be unbiased whilst its root is then biased. Is this correct? In stats it is so often a case of what words mean rather than what numbers mean.
The basic point here is that \mathbb{E}[f(X)] need not be equal to f(\mathbb{E}[X]) for a general function f. Just write the equations down and you'll see that there's no reason to expect them to be equal.

If f is linear, all is sweetness and light, and if X is specified, it's sometimes possible to calculate how these two will differ.
0
reply
X

Quick Reply

Attached files
Write a reply...
Reply
new posts
Back
to top
Latest
My Feed

See more of what you like on
The Student Room

You can personalise what you see on TSR. Tell us a little about yourself to get started.

Personalise

What factors affect your mental health the most right now?

Anxiousness about lockdown easing (103)
5.15%
Uncertainty around my education (301)
15.04%
Uncertainty around my future career prospects (212)
10.59%
Lack of purpose or motivation (278)
13.89%
Lack of support system (eg. teachers, counsellors, delays in care) (89)
4.45%
Impact of lockdown on physical health (114)
5.7%
Loneliness (175)
8.75%
Financial worries (72)
3.6%
Concern about myself or my loves ones getting/having been ill (88)
4.4%
Exposure to negative news/social media (94)
4.7%
Lack of real life entertainment (109)
5.45%
Lack of confidence in making big life decisions (177)
8.85%
Worry about missed opportunities during the pandemic (189)
9.45%

Watched Threads

View All