Sample and population Variance and SD Watch

nerak99
Badges: 11
Rep:
?
#1
Report Thread starter 2 years ago
#1
I am reposting this part of an answer I gave in order to get an explanation from a stats expert (hopefully).

My confusion is with this wikipedia entry where variance is said to be unbiased whereas SD is biased. The entry says

"...the sample variance is an unbiased estimator for the population variance, but its square root, the sample standard deviation, is a biased estimator for the population standard deviation.

My understanding is that sample bias comes about due to the increased probability that a sample is near the mean in a sample of a distribution than is the case with a full population. The n-1 compensates for that.

In the wikipedia entry I can't see how the variance can be unbiased whilst its root is then biased. Is this correct? In stats it is so often a case of what words mean rather than what numbers mean.
0
reply
Gregorius
Badges: 14
Rep:
?
#2
Report 2 years ago
#2
(Original post by nerak99)
I am reposting this part of an answer I gave in order to get an explanation from a stats expert (hopefully).

My confusion is with this wikipedia entry where variance is said to be unbiased whereas SD is biased. The entry says

"...the sample variance is an unbiased estimator for the population variance, but its square root, the sample standard deviation, is a biased estimator for the population standard deviation.
Yes, this is one of those nasty little facts that creeps up behind you and bops you on the head. One of the things that it tells you is that finding unbiased estimators for things can be hard; a second thing that is less often appreciated is that unbiased estimators may not be the nirvana one is searching for. If one is doing predictive estimation, it is not unusual for a biased predictor to give a lower RMS prediction error than an unbiased one.

It's a nice problem in mathematical statistics (which is sometimes used to torture students) to show what the expected value for the sample standard deviation is when dealing with Normal random variables.

My understanding is that sample bias comes about due to the increased probability that a sample is near the mean in a sample of a distribution than is the case with a full population. The n-1 compensates for that.
I'm not sure what you mean by saying that a sample is "near the mean" - the expected value of the empirical distribution function of a random sample is the population distribution. The usual explanation of the n-1 correction is that you're using the sample mean to estimate the population mean in the calculation of the sample standard deviation - and that needs a correction to take it into account.

In the wikipedia entry I can't see how the variance can be unbiased whilst its root is then biased. Is this correct? In stats it is so often a case of what words mean rather than what numbers mean.
The basic point here is that \mathbb{E}[f(X)] need not be equal to f(\mathbb{E}[X]) for a general function f. Just write the equations down and you'll see that there's no reason to expect them to be equal.

If f is linear, all is sweetness and light, and if X is specified, it's sometimes possible to calculate how these two will differ.
0
reply
X

Quick Reply

Attached files
Write a reply...
Reply
new posts
Latest
My Feed

See more of what you like on
The Student Room

You can personalise what you see on TSR. Tell us a little about yourself to get started.

Personalise

University open days

  • Manchester Metropolitan University
    Undergraduate Open Day Undergraduate
    Wed, 19 Jun '19
  • University of West London
    Undergraduate Open Day - West London Campus Undergraduate
    Wed, 19 Jun '19
  • University of Warwick
    Undergraduate Open Day Undergraduate
    Fri, 21 Jun '19

How did your AQA A-level Biology Paper 3 go?

Loved the paper - Feeling positive (264)
15.31%
The paper was reasonable (960)
55.68%
Not feeling great about that exam... (371)
21.52%
It was TERRIBLE (129)
7.48%

Watched Threads

View All