# Converting between Variances in S2

Watch
Announcements

Page 1 of 1

Go to first unread

Skip to page:

Hey all, I do OCR so I'm not sure if you're required to do it in other specs.

Basically I am very unsure when to use the different conversions of variances between the population and sample. I'm not even sure whether I fully grasp the concepts of which variance represents what.

The ones that I am aware of so far are

s^2 = Var(Xbar) * n/(n-1)

and

Var(Xbar) = sigma^2 /n

I hope these look familiar since I'm not even sure whether I'm using the right notation.

I thought I had a grasp of these concepts until recently, when a question gives the sample variance. The mark scheme then wants you to find the population variance and then convert back to the sample variance, but by using th two different equations, and if you're converting there and back then what's the point in converting?

My teacher didn't explain these concepts properly and I am struggling to find any explanations online, so I would really appreciate it if someone could explain what each variance represents and why you use it/ when you use it.

Thank you!

Basically I am very unsure when to use the different conversions of variances between the population and sample. I'm not even sure whether I fully grasp the concepts of which variance represents what.

The ones that I am aware of so far are

s^2 = Var(Xbar) * n/(n-1)

and

Var(Xbar) = sigma^2 /n

I hope these look familiar since I'm not even sure whether I'm using the right notation.

I thought I had a grasp of these concepts until recently, when a question gives the sample variance. The mark scheme then wants you to find the population variance and then convert back to the sample variance, but by using th two different equations, and if you're converting there and back then what's the point in converting?

My teacher didn't explain these concepts properly and I am struggling to find any explanations online, so I would really appreciate it if someone could explain what each variance represents and why you use it/ when you use it.

Thank you!

0

reply

Report

#2

(Original post by

Hey all, I do OCR so I'm not sure if you're required to do it in other specs.

Basically I am very unsure when to use the different conversions of variances between the population and sample. I'm not even sure whether I fully grasp the concepts of which variance represents what.

The ones that I am aware of so far are

s^2 = Var(Xbar) * n/(n-1)

and

Var(Xbar) = sigma^2 /n

I hope these look familiar since I'm not even sure whether I'm using the right notation.

I thought I had a grasp of these concepts until recently, when a question gives the sample variance. The mark scheme then wants you to find the population variance and then convert back to the sample variance, but by using th two different equations, and if you're converting there and back then what's the point in converting?

My teacher didn't explain these concepts properly and I am struggling to find any explanations online, so I would really appreciate it if someone could explain what each variance represents and why you use it/ when you use it.

Thank you!

**Quarkboi**)Hey all, I do OCR so I'm not sure if you're required to do it in other specs.

Basically I am very unsure when to use the different conversions of variances between the population and sample. I'm not even sure whether I fully grasp the concepts of which variance represents what.

The ones that I am aware of so far are

s^2 = Var(Xbar) * n/(n-1)

and

Var(Xbar) = sigma^2 /n

I hope these look familiar since I'm not even sure whether I'm using the right notation.

I thought I had a grasp of these concepts until recently, when a question gives the sample variance. The mark scheme then wants you to find the population variance and then convert back to the sample variance, but by using th two different equations, and if you're converting there and back then what's the point in converting?

My teacher didn't explain these concepts properly and I am struggling to find any explanations online, so I would really appreciate it if someone could explain what each variance represents and why you use it/ when you use it.

Thank you!

0

reply

(Original post by

Could u post the question?

**Shaanv**)Could u post the question?

0

reply

Report

#4

(Original post by

It's Question 5 here http://www.ocr.org.uk/Images/136159-...atistics-2.pdf

**Quarkboi**)It's Question 5 here http://www.ocr.org.uk/Images/136159-...atistics-2.pdf

In the first way, you get M1 for "z = (6.2-6.1)/sqrt(0.643/80)", and then A1 for correctly working this out as 1.115, or for correctly working out the probability as 0.1325; then the final A1 is for either comparing z: 1.115 < 1.645, or for comparing p: 0.1325 > 0.05. In this way, you can see that the first way divides into two sub-methods: one using z, and the other using p.

Then there's the second way, in which you get M1 for "6.1 + 1.645*sqrt(0.643/80)", A1 for working this out as 6.247, and the final A1 for comparing 6.2 < 6.247.

You do only one of these two ways, not both, so you don't "convert and then convert back". After scoring this M1 A1 A1 in one of the two ways, you then get the final M1 A1 for the conclusion.

1

reply

Report

#5

There are quite a few comparisons of sample and population variance, just google and see which is the most readable for you.

I always think about what happens when you have one or two points in a sample.

When you have a single point N=1,

* Sample variance: is not defined because you divide by N-1=0. This is because you use the single point to estimate the mean and there is nothing left to calculate the variance - the "error" from the mean (x-mu) is zero because the mean and the point are the same

* Population variance: you know the mean, so the single point can be used to calculate (x-mu)^2, and as only a single point is used in the summation, you divide by 1.

When you have two points N=2

* Sample variance. The mean is the average of the two points. The two "errors" from the mean are always the same, so really you have just one bit of information when you estimate the variance, so you divide by N-1=1.

* Population variance: you know the mean, the two points are independent, as is the squared error (x_i-mu)^2 so to estimate the variance, add them up and divide by 2.

Not necessarily rigorous but it is easy to remember and sounds plausible.

,

I always think about what happens when you have one or two points in a sample.

When you have a single point N=1,

* Sample variance: is not defined because you divide by N-1=0. This is because you use the single point to estimate the mean and there is nothing left to calculate the variance - the "error" from the mean (x-mu) is zero because the mean and the point are the same

* Population variance: you know the mean, so the single point can be used to calculate (x-mu)^2, and as only a single point is used in the summation, you divide by 1.

When you have two points N=2

* Sample variance. The mean is the average of the two points. The two "errors" from the mean are always the same, so really you have just one bit of information when you estimate the variance, so you divide by N-1=1.

* Population variance: you know the mean, the two points are independent, as is the squared error (x_i-mu)^2 so to estimate the variance, add them up and divide by 2.

Not necessarily rigorous but it is easy to remember and sounds plausible.

,

1

reply

Report

#6

(Original post by

Basically I am very unsure when to use the different conversions of variances between the population and sample. I'm not even sure whether I fully grasp the concepts of which variance represents what.

**Quarkboi**)Basically I am very unsure when to use the different conversions of variances between the population and sample. I'm not even sure whether I fully grasp the concepts of which variance represents what.

The idea of the population variance is to give you a measure of the spread of the data around its mean value. Hence you get the standard formula of a sum (or integral) of squared deviations from the mean. If you are given the fully specified probability distribution of the population, then it's a matter of algebra to work out the population variance.

If you draw a sample from the population, then the principles are the same - you can calculate the sample mean and the sample variance as measures of the location and spread of the sample. But what if you wanted to use your sample to estimate things about the population? So here you're in the situation that you don't know the underlying population distribution at all - all you have is the sample. Well, what you'd really like are unbiased estimators of the underlying population parameters - what does this mean? It means that if you were able to take repeated samples from the underlying population, and for each of these samples were able to calculate estimates of the population mean and variance from these sample, then the mean value of the estimates would equal the underlying population mean and variance respectively.

For an unbiased estimator of the mean, life is simple. The mean of the sample is an unbiased estimator of the population mean. For variance, the situation is a little more tricky, and the trickiness stems from the fact that when you work out the sample variance, you have to use the sample mean as an estimate of the population mean in the variance formula. When you work through the algebra, it turns out that you need that factor of (n - 1) to get the estimate of the population variance to be unbiased.

2

reply

(Original post by

A good source for the details is https://en.wikipedia.org/wiki/Variance#Population_variance_and _sample_variance"]this subsection of the Wikipedia article[/url] on variance. Let me just a few words about the concepts.

The idea of the population variance is to give you a measure of the spread of the data around its mean value. Hence you get the standard formula of a sum (or integral) of squared deviations from the mean. If you are given the fully specified probability distribution of the population, then it's a matter of algebra to work out the population variance.

If you draw a sample from the population, then the principles are the same - you can calculate the sample mean and the sample variance as measures of the location and spread of the sample. But what if you wanted to use your sample to estimate things about the population? So here you're in the situation that you don't know the underlying population distribution at all - all you have is the sample. Well, what you'd really like are unbiased estimators of the underlying population parameters - what does this mean? It means that if you were able to take repeated samples from the underlying population, and for each of these samples were able to calculate estimates of the population mean and variance from these sample, then the mean value of the estimates would equal the underlying population mean and variance respectively.

For an unbiased estimator of the mean, life is simple. The mean of the sample is an unbiased estimator of the population mean. For variance, the situation is a little more tricky, and the trickiness stems from the fact that when you work out the sample variance, you have to use the sample mean as an estimate of the population mean in the variance formula. When you work through the algebra, it turns out that you need that factor of (n - 1) to get the estimate of the population variance to be unbiased.

**Gregorius**)A good source for the details is https://en.wikipedia.org/wiki/Variance#Population_variance_and _sample_variance"]this subsection of the Wikipedia article[/url] on variance. Let me just a few words about the concepts.

The idea of the population variance is to give you a measure of the spread of the data around its mean value. Hence you get the standard formula of a sum (or integral) of squared deviations from the mean. If you are given the fully specified probability distribution of the population, then it's a matter of algebra to work out the population variance.

If you draw a sample from the population, then the principles are the same - you can calculate the sample mean and the sample variance as measures of the location and spread of the sample. But what if you wanted to use your sample to estimate things about the population? So here you're in the situation that you don't know the underlying population distribution at all - all you have is the sample. Well, what you'd really like are unbiased estimators of the underlying population parameters - what does this mean? It means that if you were able to take repeated samples from the underlying population, and for each of these samples were able to calculate estimates of the population mean and variance from these sample, then the mean value of the estimates would equal the underlying population mean and variance respectively.

For an unbiased estimator of the mean, life is simple. The mean of the sample is an unbiased estimator of the population mean. For variance, the situation is a little more tricky, and the trickiness stems from the fact that when you work out the sample variance, you have to use the sample mean as an estimate of the population mean in the variance formula. When you work through the algebra, it turns out that you need that factor of (n - 1) to get the estimate of the population variance to be unbiased.

(Original post by

I think you're misreading the mark scheme. It shows that you get B2 B1 M1 M1 A1, then there are two different ways of getting the next M1 A1 A1.

In the first way, you get M1 for "z = (6.2-6.1)/sqrt(0.643/80)", and then A1 for correctly working this out as 1.115, or for correctly working out the probability as 0.1325; then the final A1 is for either comparing z: 1.115 < 1.645, or for comparing p: 0.1325 > 0.05. In this way, you can see that the first way divides into two sub-methods: one using z, and the other using p.

Then there's the second way, in which you get M1 for "6.1 + 1.645*sqrt(0.643/80)", A1 for working this out as 6.247, and the final A1 for comparing 6.2 < 6.247.

You do only one of these two ways, not both, so you don't "convert and then convert back". After scoring this M1 A1 A1 in one of the two ways, you then get the final M1 A1 for the conclusion.

**Prasiortle**)I think you're misreading the mark scheme. It shows that you get B2 B1 M1 M1 A1, then there are two different ways of getting the next M1 A1 A1.

In the first way, you get M1 for "z = (6.2-6.1)/sqrt(0.643/80)", and then A1 for correctly working this out as 1.115, or for correctly working out the probability as 0.1325; then the final A1 is for either comparing z: 1.115 < 1.645, or for comparing p: 0.1325 > 0.05. In this way, you can see that the first way divides into two sub-methods: one using z, and the other using p.

Then there's the second way, in which you get M1 for "6.1 + 1.645*sqrt(0.643/80)", A1 for working this out as 6.247, and the final A1 for comparing 6.2 < 6.247.

You do only one of these two ways, not both, so you don't "convert and then convert back". After scoring this M1 A1 A1 in one of the two ways, you then get the final M1 A1 for the conclusion.

0

reply

Report

#8

(Original post by

Thanks, this helps a lot. One thing I'd like to know is what is the difference between the unbiased estimate of the variance and the Var(Xsample) = sigma^2 /n?

**Quarkboi**)Thanks, this helps a lot. One thing I'd like to know is what is the difference between the unbiased estimate of the variance and the Var(Xsample) = sigma^2 /n?

So, if you draw a random sample from an underlying population, you can calculate the mean of that sample. Now do it again with a different sample, and again and again and again... Each time you will get a value for the mean of a particular sample, and these means are likely to all be slightly different. In other words, the means of these samples will have their own

**sampling distribution**. The variance of that sampling distribution, which quantifies how spread out those means are, is .

0

reply

(Original post by

Above I talked about estimating the variance of the underlying population by using the variance of a sample. Now we're talking about estimating the variance of the mean (which gives you the formula ).

So, if you draw a random sample from an underlying population, you can calculate the mean of that sample. Now do it again with a different sample, and again and again and again... Each time you will get a value for the mean of a particular sample, and these means are likely to all be slightly different. In other words, the means of these samples will have their own

**Gregorius**)Above I talked about estimating the variance of the underlying population by using the variance of a sample. Now we're talking about estimating the variance of the mean (which gives you the formula ).

So, if you draw a random sample from an underlying population, you can calculate the mean of that sample. Now do it again with a different sample, and again and again and again... Each time you will get a value for the mean of a particular sample, and these means are likely to all be slightly different. In other words, the means of these samples will have their own

**sampling distribution**. The variance of that sampling distribution, which quantifies how spread out those means are, is .
0

reply

Report

#10

(Original post by

Oh I see! Thanks a lot. Would you mind explaining why we must use this step in the example I provided, as I now understand what it is but not exactly why we use it (especially why we use in in conjunction with the unbiased estimate of the variance).

**Quarkboi**)Oh I see! Thanks a lot. Would you mind explaining why we must use this step in the example I provided, as I now understand what it is but not exactly why we use it (especially why we use in in conjunction with the unbiased estimate of the variance).

0

reply

X

Page 1 of 1

Go to first unread

Skip to page:

### Quick Reply

Back

to top

to top