The Student Room Group

Grouped Variance

Starting my stats course and it gives the formula for variance - but I don't understand how they've formed grouped variance in the example attached?

I've literally put the numbers in my calculator and it's given a different value for variance.

Also not quite sure what the difference between grouped and upgrouped is? Thought it was just data in bins vs raw data...
(edited 12 months ago)
Reply 1
Original post by vitc83
Starting my stats course and it gives the formula for variance - but I don't understand how they've formed grouped variance in the example attached?

I've literally put the numbers in my calculator and it's given a different value for variance.

Also not quite sure what the difference between grouped and upgrouped is? Thought it was just data in bins vs raw data...

There are two equivalent formulae for the usual variance
https://en.wikipedia.org/wiki/Variance#Definition
so this one is based on the
E(X^2) - (E(X))^2
with the extra bit (weighting by the frequency) for the grouping.

Grouping is as you say, you treat subsets of the data as if theyre in single bins.
(edited 12 months ago)
Reply 2
Original post by mqb2766
There are two equivalent formulae for the usual variance
https://en.wikipedia.org/wiki/Variance#Definition
so this one is based on the
E(X^2) - (E(X))^2
with the extra bit (weighting by the frequency) for the grouping.

Grouping is as you say, you treat subsets of the data as if theyre in single bins.


Just not quite sure how the second bit in brackets was squared? Like why is the denominator also squared?

Edit: Like, I assume it's because when you sub in the expression for the mean into the original somehow works out?
(edited 12 months ago)
Reply 3
Original post by vitc83
Just not quite sure how the second bit in brackets was squared? Like why is the denominator also squared?

The formula
E(X^2) - (E(X))^2
is a standard one for variance and you must have come across it/derived it (done in the wiki link)? E() is the expected value or simply average when you have a finite data set. The second term is
(E(X))^2
so simply the square of the mean. You could have used the 156.4 you work out (and square it) or think about it as the numerator and denominator of E(X) both squared, then divide. The former is easier.
Reply 4
Original post by mqb2766
The formula
E(X^2) - (E(X))^2
is a standard one for variance and you must have come across it/derived it (done in the wiki link)? E() is the expected value or simply average when you have a finite data set. The second term is
(E(X))^2
so simply the square of the mean. You could have used the 156.4 you work out (and square it) or think about it as the numerator and denominator of E(X) both squared, then divide. The former is easier.

I've seen it as attached... but how come one version uses the mean and one doesn't?
Reply 5
Original post by vitc83
I've seen it as attached... but how come one version uses the mean and one doesn't?

Are you asking about whether its "n" or "n-1" or something else. The wiki link is based on the expected value which "hides" whether you divide by "n" or "n-1" when calculating the mean. In your attachment
var = s_xx / n
(just for ease divide by n), so the formula at the bottom is
E(X^2) - (E(X))^2
as per the wiki link.
Original post by vitc83
Starting my stats course and it gives the formula for variance - but I don't understand how they've formed grouped variance in the example attached?

I've literally put the numbers in my calculator and it's given a different value for variance.

I think you'll find it's a rounding error. If you repeat the given calculation using 156.379 (rather than 156.4) as the mean, you should end up with a variance closer to your calculator value.

Quick Reply

Latest