The Student Room Group

S3 - the chi-squared distribution

I finally got confused with something!

I was working through this exercise. The solution is here: http://www.meidistance.co.uk/w/s3c1a15w.pdf

When we have a table of source data which has something like 8+, why do they merge it with the 7 to make 7+, what is the point of doing that?

In another worked example, where they have 0, 1, 2, 3, 4, 5, 6+, they merge it to get 0, 1, 2, 3, 4, 5+, saying "The expected frequencies for the last two classes are both less than 5 but if they are put together to give an expected value of 5.2, the problem is overcome.". What problem?

Thanks.
Reply 1
Don't bother squaring the chi, just give it all to me.
Reply 2
The test becomes unreliable if frequences are too low.
Reply 3
KwungSun
The test becomes unreliable if frequences are too low.

And what is there to determine whether an [expected] frequency is too low?
Reply 4
Vjyrik
And what is there to determine whether an [expected] frequency is too low?


It's rule of thumb as far as I can tell. I'm sure there is some asymptotic stastical theory behind it but I'm having a very hard time finding a source that explains why. Maybe someone with more theoretical stats background can elaborate...?
Reply 5
KwungSun
It's rule of thumb as far as I can tell. I'm sure there is some asymptotic stastical theory behind it but I'm having a very hard time finding a source that explains why. Maybe someone with more theoretical stats background can elaborate...?

Thanks anyway!

Checked the Notes and Examples on MEI site, and they say ", so long as the expected frequencies of these classes is 5 or more.". At least now I know where my degrees of freedom have been running away!

Latest