The Student Room Group

The central limit theorem and its importance to statistical estimation

Hi guys

I'm having trouble with this quote

"The central limit theorem is the cornerstone of statistical estimation."

I can see how this could be true when trying to estimate certain things about a big population but what about samples that are taken from small populations ?

I've been asked to comment on the veracity of the statement but am not really sure what I can comment on besides the distribution becoming normal as the sample size increases. Can anyone suggest any other things I should consider when trying to answer this question?

After posting this, I've realised that the reason I'm struggling to answer this is because I'm finding it hard to see the point or importance of having a normal distribution. I'm new to stats and my brain is working overtime on this subject and I know the point I just mentioned is quite valuable so I have to find a way to clarify this. The normal distribution is to show an average... So the central limit theorem helps to find that average from a population/sample that might have a not normal distribution....

But using an example like the average height for the basketball league. Without doing a census and just using a sample and the height goes from 1m to 3m, the normal distribution is always going to show 2m? But how can that be? What if 80% of the height were just under 3m, how does the clt help? The graph would not be normal no matter how big the sample was , or does the clt estimate the average height will always be 2m?

I know I might be missing alot of concepts here but there is so much to take in and I am really struggling


Posted from TSR Mobile
(edited 9 years ago)
Original post by Fyer1234
Hi guys

This is precisely why essentially all statistics taken on small populations have large variances. The CLT is key for getting estimations which don't have really large variances. It tells you with fairly small variance where the mean of a distribution lies, *whatever* the distribution (as long as it does have a mean and variance). As you say, it does require taking lots of samples if the underlying distribution isn't well-behaved; however, in the cases where CLT is not useful, it's often true that statistics is simply not something that helps. For instance, if you have only two samples, there's very little you can say about the distribution.

Actually, I'm not sure you understand the statement of the CLT. It says that if you repeatedly take samples of a fixed size (the larger, the better), then the means of those samples will approximate a normal distribution centred on the original distribution's mean. It doesn't say that if the sample means all lie between 1m and 3m then the normal distribution will have mean 2m: consider the case that all humans but one are 2.5m tall, so nearly all the samples are simply lists of 2.5, so all the sample means are all very close to 2.5, so the normal distribution is centred on 2.5.
Reply 2
Original post by Smaug123
This is precisely why essentially all statistics taken on small populations have large variances. The CLT is key for getting estimations which don't have really large variances. It tells you with fairly small variance where the mean of a distribution lies, *whatever* the distribution (as long as it does have a mean and variance). As you say, it does require taking lots of samples if the underlying distribution isn't well-behaved; however, in the cases where CLT is not useful, it's often true that statistics is simply not something that helps. For instance, if you have only two samples, there's very little you can say about the distribution.

Actually, I'm not sure you understand the statement of the CLT. It says that if you repeatedly take samples of a fixed size (the larger, the better), then the means of those samples will approximate a normal distribution centred on the original distribution's mean. It doesn't say that if the sample means all lie between 1m and 3m then the normal distribution will have mean 2m: consider the case that all humans but one are 2.5m tall, so nearly all the samples are simply lists of 2.5, so all the sample means are all very close to 2.5, so the normal distribution is centred on 2.5.


Thanks for your reply

I think I'm going to have to go over things a bit more as I'm still really lost. When I think of normal distribution I see the bell shaped curve. But does the bell shape curve always peak in the middle of the two measurements in this case 1m and 3m?

I can see that your explanation says no but I can't see how the distribution can't be bell shaped unless it peaks in the middle of the two measurements... Can it start its rise a little bit in? (oh dear that last sentence makes me sound so simple)


Posted from TSR Mobile
(edited 9 years ago)
Original post by Fyer1234
Thanks for your reply

I think I'm going to have to go over things a bit more as I'm still really lost. When I think of normal distribution I see the bell shaped curve. But does the bell shape curve always peak in the middle of the two measurements in this case 1m and 3m?

I can see that your explanation says no but I can't see how the distribution can't be bell shaped unless it peaks in the middle of the two measurements... Can it start its rise a little bit in? (oh dear that last sentence makes me sound so simple)


Posted from TSR Mobile

Consider the standard normal distribution: mean 0, variance 1. Imagine I take two samples from it, and get as my measurements 0.5 and 1.6. Now, that's told me very little about the mean: in particular, it hasn't told me enough to say that it's halfway between 0.5 and 1.6. All I can say is that the mean is unlikely to be horribly far away from 1.05.

If I do this again, I might get the measurements 0.1 and -4. Again, the mean I'd predict from that is -1.95, which isn't very close to the true mean of 0.

Suppose I did this "pick pairs, take the mean of the pair" procedure several times and got 1.05, -1.95, 0.2, 0.5, -0.3. It becomes clear that the distribution mean is approximately 0 (in fact, the mean of these sample means is -0.1). It's the Central Limit Theorem that has allowed me to say that. Without the CLT, I can't go from "I drew lots of samples, and their means clustered around 0" to "the distribution mean is around 0", and that's a crucial part of statistics: without the CLT, it's really hard to say anything about the distribution's mean even if we have taken lots of samples and know the means of the samples.
Reply 4
Original post by Smaug123
Consider the standard normal distribution: mean 0, variance 1. Imagine I take two samples from it, and get as my measurements 0.5 and 1.6. Now, that's told me very little about the mean: in particular, it hasn't told me enough to say that it's halfway between 0.5 and 1.6. All I can say is that the mean is unlikely to be horribly far away from 1.05.

If I do this again, I might get the measurements 0.1 and -4. Again, the mean I'd predict from that is -1.95, which isn't very close to the true mean of 0.

Suppose I did this "pick pairs, take the mean of the pair" procedure several times and got 1.05, -1.95, 0.2, 0.5, -0.3. It becomes clear that the distribution mean is approximately 0 (in fact, the mean of these sample means is -0.1). It's the Central Limit Theorem that has allowed me to say that. Without the CLT, I can't go from "I drew lots of samples, and their means clustered around 0" to "the distribution mean is around 0", and that's a crucial part of statistics: without the CLT, it's really hard to say anything about the distribution's mean even if we have taken lots of samples and know the means of the samples.


Thanks again,

I'll have to keep re-reading this as I kinda understand but again I've been at it all day so am bit fuzzy at the mo. Need to sleep on it



Posted from TSR Mobile
Reply 5
Original post by Smaug123
Consider the standard normal distribution: mean 0, variance 1. Imagine I take two samples from it, and get as my measurements 0.5 and 1.6. Now, that's told me very little about the mean: in particular, it hasn't told me enough to say that it's halfway between 0.5 and 1.6. All I can say is that the mean is unlikely to be horribly far away from 1.05.

If I do this again, I might get the measurements 0.1 and -4. Again, the mean I'd predict from that is -1.95, which isn't very close to the true mean of 0.

Suppose I did this "pick pairs, take the mean of the pair" procedure several times and got 1.05, -1.95, 0.2, 0.5, -0.3. It becomes clear that the distribution mean is approximately 0 (in fact, the mean of these sample means is -0.1). It's the Central Limit Theorem that has allowed me to say that. Without the CLT, I can't go from "I drew lots of samples, and their means clustered around 0" to "the distribution mean is around 0", and that's a crucial part of statistics: without the CLT, it's really hard to say anything about the distribution's mean even if we have taken lots of samples and know the means of the samples.


So basically , if I'm correct, the CLT makes an estimation on the population parameters from sampling distribution value?

If the true population parameter is known then what is the point of sampling ? This is what I'm confused about. If the CLT is so useful in predictions regarding the samples taken from a population, then why sample?

Using an example of a population of teachers within a city. I choose to sample the teachers who teach maths and wish to find out how many of these drive a red car. Can someone please explain how the CLT would be any good here?


Posted from TSR Mobile
Original post by Fyer1234
So basically , if I'm correct, the CLT makes an estimation on the population parameters from sampling distribution value?

If the true population parameter is known then what is the point of sampling ? This is what I'm confused about. If the CLT is so useful in predictions regarding the samples taken from a population, then why sample?

Using an example of a population of teachers within a city. I choose to sample the teachers who teach maths and wish to find out how many of these drive a red car. Can someone please explain how the CLT would be any good here?

Currently I know nothing about how many maths teachers in a city drive a red car, but I might guess it's binomially distributed. Let's suppose I know the number of maths teachers in the city, and WLOG it's 1000. I just need to find the probability parameter.

If I sample ten teachers and one drives a red car, then my guess for p is 1/10. If I re-sample and get 5/10, then the CLT tells me that the true value of p is not too far from 3/10. I still don't know the true value: that's what statistics does, is deducing true values from measured values.
If I re-sample more and get 1/10, 5/10, 2/10, 3/10, 1/10, then the CLT tells me that p is roughly 0.24: that is, approximately 240 maths teachers drive red cars. Without the CLT, I can't make that statement.
Reply 7
Thanks mate

So it's more of a statement than a calculation that is practically used in statistics?

Or

It is only used when the population distribution is not known...?




Posted from TSR Mobile
(edited 9 years ago)
Original post by Fyer1234
Thanks mate

So it's more of a statement than a calculation that is practically used in statistics?

Or

It is only used when the population distribution is not known...?

It's a calculation too, but the statement is the real intuitive reason why it's important. (Practically, the calculation is of very great importance too in bounding the error of your estimate.) It is used when some parameter is unknown: indeed, almost the entire field of statistics is based around the case that we don't know some parameter and are trying to estimate it from data.
Reply 9
Mmm I'm starting to get it, thanks

Put simply, when we take a sample the results we have will only be reflective of that sample, eg. Brown haired people in London who live in a flat with a cat. The more we sample, the theorem willbe able to tell us that the mean of these samples will be close to the true value of the population.

Although it would be very hard to sample all the brown haired people etc etc, the CLT enables us to make a goo estimate without doing a census..?

Am I starting to get the gist?
Original post by Fyer1234
Mmm I'm starting to get it, thanks

Put simply, when we take a sample the results we have will only be reflective of that sample, eg. Brown haired people in London who live in a flat with a cat. The more we sample, the theorem willbe able to tell us that the mean of these samples will be close to the true value of the population.

Although it would be very hard to sample all the brown haired people etc etc, the CLT enables us to make a goo estimate without doing a census..?

Am I starting to get the gist?

Yep, exactly.
Reply 11
Original post by Smaug123
Yep, exactly.


Once again Smaug, thanks for your help.

Hopefully stats will start to get a bit easier soon as at the moment I'm so stressed out lol


Posted from TSR Mobile
Original post by Fyer1234
Once again Smaug, thanks for your help.

Hopefully stats will start to get a bit easier soon as at the moment I'm so stressed out lol


Posted from TSR Mobile

No problem - sadly, stats is counterintuitive and takes a lot of learning :smile:

Quick Reply

Latest