# S1 - Central Limit Theorem

Watch
Announcements

Page 1 of 1

Go to first unread

Skip to page:

I really am not sure what I want to ask because I'm so confused about CLT.

Here is what I think

If you take lots of samples of a particular size and find the means of each of those samples then all the means form a normal distribution?

The bigger the sample size, the nearer to a normal distribution?

Sample sizes of 30 are ok unless the population is very skewed?

This is now where I get worried. When I use the X(bar) in the formula

Z = (X(bar) - population mean)/(sigma/root sample size) is X(bar) the mean of one of the samples? So that X(bar) value could be anywhere in relation to the mean of the population?

The above formula would always be used when a sample is mentioned regardless of whether the sample has been taken from a normal distribution or not?

The CL theorem is NOT the formula but just something that shows you that the formula works?

In questions that ask you where the CLT was used, you do not say part a) just because you used the formula in part a) but because.....because what?

When you do the confidence limit what you are trying to work out is how confident you are that the mean you are using is close to the population mean?

It's all a bit of a ramble this. But I just don't get what I'm aiming to do. I sort of do the maths by rote but the question about when do use the CLT I have no idea because I have so many loose end in my thinking.

Here is what I think

If you take lots of samples of a particular size and find the means of each of those samples then all the means form a normal distribution?

The bigger the sample size, the nearer to a normal distribution?

Sample sizes of 30 are ok unless the population is very skewed?

This is now where I get worried. When I use the X(bar) in the formula

Z = (X(bar) - population mean)/(sigma/root sample size) is X(bar) the mean of one of the samples? So that X(bar) value could be anywhere in relation to the mean of the population?

The above formula would always be used when a sample is mentioned regardless of whether the sample has been taken from a normal distribution or not?

The CL theorem is NOT the formula but just something that shows you that the formula works?

In questions that ask you where the CLT was used, you do not say part a) just because you used the formula in part a) but because.....because what?

When you do the confidence limit what you are trying to work out is how confident you are that the mean you are using is close to the population mean?

It's all a bit of a ramble this. But I just don't get what I'm aiming to do. I sort of do the maths by rote but the question about when do use the CLT I have no idea because I have so many loose end in my thinking.

0

reply

Report

#2

To get you're Z value you need to normalise/standardise the data. You do this by subtracting the mean and dividing by the standard deviation. This method ONLY works for the normal distribution. So, in order to allow you to use the Normal Distribution (and therefore you're Z value), you need to be able to justify that that data is normally distributed.

That's where the central limit theorem comes into play. Using this theorem lets you justify that the data is normally distributed and therefore you can use this method.

That's where the central limit theorem comes into play. Using this theorem lets you justify that the data is normally distributed and therefore you can use this method.

0

reply

(Original post by

To get you're Z value you need to normalise/standardise the data. You do this by subtracting the mean and dividing by the standard deviation. This method ONLY works for the normal distribution. So, in order to allow you to use the Normal Distribution (and therefore you're Z value), you need to be able to justify that that data is normally distributed.

That's where the central limit theorem comes into play. Using this theorem lets you justify that the data is normally distributed and therefore you can use this method.

**claret_n_blue**)To get you're Z value you need to normalise/standardise the data. You do this by subtracting the mean and dividing by the standard deviation. This method ONLY works for the normal distribution. So, in order to allow you to use the Normal Distribution (and therefore you're Z value), you need to be able to justify that that data is normally distributed.

That's where the central limit theorem comes into play. Using this theorem lets you justify that the data is normally distributed and therefore you can use this method.

I'll have to try again to get to grips with this.

0

reply

Report

#4

(Original post by

I misunderstand then. I thought the CLT tells us that the means of samples tend to lead to a normal distribution rather than the the data itself being normal.

I'll have to try again to get to grips with this.

**maggiehodgson**)I misunderstand then. I thought the CLT tells us that the means of samples tend to lead to a normal distribution rather than the the data itself being normal.

I'll have to try again to get to grips with this.

0

reply

(Original post by

You're right. The Central Limit Theorem states that if we take a collection of samples from any distribution, then the means of those samples will themselves look like a collection of samples from a normal distribution. This is true regardless of the original distribution (although it is exactly true, not just "true for big n", when the original distribution was normal).

**Smaug123**)You're right. The Central Limit Theorem states that if we take a collection of samples from any distribution, then the means of those samples will themselves look like a collection of samples from a normal distribution. This is true regardless of the original distribution (although it is exactly true, not just "true for big n", when the original distribution was normal).

If you find yourself with some spare time, I wonder if you would go through my original post and check out and correct my thinking. No worries if that's not possible, what you have already said is a big help.

0

reply

Report

#6

(Original post by

Thanks for that.

If you find yourself with some spare time, I wonder if you would go through my original post and check out and correct my thinking. No worries if that's not possible, what you have already said is a big help.

**maggiehodgson**)Thanks for that.

If you find yourself with some spare time, I wonder if you would go through my original post and check out and correct my thinking. No worries if that's not possible, what you have already said is a big help.

You "use the CLT" whenever you approximate the distribution of the sample means as a normal distribution. In practice, it's wherever you used the formula - I'd say something like "Part a, because in part a we approximated the sample means to a normal distribution (as in the line <line where the formula was used>)".

Not sure what you mean by "confidence limit" - could you give me an example? (If it were "confidence interval", I could give an answer…)

0

reply

(Original post by

X(bar) could indeed be anywhere in relation to the true mean - but the CLT tells us that it is normally distributed, so it's quite likely to be near to the true mean.

You "use the CLT" whenever you approximate the distribution of the sample means as a normal distribution. In practice, it's wherever you used the formula - I'd say something like "Part a, because in part a we approximated the sample means to a normal distribution (as in the line <line where the formula was used>)".

Not sure what you mean by "confidence limit" - could you give me an example? (If it were "confidence interval", I could give an answer…)

**Smaug123**)X(bar) could indeed be anywhere in relation to the true mean - but the CLT tells us that it is normally distributed, so it's quite likely to be near to the true mean.

You "use the CLT" whenever you approximate the distribution of the sample means as a normal distribution. In practice, it's wherever you used the formula - I'd say something like "Part a, because in part a we approximated the sample means to a normal distribution (as in the line <line where the formula was used>)".

Not sure what you mean by "confidence limit" - could you give me an example? (If it were "confidence interval", I could give an answer…)

Yes, it is confidence interval.

So if you were told that the population from which the sample was taken was normally distributed, you would still use the formula for confidence intervals not a different one?

When do you say that you've not used CLT?

I thought I'd got it but still not there am I.

0

reply

Report

#8

(Original post by

Yes, it is confidence interval.

So if you were told that the population from which the sample was taken was normally distributed, you would still use the formula for confidence intervals not a different one?

When do you say that you've not used CLT?

I thought I'd got it but still not there am I.

**maggiehodgson**)Yes, it is confidence interval.

So if you were told that the population from which the sample was taken was normally distributed, you would still use the formula for confidence intervals not a different one?

When do you say that you've not used CLT?

I thought I'd got it but still not there am I.

Sorry, "*not* used CLT"? You could say you've not used the CLT whenever you *haven't* approximated a collection of sample means as following a normal distribution.

A 95% confidence interval for a mean (say) is an interval [a,b] such that Probability(a < mean < b) = 95%. That's true whether or not you use the CLT. The CLT is used in actually finding that Probability(a<mean<b); you use it to turn a complicated distribution (a collection of sample means) into a simple distribution (a normal one).

0

reply

(Original post by

What is your formula for confidence intervals?

Sorry, "*not* used CLT"? You could say you've not used the CLT whenever you *haven't* approximated a collection of sample means as following a normal distribution.

A 95% confidence interval for a mean (say) is an interval [a,b] such that Probability(a < mean < b) = 95%. That's true whether or not you use the CLT. The CLT is used in actually finding that Probability(a<mean<b); you use it to turn a complicated distribution (a collection of sample means) into a simple distribution (a normal one).

**Smaug123**)What is your formula for confidence intervals?

Sorry, "*not* used CLT"? You could say you've not used the CLT whenever you *haven't* approximated a collection of sample means as following a normal distribution.

A 95% confidence interval for a mean (say) is an interval [a,b] such that Probability(a < mean < b) = 95%. That's true whether or not you use the CLT. The CLT is used in actually finding that Probability(a<mean<b); you use it to turn a complicated distribution (a collection of sample means) into a simple distribution (a normal one).

I've found a question that you asked for to illustrate my problem. It's AQA May 2006, Q4.

the weights of packets of sultanas may be assumed to be normally distributed with a standard deviation of 6grams.

it then gives you the weights of 10 random sample packets.

Then it asks for a 99% confidence interval for the mean weight of packets. That CI formula is used.

Then it asks "State why, in calculating your confidence interval, use of the CLT was NOT necessary." The mark scheme says "weights of packets can be assumed to be normally distributed"

0

reply

Report

#10

(Original post by

Then it asks "State why, in calculating your confidence interval, use of the CLT was NOT necessary." The mark scheme says "weights of packets can be assumed to be normally distributed"

**maggiehodgson**)Then it asks "State why, in calculating your confidence interval, use of the CLT was NOT necessary." The mark scheme says "weights of packets can be assumed to be normally distributed"

0

reply

(Original post by

Ah, that's because you didn't need to *approximate* the sample means, because the underlying distribution was normal so the sample means *exactly* follow the normal distribution. CLT says that it comes to approximate a normal distribution, but you already know it *is* a normal distribution, so you don't need to bother with the CLT.

**Smaug123**)Ah, that's because you didn't need to *approximate* the sample means, because the underlying distribution was normal so the sample means *exactly* follow the normal distribution. CLT says that it comes to approximate a normal distribution, but you already know it *is* a normal distribution, so you don't need to bother with the CLT.

Super.

So, let me just check.

If the population is normally distributed and confidence intervals for a sample are asked for you use exactly the same formula as you would for calculating the CI for a non-normally distributed population's sample?

The CLT is quoted as being used when calculating CI for samples whose population has not been stated as being normally distributed but where the sample size is > 30?

0

reply

Report

#12

(Original post by

Super.

So, let me just check.

If the population is normally distributed and confidence intervals for a sample are asked for you use exactly the same formula as you would for calculating the CI for a non-normally distributed population's sample?

The CLT is quoted as being used when calculating CI for samples whose population has not been stated as being normally distributed but where the sample size is > 30?

**maggiehodgson**)Super.

So, let me just check.

If the population is normally distributed and confidence intervals for a sample are asked for you use exactly the same formula as you would for calculating the CI for a non-normally distributed population's sample?

The CLT is quoted as being used when calculating CI for samples whose population has not been stated as being normally distributed but where the sample size is > 30?

0

reply

(Original post by

Yep, I think that's right be aware, though, that they might expect you to infer that something "can be assumed to be normally distributed" - heights of people, for instance, are normally distributed but the question might not tell you so. If you make sure that you put down in your answer "assuming __ is normally distributed…" whenever you do assume something like that, you should be fine, I imagine.

**Smaug123**)Yep, I think that's right be aware, though, that they might expect you to infer that something "can be assumed to be normally distributed" - heights of people, for instance, are normally distributed but the question might not tell you so. If you make sure that you put down in your answer "assuming __ is normally distributed…" whenever you do assume something like that, you should be fine, I imagine.

Thank you so much. This CLT thing has bugged me for months and has stopped me liking statistics. Perhaps I can get a little fonder of the subject now.

I'm an adult learner with with no teacher so being able to ask TSR for help is a real big help.

0

reply

Report

#14

(Original post by

Thank you so much. This CLT thing has bugged me for months and has stopped me liking statistics. Perhaps I can get a little fonder of the subject now.

I'm an adult learner with with no teacher so being able to ask TSR for help is a real big help.

**maggiehodgson**)Thank you so much. This CLT thing has bugged me for months and has stopped me liking statistics. Perhaps I can get a little fonder of the subject now.

I'm an adult learner with with no teacher so being able to ask TSR for help is a real big help.

0

reply

X

Page 1 of 1

Go to first unread

Skip to page:

### Quick Reply

Back

to top

to top