The Student Room Group

S3 Sampling Distribution

Please could someone explain how you can have an expected value for one item of data? For example in the textbook it says that for the weights of 25 apples, X1 , X2 , X3 etc. each observation has a mean mu and standard deviation 4. But how can X1 for example have a mean when it's only one item?

(my understanding is that it's one item, but I may be wrong...)

Thanks for any help!
Reply 1
Original post by PhyM23
Please could someone explain how you can have an expected value for one item of data? For example in the textbook it says that for the weights of 25 apples, X1 , X2 , X3 etc. each observation has a mean mu and standard deviation 4. But how can X1 for example have a mean when it's only one item?

(my understanding is that it's one item, but I may be wrong...)

Thanks for any help!


Generally speaking a capital letter denotes an r.v, and a lower case the observed value, the XiX_i's will be i.i.d.r.vs themselves.
Wikipedia has a brief page on it, but it isn't very clear.
(edited 8 years ago)
Reply 2
Original post by PhyM23
Please could someone explain how you can have an expected value for one item of data? For example in the textbook it says that for the weights of 25 apples, X1 , X2 , X3 etc. each observation has a mean mu and standard deviation 4. But how can X1 for example have a mean when it's only one item?

(my understanding is that it's one item, but I may be wrong...)

Thanks for any help!


They are random variables, so I could say: Let X1X_1 be the random variable "weight of apple 1", this is a legitimate random variable with a mean and a standard deviation. We don't know the actual weight of apple 1, but we do have its mean weight and the standard deviation of its weight. I'm not sure if I'm being helpful at all, actually... it's hard to put it into words.
Reply 3
Original post by joostan
Generally speaking a capital letter denotes an r.v, and a lower case the observed value, the XiX_i's will be i.i.d.r.vs themselves.
Wikipedia has a brief page on it, but it isn't very clear.


This makes more sense now, although I'm still a little confused on how individual items can have a mean. Is it because X1, X2 etc. are unknown, and it's this that allows them to have a distribution?

Also, please could you explain when you would use, for example, Var(6X) over 6Var(X)?
Original post by PhyM23
Please could someone explain how you can have an expected value for one item of data? For example in the textbook it says that for the weights of 25 apples, X1 , X2 , X3 etc. each observation has a mean mu and standard deviation 4. But how can X1 for example have a mean when it's only one item?

(my understanding is that it's one item, but I may be wrong...)

Thanks for any help!


It's very easy to slip into bad habits in the use of language and I'm afraid it affects statistics as much as any human endeavor...

So, let's think about what you do when you draw a sample from an underlying population. You assume that the underlying population is fixed and that it has a number of characteristics that are of interest and which we want to know about; It has a mean, a standard deviation, a mode, a median etc etc.

We typically estimate these population parameters by drawing a sample from the population and measuring something about the sample; the sample mean, standard deviation, mode, median...We then use a bit of hocus-pocus to quantify how uncertain are these sample based estimates of the population parameters.

A sample can be any size you wish. It can even be of size one. If it is of size one, then it's a fairly blunt instrument and will give only very imprecise estimates of the population parameters such as the mean, median and mode. It won't even give you an estimate of the population standard deviation as you can't calculate a sample standard deviation. But it's still a sample and still carries some information about the underlying population.

But now let's turn things on their head. Suppose you know all about the underlying population - suppose it's normal with mean 5 and standard deviation 1. Then you know what to expect if you draw a sample from that population; the sample mean and the sample standard deviation should be close to the population mean and the population standard deviation. You know what to expect even if the sample is of size one; your best guess at the value of the next element you draw from the population will be 5. That is its expected value.

So, each individual observation has an expected value. It even has an expected value of its absolute deviation from the population mean. It doesn't have a standard deviation on its own.

Any clearer?
Reply 5
Original post by Zacken
They are random variables, so I could say: Let X1X_1 be the random variable "weight of apple 1", this is a legitimate random variable with a mean and a standard deviation. We don't know the actual weight of apple 1, but we do have its mean weight and the standard deviation of its weight. I'm not sure if I'm being helpful at all, actually... it's hard to put it into words.


PRSOM

That's actually made it a lot clearer!
Reply 6
Original post by Gregorius
It's very easy to slip into bad habits in the use of language and I'm afraid it affects statistics as much as any human endeavor...

So, let's think about what you do when you draw a sample from an underlying population. You assume that the underlying population is fixed and that it has a number of characteristics that are of interest and which we want to know about; It has a mean, a standard deviation, a mode, a median etc etc.

We typically estimate these population parameters by drawing a sample from the population and measuring something about the sample; the sample mean, standard deviation, mode, median...We then use a bit of hocus-pocus to quantify how uncertain are these sample based estimates of the population parameters.

A sample can be any size you wish. It can even be of size one. If it is of size one, then it's a fairly blunt instrument and will give only very imprecise estimates of the population parameters such as the mean, median and mode. It won't even give you an estimate of the population standard deviation as you can't calculate a sample standard deviation. But it's still a sample and still carries some information about the underlying population.

But now let's turn things on their head. Suppose you know all about the underlying population - suppose it's normal with mean 5 and standard deviation 1. Then you know what to expect if you draw a sample from that population; the sample mean and the sample standard deviation should be close to the population mean and the population standard deviation. You know what to expect even if the sample is of size one; your best guess at the value of the next element you draw from the population will be 5. That is its expected value.

So, each individual observation has an expected value. It even has an expected value of its absolute deviation from the population mean. It doesn't have a standard deviation on its own.

Any clearer?


Wow! Thanks for such a lengthy response! It seems very clear in my mind now :smile:
Reply 7
Original post by PhyM23
Also, please could you explain when you would use, for example, Var(6X) over 6Var(X)?


I'm not sure I know what you mean. . .

I hope Gregorius' and Zacken's posts has cleared up what I could not explain very well at all, so didn't really try.
Reply 8
Original post by joostan
I'm not sure I know what you mean. . .

I hope Gregorius' and Zacken's posts has cleared up what I could not explain very well at all, so didn't really try.


Looking back at what you said I just realised I didn't read the first bit about upper and lower case letters. Now that I have it's honestly helped a lot

Sorry I just realised I was being vague...

For example, if you have Var(X1+X2), the variances are added. But if you have Var(2X1), the variance becomes 4Var(X1). So if X1 and X2 have the same distribution, then why are the variances different when you have two lots of X1, compared to having X1+X2?
(edited 8 years ago)
Reply 9
Original post by PhyM23

For example, if you have Var(X1+X2), the variances are added. But if you have Var(2X1), the variance becomes 4Var(X1). So if X1 and X2 have the same distribution, then why are the variances different when you have two lots of X1, compared to having X1+X2?


There is a very distinct difference between X1+X2X_1 + X_2 and 2X12X_1 or 2X22X_2.

I've written about this here and here.
Original post by PhyM23
Looking back at what you said I just realised I didn't read the first bit about upper and lower case letters. Now that I have it's honestly helped a lot

Sorry I just realised I was being vague...

For example, if you have Var(X1+X2), the variances are added. But if you have Var(2X1), the variance becomes 4Var(X1). So if X1 and X2 have the same distribution, then why are the variances different when you have two lots of X1, compared to having X1+X2?


Var(X+Y)=Var(X)+Var(Y)+2Cov(X,Y)\text{Var}(X+Y)=\text{Var}(X)+ \text{Var(Y)} +2\text{Cov}(X,Y).
For independent X,YX,Y we have Cov(X,Y)=0\text{Cov}(X,Y)=0.
However, we have Cov(X,X)=Var(X)\text{Cov}(X,X)=\text{Var}(X).
Intuitively, X1X_1 and X1X_1 are not independent, for obvious reasons.
(edited 8 years ago)

Quick Reply

Latest