The Student Room Group

Probability density function (PDF)



I don't understand why this formula gives the mean of a pdf. I don't really understand pdf's too well, but I do understand that the probability of exactly x realization occurring, is zero (at least conceptually, but perhaps not mathematically). I understand partly (I guess?) that the probability of x occurring within a given interval can be defined. But then how is it, that multiplying by x on this function, suddenly gives the mean (after doing the integration)? I've tried figuring it in my head but can't manage it
(edited 8 years ago)
Reply 1
Original post by djpailo


I don't understand why this formula gives the mean of a pdf. I don't really understand pdf's too well, but I do understand that the probability of exactly x realization occurring, is zero (at least conceptually, but perhaps not mathematically). I understand partly (I guess?) that the probability of x occurring within a given interval can be defined. But then how is it, that multiplying by x on this function, suddenly gives the mean (after doing the integration)? I've tried figuring it in my head but can't manage it


the mean is the weighted average, i.e the balancing point, i.e the "centre of mass" of a lamina whose shape is that of the PDF.

that is why the formula is identical to that of finding the x coordinate of the centre of mass of a uniform lamina
Reply 2
Original post by TeeEm
the mean is the weighted average, i.e the balancing point, i.e the "centre of mass" of a lamina whose shape is that of the PDF.

that is why the formula is identical to that of finding the x coordinate of the centre of mass of a uniform lamina


Still a bit confused ;(

Is it something like this example:
http://www.mathwords.com/w/weighted_average.htm

But done on infinitesimal little bits or intervals? So is it like saying we have a score in our homework of around 92, a grade of around 68 etc and for each of those "arounds" or better worded, intervalsi, we are assigned a weight which, when integrated across, gives the probability?
Reply 3
Original post by djpailo
Still a bit confused ;(

Is it something like this example:
http://www.mathwords.com/w/weighted_average.htm

But done on infinitesimal little bits or intervals? So is it like saying we have a score in our homework of around 92, a grade of around 68 etc and for each of those "arounds" or better worded, intervalsi, we are assigned a weight which, when integrated across, gives the probability?


I think you need a statistician with better command of the English language than myself, to explain the concept on line.
I wish I could be more helpful.
Reply 4
Original post by djpailo


I don't understand why this formula gives the mean of a pdf. I don't really understand pdf's too well, but I do understand that the probability of exactly x realization occurring, is zero (at least conceptually, but perhaps not mathematically). I understand partly (I guess?) that the probability of x occurring within a given interval can be defined. But then how is it, that multiplying by x on this function, suddenly gives the mean (after doing the integration)? I've tried figuring it in my head but can't manage it


http://www.mathshelper.co.uk/OCR%20S2%20Revision%20Sheet.pdf

If you read the first page, it will explain it a little bit.

I was asking myself this question 5months ago, but I forgot :colondollar:
Original post by djpailo


I don't understand why this formula gives the mean of a pdf. I don't really understand pdf's too well, but I do understand that the probability of exactly x realization occurring, is zero (at least conceptually, but perhaps not mathematically). I understand partly (I guess?) that the probability of x occurring within a given interval can be defined. But then how is it, that multiplying by x on this function, suddenly gives the mean (after doing the integration)? I've tried figuring it in my head but can't manage it


It is easier to see why this formula works if we first take a discrete function.

Assume a discrete function p(x),x{Z:axb} p(x), x \in \{\mathbb{Z} : a \leq x \leq b\} Where p(x)p(x) is the probability of x occurring, The part of the sum that will contribute to the Expected value must be xp(x)x \cdot p(x), as this is the mean outcome for this variables value. Therefore it is logical to assume that E[p(x)]=Σ[xp(x)] E[p(x)] = \Sigma [xp(x)]

Finite Integration shows the area under a curve. But the area is really just lots of little rectangles. Which you should have explored through the Trapezium Rule. Therefore we can assume integration is like "continuous addition." This therefore implies we can apply the same the same logic to our continuous function: f(x) f(x)

Assuming f(x)[a,b] f(x) \in [a,b]

E[p(x)]=Σ[xp(x)]E[f(x)]=abxf(x)dx \displaystyle E[p(x)] = \Sigma [xp(x)] \Rightarrow E[f(x)] = \int _a ^ b xf(x)dx
(edited 8 years ago)
Reply 6
Am a 3rd year LSE statistician so will try to explain.

Say you had a distribution, which could only take 2 values. 0 or 1 each with equal probability 0.5. To calculate the mean you would do 0.5*0+0.5*1, which is the summation of xf(x). Say now you can have values 0, 0.5 and 1. You would do a third times 0 plus a third times 0.5 plus a third times 1. Now if you have a continuous distribution you can have infinite values so you have an infinitely small number times every number between 0 and 1 added together, which is the integral of X times the probability of X happening f(x) between all values hence the infinity and minus infinity. So yeah weighted average done on an infinitely small scale.


Original post by djpailo
Still a bit confused ;(

Is it something like this example:
http://www.mathwords.com/w/weighted_average.htm

But done on infinitesimal little bits or intervals? So is it like saying we have a score in our homework of around 92, a grade of around 68 etc and for each of those "arounds" or better worded, intervalsi, we are assigned a weight which, when integrated across, gives the probability?
Reply 7
Original post by LMSZ
Am a 3rd year LSE statistician so will try to explain.

Say you had a distribution, which could only take 2 values. 0 or 1 each with equal probability 0.5. To calculate the mean you would do 0.5*0+0.5*1, which is the summation of xf(x). Say now you can have values 0, 0.5 and 1. You would do a third times 0 plus a third times 0.5 plus a third times 1. Now if you have a continuous distribution you can have infinite values so you have an infinitely small number times every number between 0 and 1 added together, which is the integral of X times the probability of X happening f(x) between all values hence the infinity and minus infinity. So yeah weighted average done on an infinitely small scale.


Thanks, this helps. Do you have a similar explanation for variance which would be x^2p(x) instead?
Reply 8
Original post by djpailo
Thanks, this helps. Do you have a similar explanation for variance which would be x^2p(x) instead?


(x^2)p(x) isn't the variance that is the expected value of x^2. E(x^2)-E(x)^2 is the variance.
Reply 9
Original post by LMSZ
(x^2)p(x) isn't the variance that is the expected value of x^2. E(x^2)-E(x)^2 is the variance.


I see where you are coming from, but I'm confused as to why, on pdf page 25, the formula for the nth moment is this:

http://www.turbulence-online.com/Publications/Lecture_Notes/Turbulence_Lille/TB_16January2013.pdf

I thought it was a "central moment" if you find it about a mean, and just a "moment" if you use the formula on page 25.
Reply 10
Original post by djpailo
I see where you are coming from, but I'm confused as to why, on pdf page 25, the formula for the nth moment is this:

http://www.turbulence-online.com/Publications/Lecture_Notes/Turbulence_Lille/TB_16January2013.pdf

I thought it was a "central moment" if you find it about a mean, and just a "moment" if you use the formula on page 25.


I'm confused what you mean. The formula on the article you sent me finds the moment as in E(X), E(X^2). The central moment is E(X-E(X))^n so the second central moment is the variance.
Reply 11
Original post by LMSZ
I'm confused what you mean. The formula on the article you sent me finds the moment as in E(X), E(X^2). The central moment is E(X-E(X))^n so the second central moment is the variance.


eq 2.17 has c^n * function and says that those are higher order moments, hence variance, skewness etc :s
Original post by djpailo
eq 2.17 has c^n * function and says that those are higher order moments, hence variance, skewness etc :s
Looking at the document, it seems to clearly distinguish between moments and central moments. There's a brief moment where it talks about the variance without stating it's a central moment, but it doesn't state that it's a moment either, so I don't really see grounds for confusion.
Reply 13
Original post by DFranklin
Looking at the document, it seems to clearly distinguish between moments and central moments. There's a brief moment where it talks about the variance without stating it's a central moment, but it doesn't state that it's a moment either, so I don't really see grounds for confusion.


So I'm wondering what the how to picture c^n*f(x) instead of c*f(x) where the latter was explained like this:


Say you had a distribution, which could only take 2 values. 0 or 1 each with equal probability 0.5. To calculate the mean you would do 0.5*0+0.5*1, which is the summation of xf(x). Say now you can have values 0, 0.5 and 1. You would do a third times 0 plus a third times 0.5 plus a third times 1. Now if you have a continuous distribution you can have infinite values so you have an infinitely small number times every number between 0 and 1 added together, which is the integral of X times the probability of X happening f(x) between all values hence the infinity and minus infinity. So yeah weighted average done on an infinitely small scale.


So I was wondering whether there was some simple intuition to explain what c^n does instead of c*f(x) in terms of interpreting the pdf. I know that variance is the spread of data, but I don't fully understand how variance is interpreted from the pdf.
Original post by djpailo
So I'm wondering what the how to picture c^n*f(x) instead of c*f(x) where the latter was explained like this:First off, those equations aren't correct. You are visualising c^n f(c) (or x^n f(x)). You should not be mixing c and x here.

As far as relating c^n to the explanation: it's exactly the same, only instead of finding the average value of x f(x), you're finding the average value of x^n f(x).

I do think it's a lot easier to understand these things in the finite case (where you are summing discrete probabilities) before progressing to the infinite one (where you need to integrate).
Reply 15
Original post by DFranklin
First off, those equations aren't correct. You are visualising c^n f(c) (or x^n f(x)). You should not be mixing c and x here.

As far as relating c^n to the explanation: it's exactly the same, only instead of finding the average value of x f(x), you're finding the average value of x^n f(x).

I do think it's a lot easier to understand these things in the finite case (where you are summing discrete probabilities) before progressing to the infinite one (where you need to integrate).


Okay, so in discrete case going to the coin example stated earlier:

0*0.5 + 1*0.5 = 0.5 mean.
0*0*0.5 + 1*1*0.5 = 0.5 variance

or a dice

1*(1/6) + 2*(1/6) + 3*(1/6) +4*(1/6) + 5*(1/6) + 6*(1/6) = 3.5 = mean
1*(1/6)*(1/6) + 2*(1/6)*(1/6) + 3*(1/6)*(1/6) +4*(1/6)*(1/6) + 5*(1/6)*(1/6) + 6*(1/6)*(1/6) = 0.58 variance

are these correct? I just find it strange that the formulae work like they do. I'm so use to just finding the arithmetic mean, so when finding the mean like this, it feels unnatural.

From the pdf, can I then interpret b_x(c) as the probability that the random variable, x, falls between a class width of c if we had a discrete case?

Then making the jump to infinite case, is that the probability that the random variable, x, falls between an infinitesimally small class width?
Original post by djpailo
Okay, so in discrete case going to the coin example stated earlier:

0*0.5 + 1*0.5 = 0.5 mean.
0*0*0.5 + 1*1*0.5 = 0.5 varianceNo, what you have calculated here is E[X^2], which is not the same as the variance. The variance is E((X-E(X))^2), which can be shown (via some manipulation that I suspect you are not ready for) to also equal E[X^2] - E[X]^2.

or a dice

1*(1/6) + 2*(1/6) + 3*(1/6) +4*(1/6) + 5*(1/6) + 6*(1/6) = 3.5 = mean
1*(1/6)*(1/6) + 2*(1/6)*(1/6) + 3*(1/6)*(1/6) +4*(1/6)*(1/6) + 5*(1/6)*(1/6) + 6*(1/6)*(1/6) = 0.58 variance
In this case you haven't even managed to calculate E[X^2] correctly.

You have found kP(X=k)2\sum k P(X = k)^2 when the correct formula for E[X^2] is k2P(X=k)\sum k^2 P(X=k).

I don't want to seem patronizing, but you really need to find an elementary book covering this - it is S1/S2 material. Most undergrad level books will be assuming you already know this to some extent and so will be skimming over it at an unrealistic pace.
Reply 17
Original post by DFranklin
No, what you have calculated here is E[X^2], which is not the same as the variance. The variance is E((X-E(X))^2), which can be shown (via some manipulation that I suspect you are not ready for) to also equal E[X^2] - E[X]^2.

In this case you haven't even managed to calculate E[X^2] correctly.

You have found kP(X=k)2\sum k P(X = k)^2 when the correct formula for E[X^2] is k2P(X=k)\sum k^2 P(X=k).

I don't want to seem patronizing, but you really need to find an elementary book covering this - it is S1/S2 material. Most undergrad level books will be assuming you already know this to some extent and so will be skimming over it at an unrealistic pace.


Half the books just quote the formula, which is why I ask. I see where I went wrong, but again, I'm still left confused as to what x^2p(x), why that works in the first place. For the mean, someone explained it well, I was looking for a similar explanation for variance

You've said E[X^2] is not the variance. Okay, if that is the case, what is the difference between the moment and central moment of variance, if E[X^2] is not the moment (or better worded, why is that not the variance if the mean is zero)?

What good books would you suggest?

EDIT:
I recalculated the variance, is it 2.92 for the dice?
(edited 8 years ago)
Reply 18
Original post by TeeEm
the mean is the weighted average, i.e the balancing point, i.e the "centre of mass" of a lamina whose shape is that of the PDF.

that is why the formula is identical to that of finding the x coordinate of the centre of mass of a uniform lamina


I never knew you could apply bits of M3 to S2! The more you know.

Quick Reply

Latest