The Student Room Group

Binomial help

okay, so I've created a problem in stats to keep me entertained but i'm not sure how to approach it.
XB(44,0.5)[br]YB(36,0.5)[br]ZB(28,0.5)X\sim B(44,0.5)[br]Y\sim B(36,0.5)[br]Z\sim B(28,0.5)
find a general equation for P(X22Y18Z14(X+Y+Z)=p)P(X\geq 22 \cap Y\geq 18 \cap Z\geq 14|(X+Y+Z)=p)

what I was thinking of doing was saying that A=X+Y+ZAB(108,0.5)A=X+Y+Z \Rightarrow A\sim B(108,0.5). Then I would work out the probability for 2 of the possible values, multiply them together and assume that the last distributional probability will be one and then divide by the A probability.

Only issue is that I feel as if that isn't correct or if it is that it is a lot of work and that there should be a simpler way.
(edited 8 years ago)
Reply 1
Original post by Aph

what I was thinking of doing was saying that A=X+Y+ZAB(108,0.5)A=X+Y+Z \Rightarrow A\sim B(108,0.5).


I take it that you're assuming all three are independent?
Original post by Aph
okay, so I've created a problem in stats to keep me entertained but i'm not sure how to approach it.
XB(44,0.5)[br]YB(36,0.5)[br]ZB(28,0.5)X\sim B(44,0.5)[br]Y\sim B(36,0.5)[br]Z\sim B(28,0.5)
find a general equation for P(X22Y18Z14(X+Y+Z)=p)P(X\geq 22 \cap Y\geq 18 \cap Z\geq 14|(X+Y+Z)=p)

what I was thinking of doing was saying that A=X+Y+ZAB(108,0.5)A=X+Y+Z \Rightarrow A\sim B(108,0.5). Then I would work out the probability for 2 of the possible values, multiply them together and assume that the last distributional probability will be one and then divide by the A probability.

Only issue is that I feel as if that isn't correct or if it is that it is a lot of work and that there should be a simpler way.


I haven't worked through this in detail, but I think you may be able to attack this by observing that the distribution of a binomial RV conditional on the sum of two binomial RVs with the same probability is hypergeometric. That is, P(X=a | X+Y=p) is hypergeometric. The details for this are worked out in this Stackexchange thread.

I think this will work by observing that

P(X+Y = a | X+Y+Z=p) is simply

P((X+Y) = a | (X+Y) + Z = p)

That is, condition the binomial RV (X+Y) on the sum (X+Y) + Z.
Reply 3
Original post by Zacken
I take it that you're assuming all three are independent?

Yes, although in reality they likely won't be.
Original post by Aph
Yes, although in reality they likely won't be.


:eek: Time to crank up the computer!
Reply 5
Original post by Gregorius
I haven't worked through this in detail, but I think you may be able to attack this by observing that the distribution of a binomial RV conditional on the sum of two binomial RVs with the same probability is hypergeometric. That is, P(X=a | X+Y=p) is hypergeometric. The details for this are worked out in this Stackexchange thread.

I think this will work by observing that

P(X+Y = a | X+Y+Z=p) is simply

P((X+Y) = a | (X+Y) + Z = p)

That is, condition the binomial RV (X+Y) on the sum (X+Y) + Z.

That sort of makes sense, although I would have though that wouldn't apply here as you want all the variables to be above a certain number and say for instance X+Y=40X+Y= 40 does not necessitate that x be greater or equal to 22 or Y be greater or equal to 18.
Original post by Aph
That sort of makes sense, although I would have though that wouldn't apply here as you want all the variables to be above a certain number and say for instance X+Y=40X+Y= 40 does not necessitate that x be greater or equal to 22 or Y be greater or equal to 18.


Yes, I see the problem. It works fine if you step down a dimension as you can get P(X > b, Y> c | X+Y=p) by simply summing the P(X=a | X+Y=p) for those a that allow both constraints to be satisfied. I will think some more...

If this is actually an applied problem, you haven't considered some sort of monte-carlo simulation have you?
Reply 7
Original post by Gregorius
Yes, I see the problem. It works fine if you step down a dimension as you can get P(X > b, Y> c | X+Y=p) by simply summing the P(X=a | X+Y=p) for those a that allow both constraints to be satisfied. I will think some more...

If this is actually an applied problem, you haven't considered some sort of monte-carlo simulation have you?

It is sort of applied and I'm not sure what you mean by monte-carlo

Posted from TSR Mobile
Original post by Aph
That sort of makes sense, although I would have though that wouldn't apply here as you want all the variables to be above a certain number and say for instance X+Y=40X+Y= 40 does not necessitate that x be greater or equal to 22 or Y be greater or equal to 18.


OK, if you go back to that answer on the Stackexchange forum and work through the algebra, I think that you will find that you can do the conditioning "all at once". So you can work out P(X= k & Y = l | X+Y+Z = p). I think it comes out as

(ak)(bl)(cplk)(a+b+cp)\displaystyle \frac{\binom{a}{k} \binom{b}{l} \binom{c}{p-l-k}}{\binom{a+b+c}{p}}

You can then sum the probabilities over the indices that meet all three constraints.
Original post by Aph
It is sort of applied and I'm not sure what you mean by monte-carlo

Posted from TSR Mobile


A monte-carlo technique is where you crank up a random number generator to repeatedly simulate from the distributions that you're interested in and you simply count the proportion of occurences that meet the constraints.
Reply 10
Original post by Gregorius
OK, if you go back to that answer on the Stackexchange forum and work through the algebra, I think that you will find that you can do the conditioning "all at once". So you can work out P(X= k & Y = l | X+Y+Z = p). I think it comes out as

(ak)(bl)(cplk)(a+b+cp)\displaystyle \frac{\binom{a}{k} \binom{b}{l} \binom{c}{p-l-k}}{\binom{a+b+c}{p}}

You can then sum the probabilities over the indices that meet all three constraints.

ohhhhh, I knew there must have been a way to do it with coms. I just didn't think about it like that. initially I tried what you suggested of adding two together before realizing that that wasn't working. Thank you!!
Original post by Gregorius
A monte-carlo technique is where you crank up a random number generator to repeatedly simulate from the distributions that you're interested in and you simply count the proportion of occurences that meet the constraints.

I feel like that would take forever. not that I think any methord wouldn't take a long time.
Original post by Aph

I feel like that would take forever. not that I think any methord wouldn't take a long time.


Just out of interest, I coded this up in the simplest way possible (simulate from the binomial distributions given, select only those that meet the sum constraint, and then count the X,Y,Z values that satisfy the >= constraints). With p = 54 and simulating 10,000,000 triples, I get an answer in about 5 seconds (in R, on an Intel 4770).
(edited 8 years ago)
Reply 12
Original post by Gregorius
Just out of interest, I coded this up in the simplest way possible (simulate from the binomial distributions given, select only those that meet the sum constraint, and then count the X,Y,Z values that satisfy the >= constraints). With p = 54 and simulating 10,000,000 triples, I get an answer in about 5 seconds (in R, on an Intel 4770).

I'm afraid as i'm only doing A-levels and am on a school computer I'm limited to excel. Plus I haven't done programming before so I wouldn't know where to begin on that.
Original post by Aph
I'm afraid as i'm only doing A-levels and am on a school computer I'm limited to excel. Plus I haven't done programming before so I wouldn't know where to begin on that.


Ah well, Excel will really limit what you can do like this. "R" has become the lingua franca of statisticians - it's wonderfully flexible and allows you to knock up this sort of code very rapidly. If you're interested in probability and/or statistics it would be a good idea to learn it sometime!
Reply 14
Original post by Gregorius
Ah well, Excel will really limit what you can do like this. "R" has become the lingua franca of statisticians - it's wonderfully flexible and allows you to knock up this sort of code very rapidly. If you're interested in probability and/or statistics it would be a good idea to learn it sometime!

I will, Thanks:h: I've worked out what I needed anyway, its just a lot of doing things manually on excel

Quick Reply

Latest