The Student Room Group

Binomial hypothesis test question

This is something that someone asked me about today and I realised that I don't fully understand it and it isn't explained in textbooks:

When conducting a binomial hypothesis test, why you don't just check the probability of the observed value alone? E.g. the expectation is 5 and the observed value is 2 so why not just check P(X = 2) to find out if it's significant? Instead you find the probability of the observed value "or more extreme", in this case P(X 2).

Of course you would need to make your significance value much lower but in my mind I think it would still work as a hypothesis test since P(X = a) will always reduce as 'a' moves further away from the expectation?

Some reasons I thought of:

- The probabilities will generally be very low so the significance value can't be a nice number like 5%
- For continuous distributions you have to check a range so it makes sense to do this for all hypothesis tests
- Historical reasons / convention

Or maybe there's a different reason?
Original post by 0-)
This is something that someone asked me about today and I realised that I don't fully understand it and it isn't explained in textbooks:

When conducting a binomial hypothesis test, why you don't just check the probability of the observed value alone? E.g. the expectation is 5 and the observed value is 2 so why not just check P(X = 2) to find out if it's significant? Instead you find the probability of the observed value "or more extreme", in this case P(X 2).

Of course you would need to make your significance value much lower but in my mind I think it would still work as a hypothesis test since P(X = a) will always reduce as 'a' moves further away from the expectation?

Some reasons I thought of:

- The probabilities will generally be very low so the significance value can't be a nice number like 5%
- For continuous distributions you have to check a range so it makes sense to do this for all hypothesis tests
- Historical reasons / convention

Or maybe there's a different reason?


It's simply a practical consideration you're missing I guess. Just think about it in context. Let's use a reasonably well-understood probability of a coin throw is a 50% chance of heads. Again for simplicity, we'll keep it one-tailed and say it's biased for heads. Let's say its thrown 50 times. How many heads do you think will provide a result that proves it's biased for heads?
If you're thinking 45 then does that not also mean that landing each one of 46, 47, 48, 49 and 50 heads would also be biased? This is why it must be cumulative.
Also, consider the critical region and how it's defined. I find this is usually easier to explain to Y13s once they've covered normal distribution and can see critical regions graphically better.
Reply 2
Original post by 0-)
This is something that someone asked me about today and I realised that I don't fully understand it and it isn't explained in textbooks:

When conducting a binomial hypothesis test, why you don't just check the probability of the observed value alone? E.g. the expectation is 5 and the observed value is 2 so why not just check P(X = 2) to find out if it's significant? Instead you find the probability of the observed value "or more extreme", in this case P(X 2).

Of course you would need to make your significance value much lower but in my mind I think it would still work as a hypothesis test since P(X = a) will always reduce as 'a' moves further away from the expectation?

Some reasons I thought of:

- The probabilities will generally be very low so the significance value can't be a nice number like 5%
- For continuous distributions you have to check a range so it makes sense to do this for all hypothesis tests
- Historical reasons / convention

Or maybe there's a different reason?

If you were doing a 95% two tailed test, the tails would have a 2.5% cumulative area each. Unless a single value has a probability > 2.5%, you cant infer that it lies in the tail or not without considering the cumulative area. It could be just less than 2.5% and still lie in a tail (steep distribution funciton at that point) or much less than 2.5% and not lie in a tail (flat distribution function at that point).
(edited 11 months ago)
Reply 3
Original post by briteeshbro
It's simply a practical consideration you're missing I guess. Just think about it in context. Let's use a reasonably well-understood probability of a coin throw is a 50% chance of heads. Again for simplicity, we'll keep it one-tailed and say it's biased for heads. Let's say its thrown 50 times. How many heads do you think will provide a result that proves it's biased for heads?

For that one tailed example, if I set the significance level as 5% then the critical region would be X>=32.

Now imagining that a hypothesis test just tested P(X=x) then I could set the significance level as 2% and say that the critical value was the smallest X such that P(X=x) < 0.02 and I would still get X>=32.

This would still work as a hypothesis test no?

If you're thinking 45 then does that not also mean that landing each one of 46, 47, 48, 49 and 50 heads would also be biased? This is why it must be cumulative.

I get that but I'm not sure it explains why it must be cumulative. Am I wrong that considering P(X=x) would work for binomial as I demonstrated above?
Reply 4
Original post by mqb2766
If you were doing a 95% two tailed test, the tails would have a 2.5% cumulative area each. Unless a single value has a probability > 2.5%, you cant infer that it lies in the tail or not without considering the cumulative area. It could be just less than 2.5% and still lie in a tail (steep distribution funciton at that point) or much less than 2.5% and not lie in a tail (flat distribution function at that point).

I don't really understand what you mean. Does what you say apply to a binomial distribution? I posted an example in my last post of a redesigned hypothesis test where only P(X=x) is considered. Can you give an example where that wouldn't work for a binomial test specifically?
Reply 5
Original post by mqb2766
If you were doing a 95% two tailed test, the tails would have a 2.5% cumulative area each. Unless a single value has a probability > 2.5%, you cant infer that it lies in the tail or not without considering the cumulative area. It could be just less than 2.5% and still lie in a tail (steep distribution funciton at that point) or much less than 2.5% and not lie in a tail (flat distribution function at that point).

Actually are you saying that two tailed tests wouldn't work as designed if only P(X=x) was considered because the idea of halving the significance level wouldn't make sense?
Reply 6
Original post by 0-)
I don't really understand what you mean. Does what you say apply to a binomial distribution? I posted an example in my last post of a redesigned hypothesis test where only P(X=x) is considered. Can you give an example where that wouldn't work for a binomial test specifically?

Your question has nothing specific to a binomial distribution, so its equally valid for any distribution as you have to consider the full cumulative probability associated with the tail(s) and/or body, not just a single value/small slice.

So if the test was 95% two sided, then the tails correpsond to a cumulative probability of 2.5% each. You cant infer the value of the total area (cumulative probability) by just looking at the value/area of a small slice. The only way you can is if the value is > 2.5% then you know its not in the tails.
Reply 7
Original post by mqb2766
Your question has nothing specific to a binomial distribution, so its equally valid for any distribution as you have to consider the full cumulative probability associated with the tail(s) and/or body, not just a single value/small slice.

So if the test was 95% two sided, then the tails correpsond to a cumulative probability of 2.5% each. You cant infer the value of the total area (cumulative probability) by just looking at the value/area of a small slice. The only way you can is if the value is > 2.5% then you know its not in the tails.

My question was specific to the binomial distribution but I get why forgetting about all other distributions doesn't really make sense.

Focussing on one-tailed tests for the binomial distribution for now only, would it not be possible to redesign the concept of a hypothesis test by only considering P(X=x) as I showed above? You could still use it to solve real life problems like hypothesis tests do. I haven't thought about how to extend it to two tailed tests yet!

Maybe the answer to my question just comes down to, "that's what a hypothesis test is".
Reply 8
Original post by 0-)
My question was specific to the binomial distribution but I get why forgetting about all other distributions doesn't really make sense.

Focussing on one-tailed tests for the binomial distribution for now only, would it not be possible to redesign the concept of a hypothesis test by only considering P(X=x) as I showed above? You could still use it to solve real life problems like hypothesis tests do. I haven't thought about how to extend it to two tailed tests yet!

Maybe the answer to my question just comes down to, "that's what a hypothesis test is".


No, the only thing that a single value can tell you is if its not in the tail such as when p(x=5) > 0.05 for a 95% single tail test (its irrelevant about it being single tailed or two tailed). If p(x=5) < 0.05 then it may be in the tail or not, and you can only get this by considering the cumulative probability up to that point (or greater than that point) and checking whether that area is < 0.05. It depends on the shape of the distribution function up to (or past) that point as youre testing the cumulative probability.
(edited 11 months ago)
Reply 9
Original post by mqb2766
No, the only thing that a single value can tell you is if its not in the tail when p(x=5) > 0.05 for a 95% single tail test (its irrelevant about it being single tailed or two tailed). If p(x=5) < 0.05 then it may be in the tail or not, and you can only get this by considering the cumulative probability up to that point (or greater than that point) and checking whether that area is < 0.05. It depends on the shape of the distribution function up to (or past) that point as youre testing the cumulative probability.

Say X~B(50,0.5) then E(X) = 25. In my redesigned hypothesis test I could define my "upper tail" by setting the significance level as 2% and the critical value as the smallest value of X above E(X) where P(X=x)<0.02. Since for x>25 I know that P(x+1)<P(x), if I get P(X=x)<0.02 then I know that x has to be in my "tail".

No?

I agree that this would depend on the shape of the distribution but my question was about the binomial distribution specifically.
Reply 10
Original post by 0-)
Say X~B(50,0.5) then E(X) = 25. In my redesigned hypothesis test I could define my "upper tail" by setting the significance level as 2% and the critical value as the smallest value of X above E(X) where P(X=x)<0.02. Since for x>25 I know that P(x+1)<P(x), if I get P(X=x)<0.02 then I know that x has to be in my "tail".

No?

I agree that this would depend on the shape of the distribution but my question was about the binomial distribution specifically.


Im presuming youre talking about (for an example) a 98% test with B(50,0.5) so in that case
p(x=32) = 0.016
however its not in the tail as
p(x=33) = 0.008, ...
so the cumulative probabilty is > 0.02. In this case the upper 2% tail would be x>=33 and
P(x>=32) ~ 0.032
P(x>=33) ~ 0.016
(edited 11 months ago)
Reply 11
Original post by mqb2766
Im presuming youre talking about (for an example) a 98% test with B(50,0.5) so in that case
p(x=32) = 0.016
however its not in the tail as
p(x=33) = 0.008, ...
so the cumulative probabilty is > 0.02. In this case the upper 2% tail would be x>=33 and
P(x>=32) ~ 0.032
P(x>=33) ~ 0.016

I understand this but I don't think you understand my question. But the longer this thread has gone on I've realised it might be a silly question anyway so I'll stop here :smile:
Original post by 0-)
For that one tailed example, if I set the significance level as 5% then the critical region would be X>=32.

Now imagining that a hypothesis test just tested P(X=x) then I could set the significance level as 2% and say that the critical value was the smallest X such that P(X=x) < 0.02 and I would still get X>=32.

This would still work as a hypothesis test no?


I get that but I'm not sure it explains why it must be cumulative. Am I wrong that considering P(X=x) would work for binomial as I demonstrated above?


Really good point you're making there but it all lies with the actual hypothesis test you're performing.
Think about the null and alternate hypotheses. The hypothesis being disproved is that the coin is fair so we set H0: p=0.5. this is the natural working assumption with coin flips. We're testing that the coin is biased for heads so the alternate is H1: p>0.5. This is really important because it states you're looking for where the probability is greater than a certain value. This shows that the test about to be performed is inherently cumulative.
Smart question though.

Quick Reply

Latest