The Student Room Group

Hypothesis Testing struggles - please help

Hi there,

I'm having real trouble with my statistics module and just cannot get my head around hypothesis testing at the moment. Here is the problem I'm working on:


A timber merchant has two machines producing mouldings: 60% by Machine 1 and 40% by Machine 2. The proportion of substandard mouldings produced by the machines are 2% and 12% respectively. A moulding was produced, but the merchant has forgotten to label which machine it has come from. When it were tested, it was found to be substandard. The merchant is interested in which machine the moulding had come from.


I'm asked to use ideal hypothesis testing to analyse the data and draw conclusions.


My thinking:


Null hypothesis (Ho) is that the sub-standard moulding came from machine 2


Alternative hypothesis (H1) is that the moulding came from machine 1.


To do this problem I need P(Ho), P(H1), P(Data|Ho) and P(Data|H1).


At this point I'm so confused as to how to find these 4 probabilities and whether my null and alternative hypotheses are even correct. I'm severely second guessing myself and this is limiting my progress.


I would love some help and advice if there's anyone who can lend a helping hand.


Thinking out loud, I feel the P(Ho) is 0.6*0.02=0.012 (the number of sub-standard mouldings from total production) and P(H1) is 0.4*0.12=0.048. But then have I used the data in the question and these probabilities are actually something else like P(Data|H1), etc.
Original post by Bameron
Hi there,

I'm having real trouble with my statistics module and just cannot get my head around hypothesis testing at the moment. Here is the problem I'm working on:


A timber merchant has two machines producing mouldings: 60% by Machine 1 and 40% by Machine 2. The proportion of substandard mouldings produced by the machines are 2% and 12% respectively. A moulding was produced, but the merchant has forgotten to label which machine it has come from. When it were tested, it was found to be substandard. The merchant is interested in which machine the moulding had come from.


I'm asked to use ideal hypothesis testing to analyse the data and draw conclusions.


My thinking:


Null hypothesis (Ho) is that the sub-standard moulding came from machine 2


Alternative hypothesis (H1) is that the moulding came from machine 1.


To do this problem I need P(Ho), P(H1), P(Data|Ho) and P(Data|H1).


At this point I'm so confused as to how to find these 4 probabilities and whether my null and alternative hypotheses are even correct. I'm severely second guessing myself and this is limiting my progress.


I would love some help and advice if there's anyone who can lend a helping hand.


Thinking out loud, I feel the P(Ho) is 0.6*0.02=0.012 (the number of sub-standard mouldings from total production) and P(H1) is 0.4*0.12=0.048. But then have I used the data in the question and these probabilities are actually something else like P(Data|H1), etc.


I have no idea what "ideal hypothesis testing" is, but this looks like a straight application of Bayes theorem. You know things like P(Faulty | Machine 1), P(Faulty | Machine 2), and you also know the prior probabilities P(Machine 1) and P(Machine 2). You want to work out P(Machine 1 | Faulty) and P(Machine 2 | Faulty).
Original post by Gregorius
I have no idea what "ideal hypothesis testing" is.



I'd looked up "ideal hypothesis testing", couldn't find anything remotely close that seemed relevant, and concluded "best left for the experts".
Reply 3
Original post by Gregorius
I have no idea what "ideal hypothesis testing" is, but this looks like a straight application of Bayes theorem. You know things like P(Faulty | Machine 1), P(Faulty | Machine 2), and you also know the prior probabilities P(Machine 1) and P(Machine 2). You want to work out P(Machine 1 | Faulty) and P(Machine 2 | Faulty).

So would my null hypothesis be P(Machine 2 is faulty) as that seems the more likely of the 2 initially and my alternative hypothesis be P(Machine 1 is faulty)? These would be 0.12 and 0.02 respectively.


Would I be right in saying P(Faulty|M1) being 0.6*0.02 and P(Faulty|M2) being 0.4*0.12?


How would I go on from there to find P(M1|faulty) and P(M2|faulty)? Would that involve a partition?


Really appreciate the help so far!!!
Original post by Bameron
So would my null hypothesis be P(Machine 2 is faulty) as that seems the more likely of the 2 initially and my alternative hypothesis be P(Machine 1 is faulty)? These would be 0.12 and 0.02 respectively.

Again, I'm going to duck the question of "hypothesis testing" here as this is an exercise in pure probability (there's no statistics or hypothesis testing involved, in the classical sense).

Let "M1" be the statement that the item came from machine M1 and "M2" be the statement that the item came from machine M2.
Let "F" be the statement that a particular item is faulty. We know the following probabilities:

P(M1) = 0.6
P(M2) = 0.4

and the following conditional probabilities:

P(F | M1) = 0.02
P(F | M2) = 0.12

and we want to work out the conditional probabilities:

P(M1 | F)
P(M2 | F) = 1 - P(M1 | F)

To do that, you need to apply Bayes theorem.


Would I be right in saying P(Faulty|M1) being 0.6*0.02 and P(Faulty|M2) being 0.4*0.12?


No, see above.


How would I go on from there to find P(M1|faulty) and P(M2|faulty)? Would that involve a partition?


Bayes theorem:

P(M1 | F) = P(F | M1) x P(M1) / (P(F | M1) x P(M1) + P(F | M2) x P(M2))

Quick Reply

Latest