# June 2018 Stats

Hi, just doing the 2018 paper, and the pure section was fine, but I’ve completely flopped this stats paper, which is a shame, as Stats is usually the thing I do the best in. Literally had no clue for Question 1 - but in the markscheme it has C from 0 to 8 (and then a probability of 1/9 for each), but I’m just really confused where u even get those numbers from, as there’s literally nothing in the Question ?
In the large data set (Edexcel), cloud cover is measured in oktas, integer values from 0 to 8, hence the range 0 to 8. Each of these values has an equal probability, so the probability of each value is 1/9. The large data set is often neglected by students, so I'd imagine a lot of other students in 2018 were in the same situation as you. I hope that this helped.
Ah, thanks a lot man. I’ve tried a lot trying to memorise the large data set - just doesn’t come natural to me, so yeah I messed up that question. It was really weird because I think I got 43/44 on the 2019 paper and all the other papers I did, I was averaging around 40-45 and then this paper I got 20s. So idk what to make of it
(edited 3 years ago)

I’m having lots of trouble with this question as well ngl. Parts A and D are fine, but I’m just not grasping B and D atm.
For b) how do you represent the 4 batteries working and the extra 4 hours compared to a)? Note the question asks about the remaining exam, so really you have to condition on the first 16 hours being ok.
For c) you'd have to model two batteries working for 20 hrs and two for 16 and 4 hours.
I’ve managed to get the answer .... but a conditional probability in a normal distribution question - genuinely never seen that. Also, I was a bit confused on why we’d find out the probability of X>20 and X>16, instead of X=20 and X=16. I know In question 1 it asks for the probability that X>16, so I’m just wondering whether it’s to do with that ? Because when you’re actually reading the question, it says she has used her calc for 16 hours, not more, and then it claims she’s got another 4 hours, which would take us to bang on 20 hours, so not really sure why we assume > and use the cumulative distribution instead of assuming = and using the PD.
Original post by Madman11
I’ve managed to get the answer .... but a conditional probability in a normal distribution question - genuinely never seen that. Also, I was a bit confused on why we’d find out the probability of X>20 and X>16, instead of X=20 and X=16. I know In question 1 it asks for the probability that X>16, so I’m just wondering whether it’s to do with that ? Because when you’re actually reading the question, it says she has used her calc for 16 hours, not more, and then it claims she’s got another 4 hours, which would take us to bang on 20 hours, so not really sure why we assume > and use the cumulative distribution instead of assuming = and using the PD.

It is a CDF questiion, as many questions involving the normal distribution (pdf) are at a level.
You want to see if the lifetime is > 16 (exam 1) or > 20 (exam 2). These are cdf range questions as the batteries will eventually fail (probability 1). The cdf goes from 0->1 as X increases. The actual value of the pdf tells us about the relative (density) probability with which these things occur, but you want to find the area under the density curve, i.e. the cdf.

You've obviously worked out how split the 0->20 into 0->16 and 16->20 and use the joint / conditional to get the exam 2 info the question asked. Let me know if you're still unsure (maybe post what you've done)?
So should I always use the CDF function, unless the question specifically asks ‘X=5’ or something like that ? Because actually now that we’re talking about this, I don’t think I’ve ever used the Normal PD function on my calculator ? Is that not on the Maths spec or something
Original post by Madman11
So should I always use the CDF function, unless the question specifically asks ‘X=5’ or something like that ? Because actually now that we’re talking about this, I don’t think I’ve ever used the Normal PD function on my calculator ? Is that not on the Maths spec or something

It's a lot more usual to use the cdf and talk about ranges than it is is evaluate the pdf for a specific value.

Note, in a normal distribution, the probability that X=5 (for instance) is basically zero. It's a density function on the real axis, so the probability that the event has exactly that value is zero. However, it makes sense to talk about ranges, but that's a CDF (area under the density curve).
By the way, what do you reckon the likelihood is that the stats and mechanics paper in a couple of weeks is gonna be as tough as the 2018 one ? I found the 2019 one very straightforward for the most part, especially compared to 2018, so I’m just wondering if it’s likely to be extra tough this year, considering how easy it was last year.
Original post by Madman11
By the way, what do you reckon the likelihood is that the stats and mechanics paper in a couple of weeks is gonna be as tough as the 2018 one ? I found the 2019 one very straightforward for the most part, especially compared to 2018, so I’m just wondering if it’s likely to be extra tough this year, considering how easy it was last year.

It's not my area, so can't really say - sorry. I take it you're doing the "resits"?
If there are not too many examples of your exam board, maybe do the other boards as well to get some robustness about the type of questions that may be asked.
Yeah mate, I’m on for the October resits. Thanks anyway man, you’ve been a big help
Original post by mqb2766
It's a lot more usual to use the cdf and talk about ranges than it is is evaluate the pdf for a specific value.

Note, in a normal distribution, the probability that X=5 (for instance) is basically zero. It's a density function on the real axis, so the probability that the event has exactly that value is zero. However, it makes sense to talk about ranges, but that's a CDF (area under the density curve).

I thought when they ask for x=5 for instance you use normal pd, as x =5.
Original post by Ferrari08
I thought when they ask for x=5 for instance you use normal pd, as x =5.

For a discrete probability (mass) function, then yes.

For a continuous probability density function (like a normal distriution) the probability that the variable takes an exact value, say X=5.3453... is zero. The probability density function tells you what the probability of a range is, say 5<X<5.5, by finding the area (probability) under the pdf curve for that interval.
https://en.m.wikipedia.org/wiki/Probability_mass_function
https://en.m.wikipedia.org/wiki/Probability_density_function
That's why you use the cdf a lot.
(edited 3 years ago)
Sorry to drag up an old thread

For part c)
Why do you not consider the combinations of 2 old and 2 new batteries? So find (p(L>4))^2 x 0.189^2 x4C2 ??
Original post by aqahmed
Sorry to drag up an old thread

For part c)
Why do you not consider the combinations of 2 old and 2 new batteries? So find (p(L>4))^2 x 0.189^2 x4C2 ??

Similar to b) youd want
P(L>20 | L>16)^2 * P(L>4)^2
So not sure what the 0.189^2 term is as P(L>4)~1 and P(L>16)~0.7 and P(L>20)~0.3.

You dont need an nCr term as you could imagine it as a tree with splits on battery 1, then battery 2, then battery 3 then battery 4. As you randomly select 2 batteries to replace it could be any two along the "all working" branch and the "all working" leaf corresponds to the previous multiplication / joint probability.
(edited 3 months ago)