# Q about continuity correction factor

Watch
Announcements
#1
Hello everyone (:

Let X be a discrete RV. : X ~ B(n, p) and Y be a continuous RV. : Y ~ N[np, np(1-p)]. Then approximating (the binomially distributed RV. X) P(X>4) to the normal distribution using a continuity correction factor would give P(Y>4.5) right? But shouldn't it be P(Y≥4.5) (cos Y = 4.5 would give 6 while approximating to the nearest integer and 6 > 5)? Is "≥" not used simply because in a normal distribution (or any continuous distribution for that matter) P(Y=k) = 0 since Y is a continuous RV.?

Secondly, will X ≈ Y ~ [np, np(1-p)] be called 'approximating a normal to the binomial' or 'approximating a binomial to the normal'? I know what it means but I'm just not sure which one of these is the correct phrase to use.

Also, could you please explain to me what is the need to use normal approximations? I mean, in any case, calculating the probability by using a binomial distribution is not one but difficult, even if n is large. Also will this not be the most accurate answer (when a binomial distribution is used)? Doesn't approximating a normal distribution to a binomial reduce the accuracy of our outcome (even after using continuity correction factors) compared to using a binomial distribution directly?

Thank you (:

Sent from my iPhone
0
7 years ago
#2
(Original post by Sidhant Shivram)
Is "≥" not used simply because in a normal distribution (or any continuous distribution for that matter) P(Y=k) = 0 since Y is a continuous RV.?
You're right, here. "Greater than or equal to" is essentially equivalent to "greater than" in a continuous distribution, since the probability of equality is zero.

(Original post by Sidhant Shivram)
Secondly, will X ≈ Y ~ [np, np(1-p)] be called 'approximating a normal to the binomial' or 'approximating a binomial to the normal'? I know what it means but I'm just not sure which one of these is the correct phrase to use.
That's an interesting question, but I think that the clearest possible phrase would be: "using a normal distribution as an approximation to a binomial distribution". Always best to be explicit, I think.

(Original post by Sidhant Shivram)
Also, could you please explain to me what is the need to use normal approximations? I mean, in any case, calculating the probability by using a binomial distribution is not one but difficult, even if n is large. Also will this not be the most accurate answer (when a binomial distribution is used)? Doesn't approximating a normal distribution to a binomial reduce the accuracy of our outcome (even after using continuity correction factors) compared to using a binomial distribution directly?
Obviously, since it is an approximation, it is going to be less accurate than using the true distribution, but nevertheless, for large n, and p close to 0.5, the normal can be extremely accurate. The drop in accuracy, I think, is a fair price to pay for the work saved. The calculations needed for a binomial distribution become extremely tedious as n gets big.
1
7 years ago
#3
(Original post by Sidhant Shivram)
Let X be a discrete RV. : X ~ B(n, p) and Y be a continuous RV. : Y ~ N[np, np(1-p)]. Then approximating (the binomially distributed RV. X) P(X>4) to the normal distribution using a continuity correction factor would give P(Y>4.5) right? But shouldn't it be P(Y≥4.5) (cos Y = 4.5 would give 6 while approximating to the nearest integer and 6 > 5)? Is "≥" not used simply because in a normal distribution (or any continuous distribution for that matter) P(Y=k) = 0 since Y is a continuous RV.?
Under the normal distribution, probability(Y>4.5) = probability(Y>=4.5). This is because probability(Y=4.5) is zero, because it's continuous, as you say. Therefore it doesn't matter which you pick.

Secondly, will X ≈ Y ~ [np, np(1-p)] be called 'approximating a normal to the binomial' or 'approximating a binomial to the normal'? I know what it means but I'm just not sure which one of these is the correct phrase to use.
I would call it "approximating the binomial X by the normal Y". That's unambiguous - I'm not quite sure which of your two options is better, suggesting that they're both sufficiently ambiguous to be confusing.

Also, could you please explain to me what is the need to use normal approximations? I mean, in any case, calculating the probability by using a binomial distribution is not one but difficult, even if n is large. Also will this not be the most accurate answer (when a binomial distribution is used)? Doesn't approximating a normal distribution to a binomial reduce the accuracy of our outcome (even after using continuity correction factors) compared to using a binomial distribution directly?
Yes, it does reduce the accuracy, but the key point is that if n is really big (say, 200) you start getting massive numbers cancelled out by really small numbers. Like for the probability that X = 1000 under Bin(2000, 1/4), you calculate 2000 choose 1000, which is 601 digits long, and (1/4)^1000 and (3/4)^1000 which have 603 and 125 zeros at the front of their decimal expansions respectively. Those numbers are *far* too small for your calculator to do.

However, using a normal approximation N(500, 375), the numbers become actually doable (and, even, quite easy) on a calculator - try it!

EDIT: Bad example - the probabilities are still in the region of 10^-147 here. If I were better at coming up with examples, the example would have worked perfectly and you wouldn't want to sum a thousand of these to find the probability that x>1000.
1
#4
(Original post by StrangeBanana)
You're right, here. "Greater than or equal to" is essentially equivalent to "greater than" in a continuous distribution, since the probability of equality is zero.

That's an interesting question, but I think that the clearest possible phrase would be: "using a normal distribution as an approximation to a binomial distribution". Always best to be explicit, I think.

Obviously, since it is an approximation, it is going to be less accurate than using the true distribution, but nevertheless, for large n, and p close to 0.5, the normal can be extremely accurate. The drop in accuracy, I think, is a fair price to pay for the work saved. The calculations needed for a binomial distribution become extremely tedious as n gets big.
Yes, for the 2nd question, your answer is the more 'formal' of the 2 options (this is what I use as well, though the other option makes more sense to me :P).

Thank you for your post (:

(Original post by Smaug123)
Under the normal distribution, probability(Y>4.5) = probability(Y>=4.5). This is because probability(Y=4.5) is zero, because it's continuous, as you say. Therefore it doesn't matter which you pick.

I would call it "approximating the binomial X by the normal Y". That's unambiguous - I'm not quite sure which of your two options is better, suggesting that they're both sufficiently ambiguous to be confusing.

Yes, it does reduce the accuracy, but the key point is that if n is really big (say, 200) you start getting massive numbers cancelled out by really small numbers. Like for the probability that X = 1000 under Bin(2000, 1/4), you calculate 2000 choose 1000, which is 601 digits long, and (1/4)^1000 and (3/4)^1000 which have 603 and 125 zeros at the front of their decimal expansions respectively. Those numbers are *far* too small for your calculator to do.

However, using a normal approximation N(500, 375), the numbers become actually doable (and, even, quite easy) on a calculator - try it!

EDIT: Bad example - the probabilities are still in the region of 10^-147 here. If I were better at coming up with examples, the example would have worked perfectly and you wouldn't want to sum a thousand of these to find the probability that x>1000.
You're answer (to that 2nd question) actually makes sense (compared to the other option) but for some reason, most textbooks use the other option...

No worries there, I got exactly what you tried to say using that example.

Thanks a lot (:
0
X

new posts
Back
to top
Latest
My Feed

### Oops, nobody has postedin the last few hours.

Why not re-start the conversation?

see more

### See more of what you like onThe Student Room

You can personalise what you see on TSR. Tell us a little about yourself to get started.

### Poll

Join the discussion

#### Feeling behind at school/college? What is the best thing your teachers could to help you catch up?

Extra compulsory independent learning activities (eg, homework tasks) (15)
6.82%
Run extra compulsory lessons or workshops (34)
15.45%
Focus on making the normal lesson time with them as high quality as possible (37)
16.82%
Focus on making the normal learning resources as high quality/accessible as possible (33)
15%
Provide extra optional activities, lessons and/or workshops (59)
26.82%
Assess students, decide who needs extra support and focus on these students (42)
19.09%