# Q about continuity correction factor

Watch
Announcements

Page 1 of 1

Go to first unread

Skip to page:

Hello everyone (:

Let X be a discrete RV. : X ~ B(n, p) and Y be a continuous RV. : Y ~ N[np, np(1-p)]. Then approximating (the binomially distributed RV. X) P(X>4) to the normal distribution using a continuity correction factor would give P(Y>4.5) right? But shouldn't it be P(Y≥4.5) (cos Y = 4.5 would give 6 while approximating to the nearest integer and 6 > 5)? Is "≥" not used simply because in a normal distribution (or any continuous distribution for that matter) P(Y=k) = 0 since Y is a continuous RV.?

Secondly, will X ≈ Y ~ [np, np(1-p)] be called 'approximating a normal to the binomial' or 'approximating a binomial to the normal'? I know what it means but I'm just not sure which one of these is the correct phrase to use.

Also, could you please explain to me what is the need to use normal approximations? I mean, in any case, calculating the probability by using a binomial distribution is not one but difficult, even if n is large. Also will this not be the most accurate answer (when a binomial distribution is used)? Doesn't approximating a normal distribution to a binomial reduce the accuracy of our outcome (even after using continuity correction factors) compared to using a binomial distribution directly?

Thank you (:

Sent from my iPhone

Let X be a discrete RV. : X ~ B(n, p) and Y be a continuous RV. : Y ~ N[np, np(1-p)]. Then approximating (the binomially distributed RV. X) P(X>4) to the normal distribution using a continuity correction factor would give P(Y>4.5) right? But shouldn't it be P(Y≥4.5) (cos Y = 4.5 would give 6 while approximating to the nearest integer and 6 > 5)? Is "≥" not used simply because in a normal distribution (or any continuous distribution for that matter) P(Y=k) = 0 since Y is a continuous RV.?

Secondly, will X ≈ Y ~ [np, np(1-p)] be called 'approximating a normal to the binomial' or 'approximating a binomial to the normal'? I know what it means but I'm just not sure which one of these is the correct phrase to use.

Also, could you please explain to me what is the need to use normal approximations? I mean, in any case, calculating the probability by using a binomial distribution is not one but difficult, even if n is large. Also will this not be the most accurate answer (when a binomial distribution is used)? Doesn't approximating a normal distribution to a binomial reduce the accuracy of our outcome (even after using continuity correction factors) compared to using a binomial distribution directly?

Thank you (:

Sent from my iPhone

0

reply

Report

#2

(Original post by

Is "≥" not used simply because in a normal distribution (or any continuous distribution for that matter) P(Y=k) = 0 since Y is a continuous RV.?

**Sidhant Shivram**)Is "≥" not used simply because in a normal distribution (or any continuous distribution for that matter) P(Y=k) = 0 since Y is a continuous RV.?

(Original post by

Secondly, will X ≈ Y ~ [np, np(1-p)] be called 'approximating a normal to the binomial' or 'approximating a binomial to the normal'? I know what it means but I'm just not sure which one of these is the correct phrase to use.

**Sidhant Shivram**)Secondly, will X ≈ Y ~ [np, np(1-p)] be called 'approximating a normal to the binomial' or 'approximating a binomial to the normal'? I know what it means but I'm just not sure which one of these is the correct phrase to use.

(Original post by

Also, could you please explain to me what is the need to use normal approximations? I mean, in any case, calculating the probability by using a binomial distribution is not one but difficult, even if n is large. Also will this not be the most accurate answer (when a binomial distribution is used)? Doesn't approximating a normal distribution to a binomial reduce the accuracy of our outcome (even after using continuity correction factors) compared to using a binomial distribution directly?

**Sidhant Shivram**)Also, could you please explain to me what is the need to use normal approximations? I mean, in any case, calculating the probability by using a binomial distribution is not one but difficult, even if n is large. Also will this not be the most accurate answer (when a binomial distribution is used)? Doesn't approximating a normal distribution to a binomial reduce the accuracy of our outcome (even after using continuity correction factors) compared to using a binomial distribution directly?

1

reply

Report

#3

(Original post by

Let X be a discrete RV. : X ~ B(n, p) and Y be a continuous RV. : Y ~ N[np, np(1-p)]. Then approximating (the binomially distributed RV. X) P(X>4) to the normal distribution using a continuity correction factor would give P(Y>4.5) right? But shouldn't it be P(Y≥4.5) (cos Y = 4.5 would give 6 while approximating to the nearest integer and 6 > 5)? Is "≥" not used simply because in a normal distribution (or any continuous distribution for that matter) P(Y=k) = 0 since Y is a continuous RV.?

**Sidhant Shivram**)Let X be a discrete RV. : X ~ B(n, p) and Y be a continuous RV. : Y ~ N[np, np(1-p)]. Then approximating (the binomially distributed RV. X) P(X>4) to the normal distribution using a continuity correction factor would give P(Y>4.5) right? But shouldn't it be P(Y≥4.5) (cos Y = 4.5 would give 6 while approximating to the nearest integer and 6 > 5)? Is "≥" not used simply because in a normal distribution (or any continuous distribution for that matter) P(Y=k) = 0 since Y is a continuous RV.?

Secondly, will X ≈ Y ~ [np, np(1-p)] be called 'approximating a normal to the binomial' or 'approximating a binomial to the normal'? I know what it means but I'm just not sure which one of these is the correct phrase to use.

Also, could you please explain to me what is the need to use normal approximations? I mean, in any case, calculating the probability by using a binomial distribution is not one but difficult, even if n is large. Also will this not be the most accurate answer (when a binomial distribution is used)? Doesn't approximating a normal distribution to a binomial reduce the accuracy of our outcome (even after using continuity correction factors) compared to using a binomial distribution directly?

However, using a normal approximation N(500, 375), the numbers become actually doable (and, even, quite easy) on a calculator - try it!

EDIT: Bad example - the probabilities are still in the region of 10^-147 here. If I were better at coming up with examples, the example would have worked perfectly and you wouldn't want to sum a thousand of these to find the probability that x>1000.

1

reply

(Original post by

You're right, here. "Greater than or equal to" is essentially equivalent to "greater than" in a continuous distribution, since the probability of equality is zero.

That's an interesting question, but I think that the clearest possible phrase would be: "using a normal distribution as an approximation to a binomial distribution". Always best to be explicit, I think.

Obviously, since it is an approximation, it is going to be less accurate than using the true distribution, but nevertheless, for large n, and p close to 0.5, the normal can be extremely accurate. The drop in accuracy, I think, is a fair price to pay for the work saved. The calculations needed for a binomial distribution become extremely tedious as n gets big.

**StrangeBanana**)You're right, here. "Greater than or equal to" is essentially equivalent to "greater than" in a continuous distribution, since the probability of equality is zero.

That's an interesting question, but I think that the clearest possible phrase would be: "using a normal distribution as an approximation to a binomial distribution". Always best to be explicit, I think.

Obviously, since it is an approximation, it is going to be less accurate than using the true distribution, but nevertheless, for large n, and p close to 0.5, the normal can be extremely accurate. The drop in accuracy, I think, is a fair price to pay for the work saved. The calculations needed for a binomial distribution become extremely tedious as n gets big.

Thank you for your post (:

(Original post by

Under the normal distribution, probability(Y>4.5) = probability(Y>=4.5). This is because probability(Y=4.5) is zero, because it's continuous, as you say. Therefore it doesn't matter which you pick.

I would call it "approximating the binomial X by the normal Y". That's unambiguous - I'm not quite sure which of your two options is better, suggesting that they're both sufficiently ambiguous to be confusing.

Yes, it does reduce the accuracy, but the key point is that if n is really big (say, 200) you start getting massive numbers cancelled out by really small numbers. Like for the probability that X = 1000 under Bin(2000, 1/4), you calculate 2000 choose 1000, which is 601 digits long, and (1/4)^1000 and (3/4)^1000 which have 603 and 125 zeros at the front of their decimal expansions respectively. Those numbers are *far* too small for your calculator to do.

However, using a normal approximation N(500, 375), the numbers become actually doable (and, even, quite easy) on a calculator - try it!

EDIT: Bad example - the probabilities are still in the region of 10^-147 here. If I were better at coming up with examples, the example would have worked perfectly and you wouldn't want to sum a thousand of these to find the probability that x>1000.

**Smaug123**)Under the normal distribution, probability(Y>4.5) = probability(Y>=4.5). This is because probability(Y=4.5) is zero, because it's continuous, as you say. Therefore it doesn't matter which you pick.

I would call it "approximating the binomial X by the normal Y". That's unambiguous - I'm not quite sure which of your two options is better, suggesting that they're both sufficiently ambiguous to be confusing.

Yes, it does reduce the accuracy, but the key point is that if n is really big (say, 200) you start getting massive numbers cancelled out by really small numbers. Like for the probability that X = 1000 under Bin(2000, 1/4), you calculate 2000 choose 1000, which is 601 digits long, and (1/4)^1000 and (3/4)^1000 which have 603 and 125 zeros at the front of their decimal expansions respectively. Those numbers are *far* too small for your calculator to do.

However, using a normal approximation N(500, 375), the numbers become actually doable (and, even, quite easy) on a calculator - try it!

EDIT: Bad example - the probabilities are still in the region of 10^-147 here. If I were better at coming up with examples, the example would have worked perfectly and you wouldn't want to sum a thousand of these to find the probability that x>1000.

No worries there, I got exactly what you tried to say using that example.

Thanks a lot (:

0

reply

X

Page 1 of 1

Go to first unread

Skip to page:

### Quick Reply

Back

to top

to top