Hey there! Sign in to join this conversationNew here? Join for free

Confidence interval help needed Watch

Announcements
    • Thread Starter
    Offline

    2
    ReputationRep:
    Let θ denote the true ability of a student in a certain subject, representing the probability of astudent answering correctly an arbitrary question and let X1, . . . , Xn be the marks to n questions the student answered, taking values either 0 or 1. We model them as n i.i.d. Bernoulli(θ) randomvariables. Then, we approximate a student’s ability by its ML estimate.

    a) We can approximate the distribution of

    (θ(X) - θ) / ((θ(X)(1−θ(X)))/n)0.5


    (Where θ(X) is the Maximum likelihood estimate)

    by a standard gaussian N(0,1) distribution, under distribution P0. Using this result, construct the 95 percent confidence interval for θ.

    b) How big should n be so that we can estimate the true ability within 0.01 marks, with probabilityat least 95%? Justify your answer.

    I have no idea where to start here. Why are we approximating that scary looking distribution in the first place, and what does it mean by 'under distribution P0 ? How do we start to construct an interval?
    Offline

    13
    ReputationRep:
    (Original post by pineapplechemist)
    Let θ denote the true ability of a student in a certain subject, representing the probability of astudent answering correctly an arbitrary question and let X1, . . . , Xn be the marks to n questions the student answered, taking values either 0 or 1. We model them as n i.i.d. Bernoulli(θ) randomvariables. Then, we approximate a student’s ability by its ML estimate.

    a) We can approximate the distribution of

    (θ(X) - θ) / ((θ(X)(1−θ(X)))/n)0.5


    (Where θ(X) is the Maximum likelihood estimate)

    by a standard gaussian N(0,1) distribution, under distribution P0. Using this result, construct the 95 percent confidence interval for θ.

    b) How big should n be so that we can estimate the true ability within 0.01 marks, with probabilityat least 95%? Justify your answer.

    I have no idea where to start here. Why are we approximating that scary looking distribution in the first place, and what does it mean by 'under distribution P0 ? How do we start to construct an interval?
    What it is saying here is that, if \hat{\theta} is the maximum likelihood estimator of \theta, then

    \displaystyle \frac{\theta - \hat{\theta}}{\sqrt{\frac{\hat{ \theta }\left(1 - \hat{\theta}\right)}{n}}}

    is approximately distributed as a standard normal variable. I've changed your notation of \theta(X) to the more common \hat{\theta}. The point here is that \hat{\theta} is your estimate of  \theta and

    \displaystyle \sqrt{\frac{\hat{\theta}\left(1 - \hat{\theta}\right)}{n}}

    is an estimate of it's standard error. You can now use these facts to construct the 95% confidence interval in the usual way, that is \hat{\theta} plus or minus 1.96 times that standard error.

    The terminology "under distribution P0" is odd, and appears to be superfluous.
    • Thread Starter
    Offline

    2
    ReputationRep:
    (Original post by Gregorius)
    What it is saying here is that, if \hat{\theta} is the maximum likelihood estimator of \theta, then

    \displaystyle \frac{\theta - \hat{\theta}}{\sqrt{\frac{\hat{ \theta }\left(1 - \hat{\theta}\right)}{n}}}

    is approximately distributed as a standard normal variable. I've changed your notation of \theta(X) to the more common \hat{\theta}. The point here is that \hat{\theta} is your estimate of  \theta and

    \displaystyle \sqrt{\frac{\hat{\theta}\left(1 - \hat{\theta}\right)}{n}}

    is an estimate of it's standard error. You can now use these facts to construct the 95% confidence interval in the usual way, that is \hat{\theta} plus or minus 1.96 times that standard error.

    The terminology "under distribution P0" is odd, and appears to be superfluous.
    Ah thank you, it's much more familiar now! I didn't realise that it was the standard error :P
    • Thread Starter
    Offline

    2
    ReputationRep:
    (Original post by Gregorius)
    What it is saying here is that, if \hat{\theta} is the maximum likelihood estimator of \theta, then

    \displaystyle \frac{\theta - \hat{\theta}}{\sqrt{\frac{\hat{ \theta }\left(1 - \hat{\theta}\right)}{n}}}

    is approximately distributed as a standard normal variable. I've changed your notation of \theta(X) to the more common \hat{\theta}. The point here is that \hat{\theta} is your estimate of  \theta and

    \displaystyle \sqrt{\frac{\hat{\theta}\left(1 - \hat{\theta}\right)}{n}}

    is an estimate of it's standard error. You can now use these facts to construct the 95% confidence interval in the usual way, that is \hat{\theta} plus or minus 1.96 times that standard error.

    The terminology "under distribution P0" is odd, and appears to be superfluous.
    So, is my interval the estimator +/- 1.96 ? (Sorry for large text, I copied and pasted). Do I need to do anything else?
    Offline

    13
    ReputationRep:
    (Original post by pineapplechemist)
    So, is my interval the estimator +/- 1.96 ? (Sorry for large text, I copied and pasted). Do I need to do anything else?
    That's it.
    • Thread Starter
    Offline

    2
    ReputationRep:
    (Original post by Gregorius)
    That's it.
    Thanks. They didn't actually teach us how to do it. For the second part of the question, is it similar? Do I need to rearrange the interval for n?
    Offline

    13
    ReputationRep:
    (Original post by pineapplechemist)
    Thanks. They didn't actually teach us how to do it. For the second part of the question, is it similar? Do I need to rearrange the interval for n?
    Yes, you have to choose n so that the confidence interval has width 0.01.
    • Thread Starter
    Offline

    2
    ReputationRep:
    (Original post by Gregorius)
    Yes, you have to choose n so that the confidence interval has width 0.01.
    My methods therefore is upper interval - lower interval <= 0.01. I rearranged for n and got 153664(theta)(1-theta)<= n. It doesn't seem right to have it in terms of my estimator, is it ok?
    Offline

    13
    ReputationRep:
    (Original post by pineapplechemist)
    My methods therefore is upper interval - lower interval <= 0.01. I rearranged for n and got 153664(theta)(1-theta)<= n. It doesn't seem right to have it in terms of my estimator, is it ok?
    If you are not actually given any information that allows you to calculate an actual value of \hat{\theta} then you need to think about what the "worst" value of \hat{\theta} could be, from the point of view of the width of the confidence interval. As a hint, consider the graph of the function f(x)=x(1-x).
    • Thread Starter
    Offline

    2
    ReputationRep:
    (Original post by Gregorius)
    If you are not actually given any information that allows you to calculate an actual value of \hat{\theta} then you need to think about what the "worst" value of \hat{\theta} could be, from the point of view of the width of the confidence interval. As a hint, consider the graph of the function f(x)=x(1-x).
    Ah ok, so we take the estimator to equal 0.5 right, which maximises that function? This would make the confidence interval the widest it could be. Therefore if we get that interval within 0.01 of the true value then all other intervals at least within 0.01 of the true value. Am I thinking along the correct lines here?

    Thanks for your help by the way, really appreciate it. Finding this module very hard!

    edit: So I get n needs to be at least 38416. Seems huge!
 
 
 
  • See more of what you like on The Student Room

    You can personalise what you see on TSR. Tell us a little about yourself to get started.

  • Poll
    Should Spain allow Catalonia to declare independence?
    Useful resources

    Make your revision easier

    Maths

    Maths Forum posting guidelines

    Not sure where to post? Read the updated guidelines here

    Equations

    How to use LaTex

    Writing equations the easy way

    Student revising

    Study habits of A* students

    Top tips from students who have already aced their exams

    Study Planner

    Create your own Study Planner

    Never miss a deadline again

    Polling station sign

    Thinking about a maths degree?

    Chat with other maths applicants

    Can you help? Study help unanswered threads

    Groups associated with this forum:

    View associated groups
  • See more of what you like on The Student Room

    You can personalise what you see on TSR. Tell us a little about yourself to get started.

  • The Student Room, Get Revising and Marked by Teachers are trading names of The Student Room Group Ltd.

    Register Number: 04666380 (England and Wales), VAT No. 806 8067 22 Registered Office: International House, Queens Road, Brighton, BN1 3XE

    Quick reply
    Reputation gems: You get these gems as you gain rep from other members for making good contributions and giving helpful advice.