# Normal Distribution MLE

Watch
Announcements
#1
Hello,

If I take, {X1, . . . , Xn}, a random sample from N(µ, σ2). So, this means I take a random sample of n objects from a population which is normally distributed with population mean µ and population variance σ2. Now, I understand how to use the Maximum Likelihood Estimator (MLE) for distributions with 1 unknown parameter.

But, how do I do it with 2: the problem I get it 2 things:
E.g., suppose I am maximising the likelihood function of the population mean. Then, to find the MLE - usually, I just take the derivative of the likelihood function, or the log likelihood function l(µ) with respect to µ and then equate it to 0 to find the MLE of µ. This I mean if the only population parameter was µ.

But here, we have 2 parameters: so e.g. if we solve for the MLE of µ when σ2 is known, what do we do? And the same for if σ2 is unknown, what is the difference in the method?

What I can tell is: We just find the likelihood function of L(µ) in terms of µ and σ2 and then I would find the log likelihood function i.e. l(µ) = Log(L(µ)) with respect to base e i.e. natural logarithm.

So, now I have got the likelihood function of µ which is the same for when σ2 is known and unknown. What do I do after? When I differentiate with respect to µ, do I take partial derivatives or do something else? What is the difference in the method for the derivative part if σ2 is known or unknown?

Thank you!
0
1 year ago
#2
(Original post by Chittesh14)
...
Can't recall MLE myself, however it is fully worked, for the Normal distribution, in wiki - here.
0
#3
(Original post by ghostwalker)
Can't recall MLE myself, however it is fully worked, for the Normal distribution, in wiki - here.
Thank you, this is literally what is in my lecture notes haha, like almost identical. But, the only difference is in my lecture notes, it says for one: σ2 is known and for the other σ2 is unknown, in both cases finding the MLE for µ and I can't distinguish the methods because the part where they differentiate is skipped and it is identical until then!
0
1 year ago
#4
(Original post by Chittesh14)
Thank you, this is literally what is in my lecture notes haha, like almost identical. But, the only difference is in my lecture notes, it says for one: σ2 is known and for the other σ2 is unknown, in both cases finding the MLE for µ and I can't distinguish the methods because the part where they differentiate is skipped and it is identical until then!
As far as finding is concerned, I don't think there is any difference, since is held constant for the partial derivative.

If is unknown, then the additional work of partial wrt is required, but not otherwise.

Or is it the actual differentating itself that's causing the problem?
Last edited by ghostwalker; 1 year ago
0
#5
(Original post by ghostwalker)
As far as finding is concerned, I don't think there is any difference, since is held constant for the partial derivative.

If is unknown, then the additional work of partial wrt [tex]\sigma]/tex] is required, but not otherwise.

Or is it the actual differentating itself that's causing the problem?
I'm not sure, I will read it again and let you know if needed, I might've got it. Thanks!
0
#6
Gregorius Tagged you in case you'd remember this topic. If you just read this post rather than the first one, it is easier because my question is summarised in this post.

So, my thoughts after reading the question I was stuck on again were:
If I take, {X1, . . . , Xn}, a random sample from the population with a distribution N(µ, σ2). So, this means I take a random sample of n objects from a population which is normally distributed with population mean µ and population variance σ2. Then, if I am estimating the population mean µ using the MLE (maximum likelihood estimator suppose), I do the normal steps and find the (log) likelihood function. Now, I have two cases: either the population variance σ2 is known or unknown.

Is the following correct?
- If σ2 is known, I simply do the normal method, take the partial derivative of the (log) likelihood function with respect to µ and set it equal to 0 and then solve for , labelling it the MLE of µ, which may be in terms of σ2.
- If σ2 is unknown, then I have to use the normal method and take the partial derivative of the (log) likelihood function with respect to σ2 and set it equal to 0 and then solve for , labelling it the MLE of σ2. Then, I find the normal MLE of µ as in the above step, and substitute this estimate for σ2 wherever σ2 is in the equation.

Thank you!
0
1 year ago
#7
(Original post by Chittesh14)
Gregorius Tagged you in case you'd remember this topic. If you just read this post rather than the first one, it is easier because my question is summarised in this post.

So, my thoughts after reading the question I was stuck on again were:
If I take, {X1, . . . , Xn}, a random sample from the population with a distribution N(µ, σ2). So, this means I take a random sample of n objects from a population which is normally distributed with population mean µ and population variance σ2. Then, if I am estimating the population mean µ using the MLE (maximum likelihood estimator suppose), I do the normal steps and find the (log) likelihood function. Now, I have two cases: either the population variance σ2 is known or unknown.
So far, so good. Note that if the population variance is unknown, then the log-likelihood is a function of two variables, and . If the population variance is known, then it is a function of just one variable, latex]\mu[/latex]. So in the former case, to find the MLE, you're going to be finding the maximum of a function of two variables, which involves taking the partial derivatives, setting them equal to zero, and solving the consequent system of two equations in two unknowns. In the latter case, you are maximizing a function of one variable.

Is the following correct?
- If σ2 is known, I simply do the normal method, take the partial derivative of the (log) likelihood function with respect to µ and set it equal to 0 and then solve for , labelling it the MLE of µ, which may be in terms of σ2.
It's not a partial derivative in this case, as the likelihood is the function of just one variable, but otherwise correct.

- If σ2 is unknown, then I have to use the normal method and take the partial derivative of the (log) likelihood function with respect to σ2 and set it equal to 0 and then solve for , labelling it the MLE of σ2. Then, I find the normal MLE of µ as in the above step, and substitute this estimate for σ2 wherever σ2 is in the equation.
This works in this case (i.e. for the normal distribution), as it is particularly simple. But in general, you should think in terms of setting both partial derivatives equal to zero (simultaneously) and solving the system of two simultaneous equations in two unknowns.
0
#8
(Original post by Gregorius)
So far, so good. Note that if the population variance is unknown, then the log-likelihood is a function of two variables, and . If the population variance is known, then it is a function of just one variable, latex]\mu[/latex]. So in the former case, to find the MLE, you're going to be finding the maximum of a function of two variables, which involves taking the partial derivatives, setting them equal to zero, and solving the consequent system of two equations in two unknowns. In the latter case, you are maximizing a function of one variable.

It's not a partial derivative in this case, as the likelihood is the function of just one variable, but otherwise correct.

This works in this case (i.e. for the normal distribution), as it is particularly simple. But in general, you should think in terms of setting both partial derivatives equal to zero (simultaneously) and solving the system of two simultaneous equations in two unknowns.
Thank you so much, I get it completely now. You're too good ! So, in the case of where is known, I just differentiate the log-likelihood function with respect to normally treating as a constant right?
Last edited by Chittesh14; 1 year ago
0
1 year ago
#9
(Original post by Chittesh14)
Thank you so much, I get it completely now. You're too good ! So, in the case of where is known, I just differentiate the log-likelihood function with respect to normally treating as a constant right?
Yes.
0
#10
(Original post by Gregorius)
Yes.
Thanks!!!
0
X

new posts
Back
to top
Latest
My Feed

### Oops, nobody has postedin the last few hours.

Why not re-start the conversation?

see more

### See more of what you like onThe Student Room

You can personalise what you see on TSR. Tell us a little about yourself to get started.

### Poll

Join the discussion

#### Current uni students - are you thinking of dropping out of university?

Yes, I'm seriously considering dropping out (91)
13.87%
I'm not sure (30)
4.57%
No, I'm going to stick it out for now (212)
32.32%
I have already dropped out (14)
2.13%
I'm not a current university student (309)
47.1%

View All
Latest
My Feed

### Oops, nobody has postedin the last few hours.

Why not re-start the conversation?

### See more of what you like onThe Student Room

You can personalise what you see on TSR. Tell us a little about yourself to get started.