Quick question on the chain rule
Watch
Announcements
Page 1 of 1
Skip to page:
Hello. I've basically finished C2 and have started C3. I am starting with algebra/ functions and differentiation. Would anybody know an explanation to why you can just multiply f' (gx) and g'(x) to get d[f(g(x)]/d[x]
I have read an explanation as follows...To understand chain rule think about definition of derivative as rate of change. d[f(g(x)]/d[x] basically means rate of change of f(g(x)) regarding rate of change of x, and to calculate this we need to know two values:
1- How much f(g(x)) changes while g(x) changes = d[f(g(x))]/d[g(x)]
2- How much g(x) changes while x changes = d[g(x)]/d[x]
to calculate rate of change of f(g(x)) in regard to rate of change of x, you just need to multiply these two values together because x changes f(x) and g(x)changes f(g(x)) (it should be obvious thinking about definition of a function in mathematics).
Please could anybody elaborate on this and explain why?
Thanks
I have read an explanation as follows...To understand chain rule think about definition of derivative as rate of change. d[f(g(x)]/d[x] basically means rate of change of f(g(x)) regarding rate of change of x, and to calculate this we need to know two values:
1- How much f(g(x)) changes while g(x) changes = d[f(g(x))]/d[g(x)]
2- How much g(x) changes while x changes = d[g(x)]/d[x]
to calculate rate of change of f(g(x)) in regard to rate of change of x, you just need to multiply these two values together because x changes f(x) and g(x)changes f(g(x)) (it should be obvious thinking about definition of a function in mathematics).
Please could anybody elaborate on this and explain why?
Thanks
0
reply
Report
#2
0
reply
(Original post by Wahrheit)
https://proofwiki.org/wiki/Chain_Rul...lued_Functions Hopefully that answers your question!
https://proofwiki.org/wiki/Chain_Rul...lued_Functions Hopefully that answers your question!
0
reply
Report
#4
Essentially all chain rule is saying is dy/dx = dy/du . du/dx so just think of it in terms of that.
0
reply
Report
#5
(Original post by MathMeister)
Hello. I've basically finished C2 and have started C3. I am starting with algebra/ functions and differentiation. Would anybody know an explanation to why you can just multiply f' (gx) and g'(x) to get d[f(g(x)]/d[x]
I have read an explanation as follows...To understand chain rule think about definition of derivative as rate of change. d[f(g(x)]/d[x] basically means rate of change of f(g(x)) regarding rate of change of x, and to calculate this we need to know two values:
1- How much f(g(x)) changes while g(x) changes = d[f(g(x))]/d[g(x)]
2- How much g(x) changes while x changes = d[g(x)]/d[x]
to calculate rate of change of f(g(x)) in regard to rate of change of x, you just need to multiply these two values together because x changes f(x) and g(x)changes f(g(x)) (it should be obvious thinking about definition of a function in mathematics).
Please could anybody elaborate on this and explain why?
Thanks
Hello. I've basically finished C2 and have started C3. I am starting with algebra/ functions and differentiation. Would anybody know an explanation to why you can just multiply f' (gx) and g'(x) to get d[f(g(x)]/d[x]
I have read an explanation as follows...To understand chain rule think about definition of derivative as rate of change. d[f(g(x)]/d[x] basically means rate of change of f(g(x)) regarding rate of change of x, and to calculate this we need to know two values:
1- How much f(g(x)) changes while g(x) changes = d[f(g(x))]/d[g(x)]
2- How much g(x) changes while x changes = d[g(x)]/d[x]
to calculate rate of change of f(g(x)) in regard to rate of change of x, you just need to multiply these two values together because x changes f(x) and g(x)changes f(g(x)) (it should be obvious thinking about definition of a function in mathematics).
Please could anybody elaborate on this and explain why?
Thanks
(Original post by MathMeister)
Unfortunately not, sorry. I was really looking for an understanding of why the definition of a function would help understand why you can just multiply (or some reason other than the fraction cancelling one). I was looking for something a bit intuitive or something that makes perfect sense. For example if I were asking about why nx^n-1 worked I would need somebody to share about concavity/ first principles. Thank you anyway.
Unfortunately not, sorry. I was really looking for an understanding of why the definition of a function would help understand why you can just multiply (or some reason other than the fraction cancelling one). I was looking for something a bit intuitive or something that makes perfect sense. For example if I were asking about why nx^n-1 worked I would need somebody to share about concavity/ first principles. Thank you anyway.
when I was at the same stage in with my mathematics all I wanted to know is how to do the chain rule correctly and not why and how.
The formal proofs for these require mathematical analysis which is a branch of mathematics normally taught to first year maths undergraduates ...
1
reply
Report
#6
(Original post by TeeEm)
!!!!!!!!! I admire your question !!!!!!!!!!!!
when I was at the same stage in with my mathematics all I wanted to know is how to do the chain rule correctly and not why and how.
The formal proofs for these require mathematical analysis which is a branch of mathematics normally taught to first year maths undergraduates ...
!!!!!!!!! I admire your question !!!!!!!!!!!!
when I was at the same stage in with my mathematics all I wanted to know is how to do the chain rule correctly and not why and how.
The formal proofs for these require mathematical analysis which is a branch of mathematics normally taught to first year maths undergraduates ...
0
reply
Report
#7
(Original post by Wahrheit)
Yeah sorry I couldn't resist the opportunity to be mischievous by answering his question is the best yet least helpful way possible.
Yeah sorry I couldn't resist the opportunity to be mischievous by answering his question is the best yet least helpful way possible.

0
reply
Report
#8
Ok so think of two functions like 5x and cos(y) and you have to differentiate z=cos(y) with respect to x where y=5x so d(cos(5x))/dx
dz/dx=dz/dy . dy/dx just by fraction cancellation and you know how to work out dz/dy and dy/dx so just multiply them and you're sorted!
dz/dx=dz/dy . dy/dx just by fraction cancellation and you know how to work out dz/dy and dy/dx so just multiply them and you're sorted!
0
reply
(Original post by TeeEm)
The formal proofs for these require mathematical analysis which is a branch of mathematics normally taught to first year maths undergraduates ...
The formal proofs for these require mathematical analysis which is a branch of mathematics normally taught to first year maths undergraduates ...
1
reply
Report
#10
(Original post by MathMeister)
Is there nothing intuitive or simple? I've thought about it a lot and understand that g(x) maps x onto the y (the range) and this I suppose this forms the graph of g(x). Then these values get mapped onto the same x and the y is given by f(g(x)). I'm not sure if my thoughts are even useful for understanding and I've not really gotten very far. :/
Is there nothing intuitive or simple? I've thought about it a lot and understand that g(x) maps x onto the y (the range) and this I suppose this forms the graph of g(x). Then these values get mapped onto the same x and the y is given by f(g(x)). I'm not sure if my thoughts are even useful for understanding and I've not really gotten very far. :/
0
reply
Report
#11
(Original post by MathMeister)
Is there nothing intuitive or simple? I've thought about it a lot and understand that g(x) maps x onto the y (the range) and this I suppose this forms the graph of g(x). Then these values get mapped onto the same x and the y is given by f(g(x)). I'm not sure if my thoughts are even useful for understanding and I've not really gotten very far. :/
Is there nothing intuitive or simple? I've thought about it a lot and understand that g(x) maps x onto the y (the range) and this I suppose this forms the graph of g(x). Then these values get mapped onto the same x and the y is given by f(g(x)). I'm not sure if my thoughts are even useful for understanding and I've not really gotten very far. :/
0
reply
Report
#12
(Original post by MathMeister)
Is there nothing intuitive or simple? I've thought about it a lot and understand that g(x) maps x onto the y (the range) and this I suppose this forms the graph of g(x). Then these values get mapped onto the same x and the y is given by f(g(x)). I'm not sure if my thoughts are even useful for understanding and I've not really gotten very far. :/
Is there nothing intuitive or simple? I've thought about it a lot and understand that g(x) maps x onto the y (the range) and this I suppose this forms the graph of g(x). Then these values get mapped onto the same x and the y is given by f(g(x)). I'm not sure if my thoughts are even useful for understanding and I've not really gotten very far. :/
In my opinion anybody who claims "it is obvious..." is either the next recipient of the Fields Medal or just shows total contempt for the majority of people trying to make sense of this. I learned the formal proof of the chain rule over 25 years ago and I am not embarrassed to say I cannot replicate the proof without looking it up.
Learn and practice the technique at this stage and if you still wander about the why a year on from now consider a Maths degree ...
0
reply
(Original post by Wahrheit)
My post above yours explains it pretty simply. It's all about cancellation. Very intuitive.
My post above yours explains it pretty simply. It's all about cancellation. Very intuitive.
0
reply
Report
#14
(Original post by MathMeister)
...
...
The first link in Google if you search for chain rule proof.
0
reply
Report
#15
(Original post by MathMeister)
Isn't their anything else? Explaining with cancellation is like explaining why the differential of x^n is nx^n-1 without teaching first principles and concavity. Cancellation isn't actually what is happening here though :/
Isn't their anything else? Explaining with cancellation is like explaining why the differential of x^n is nx^n-1 without teaching first principles and concavity. Cancellation isn't actually what is happening here though :/
0
reply
Report
#16
(Original post by MathMeister)
Isn't their anything else? Explaining with cancellation is like explaining why the differential of x^n is nx^n-1 without teaching first principles and concavity. Cancellation isn't actually what is happening here though :/
Isn't their anything else? Explaining with cancellation is like explaining why the differential of x^n is nx^n-1 without teaching first principles and concavity. Cancellation isn't actually what is happening here though :/
No offence, but you seem to be constantly looking for "intuitive" reasons why everything is true. There is nothing wrong with finding intuitive arguments when they work, but intuition can often lead you astray (especially where limits are concerned), and an awful lot of higher mathematics is about taking a purely abstract definition and seeing how far you can work with that definition.
MrM has given you a great link to a pdf which gives a reasonably concise demonstration of why the chain rule is true. Read it, file it and forget about it - just make sure you can apply the chain rule in practice

0
reply
Report
#17
(Original post by MathMeister)
[COLOR=#444444]Hello. I've basically finished C2 and have started C3. I am starting with algebra/ functions and differentiation. Would anybody know an explanation to why you can just multiply f' (gx) and g'(x) to get d[f(g(x)]/d[x]
[COLOR=#444444]Hello. I've basically finished C2 and have started C3. I am starting with algebra/ functions and differentiation. Would anybody know an explanation to why you can just multiply f' (gx) and g'(x) to get d[f(g(x)]/d[x]

The derivative of



This means that if we move a tiny distance of 1 (small) unit along the x-axis from

If we move a tiny distance of 2 (small) units along the x-axis from

If we move a tiny distance of 3 (small) units along the x-axis from

Now consider






Now when we move a tiny distance of 1 (small) unit along the x-axis from




But of course we have a composite function: when we change the value of




Note that since




This merely says that if


Why did I insist on only moving by small units along the x-axis? That's because if you move too much, then you begin to move away from the point at which you drew the tangent, and you'll be working at a point with a different tangent. In fact, to make this argument work properly, you have to imagine that the 1 small unit is in fact infinitesimally small (or alternatively, and from a more modern point of view, that the result

The whole explanation stems from the fact that for a straight line (like a tangent), rise = run * gradient, but in this case:
1. "rise" is "rise of

2. "gradient" is the slope of the tangent to


3. "run" is the output from





Thus, in fact, we have, by substituting a bit:
"rise of



which is the result that we wanted.
0
reply
(Original post by atsruser)
...
...

0
reply
Report
#20
0
reply
X
Page 1 of 1
Skip to page:
Quick Reply
Back
to top
to top