# Quick question on the chain rule

Watch
Announcements

Page 1 of 1

Go to first unread

Skip to page:

Hello. I've basically finished C2 and have started C3. I am starting with algebra/ functions and differentiation. Would anybody know an explanation to why you can just multiply f' (gx) and g'(x) to get d[f(g(x)]/d[x]

I have read an explanation as follows...To understand chain rule think about definition of derivative as rate of change. d[f(g(x)]/d[x] basically means rate of change of f(g(x)) regarding rate of change of x, and to calculate this we need to know two values:

1- How much f(g(x)) changes while g(x) changes = d[f(g(x))]/d[g(x)]

2- How much g(x) changes while x changes = d[g(x)]/d[x]

to calculate rate of change of f(g(x)) in regard to rate of change of x, you just need to multiply these two values together because x changes f(x) and g(x)changes f(g(x))

Thanks

I have read an explanation as follows...To understand chain rule think about definition of derivative as rate of change. d[f(g(x)]/d[x] basically means rate of change of f(g(x)) regarding rate of change of x, and to calculate this we need to know two values:

1- How much f(g(x)) changes while g(x) changes = d[f(g(x))]/d[g(x)]

2- How much g(x) changes while x changes = d[g(x)]/d[x]

to calculate rate of change of f(g(x)) in regard to rate of change of x, you just need to multiply these two values together because x changes f(x) and g(x)changes f(g(x))

__Please could anybody elaborate on this and explain why?__**(it should be obvious thinking about definition of a function in mathematics).**

Thanks

0

reply

Report

#2

0

reply

(Original post by

https://proofwiki.org/wiki/Chain_Rul...lued_Functions Hopefully that answers your question!

**Wahrheit**)https://proofwiki.org/wiki/Chain_Rul...lued_Functions Hopefully that answers your question!

0

reply

Report

#4

Essentially all chain rule is saying is dy/dx = dy/du . du/dx so just think of it in terms of that.

0

reply

Report

#5

(Original post by

Hello. I've basically finished C2 and have started C3. I am starting with algebra/ functions and differentiation. Would anybody know an explanation to why you can just multiply f' (gx) and g'(x) to get d[f(g(x)]/d[x]

I have read an explanation as follows...To understand chain rule think about definition of derivative as rate of change. d[f(g(x)]/d[x] basically means rate of change of f(g(x)) regarding rate of change of x, and to calculate this we need to know two values:

1- How much f(g(x)) changes while g(x) changes = d[f(g(x))]/d[g(x)]

2- How much g(x) changes while x changes = d[g(x)]/d[x]

to calculate rate of change of f(g(x)) in regard to rate of change of x, you just need to multiply these two values together because x changes f(x) and g(x)changes f(g(x))

Thanks

**MathMeister**)Hello. I've basically finished C2 and have started C3. I am starting with algebra/ functions and differentiation. Would anybody know an explanation to why you can just multiply f' (gx) and g'(x) to get d[f(g(x)]/d[x]

I have read an explanation as follows...To understand chain rule think about definition of derivative as rate of change. d[f(g(x)]/d[x] basically means rate of change of f(g(x)) regarding rate of change of x, and to calculate this we need to know two values:

1- How much f(g(x)) changes while g(x) changes = d[f(g(x))]/d[g(x)]

2- How much g(x) changes while x changes = d[g(x)]/d[x]

to calculate rate of change of f(g(x)) in regard to rate of change of x, you just need to multiply these two values together because x changes f(x) and g(x)changes f(g(x))

__Please could anybody elaborate on this and explain why?__**(it should be obvious thinking about definition of a function in mathematics).**

Thanks

(Original post by

Unfortunately not, sorry. I was really looking for an understanding of why the definition of a function would help understand why you can just multiply (or some reason other than the fraction cancelling one). I was looking for something a bit intuitive or something that makes perfect sense. For example if I were asking about why nx^n-1 worked I would need somebody to share about concavity/ first principles. Thank you anyway.

**MathMeister**)Unfortunately not, sorry. I was really looking for an understanding of why the definition of a function would help understand why you can just multiply (or some reason other than the fraction cancelling one). I was looking for something a bit intuitive or something that makes perfect sense. For example if I were asking about why nx^n-1 worked I would need somebody to share about concavity/ first principles. Thank you anyway.

when I was at the same stage in with my mathematics all I wanted to know is how to do the chain rule correctly and not why and how.

The formal proofs for these require

**mathematical analysis**which is a branch of mathematics normally taught to first year maths undergraduates ...

1

reply

Report

#6

(Original post by

!!!!!!!!! I admire your question !!!!!!!!!!!!

when I was at the same stage in with my mathematics all I wanted to know is how to do the chain rule correctly and not why and how.

The formal proofs for these require

**TeeEm**)!!!!!!!!! I admire your question !!!!!!!!!!!!

when I was at the same stage in with my mathematics all I wanted to know is how to do the chain rule correctly and not why and how.

The formal proofs for these require

**mathematical analysis**which is a branch of mathematics normally taught to first year maths undergraduates ...
0

reply

Report

#7

(Original post by

Yeah sorry I couldn't resist the opportunity to be mischievous by answering his question is the best yet least helpful way possible.

**Wahrheit**)Yeah sorry I couldn't resist the opportunity to be mischievous by answering his question is the best yet least helpful way possible.

0

reply

Report

#8

Ok so think of two functions like 5x and cos(y) and you have to differentiate z=cos(y) with respect to x where y=5x so d(cos(5x))/dx

dz/dx=dz/dy . dy/dx just by fraction cancellation and you know how to work out dz/dy and dy/dx so just multiply them and you're sorted!

dz/dx=dz/dy . dy/dx just by fraction cancellation and you know how to work out dz/dy and dy/dx so just multiply them and you're sorted!

0

reply

(Original post by

The formal proofs for these require

**TeeEm**)The formal proofs for these require

**mathematical analysis**which is a branch of mathematics normally taught to first year maths undergraduates ...
1

reply

Report

#10

(Original post by

Is there nothing intuitive or simple? I've thought about it a lot and understand that g(x) maps x onto the y (the range) and this I suppose this forms the graph of g(x). Then these values get mapped onto the same x and the y is given by f(g(x)). I'm not sure if my thoughts are even useful for understanding and I've not really gotten very far. :/

**MathMeister**)Is there nothing intuitive or simple? I've thought about it a lot and understand that g(x) maps x onto the y (the range) and this I suppose this forms the graph of g(x). Then these values get mapped onto the same x and the y is given by f(g(x)). I'm not sure if my thoughts are even useful for understanding and I've not really gotten very far. :/

0

reply

Report

#11

**MathMeister**)

Is there nothing intuitive or simple? I've thought about it a lot and understand that g(x) maps x onto the y (the range) and this I suppose this forms the graph of g(x). Then these values get mapped onto the same x and the y is given by f(g(x)). I'm not sure if my thoughts are even useful for understanding and I've not really gotten very far. :/

0

reply

Report

#12

**MathMeister**)

Is there nothing intuitive or simple? I've thought about it a lot and understand that g(x) maps x onto the y (the range) and this I suppose this forms the graph of g(x). Then these values get mapped onto the same x and the y is given by f(g(x)). I'm not sure if my thoughts are even useful for understanding and I've not really gotten very far. :/

In my opinion anybody who claims "it is obvious..." is either the next recipient of the Fields Medal or just shows total contempt for the majority of people trying to make sense of this. I learned the formal proof of the chain rule over 25 years ago and I am not embarrassed to say I cannot replicate the proof without looking it up.

Learn and practice the technique at this stage and if you still wander about the why a year on from now consider a Maths degree ...

0

reply

(Original post by

My post above yours explains it pretty simply. It's all about cancellation. Very intuitive.

**Wahrheit**)My post above yours explains it pretty simply. It's all about cancellation. Very intuitive.

0

reply

Report

#14

(Original post by

...

**MathMeister**)...

The first link in Google if you search for chain rule proof.

0

reply

Report

#15

(Original post by

Isn't their anything else? Explaining with cancellation is like explaining why the differential of x^n is nx^n-1 without teaching first principles and concavity. Cancellation isn't actually what is happening here though :/

**MathMeister**)Isn't their anything else? Explaining with cancellation is like explaining why the differential of x^n is nx^n-1 without teaching first principles and concavity. Cancellation isn't actually what is happening here though :/

0

reply

Report

#16

**MathMeister**)

Isn't their anything else? Explaining with cancellation is like explaining why the differential of x^n is nx^n-1 without teaching first principles and concavity. Cancellation isn't actually what is happening here though :/

No offence, but you seem to be constantly looking for "intuitive" reasons why everything is true. There is nothing wrong with finding intuitive arguments when they work, but intuition can often lead you astray (especially where limits are concerned), and an awful lot of higher mathematics is about taking a purely abstract definition and seeing how far you can work with that definition.

MrM has given you a great link to a pdf which gives a reasonably concise demonstration of why the chain rule is true. Read it, file it and forget about it - just make sure you can apply the chain rule in practice

0

reply

Report

#17

(Original post by

[COLOR=#444444]Hello. I've basically finished C2 and have started C3. I am starting with algebra/ functions and differentiation. Would anybody know an explanation to why you can just multiply f' (gx) and g'(x) to get d[f(g(x)]/d[x]

**MathMeister**)[COLOR=#444444]Hello. I've basically finished C2 and have started C3. I am starting with algebra/ functions and differentiation. Would anybody know an explanation to why you can just multiply f' (gx) and g'(x) to get d[f(g(x)]/d[x]

The derivative of is represented by the the slope of the gradient of the tangent to its graph at some point on the x-axis, say . Let's suppose that at , the tangent has gradient 6.

This means that if we move a tiny distance of 1 (small) unit along the x-axis from , we will move 6 (small) units up the y-axis along this tangent (think of rise = run * gradient - here the run is 1 unit, the rise is 6 units, on the tangent).

If we move a tiny distance of 2 (small) units along the x-axis from , we will move 6*2 = 12 (small) units up the y-axis along this tangent.

If we move a tiny distance of 3 (small) units along the x-axis from , we will move 6*3 = 18 (small) units up the y-axis along this tangent.

Now consider again. Here the increase in the value of 's argument is what comes out of . Suppose that now we think of the graph of and its tangent at the point and let's say this tangent has gradient 3. Let's also say that .

Now when we move a tiny distance of 1 (small) unit along the x-axis from on the graph of , we will move 3 (small) units up the y-axis along the tangent to at the point .

But of course we have a composite function: when we change the value of by 3 (small) units, we feed that into , and consequently by the argument I gave above, we will now move up the y-axis on the graph of by 6 * 3 = 18 units, by moving 1 unit along the x-axis on the graph of .

Note that since , then the increase in value of the argument of is happening very close to , and we know that at that point has gradient 6.

This merely says that if . then as you already know.

Why did I insist on only moving by small units along the x-axis? That's because if you move too much, then you begin to move away from the point at which you drew the tangent, and you'll be working at a point with a different tangent. In fact, to make this argument work properly, you have to imagine that the 1 small unit is in fact infinitesimally small (or alternatively, and from a more modern point of view, that the result becomes more and more accurate the smaller you make your 1 small unit).

The whole explanation stems from the fact that for a straight line (like a tangent), rise = run * gradient, but in this case:

1. "rise" is "rise of "

2. "gradient" is the slope of the tangent to at

3. "run" is the output from - which in fact is the rise of the tangent at at when we move 1 small unit along the x-axis of its graph, and hence "run = 1 * gradient of = gradient of ".

Thus, in fact, we have, by substituting a bit:

"rise of = gradient of * gradient of "

which is the result that we wanted.

0

reply

(Original post by

...

**atsruser**)...

0

reply

Report

#20

(Original post by

Thank you! It makes perfect sense now!

**MathMeister**)Thank you! It makes perfect sense now!

0

reply

X

Page 1 of 1

Go to first unread

Skip to page:

### Quick Reply

Back

to top

to top