MathMeister
Badges: 10
Rep:
?
#1
Report Thread starter 6 years ago
#1
Hello. I've basically finished C2 and have started C3. I am starting with algebra/ functions and differentiation. Would anybody know an explanation to why you can just multiply f' (gx) and g'(x) to get d[f(g(x)]/d[x]

I have read an explanation as follows...To understand chain rule think about definition of derivative as rate of change. d[f(g(x)]/d[x] basically means rate of change of f(g(x)) regarding rate of change of x, and to calculate this we need to know two values:

1- How much f(g(x)) changes while g(x) changes = d[f(g(x))]/d[g(x)]
2- How much g(x) changes while x changes = d[g(x)]/d[x]
to calculate rate of change of f(g(x)) in regard to rate of change of x, you just need to multiply these two values together because x changes f(x) and g(x)
changes f(g(x)) (it should be obvious thinking about definition of a function in mathematics).
Please could anybody elaborate on this and explain why?
Thanks

0
reply
Wahrheit
Badges: 18
Rep:
?
#2
Report 6 years ago
#2
https://proofwiki.org/wiki/Chain_Rul...lued_Functions

Hopefully that answers your question!
0
reply
MathMeister
Badges: 10
Rep:
?
#3
Report Thread starter 6 years ago
#3
(Original post by Wahrheit)
https://proofwiki.org/wiki/Chain_Rul...lued_Functions Hopefully that answers your question!
Unfortunately not, sorry. I was really looking for an understanding of why the definition of a function would help understand why you can just multiply (or some reason other than the fraction cancelling one). I was looking for something a bit intuitive or something that makes perfect sense. For example if I were asking about why nx^n-1 worked I would need somebody to share about concavity/ first principles. Thank you anyway.
0
reply
Wahrheit
Badges: 18
Rep:
?
#4
Report 6 years ago
#4
Essentially all chain rule is saying is dy/dx = dy/du . du/dx so just think of it in terms of that.
0
reply
TeeEm
Badges: 19
Rep:
?
#5
Report 6 years ago
#5
(Original post by MathMeister)
Hello. I've basically finished C2 and have started C3. I am starting with algebra/ functions and differentiation. Would anybody know an explanation to why you can just multiply f' (gx) and g'(x) to get d[f(g(x)]/d[x]

I have read an explanation as follows...To understand chain rule think about definition of derivative as rate of change. d[f(g(x)]/d[x] basically means rate of change of f(g(x)) regarding rate of change of x, and to calculate this we need to know two values:

1- How much f(g(x)) changes while g(x) changes = d[f(g(x))]/d[g(x)]
2- How much g(x) changes while x changes = d[g(x)]/d[x]
to calculate rate of change of f(g(x)) in regard to rate of change of x, you just need to multiply these two values together because x changes f(x) and g(x)
changes f(g(x)) (it should be obvious thinking about definition of a function in mathematics).
Please could anybody elaborate on this and explain why?
Thanks


(Original post by MathMeister)
Unfortunately not, sorry. I was really looking for an understanding of why the definition of a function would help understand why you can just multiply (or some reason other than the fraction cancelling one). I was looking for something a bit intuitive or something that makes perfect sense. For example if I were asking about why nx^n-1 worked I would need somebody to share about concavity/ first principles. Thank you anyway.
!!!!!!!!! I admire your question !!!!!!!!!!!!
when I was at the same stage in with my mathematics all I wanted to know is how to do the chain rule correctly and not why and how.

The formal proofs for these require mathematical analysis which is a branch of mathematics normally taught to first year maths undergraduates ...
1
reply
Wahrheit
Badges: 18
Rep:
?
#6
Report 6 years ago
#6
(Original post by TeeEm)
!!!!!!!!! I admire your question !!!!!!!!!!!!
when I was at the same stage in with my mathematics all I wanted to know is how to do the chain rule correctly and not why and how.

The formal proofs for these require mathematical analysis which is a branch of mathematics normally taught to first year maths undergraduates ...
Yeah sorry I couldn't resist the opportunity to be mischievous by answering his question is the best yet least helpful way possible.
0
reply
TeeEm
Badges: 19
Rep:
?
#7
Report 6 years ago
#7
(Original post by Wahrheit)
Yeah sorry I couldn't resist the opportunity to be mischievous by answering his question is the best yet least helpful way possible.
it is all good ...
0
reply
Wahrheit
Badges: 18
Rep:
?
#8
Report 6 years ago
#8
Ok so think of two functions like 5x and cos(y) and you have to differentiate z=cos(y) with respect to x where y=5x so d(cos(5x))/dx

dz/dx=dz/dy . dy/dx just by fraction cancellation and you know how to work out dz/dy and dy/dx so just multiply them and you're sorted!
0
reply
MathMeister
Badges: 10
Rep:
?
#9
Report Thread starter 6 years ago
#9
(Original post by TeeEm)
The formal proofs for these require mathematical analysis which is a branch of mathematics normally taught to first year maths undergraduates ...
Is there nothing intuitive or simple? I've thought about it a lot and understand that g(x) maps x onto the y (the range) and this I suppose this forms the graph of g(x). Then these values get mapped onto the same x and the y is given by f(g(x)). I'm not sure if my thoughts are even useful for understanding and I've not really gotten very far. :/
1
reply
Wahrheit
Badges: 18
Rep:
?
#10
Report 6 years ago
#10
(Original post by MathMeister)
Is there nothing intuitive or simple? I've thought about it a lot and understand that g(x) maps x onto the y (the range) and this I suppose this forms the graph of g(x). Then these values get mapped onto the same x and the y is given by f(g(x)). I'm not sure if my thoughts are even useful for understanding and I've not really gotten very far. :/
My post above yours explains it pretty simply. It's all about cancellation. Very intuitive.
0
reply
Wahrheit
Badges: 18
Rep:
?
#11
Report 6 years ago
#11
(Original post by MathMeister)
Is there nothing intuitive or simple? I've thought about it a lot and understand that g(x) maps x onto the y (the range) and this I suppose this forms the graph of g(x). Then these values get mapped onto the same x and the y is given by f(g(x)). I'm not sure if my thoughts are even useful for understanding and I've not really gotten very far. :/
Also f(g(x)) is a different range than g(x). If you are think of y as g(x) think of f(g(x)) as z
0
reply
TeeEm
Badges: 19
Rep:
?
#12
Report 6 years ago
#12
(Original post by MathMeister)
Is there nothing intuitive or simple? I've thought about it a lot and understand that g(x) maps x onto the y (the range) and this I suppose this forms the graph of g(x). Then these values get mapped onto the same x and the y is given by f(g(x)). I'm not sure if my thoughts are even useful for understanding and I've not really gotten very far. :/
This is stuff that in order to understand properly and formally you will need a lot more than A level maths.
In my opinion anybody who claims "it is obvious..." is either the next recipient of the Fields Medal or just shows total contempt for the majority of people trying to make sense of this. I learned the formal proof of the chain rule over 25 years ago and I am not embarrassed to say I cannot replicate the proof without looking it up.
Learn and practice the technique at this stage and if you still wander about the why a year on from now consider a Maths degree ...
0
reply
MathMeister
Badges: 10
Rep:
?
#13
Report Thread starter 6 years ago
#13
(Original post by Wahrheit)
My post above yours explains it pretty simply. It's all about cancellation. Very intuitive.
Isn't their anything else? Explaining with cancellation is like explaining why the differential of x^n is nx^n-1 without teaching first principles and concavity. Cancellation isn't actually what is happening here though :/
0
reply
Mr M
Badges: 20
Rep:
?
#14
Report 6 years ago
#14
(Original post by MathMeister)
...
http://kruel.co/math/chainrule.pdf

The first link in Google if you search for chain rule proof.
0
reply
Wahrheit
Badges: 18
Rep:
?
#15
Report 6 years ago
#15
(Original post by MathMeister)
Isn't their anything else? Explaining with cancellation is like explaining why the differential of x^n is nx^n-1 without teaching first principles and concavity. Cancellation isn't actually what is happening here though :/
Well it kind of does, sometimes thinking about everything graphically doesn't actually help especially in higher dimensions
0
reply
davros
  • Study Helper
Badges: 16
Rep:
?
#16
Report 6 years ago
#16
(Original post by MathMeister)
Isn't their anything else? Explaining with cancellation is like explaining why the differential of x^n is nx^n-1 without teaching first principles and concavity. Cancellation isn't actually what is happening here though :/
"Concavity" has nothing to do with explaining the differential of x^n - that just follows from the limit definition of the derivative and a straightforward application of the binomial theorem (assuming you're using the usual convention of n being a positive integer; things are more complicated otherwise!).

No offence, but you seem to be constantly looking for "intuitive" reasons why everything is true. There is nothing wrong with finding intuitive arguments when they work, but intuition can often lead you astray (especially where limits are concerned), and an awful lot of higher mathematics is about taking a purely abstract definition and seeing how far you can work with that definition.

MrM has given you a great link to a pdf which gives a reasonably concise demonstration of why the chain rule is true. Read it, file it and forget about it - just make sure you can apply the chain rule in practice
0
reply
atsruser
Badges: 11
Rep:
?
#17
Report 6 years ago
#17
(Original post by MathMeister)
[COLOR=#444444]Hello. I've basically finished C2 and have started C3. I am starting with algebra/ functions and differentiation. Would anybody know an explanation to why you can just multiply f' (gx) and g'(x) to get d[f(g(x)]/d[x]
Suppose y =f(g(x)).

The derivative of f(x) is represented by the the slope of the gradient of the tangent to its graph at some point on the x-axis, say x_0. Let's suppose that at x_0, the tangent has gradient 6.

This means that if we move a tiny distance of 1 (small) unit along the x-axis from x_0, we will move 6 (small) units up the y-axis along this tangent (think of rise = run * gradient - here the run is 1 unit, the rise is 6 units, on the tangent).

If we move a tiny distance of 2 (small) units along the x-axis from x_0, we will move 6*2 = 12 (small) units up the y-axis along this tangent.

If we move a tiny distance of 3 (small) units along the x-axis from x_0, we will move 6*3 = 18 (small) units up the y-axis along this tangent.

Now consider y =f(g(x)) again. Here the increase in the value of f's argument is what comes out of g(x). Suppose that now we think of the graph of g(x) and its tangent at the point x_1 and let's say this tangent has gradient 3. Let's also say that x_0 = g(x_1).

Now when we move a tiny distance of 1 (small) unit along the x-axis from x_1 on the graph of g(x), we will move 3 (small) units up the y-axis along the tangent to g(x) at the point x_1.

But of course we have a composite function: when we change the value of g(x) by 3 (small) units, we feed that into f(x), and consequently by the argument I gave above, we will now move up the y-axis on the graph of f(x) by 6 * 3 = 18 units, by moving 1 unit along the x-axis on the graph of g(x).

Note that since x_0 = g(x_1), then the increase in value of the argument of f(x) is happening very close to x_0, and we know that at that point f(x) has gradient 6.

This merely says that if y =f(g(x)). then y' = f'(x)g'(x) as you already know.

Why did I insist on only moving by small units along the x-axis? That's because if you move too much, then you begin to move away from the point at which you drew the tangent, and you'll be working at a point with a different tangent. In fact, to make this argument work properly, you have to imagine that the 1 small unit is in fact infinitesimally small (or alternatively, and from a more modern point of view, that the result y' = f'(x)g'(x) becomes more and more accurate the smaller you make your 1 small unit).

The whole explanation stems from the fact that for a straight line (like a tangent), rise = run * gradient, but in this case:

1. "rise" is "rise of f(x)"

2. "gradient" is the slope of the tangent to f(x) at x_0

3. "run" is the output from g(x) - which in fact is the rise of the tangent at g(x) at x_1 when we move 1 small unit along the x-axis of its graph, and hence "run = 1 * gradient of g(x) = gradient of g(x)".

Thus, in fact, we have, by substituting a bit:

"rise of f(x) = gradient of g(x) * gradient of f(x)"

which is the result that we wanted.
0
reply
physicsmaths
Badges: 18
Rep:
?
#18
Report 6 years ago
#18
use small increments in each function


Posted from TSR Mobile
0
reply
MathMeister
Badges: 10
Rep:
?
#19
Report Thread starter 6 years ago
#19
(Original post by atsruser)
...
Thank you! It makes perfect sense now!
0
reply
atsruser
Badges: 11
Rep:
?
#20
Report 6 years ago
#20
(Original post by MathMeister)
Thank you! It makes perfect sense now!
My pleasure. I added a bit more explanation at the end to try to make it transparently clear.
0
reply
X

Quick Reply

Attached files
Write a reply...
Reply
new posts
Back
to top
Latest
My Feed

See more of what you like on
The Student Room

You can personalise what you see on TSR. Tell us a little about yourself to get started.

Personalise

Have you experienced financial difficulties as a student due to Covid-19?

Yes, I have really struggled financially (14)
11.86%
I have experienced some financial difficulties (29)
24.58%
I haven't experienced any financial difficulties and things have stayed the same (51)
43.22%
I have had better financial opportunities as a result of the pandemic (20)
16.95%
I've had another experience (let us know in the thread!) (4)
3.39%

Watched Threads

View All
Latest
My Feed