Thursday 21 February 2013

Proving the Chain Rule

The chain rule is a method of differentiating a function within a function, for example $\sin x^2$. If you have studied maths to a sufficient (but not necessary) level it is likely that you will know that $\begin{align*}\frac{d}{dx}\left(fg(x)\right) = g'(x)f'g(x) \end{align*}$, but you may not know why this is the case.

We know from basic differentiation that $\begin{align*}\frac{d}{dx}(g(x)) = \lim_{h\to 0}\left(\frac{g(x+h) - g(x)}{h}\right) \end{align*}$

Let $\begin{align*}v = \frac{g(x+h) - g(x)}{h} - g'(x)\ (1)\end{align*}$ clearly $v\to 0\ as\ h\to 0$

This idea can be extended to a function of a function, as long as the function is differentiable for some function of $x,\ y$ then as $\begin{align*}\ k\to 0\ ,\  \frac{f(y+k)-f(y)}{k} \to f'(y)\end{align*}$

Let $\begin{align*}w = \frac{f(y+k)-f(y)}{k} - f'(y)\ (2)\end{align*}$ clearly $w\to 0\ as\ k\to 0$

Rearranging $(1)$ and $(2)$ we get:

$\begin{align*}g(x+h) = g(x) + \{g'(x)+v\}h\ (3) \end{align*}$

$\begin{align*}f(y+k) = f(y) + \{f'(y)+w\}k\ (4) \end{align*}$

$\begin{align*}(3) \Rightarrow fg(x+h) = f(g(x) + \{g'(x)+v\}h) \end{align*}$

If we let $k = \{g'(x)+v\}h$ and $y=g(x)$, clearly $k\to0$ as$\ h\to0$

This reduces $(4)$ to$\begin{align*}\ \ f(g(x) + \{g'(x)+v\}h) = fg(x) + \{f'g(x)+w\}\{g'(x)+v\}h \end{align*}$

The left hand side of this statement is equivalent to $fg(x+h)$, we are now in a position to simplify $\begin{align*}\frac{fg(x+h)-fg(x)}{h}\end{align*}$

$\begin{align*}\frac{fg(x+h)-fg(x)}{h} \equiv \frac{fg(x)+\{f'g(x)+w\}\{g'(x)+v\}h - fg(x)}{h} \equiv \{f'g(x)+w\}\{g'(x)+v\}\end{align*}$

We now have a reasonably familiar expression that needs a bit of tweaking to get to the final result.

$\begin{align*}LHS\to\frac{d}{dx}\left(fg(x)\right)\ as\ h\to0\ \therefore\ RHS\to\frac{d}{dx}\left(fg(x)\right)\ as\ h\to0\end{align*}$

As $h\to0\ k,v\to0, w\to0$ as $k\to0$

$\begin{align*}\Rightarrow \lim_{h\to0}\{f'g(x)+w\}\{g'(x)+v\} = g'(x)f'g(x) \end{align*}$

$\begin{align*}\frac{d}{dx}(fg(x)) = g'(x)f'g(x) \end{align*}$

And that is the chain rule! If at any point you do not understand what I have done please leave a comment.

No comments:

Post a Comment