Thursday 21 February 2013

Proving the Chain Rule

The chain rule is a method of differentiating a function within a function, for example $\sin x^2$. If you have studied maths to a sufficient (but not necessary) level it is likely that you will know that $\begin{align*}\frac{d}{dx}\left(fg(x)\right) = g'(x)f'g(x) \end{align*}$, but you may not know why this is the case.

We know from basic differentiation that $\begin{align*}\frac{d}{dx}(g(x)) = \lim_{h\to 0}\left(\frac{g(x+h) - g(x)}{h}\right) \end{align*}$

Let $\begin{align*}v = \frac{g(x+h) - g(x)}{h} - g'(x)\ (1)\end{align*}$ clearly $v\to 0\ as\ h\to 0$

This idea can be extended to a function of a function, as long as the function is differentiable for some function of $x,\ y$ then as $\begin{align*}\ k\to 0\ ,\  \frac{f(y+k)-f(y)}{k} \to f'(y)\end{align*}$

Let $\begin{align*}w = \frac{f(y+k)-f(y)}{k} - f'(y)\ (2)\end{align*}$ clearly $w\to 0\ as\ k\to 0$

Rearranging $(1)$ and $(2)$ we get:

$\begin{align*}g(x+h) = g(x) + \{g'(x)+v\}h\ (3) \end{align*}$

$\begin{align*}f(y+k) = f(y) + \{f'(y)+w\}k\ (4) \end{align*}$

$\begin{align*}(3) \Rightarrow fg(x+h) = f(g(x) + \{g'(x)+v\}h) \end{align*}$

If we let $k = \{g'(x)+v\}h$ and $y=g(x)$, clearly $k\to0$ as$\ h\to0$

This reduces $(4)$ to$\begin{align*}\ \ f(g(x) + \{g'(x)+v\}h) = fg(x) + \{f'g(x)+w\}\{g'(x)+v\}h \end{align*}$

The left hand side of this statement is equivalent to $fg(x+h)$, we are now in a position to simplify $\begin{align*}\frac{fg(x+h)-fg(x)}{h}\end{align*}$

$\begin{align*}\frac{fg(x+h)-fg(x)}{h} \equiv \frac{fg(x)+\{f'g(x)+w\}\{g'(x)+v\}h - fg(x)}{h} \equiv \{f'g(x)+w\}\{g'(x)+v\}\end{align*}$

We now have a reasonably familiar expression that needs a bit of tweaking to get to the final result.

$\begin{align*}LHS\to\frac{d}{dx}\left(fg(x)\right)\ as\ h\to0\ \therefore\ RHS\to\frac{d}{dx}\left(fg(x)\right)\ as\ h\to0\end{align*}$

As $h\to0\ k,v\to0, w\to0$ as $k\to0$

$\begin{align*}\Rightarrow \lim_{h\to0}\{f'g(x)+w\}\{g'(x)+v\} = g'(x)f'g(x) \end{align*}$

$\begin{align*}\frac{d}{dx}(fg(x)) = g'(x)f'g(x) \end{align*}$

And that is the chain rule! If at any point you do not understand what I have done please leave a comment.

Monday 18 February 2013

Triangle Properties: STEP I 2004 Question 6 Solution

This is an interesting STEP question (as many of them are!) but it gets you to, unwillingly, prove some interesting properties of triangles. I also hope that this post can act as a train of thought needed throughout a STEP question.


At first glance there are probably a few viable ways of attacking the problem, some neater than others. The most obvious one is to go at it algebraically and try to achieve the desired result.

$\begin{align*}Midpoint\ of\ BC,\ M_{BC}\left(\frac{p_2 + p_3}{2}, \frac{q_2 + q_3}{2}\right)\end{align*}$

$\begin{align*}Midpoint\ of\ AC,\ M_{AC}\left(\frac{p_1 + p_3}{2}, \frac{q_1 + q_3}{2}\right)\end{align*}$

$\begin{align*}Gradient\ of\ line\ connecting\ M_{BC}\ and\ A,\ m_1 =  \frac{q_2 + q_3 - 2q_1}{p_2 + p_3 - 2p_1} \end{align*}$

$\begin{align*}Gradient\ of\ line\ connecting\ M_{AC}\ and\ B,\ m_1 =  \frac{q_1 + q_3 - 2q_2}{p_1 + p_3 - 2p_2} \end{align*}$

As you can see, this turns very messy, very quickly. I could persevere, find the equation of both the lines equate them and solve for x but that is going to be a humongous algebraic slog. There are then two options, persevere or go at it via a different method. Given that this is the first part of the question looking for a more concise method is undoubtedly the way to go.

Given that we are thinking about geometrical properties of lines and their midpoints it makes sense to use vectors to solve this problem. Most importantly it will likely cut down on the algebra a lot due to their concise notation.

$\begin{align*}\ \ Let\  \overrightarrow{OA} = \mathbf{a} = \begin{bmatrix} p_1 \\ q_1 \end{bmatrix},\ \overrightarrow{OB} = \mathbf{b} = \begin{bmatrix} p_2 \\ q_2 \end{bmatrix},\ \overrightarrow{OC} = \mathbf{c} = \begin{bmatrix} p_3 \\ q_3 \end{bmatrix} \end{align*}$

$\begin{align*}Let\ the\ midpoints\ of\ AB,\ AC,\ BC\ be\ denoted\ D,\ E,\ F \end{align*}$

$\begin{align*}\ \ \therefore \overrightarrow{OD} = \mathbf{\frac{1}{2}(a+b)},\ \overrightarrow{OE} = \mathbf{\frac{1}{2}(a+c)},\ \overrightarrow{OF} = \mathbf{\frac{1}{2}(b+c)}\end{align*}$

That has set up all of the required details to begin actually answering the question. The vector equations for the line joining A and the midpoint of BC and the line joining B and the midpoint AC, are the lines $AF$ and $BE$:

$\begin{align*}\mathbf{r_{AF}} = \mathbf{a} + \lambda (\mathbf{b + c - 2a}) \end{align*}$

$\begin{align*}\mathbf{r_{BE}} = \mathbf{b} + \mu (\mathbf{a+c-2b}) \end{align*}$

They intersect when $\mathbf{r_{AF}} = \mathbf{r_{BE}}$.

$\Rightarrow \mathbf{a} - \mathbf{b} = (\mu - \lambda)\mathbf{c} + (\mu + 2\lambda)\mathbf{a} - (\lambda + 2\mu)\mathbf{b}$

$\mathbf{c}$ is not parallel to $\mathbf{a}$ or $\mathbf{b} \Rightarrow \mu - \lambda = 0$

$\Rightarrow \mathbf{a} - \mathbf{b} = 3\lambda\mathbf{a} - 3\lambda\mathbf{b}$

This must then mean that $\begin{align*}\lambda = \frac{1}{3}\end{align*}$. Denote the point of intersection $Q$, it lies on $\begin{align*}\mathbf{r_{AF}},\ \lambda = \frac{1}{3}\end{align*}$.

$\begin{align*}\ \ \ \overrightarrow{OQ} = \mathbf{a} + \frac{1}{3}(\mathbf{b} + \mathbf{c} - 2\mathbf{a}) = \frac{1}{3}(\mathbf{a} + \mathbf{b} + \mathbf{c}) = \frac{1}{3}\begin{bmatrix} p_1+p_2+p_3 \\ q_1+q_2+q_3 \end{bmatrix}\end{align*}$

$\begin{align*}\therefore Q\left(\frac{p_1+p_2+p_3}{3},\frac{q_1+q_2+q_3}{3}\right)\end{align*}$

We have found the point of intersection we now need to show that this lies on the line connecting C to the midpoint of AB, this is the vector equation $CD$.

$\begin{align*}\mathbf{r_{CD}} = \mathbf{c} + \omega (\mathbf{a + b - 2c})\end{align*}$

$\begin{align*}\ \ \overrightarrow{OQ} = \mathbf{r_{CD}} \Rightarrow \frac{1}{3}\mathbf{a} + \frac{1}{3}\mathbf{b} - \frac{2}{3}\mathbf{c} = \omega(\mathbf{a} + \mathbf{b} - 2\mathbf{c})\end{align*}$

This is satisfied by $\omega = \frac{1}{3}$ $\therefore\ Q\ lies\ on\ CD$.

Let us think about what we have actually shown here. We have shown that for any triangle $ABC$, if we connect the vertices to the midpoint of the opposite side they all meet at a common point (they are concurrent), this point is called the centroid. This has useful implications to finding the centre of mass of an object.

To answer the next part of the question we will also use vectors, this part seems almost tailor made to do so given that we are looking for when two lines are perpendicular.

$\begin{align*}\ \ \overrightarrow{OH} = \mathbf{h} = \begin{bmatrix} p_1 + p_2 + p_3 \\ q_1 + q_2 + q_3 \end{bmatrix} = \mathbf{a}+\mathbf{b}+\mathbf{c}\end{align*}$

$\begin{align*}\mathbf{r_{AH}} = \mathbf{a} + \lambda (\mathbf{h} - \mathbf{a}) = \mathbf{a} + \lambda (\mathbf{b} + \mathbf{c}) \end{align*}$

$\begin{align*}\mathbf{r_{BC}} = \mathbf{b} + \mu (\mathbf{c} - \mathbf{b}) \end{align*}$

If $AH \perp BC,\ (\mathbf{b} + \mathbf{c}) \cdot (\mathbf{c} - \mathbf{b}) = 0 \Rightarrow \mathbf{b} \cdot \mathbf{b} = \mathbf{c} \cdot \mathbf{c} \Rightarrow |\mathbf{b}| = |\mathbf{c}| \therefore p_2^2+q_2^2 = p_3^2+q_3^2 (1)$ as required.

A virtually identical method is taken for $BH \perp AC$.

$\begin{align*}\mathbf{r_{BH}} = \mathbf{b} + \lambda (\mathbf{a} + \mathbf{c}) \end{align*}$

$\begin{align*}\mathbf{r_{AC}} = \mathbf{a} + \mu (\mathbf{c} - \mathbf{a}) \end{align*}$

If $BH \perp AC,\ (\mathbf{a} + \mathbf{c}) \cdot (\mathbf{c} - \mathbf{a}) = 0 \Rightarrow \mathbf{a} \cdot \mathbf{a} = \mathbf{c} \cdot \mathbf{c} \Rightarrow |\mathbf{a}| = |\mathbf{c}| \therefore p_1^2+q_1^2 = p_3^2+q_3^2 (2)$.

The final part can be easily deduced if we subtract the two statements we have already shown to be true, eliminating $p_3,\ q_3$.

$\begin{align*}\ \ (1) - (2) \Rightarrow p_1^2 + q_1^2 = p_2^2 + q_2^2 \Rightarrow(\mathbf{a} + \mathbf{b}) \cdot (\mathbf{b} - \mathbf{a}) = 0 \therefore AC \perp BH,\ BC \perp AH \Rightarrow AB \perp CH \end{align*}$

And that is the entirety of the question completed!

The line through a vertex which is perpendicular to the opposite side is called an altitude. We have shown that the three altitudes of a triangle meet at a common point (they are concurrent) if, this point is called the orthocentre.

This question is very nice for forcing you to prove interesting properties of triangles unwillingly, I highly recommend that you do this question yourself. If you do not understand what I have done at any point comment and I will explain.

Sunday 17 February 2013

Logarithmic Differentiation: Proving the Product Rule

If you have standard maths to a pre-undergraduate level then it is pretty likely that you have met the product rule for differentiating two functions of $x$:$\ u,\ v$

$\begin{align*}\frac{d}{dx}(uv) = u'v+v'u\end{align*}$

But, I hear you scream, why?! Well there is a very neat proof for the product rule, but it can be extended to the quotient rule, or the product of three functions, or eighteen.

Consider $y = uv$ for some functions of $x$: $u,\ v$, this then means that:

$\ln y = \ln uv$
$\ln y = \ln u + \ln v$

We can then differentiate both sides implicitly to get:

$\begin{align*}\frac{1}{y} \frac{dy}{dx} = \frac{u'}{u} + \frac{v'}{v}\end{align*}$

$\begin{align*}\frac{1}{y} \frac{dy}{dx} = \frac{u'v + v'u}{uv}\end{align*}$

$y = uv$ so we can then multiply through by $uv$ to get:

$\begin{align*}\frac{dy}{dx} = u'v + v'u\end{align*}$

Which is the product rule! This is a very concise little proof of the product rule, but it does assume that you can differentiate implicitly and also that $\begin{align*}\frac{d}{dx}(\ln f(x)) = \frac{f'(x)}{f(x)}\end{align*}$.

Logarithmic differentiation also has useful applications to more complicated derivatives, by reducing more complex functions to ones that are much simpler to differentiate.

$\begin{align*}\ e.g\ Given\ y = xe^x\cos (1+x^2)\ find \frac{dy}{dx}\end{align*}$

Taking logs of both sides and rearranging gives:

$\begin{align*}\ \ln y = \ln x + x\ln e + \ln cos (1+x^2)\end{align*}$

We can differentiate both sides implicitly as we did earlier and we get:

$\begin{align*}\ \frac{1}{y}\frac{dy}{dx} = \frac{1}{x} + 1 - \frac{2x\sin(1+x^2)}{\cos(1+x^2)}\end{align*}$

$\begin{align*}\ \frac{1}{y}\frac{dy}{dx} = \frac{\cos(1+x^2) + x\cos(1+x^2) - 2x^2\sin(1+x^2)}{x\cos(1+x^2)} \end{align*}$

Multiplying through by $y$ we get:

$\begin{align*}\ \frac{dy}{dx} = e^x\left((x+1)\cos(1+x^2) - 2x^2\sin(1+x^2)\right) \end{align*}$

It only took a few lines to evaluate a pretty complex derivative rather than having to break it down and use the product rule on three products.

See if you can prove that $\begin{align*}\frac{d}{dx}\left(\frac{u}{v}\right) = \frac{u'v-v'u}{v^2}\end{align*}$also evaluate$\begin{align*}\frac{d}{dx}\left(\frac{3x^2sinx}{\sqrt{x^2+1}}\right)\end{align*}$