Differentiation by First Principles

00:00:00.000 Okay, we're going to work through differentiations from first principles.

00:00:06.000 I'm going to look at the theory, and then we're going to try an example as well.

00:00:10.000 I'm going to start off with my set of axes here and draw a really general curve.

00:00:15.000 This curve could be anything at all. You're familiar with y equals x squared, y equals x cubed,

00:00:20.000 y equals 3x to the power of 5 plus 4x to the power of 4 plus 3x to the power of 8.

00:00:25.000 You get the picture. It could be anything.

00:00:28.000 So there's my x and y axes, and there's my really general curve.

00:00:33.000 It's y equals fx, and some function of x.

00:00:37.000 That's a point p on the curve, and there's a tangent through p on the curve.

00:00:44.000 This is going to become important later. There's another point q on the curve.

00:00:49.000 Now, if I draw a line through p and through q, I don't have a tangent, but I've got a secant.

00:00:57.000 We could calculate the gradient of that secant, and that's what we'll do.

00:01:02.000 One of the purposes of differentiation from first principles is to look at the gradient of the tangent through p.

00:01:12.000 Now, so that this process can work properly, what I want is to make sure that the interval between p and q is as small as possible.

00:01:24.000 And the reason for that is that the closer the q gets to p, then the more the secant through p and q looks like the tangent just through p.

00:01:38.000 Now, my q is moved towards p there. I'm slightly probably know with my animation here.

00:01:43.000 The secant, the red secant there, probably should have moved as well.

00:01:48.000 Okay, that needed to move so that it would look more like the gradient, would look like it had the gradient of the orange line, which is the tangent.

00:01:57.000 So my q has approached p, the gradient of my secant, in other words, which is there, gets closer and closer to the gradient of the orange line, which is my tangent.

00:02:12.000 And in mathematics, we use the arrow to mean approaches.

00:02:18.000 MPq means the gradient of pq approaches the gradient at the point p on the curve y equals fx.

00:02:27.000 Now, those two things are true, that statement is true of the gradient of pq approaches the gradient at p, as long as the distance between p and q gets increasingly smaller.

00:02:41.000 We're going to actually give our points coordinate to now, or they're going to be very general.

00:02:50.000 p is the point now with the x coordinate, x and the y coordinate f of x.

00:02:58.000 And q is the point x plus h, f of x plus h.

00:03:04.000 Now, where did that come from?

00:03:06.000 Well, all that's saying is that the distance along the x-axis between p and q is a distance of h units.

00:03:17.000 In other words, if p is at the point x, then q is at the point x plus h.

00:03:27.000 Now, similarly, if p is the point along the y-axis of f of x, then the distance vertically up to q is f of x plus h.

00:03:42.000 Now, this is generally the way that the coordinates are written when one's trying to differentiate by first principles.

00:03:48.000 Very similar to the way in which textbook exercises do it.

00:03:51.000 There are other methods, but this is the method we'll stick with today.

00:03:56.000 Using that letter h to represent the distance between p and q on the x-axis.

00:04:01.000 So, I see here with my cursor, I'm pointing to the x-axis.

00:04:04.000 So, here is approximately where on the x-axis p is.

00:04:09.000 And if we go a little bit further along, then you find q.

00:04:15.000 And the distance between p and q on that x-axis is h units.

00:04:22.000 h approaches zero means that the distance between p and q along the x-axis approaches zero.

00:04:27.000 So, from here to here, that approaches zero, the closer and closer p and q get together.

00:04:36.000 All right, so let's put all of that notation into some sort of formula.

00:04:44.000 We already know that the gradient equals the rise over the run.

00:04:47.000 And that's the difference between the y values divided by the difference between the x values.

00:04:53.000 Y2 minus Y1 over x2 minus x1.

00:04:58.000 And this is the reason why it's difficult without calculus or impossible without calculus to talk about the gradient at a point.

00:05:08.000 And that's what we're interested in doing with calculus with respect to curves.

00:05:13.000 We want to find what is the gradient at a point on a curve?

00:05:18.000 The reason why we've got straight lines here is because we can use the gradient formula to define what the gradient is for a straight line.

00:05:26.000 The only way it's going to make sense for a single point on a curve is when the two points that we're interested in are extremely close together.

00:05:36.000 As close as you like, infinitesimally close together.

00:05:45.000 I'm going to define something therefore called the gradient function.

00:05:49.000 So, it's not the gradient as a single value now, but the gradient will have a different value at different points along the curve.

00:05:58.000 We can see on my blue curve here at the point p, the gradient looks relatively steep, but up here near q, the gradient seems fairly flat.

00:06:07.000 In other words, steepness of the curve isn't very great.

00:06:10.000 As we move away from q and we head back down the other side of the curve, the blue curve, what we find is the gradient turns negative.

00:06:17.000 In other words, the gradient is changing at each point along the curve.

00:06:20.000 So, rather than being a single number, the gradient is going to be some function of x. As you move along the x axis, the gradient will be changing.

00:06:30.000 We've had the gradient function of x, and that's why we're using this new notation.

00:06:35.000 This new notation is due to got freed Wilhelm Leibniz and Leibniz used the letter d because d stands for difference.

00:06:44.000 The difference in the y values divided by the difference in the x values.

00:06:50.000 And as I've said there, it's simply new notation.

00:06:52.000 So, regarded as being analogous to the letter m when we're talking about straight lines.

00:07:00.000 It's the change in y with respect to x or simply the gradient function.

00:07:06.000 This is a formula that I was somewhat reluctant to put in, but I thought it'll make things a little bit clearer.

00:07:11.000 You don't see this generally in textbooks, but it'll emphasize what's to come.

00:07:16.000 And the gradient function is equal to the limit of h approaches zero.

00:07:21.000 In other words, as that distance along the x axis between p and q gets ever smaller as it approaches zero, then we're interested in finding out what the gradient of that straight line from p to q is because that will approximately be the same as the gradient exactly at p.

00:07:40.000 Indeed, as h approaches zero and effectively equals zero because it's so infinitesimally small, the gradient of the straight line, p, q will be the gradient at p.

00:07:53.000 And if we want to calculate the gradient from p to q, then we can use this gradient formula y2 minus y1 of x2 minus x1.

00:08:02.000 But the coordinates of y2, y1, x2 and x1, which we defined on the previous slide, are slightly different to what we're used to.

00:08:16.000 They're the xfx, they're the x plus h, f of x plus h.

00:08:21.000 So I just wanted to emphasize there that the gradient function comes from a limit formula. It's a limit, the gradient function.

00:08:36.000 And as I say here, we want a formula for the gradient of the tangent to the curve at the point p. And this is the gradient of the secant between p, xfx, and that's the coordinates of p.

00:08:50.000 So xfx, f of x, and q, x plus h, f of x plus h. And when the difference between them along the x-axis is infinitesimal, then this formula will work. So now I'm going to sub in those coordinates here into this formula.

00:09:12.000 This formula here that's appeared now is one step prior to the formula you'll get for differentiation by first principles in the textbook.

00:09:22.000 And the way that you get this formula is simply to take this formula here and to sub in the coordinates for p and the coordinates for q.

00:09:34.000 If you let y2 and x2 be the coordinates of q and x1 and y1 be the coordinates of p.

00:09:44.000 So what is y2 or y2 is f of x plus h? And what's y1? That's f of x. And similarly on the denominator here, you can see x2 is x plus h.

00:09:56.000 x1 is just x. We can simplify the denominator, remove the brackets, the x is cancel to leave you with nothing but h down the bottom there.

00:10:10.000 Now, for those of you who are a step ahead, you might realize that, well, you can't simply substitute h is equal to zero into this formula because you're not allowed to divide by zero.

00:10:23.000 It's against the law, or it's against the law of maths. You can't divide by zero.

00:10:28.000 And so what that means is that when we do the algebra for a specific curve and differentiate by first principles, you'll find that it will always be the case that the h on the bottom will cancel with a factor on the numerator.

00:10:46.000 We'll see that in an example that we do shortly. But before that, we might just recap using the derivation for that formula we've just found without the graph.

00:11:01.000 So once more, remember, x1 value is simply x. When I x2 value is just a little bit bigger than this, namely, x plus h. What's h? Well, h is a number as close as you like to zero. This means that the two y values will be f of x and f of x plus h.

00:11:20.000 Our gradient formula, y2 minus y1 of x2 minus x1, it takes on a slightly different form. When instead of talking about the gradient, we talk about the gradient function to y dx. So instead of the letter m, we're now going to use to y dx.

00:11:36.000 And once more, the reason for that is because m, when we're talking about straight lines, equals a single number, m is equal to two or minus four, it's equal to a constant number. But with curves, it won't be, there'll be an x involved, so it might be two x or three x or four x squared, it'll be a function of x.

00:11:58.000 So that gradient function is going to be defined as a limit as h approaches zero of the gradient formula.

00:12:10.000 Now, I've substituted in x1 and y1 and x2 and y2. They're the coordinates of p and q. A little bit of simplification on the denominator leads to the formula that you really need to commit to memory.

00:12:32.000 The formula for differentiation by first principles. And what you can notice is the formula still looks like a rise over a run. The top of that formula is still the difference between the y coordinate, y coordinate and the bottom is still the difference between the x coordinates.

00:13:01.000 Time for an example.

00:13:04.000 We're going to start with the parabola. It's the place that I think most teachers start just looking at the, my simple case, the curve, y equals x squared.

00:13:15.000 And I want to find the gradient of a tangent, any point on the parabola.

00:13:20.000 There's my parabola. And I want to know what is the gradient of a tangent there? Or really, perhaps I want to know what the gradient of a tangent, say there is.

00:13:34.000 Or maybe even the gradient of a tangent there. Looks like that green tangent probably has a gradient of zero but look at things.

00:13:44.000 What we need to know is there a single formula for this particular curve, y equals x squared, that will be able to tell us what the gradient is at any point on that parabola.

00:13:56.000 And that's our purpose. What is the gradient of a tangent drawn at any point on that parabola?

00:14:02.000 Is there a relationship between the gradient of the tangent drawn and the x value where the tangent intersects the curve?

00:14:24.000 I would that you're going to become increasingly familiar with this idea of differentiation. At the differentiator curve means to find the gradient function given that curve.

00:14:35.000 So differentiation will give us a formula for this parabola that will be able to generate the value of the gradient for the tangent to any curve.

00:14:44.000 So we're not just going to stick with y equals x squared, we're eventually going to look at more general cases.

00:14:53.000 Okay, so we've already generated a formula to enable us to differentiate by first principles.

00:15:02.000 And now we're going to try a specific example and calculate the gradient function, which we're going to call the y dx or the y by dx, for a particular function.

00:15:16.000 And we're going to use y equals x squared. And the process we're going through, often referred to as differentiation, a word you're going to become familiar with increasingly over time.

00:15:27.000 And for the case of y equals x squared, the parabola, we're going to not use y anymore, we're going to use f of x instead.

00:15:39.000 And our f of x is going to represent x squared. So in the gradient function formula over here, wherever I see an f of x, I'm going to sub in x squared.

00:15:53.000 And, well, our function notation shows us that if the function of x is x squared, then the function of x plus h must be x plus h, all squared. Okay, so I'm going to be able to sub in two things on the numerator in order to eventually derive what my gradient function will be.

00:16:17.000 So let's see our first step here, my f of x is x squared, and my f of x plus h is x plus h, all squared. Okay, if f of x is x squared, and I want to know what f of x plus h is, then what I need to do is to find wherever there's an x, it's a substitute at x plus h.

00:16:44.000 Now, I'm going to expand this bracket here, and so I've expanded x plus h, all squared to become x squared plus two x h plus h squared.

00:16:59.000 I've still got my x squared left at the end there, and so in the next line that will become important.

00:17:08.000 Here the x squared here, as a result of my expansion of x plus h, all squared, and my x minus x squared here, effectively cancel each other out, and I'm left with two x h plus h squared, and that's all divided by h.

00:17:28.000 The next line is where I've taken out a common factor, I really should have included an extra line there, but for those of you who are able to follow me, you'll see that, well, h appears in two x h and h appears in h squared, and so my h on the bottom can divide through the numerator, leaving me with nothing but two x plus h on the top there, and the h on the bottom has cancelled with that h, and it's counted with one of these h's.

00:17:57.000 Because h approaches zero, I'm left with nothing else but two x.

00:18:04.000 Okay, we're going to try another example here.

00:18:08.000 It'll be text up on your screen at the moment, which basically explains what we're about to do, which is differentiate y equals 4x cubed minus 3x from first principles.

00:18:21.000 We've already differentiated y equals x squared, and we just want to recall that to differentiate is to find the function which, when we substitute the x value into it, it will give us the value of the gradient of a tangent at that point on the curve.

00:18:36.000 If I was to use the letter m, then m equals 2x is the formula for the gradient of the tangent to y equals x squared at the point x.

00:18:45.000 I don't normally use the letter m, though, when we're talking about the gradient of curves, because m normally stands for a constant.

00:18:53.000 And so for that reason, we don't use m instead, we use new notation, which is divide by dx or dy dx.

00:19:03.000 Okay, there's my formula.

00:19:06.000 Once more, it needs to be committed to memory so that we can begin doing this question.

00:19:16.000 Okay, so I probably should have written up there what one we're doing, but we can tell from here that what I'm trying to do is to differentiate y equals 4x cubed minus 3x.

00:19:32.000 Okay, so that's my purpose here. The reason I haven't got written up there, though, is because it simply won't fit in at the slide.

00:19:38.000 You'll notice that a few of these questions that you do require a lot of space.

00:19:44.000 You know, you're going to be looking at half a page to a page of working.

00:19:47.000 Sometimes, depending upon how many lines of algebra you can do in your head.

00:19:52.000 Okay, so let's begin this one.

00:19:55.000 We're subbing in for f of x, this here.

00:20:00.000 We also need to put in f of x plus h.

00:20:04.000 So f of x plus h is the formula that you get when you put x plus h into the spots in f of x wherever you see an x.

00:20:18.000 So here I see an x and here I see an x.

00:20:21.000 So I'm going to put x plus h next to the 4 and cubed all, and I'm going to put an x plus h next to the 3.

00:20:31.000 So I've got a 4x plus h or cubed minus 3 of x plus and that's my function of x plus h.

00:20:39.000 From that, I'm going to subtract my function of x and that's how I get my numerator.

00:20:46.000 It's all divided by h. Let's expand the brackets.

00:20:51.000 So he would use the fact that the function f of x is equal to 4x cubed minus 3x.

00:20:57.000 And so f of x plus h equals 4x plus h cubed minus 3x plus h.

00:21:05.000 Okay, so what's happened here?

00:21:08.000 To be careful, I've noticed that students seem to think sometimes that,

00:21:15.000 although we know how to expand something like x plus h or squared to get x squared plus 2x h plus h squared.

00:21:24.000 But when it comes to cubing x plus h, they'll immediately just put a 3 above the x and a 3 above the h to get x cubed plus h cubed and think they're done.

00:21:35.000 Or that there's some formula, indeed, for expanding x plus h or cubed.

00:21:42.000 There is, but we wouldn't, we wouldn't need to use it.

00:21:45.000 Well, instead what we do is we just split the x plus h or cubed into x plus h times x plus h or squared.

00:21:52.000 Okay, so that times that gives me that.

00:21:55.000 That just helps us do the working.

00:21:57.000 That's the only thing that I've done to get to that next line.

00:22:01.000 And to get to this line, well, we know how to expand x plus h or squared and that becomes this.

00:22:16.000 I've expanded this bracket here to get 3x minus 3h.

00:22:25.000 And we can further expand the brackets and we end up with something that looks rather ugly.

00:22:46.000 X times x squared gives me x cubed.

00:22:49.000 X times 2xh gives me 2x squared h. X times h squared gives me xh squared.

00:22:57.000 H times x squared is x squared h.

00:23:02.000 H times 2xh is 2xh squared and finally h times h squared is h cubed.

00:23:11.000 Check to see if you can get to that line from this one.

00:23:17.000 Finally, the 4 has multiplied through every single term there.

00:23:27.000 I've also removed the last bracket here to be aware of that.

00:23:32.000 Minus times a minus gives me a plus here.

00:23:40.000 Collect like terms. It's an easy step, but you need to be careful to ensure that you do pick up all the like terms.

00:23:48.000 You might want to think rather than watching this straight through, I hope you're pausing it particular points.

00:23:54.000 Now might be a good time to pause. Can you simplify from here on?

00:23:58.000 From this point onwards, what do you do?

00:24:07.000 I've factorized. I've taken out a common factor of h.

00:24:11.000 You should always be able to do that. Many of these differentiation by first principles problems.

00:24:16.000 There should be a common factor of h that comes out.

00:24:20.000 And the reason for that is that we need the h on the top here to cancel with the h on the bottom.

00:24:28.000 If it doesn't, then we can't evaluate our limit, not in any trivial way anyway.

00:24:34.000 If our limit approaches zero, then if this h is still there, we're dividing by zero.

00:24:40.000 Once again, you're allowed to cancel all. It's just not logically possible actually.

00:24:48.000 I've cancelled my h on the denominator there.

00:24:57.000 And because it's a limit now, I'm going to regard as being zero.

00:25:11.000 Any term left that contains a h.

00:25:15.000 So 12 x times h, which is zero, is zero.

00:25:19.000 And 4 times zero, he's also zero. So my 12 x h, that's going to vanish.

00:25:24.000 And my 4 h, that's going to vanish, leaving me with what?

00:25:32.000 12 x squared minus three. And so that's my answer.

00:25:35.000 I can just write, I can remove the limit notation now.

00:25:38.000 There's no reason for me to continue saying if the limit is h approaches zero,

00:25:42.000 there is no longer any h there, the h has dropped out.

00:25:45.000 So my final answer is generally written as y dx equals 12 x squared minus three.

00:25:54.000 And so we're done.