We have seen how we can approximate the area under a non-negative valued function over an interval $[a,b]$ with a sum of the form $\sum_{i=1}^n f(x^*_i) \Delta x_i$, and how this approximation gets better and better as our $\Delta x_i$ values become very small.
Of course, when these widths $\Delta x_i$ of the sub-intervals of the related partition become small, we necessarily have many more terms to our sum (i.e., thinner rectangular strips require more rectangles to cover the original area).
As such, good approximations generally happen when they are a sum of many, many terms -- each of which contributes a very small value.
This idea of approximating something of interest with a large (in the limit, infinite) sum of many, many small values (in the limit, each being infinitesimal) has broad application and forms the next "big idea" in the study of calculus.
We will drive much of our exploration into this idea with the question: "What is the area under this curve?" -- much like we drove the development of the derivative with the question: "What is the slope of the tangent line to a curve at a given point?".
However, there is more than one way to interpret this next big idea. This should not be surprising. Think how the derivative can also be construed as an instantaneous rate of change between two variables, and how thinking about it in that way greatly widened the contexts in which the derivative could appear. Here too, the ideas revealed by asking about the area under a curve can also be seen in other contexts.
Before we move forward, let us consider one of these..
Suppose the upward velocity of a piston t seconds after it starts to move is given by $v(t) = \sin(t)$ m/s. If the piston is 3 meters high when it starts to move, how high will it be 4 seconds later?
Suppose we attempted to approximate the answer to the question above. We might recall that
but that is only true when the rate is a constant velocity - and clearly the velocity in the problem above is changing over time. (Note: sometimes, the piston even moves backwards!)
However, if the time interval was just shorter... the velocity wouldn't be changing by as much, right? Consider the difference in velocity at the following two times:
$$v(1) \doteq 0.84147 \quad v(1.0001) \doteq 0.84152$$So maybe, for just that brief interval of time, distance = rate $\times$ time could give a decent approximation of the small change in position during that tiny time interval...
$$\textrm{distance} \doteq 0.8415 \times (1.0001 - 1)$$Note, in the interest of getting an even better approximation, we tried to split the difference here with regard to the rate used - something between 0.84147 and 0.84152. There is more to say about this - but for now, recall that we are at least assured (by the Intermediate Value Theorem, in this case) that at some $x^*$ we have $v(x^*) = 0.8415$...
Consequently, if we denote the small change in position that happens over this tiny time interval by $\Delta s$, and if we denote the change in time (i.e., time elapsed) from the start to the end of that tiny time interval by $\Delta t$, then we can say
$$\Delta s \doteq v(x^*) \times \Delta t$$Now, to approximate the total net change in position from $t=0$ to $t=4$, we could find all of these "little-changes-in-position" and just add them up! (Keep in mind, some of these could be negative, indicating a distance traveled backwards.)
Consider the following possible way to cut up the time from $t=0$ to $t=2$ into nine tiny intervals no longer than 0.6 seconds -- and this is just one of many, many ways to do this...
$$\begin{array}{cccc} i & i^{th} \textrm{ time interval } & \textrm{ range of $v(t)$ values } & \Delta t\\ 1 & (0,0.4) & 0 \textrm{ to } 0.39 & 0.4\\ 2 & (0.4,1) & 0.39 \textrm{ to } 0.84 & 0.6\\ 3 & (1,1.3) & 0.84 \textrm{ to } 0.96 & 0.3\\ 4 & (1.3,1.9) & 0.96 \textrm{ to } 1 & 0.6\\ 5 & (1.9, 2.4) & 0.67 \textrm{ to } 0.95 & 0.5\\ 6 & (2.4,2.6) & 0.51 \textrm{ to } 0.67 & 0.2\\ 7 & (2.6,3.15) & -0.01 \textrm{ to } 0.51 & 0.55\\ 8 & (3.15, 3.55) & -0.4 \textrm{ to } -0.01 & 0.4\\ 9 & (3.55,4) & -0.75 \textrm{ to } -0.4 & 0.45 \end{array}$$Now, which $v(x^*)$ value we should use for each little tiny time interval could be debated. We might want to use the minimum value of the function in each range, thereby ensuring we do not over-estimate how high the piston will be. We might instead want to use the maximum value of the function in each range, so that we don't under-estimate the piston's height. Alternatively, we might decide to split the difference and use some $v(x^*)$ value between the two of these. While the choice at this point will affect our approximation, this becomes less and less of a concern as our time intervals become more and more narrow (i.e., there will be a smaller and smaller range of $v(x^*)$ values from which to choose). For the purposes of simply making a decision about such things for this example, let us suppose we use the value $f\,(x^*)$ where $x^*$ is chosen to be the right endpoint of each time interval.
So, based on the above intervals and the choice of $v(x^*)$ values just described, an approximation for the total change in position the piston experiences from $t=0$ to $t=2$ is given by
$$\begin{array}{lll} \textrm{Total change } & = & (0.39)(0.4) + (0.84)(0.6) + (0.96)(0.3) +\\ \textrm{in position} & & (0.94)(0.6) + (0.67)(0.5) + (0.51)(0.2) +\\ & & (-0.01)(0.55) + (-0.4)(0.4) + (-0.75)(0.45) \end{array}$$A couple of comments about the construction of the above sum are in order:
The $4^{th}$ term above, (0.94)(0.6), does indeed correspond to the product of the height of the function at the right endpoint of the time interval in question $v(1.9) \doteq 0.94$ and the width of that time interval, 0.6. This may, however, not be apparent in the table, as $v(t)$ attains a maximum value of 1 within this time interval;
Also note that $v(t)$ starts decreasing after $\pi \ /\ 2 \doteq 1.57$, so the height of the function at the right endpoint of each time interval to the right of 1.57 will represent its minimum value in that interval.
Importantly, notice that this sum takes the form
$$\sum_{i=1}^{n} v(x_i^*) \Delta t_i$$where $n=9$ in this case, and even the biggest $\Delta t_i$ is fairly small (i.e., no more than 0.6) -- and this sum generally represents a better and better approximation to the true change in position for our piston as we shrink the size of these individual time intervals (i.e., as the maximum $\Delta t_i$ gets small).
Recall the summation we used to approximate the area under some non-negative valued function on the interval $[a,b]$:
$$\sum_{i=1}^n f(x_i^*) \Delta x_i$$Notice how the summation to approximate the net change in height of the piston is structurally absolutely identical to the summation immediately above!
So finding the net change over some interval of time in the height of a piston whose velocity is governed by some function can be answered with the same process as finding the area under a curve over some interval. As earlier suggested, there is a more general process at work here.
With this in mind, let us define for a given partition, $a=x_0 \lt x_1 \lt x_2, \ldots x_n=b$ of $[a,b]$ a sum of the following form to be a Riemann sum (named after the nineteenth century German mathematician Bernhard Riemann),
$$\sum_{i=1}^{n} \ f\,(x_i^*)\Delta x_i$$where for each $i$, $x_i^*$ is some chosen value in the $i^{th}$ subinterval, $[x_{i-1},x_i]$, and $\Delta x_i = x_i - x_{i-1}$.
In both of the cases we have considered, recall how the approximations these sums represented generally got better as the norm of the partition used became small.
Hoping to find the "best" approximation (e.g., the actual net change in height of the piston, or the actual area under the curve), let us then try to find the limit of the Riemann sum as the norm of the partition goes to zero (i.e., as $||\Delta|| \rightarrow 0$).
When this limit exists, we say that $f(x)$ is Riemann integrable on the interval in question and define the definite integral of $f\,(x)$ from $a$ to $b$, to be the value of this limit, denoted by
$$\int_a^b \ f\,(x) \ dx = \lim_{||\Delta|| \rightarrow 0} \ \sum_{i=1}^{n} \ f\,(x_i^*) \Delta x_i$$You are probably wondering why the same symbol $\int$ is used here in connection to Riemann sums, when it is also used to describe a set of antiderivatives for a function. What do these two ideas have in common?
First, one should know that the symbol $\int$ initially appeared in the work of German mathematician Gottfried Wilhelm Leibniz in 1675 in his private writings. It is based on an archaic variant of the letter "s" that can still be seen in signs and logos in Nordic and German-speaking countries.
Leibniz used this elongated latin "S" to highlight the nature of the sum in question. While sigma $(\Sigma)$, the greek version of the letter "S", was used to denote a finite sum of finite values -- Leibniz wanted to stress that the integral should be thought of as an infinite sum of infinitesimals.
In a similar manner, we trade the upper case delta $(\Delta)$, the greek version of an uppercase "D", for a lowercase "d" -- the former representing a finite difference, which is large in comparison to the "small" infinitesimal difference the latter represents.
With regard to the similarity to the notation for antidifferentiation (i.e., finding the indefinite integral) there is an intimate an intimate connection between these two ideas -- which on the surface appear very different. This connection is described by the Fundamental Theorem of Calculus and will be discussed soon.
As one final comment about the notation -- as the definite integral tells us something about the behavior of the function over an interval $[a,b]$ and that information gets somewhat lost in the limit of the Riemann sum through the consideration of the partition used and the norm of that partition - we make the connection more prominent by putting the $a$ and $b$ as a subscript and superscript, respectively, next to the $\int$ symbol.
As a matter of vocabulary, the $x$-value written as a subscript (here, $a$) is called the lower limit of integration, while the one written as a superscript (here, $b$) is called the upper limit of integration.
Often, to evaluate a definite integral directly from its limit of a Riemann sum definition, we choose a convenient partition, one in which all of the $\Delta x_i$'s are the same size (which we denote by $\Delta x$). As has been said before, we call such a partition a regular partition. In this case, we can be assured that the norm of the partition $||\Delta||$ goes to zero if we require that the number of subintervals goes to infinity. In such circumstances, we can rewrite the definite integral in an algebraically simpler form:
$$\int_a^b \ f\,(x) \ dx = \lim_{n \rightarrow \infty} \ \sum_{i=1}^{n} \ f\,(x_i^*) \Delta x$$Note that in a regular partition, the sequence of $x_i$'s forms an arithmetic progression.
In the case where the subintervals are not all the same length, we say we have a non-regular partition. There are many such partitions, but a frequently algebraically useful one is where the $x_i$'s form a geometric progression.