.. _integration: Numerical Integration ======================== .. include:: ../replace.txt .. index:: integral, trapezoid rule, simpson's rule, quad Numerical integration is an important numerical method because analytic solutions are often not available. Surprisingly, the seemingly simple equation :math:`y = e^{-x^2}` can not be integrated analytically. :numref:`integral1` shows that the value of a definite integral is the area bounded by the function, the :math:`x`--axis, and the lower and upper bounds of the integral. .. _integral1: .. figure:: integral1.png :width: 80% :align: center A definite integral is an area calculation. .. 9.1 cm wide Numerical techniques estimate the area by summing the areas of a sequence of thin slices (subintervals). You may remember three such techniques from calculus class. The *Reimann Integral* uses rectangular regions to estimate the area of each small subinterval. The *Trapezoid Rule* improves the results by modeling the area of each subinterval as a trapezoid. *Simpson's Rule* provides a more accurate estimate by modeling the curve over a pair of subintervals as a quadratic polynomial. .. _matlabIntegral: MATLAB's Integral Function ---------------------------- |M| has a function for computing definite integrals called `integral `_, which is a replacement for an older function called ``quad``. The ``integral`` function uses a `global adaptive quadrature algorithm `_ where the integration region is divided into piecewise smooth regions before integrating each subinterval. The widths of the subintervals are adapted so that each subinterval is accurately modeled by an equation. The ``integral`` function is called with arguments of a function handle and the interval of the integral (``I = integral(f, a, b)``). Be sure to use element-wise arithmetic operators when defining the function so that it can accept and return vectors. The integration interval is normally specified with a pair of real scalars, but negative and positive infinity (``-Inf`` and ``Inf``) may be used. See |M|'s documentation for information about complex functions, and the use of tolerances and waypoints. Here are a few usage examples. :: >> integral(@(x) sin(x), 0, pi) ans = 2.0000 >> integral(@(x) exp(-2*x), 0, Inf) ans = 0.5000 >> f = @(x) x.^2 - 5.*x + 8; >> integral(f, 1, 4) ans = 7.5000 Fixed Width Subinterval Algorithms ------------------------------------ Although, one only needs |M|'s integral function to compute integrals numerically, it is desirable to learn the basics of the numeric integration algorithms. Knowledge of the algorithms not only helps one to better understand and appreciate |M|'s functions, but may be needed to write an integration function in a programming language, such as C, where |M|'s functions are not available. We will consider how to implement numeric integration algorithms using the trapezoid rule and Simpson's rule. We do not discuss results from Reimann integrals because integrals from trapezoids are only slightly more complex than Reimann integrals, but are more accurate. .. _trapezoidrule: Trapezoid Rule Integrals ^^^^^^^^^^^^^^^^^^^^^^^^^ The geometry of a trapezoid is illustrated in :numref:`fig-trapArea`. When an integration region is divided into equal width trapezoids separated by grid points :math:`\{x_1, x_2, \ldots, x_n \}`, then the width of each trapezoid is :math:`h = x_{i+1} - x_i`. The area the trapezoid is the product of the width with the average height of the two sides. .. math:: \text{Area} = h\,\frac{f(x_i) + f(x_{i+1})}{2} .. _fig-trapArea: .. figure:: trapArea.png :width: 30% :align: center The shape of a trapezoid and its size measurements. .. 5.0 cm wide The definite integral of a function over a region from :math:`a` to :math:`b` is approximated by the sum of trapezoids found by splitting the region into :math:`n - 1` equal width subintervals, where :math:`n` is the number of grid points that divide the interval into subintervals. .. \footnote{Many literature references count $n$ subintervals with $n + 1$ grid points at $\{x_0, x_1, \ldots, x_n\}$. We label the grid points from $x_1$ to $x_n$ to be consistent with the \MATLAB code where array indices start at 1, not 0. .. math:: h = \frac{b - a}{n - 1} .. math:: \int_{a}^{b} f(x)\,dx \approx \sum_{i=1}^{n - 1} h\,\frac{f(x_i) + f(x_{i+1})}{2} In the summation, each term except the first and last appear twice because they are part of the calculation of two trapezoids. .. math:: \int_{a}^{b} f(x)\,dx \approx \frac{h}{2} \left(f(x_1) + 2\sum_{i=2}^{n - 1} f(x_i) + f(x_n) \right) As shown in :numref:`fig-trapIntegral`, the area of each trapezoid is sometimes more, less, or nearly the same as the true area of each subinterval. Using more subintervals, which reduces the size of :math:`h`, will improve the accuracy. With narrower subintervals, the spanned region of :math:`f(x)` from :math:`x_i` to :math:`x_{i+1}` will more closely resemble a straight line. Of course, we need to be sensitive to the size of :math:`h` to avoid excessive round-off errors that can occur when :math:`h` is very small. .. _fig-trapIntegral: .. figure:: trapIntegral.png :align: center :width: 80% The trapezoid rule models each subinterval as a trapezoid. The definite integral is the sum of the trapezoid areas. .. 9.1 cm wide The following function implements the trapezoid rule to calculate an estimate of a definite integral. :: function I = trapIntegral(f, a, b, n) % TRAPINTEGRAL - Definite integral by trapezoid area summation. % % f - vector aware function % a - lower integration boundary % b - upper integration boundary, a < b % n - number of subinterval trapezoids if a >= b error('Variable b should be greater than a.') end h = (b - a)/n; x = linspace(a, b, n+1); y = f(x); I = (h/2)*(y(1) + 2*sum(y(2:end-1)) + y(end)); Testing of the trapezoid rule integration function shows that it finds an answer that is somewhat close with a modest number of grid points, but struggles to find the true definite integral value even as the number of grid points is increased. :: >> fun = @(x) x.^2; >> trapIntegral(fun, 0, 4, 10) % correct = 21.3333 ans = 21.4650 >> trapIntegral(fun, 0, 4, 50) ans = 21.3378 >> trapIntegral(fun, 0, 4, 100) ans = 21.3344 >> f = @(x) x.^2 - 3.*x + 2*sin(3*x).*exp(-0.01*x) + 10; >> trapIntegral(f, 1, 4, 10) % correct = 27.30753 ans = 27.4644 >> trapIntegral(f, 1, 4, 100) ans = 27.3088 >> trapIntegral(f, 1, 4, 500) ans = 27.3076 .. _simpsonrule: Simpson's Rule Integrals ^^^^^^^^^^^^^^^^^^^^^^^^^ Rather than using a straight line to approximate the function over an integration subinterval, Simpson's rule estimates the function as a parabola represented with a quadratic equation. Each integration region consists of two adjacent subintervals and three grid point. The three grid points are used to find the quadratic equation from which the area is computed. The development of the quadratic equation is covered in Calculus textbooks and can also be found on several web pages. It is typically found from a Taylor series expansion of a quadratic equation or using *Lagrange polynomial interpolation* [SIAUW15]_. Lagrange polynomial interpolation finds a polynomial that goes through a set of data points. The Lagrange polynomial, :math:`P(x)`, has the property that :math:`P(x_i) = y_i` for all given :math:`(x_i, y_i)` data points. We first find a Lagrange basis polynomial, :math:`P_i(x)`, for each of given :math:`x_i` value. The basis polynomials are products of factors such that :math:`P_i(x_i) = 1` and :math:`P_i(x_j) = 0` when :math:`i \neq j`. Then the Lagrange polynomial is a sum of products between the basis polynomials and the given :math:`y_i` values. The properties of the basis polynomials ensure that the Lagrange polynomial passes through each point. .. math:: P_i(x) = \prod_{j=1, j\neq i}^n \frac{x - x_j}{x_i - x_j} .. math:: P(x) = \sum_{i=1}^n y_i\,P_i(x) .. topic:: Lagrange Polynomial Example Let us use Lagrange polynomial interpolation to find the quadratic equation that passes through the points :math:`\{(0, 2), (1, 5), \text{and} (2, 6)\}`. .. math:: \begin{array}{rl} P_1(x) &= \frac{(x - x_2)(x - x_3)}{(x_1 - x_2)(x_1 - x_3)} = \frac{(x - 1)(x - 2)}{(0 - 1)(0 - 2)} = \frac{1}{2}(x^2 - 3\,x + 2) \\ \hfill \\ P_2(x) &= \frac{(x - x_1)(x - x_3)}{(x_2 - x_1)(x_2 - x_3)} = \frac{(x - 0)(x - 2)}{(1 - 0)(1 - 2)} = -x^2 + 2\,x \\ \hfill \\ P_3(x) &= \frac{(x - x_1)(x - x_2)}{(x_3 - x_1)(x_3 - x_2)} = \frac{(x - 0)(x - 1)}{(2 - 0)(2 - 1)} = \frac{1}{2}(x^2 - x) \end{array} .. math:: \begin{array}{rl} P(x) &= 2\,P_1(x) + 5\,P_2(x) + 6\,P_3(x) \\ &= (1 - 5 + 3)\,x^2 + (- 3 + 10 - 3)\,x + 2 \\ &= -x^2 + 4\,x + 2 \end{array} .. codeplot :: >> P = @(x) -x.^2 + 4*x + 2; >> x = linspace(-1, 3); >> y = P(x); >> plot(x,y) >> hold on >> plot([0 1 2], [2 5 6], 'd', 'MarkerSize', 10) >> hold off .. _fig-lagrange: .. figure:: lagrange.png :align: center :width: 80% The Lagrange polynomial goes through the data points :math:`\{(0, 2), (1, 5), \text{and} (2, 6)\}`. :numref:`fig-simpsonRegion` illustrates one region of integration. The quadratic equation :math:`P(x)` passes through the three points :math:`\{(x_{i-1}, f(x_{i-1})), (x_{i}, f(x_{i})), \text{and} (x_{i+1}, f(x_{i+1}))\}`, which are the intersections between the three grid points and :math:`f(x)`. Using integration by substitution and some algebra, the definite integral of each integration region is given by .. math:: I_i = \int_{x_{i-1}}^{x_{i+1}} P(x)\,dx = \frac{h}{3}\left(f(x_{i-1}) + 4\,f(x_i) + f(x_{i+1})\right). .. _fig-simpsonRegion: .. figure:: simpsonRegion.png :align: center :width: 60% Simpson's Rule subinterval region of integration. .. 8.1 cm wide Then the definite integral from :math:`a` to :math:`b` is approximated by the sum of the subinterval integrals. Since each subinterval integral spans two subintervals, the count of function values at even and odd grid points are not the same. .. math:: \int_a^b f(x)\,dx \approx \frac{h}{3}\left(f(x_1) + 4\left(\sum_{i = 2,\, i\, \text{even}}^{n - 1} f(x_i)\right) + 2\left(\sum_{i = 3,\, i\, \text{odd}}^{n - 2} f(x_i)\right) + f(x_n)\right). :numref:`fig-simpsonIntegral` shows that Simpson's rule can find accurate definite integral results if the subintervals are small enough to accurately model the function with quadratic equations. .. _fig-simpsonIntegral: .. figure:: simpsonIntegral.png :align: center :width: 90% *Left:* Simpson rule integration with 5 grid points, 4 subintervals, and 2 integration regions. *Right:* Simpson rule integration with 7 grid points, 6 subintervals, and 3 integration regions. .. 10.6 cm wide .. note:: Because Simpson's rule computes integrals over two subintervals, the number of subintervals must be an even number. So the number of grid points must be an odd number. To satisfy the indexing requirements of the ``sum`` functions, the minimum number of grid points is 5. The ``simpsonIntegration`` function implements Simpson's rule integration. :: function I = simpsonIntegral(f, a, b, n) % SIMPSONINTEGRAL - Definite integral by Simpson's rule. % % f - vector aware function % a - lower integration boundary % b - upper integration boundary, a < b % n - number of subintervals, must be an even number and >= 4 if a >= b error('Variable b should be greater than a.') end if mod(n, 2) ~= 0 || n < 4 error('Variable n, should be an even number, n >= 4.') end h = (b - a)/n; x = linspace(a, b, n+1); y = f(x); I = (h/3)*(y(1) + 4*sum(y(2:2:end-1)) + 2*sum(y(3:2:end-2)) + y(end)); % MATLAB indexing starts at 1, so even and odd are reversed. Testing results show that Simpson's rule integration is more accurate than trapezoid integration even with fewer grid points. :: >> fun = @(x) x.^2 >> simpsonIntegral(fun, 0, 4, 8) ans = % correct = 21.3333 21.3333 >> f = @(x) x.^2-3.*x+2*sin(3*x).*exp(-0.01*x)+10 >> simpsonIntegral(f, 1, 4, 8) ans = % correct = 27.30753 27.2950 >> simpsonIntegral(f, 1, 4, 18) ans = 27.3071 >> I = simpsonIntegral(f, 1, 4, 28) I = 27.3075 >> error = abs(I - integral(f, 1, 4)) error = % compare to MATLAB's integral 7.1945e-05 .. _adaptIntegral: Recursive Adaptive Integral ---------------------------- The following recursive function investigates the concept of an adaptive algorithm, like |M|'s ``integral`` function, that adjusts the width of each subinterval based the accuracy of intermediate results. The basic concept of how the algorithm works is simple. If the function to be integrated fits the integration model (trapezoid or quadratic equation) over the integral region, then we have an accurate definite integral. However, if the integration model is a poor data fit, then the interval is divided into smaller subintervals. For each interval, the algorithm splits the interval from :math:`a` to :math:`b` in half and calculates the area of three intervals. The interval from :math:`a` to :math:`b` is divided equally into subintervals from :math:`a` to :math:`c` and from :math:`c` to :math:`b`. If the area from :math:`a` to :math:`b` is within a small tolerance of the sum of the areas of the subintervals, then the algorithm is satisfied with the area of the interval. If not, then each subinterval is treated as its own interval and the algorithm repeats. The algorithm continues to split each subinterval until the integration model accurately fits the function over the subinterval. The returned subinterval areas are added together to find the definite integral of the function from :math:`a` to :math:`b`. For convenience of comparisons, the adaptive function can use either the trapezoid rule or Simpson's rule algorithms. The second argument should be either ``'t'`` or ``'s'`` to select the integration algorithm. :: function I = adaptIntegral(f, intFun, a, b) % ADAPTINTEGRAL - Adaptive interval definite integral. % % f - vector aware function % intFun - 't' for trapIntegral or 's' for simpsonIntegral % a - lower integration boundary % b - upper integration boundary, a < b % % This code just demonstrates adaptive intervals. % Use MATLAB's integral function, not this function. if a >= b error('Variable b should be greater than a.') end tol = 1e-12; c = (a + b)/2; % Three integrals if intFun == 't' Iab = trapIntegral(f, a, b, 2); Iac = trapIntegral(f, a, c, 2); Icb = trapIntegral(f, c, b, 2); elseif intFun == 's' Iab = simpsonIntegral(f, a, b, 4); Iac = simpsonIntegral(f, a, c, 4); Icb = simpsonIntegral(f, c, b, 4); else error("Specify 's' or 't' for intFun") end if abs(Iab - (Iac + Icb)) < tol I = Iac + Icb; else I = adaptIntegral(f, intFun, a, c) ... + adaptIntegral(f, intFun, c, b); end Testing shows that the adaptive interval algorithm is an accurate estimator of definite integrals. Results from using Simpson's rule are more accurate than using the trapezoid rule algorithm. :: >> f = @(x) x.^2-3.*x+2*sin(3*x).*exp(-0.01*x)+10 >> I = integral(f, 1, 4); % compare to built-in >> I - adaptIntegral(f, 's', 1, 4) ans = 2.4833e-12 >> I - adaptIntegral(f, 't', 1, 4) ans = -2.9574e-09 >> s = @(x) sin(x); >> adaptIntegral(s, 's', 0, pi) ans = 2.0000 >> 2 - ans ans = -1.2683e-12 >> 2 - adaptIntegral(s, 't', 0, pi) ans = 1.7330e-09