Go to the first, previous, next, last section, table of contents.
This chapter describes routines for finding roots of arbitrary one-dimensional functions. The library provides low level components for a variety of iterative solvers and convergence tests. These can be combined by the user to achieve the desired solution, with full access to the intermediate steps of the iteration. Each class of methods uses the same framework, so that you can switch between solvers at runtime without needing to recompile your program. Each instance of a solver keeps track of its own state, allowing the solvers to be used in multi-threaded programs.
The header file `gsl_roots.h' contains prototypes for the root finding functions and related declarations.
One-dimensional root finding algorithms can be divided into two classes, root bracketing and root polishing. Algorithms which proceed by bracketing a root are guaranteed to converge. Bracketing algorithms begin with a bounded region known to contain a root. The size of this bounded region is reduced, iteratively, until it encloses the root to a desired tolerance. This provides a rigorous error estimate for the location of the root.
The technique of root polishing attempts to improve an initial guess to the root. These algorithms converge only if started "close enough" to a root, and sacrifice a rigorous error bound for speed. By approximating the behavior of a function in the vicinity of a root they attempt to find a higher order improvement of an initial guess. When the behavior of the function is compatible with the algorithm and a good initial guess is available a polishing algorithm can provide rapid convergence.
In GSL both types of algorithm are available in similar frameworks. The user provides a high-level driver for the algorithms, and the library provides the individual functions necessary for each of the steps. There are three main phases of the iteration. The steps are,
The state for bracketing solvers is held in a gsl_root_fsolver
struct. The updating procedure uses only function evaluations (not
derivatives). The state for root polishing solvers is held in a
gsl_root_fdfsolver
struct. The updates require both the function
and its derivative (hence the name fdf
) to be supplied by the
user.
Note that root finding functions can only search for one root at a time. When there are several roots in the search area, the first root to be found will be returned; however it is difficult to predict which of the roots this will be. In most cases, no error will be reported if you try to find a root in an area where there is more than one.
Care must be taken when a function may have a multiple root (such as f(x) = (x-x_0)^2 or f(x) = (x-x_0)^3). It is not possible to use root-bracketing algorithms on even-multiplicity roots. For these algorithms the initial interval must contain a zero-crossing, where the function is negative at one end of the interval and positive at the other end. Roots with even-multiplicity do not cross zero, but only touch it instantaneously. Algorithms based on root bracketing will still work for odd-multiplicity roots (e.g. cubic, quintic, ...). Root polishing algorithms generally work with higher multiplicity roots, but at a reduced rate of convergence. In these cases the Steffenson algorithm can be used to accelerate the convergence of multiple roots.
While it is not absolutely required that f have a root within the search region, numerical root finding functions should not be used haphazardly to check for the existence of roots. There are better ways to do this. Because it is easy to create situations where numerical root finders can fail, it is a bad idea to throw a root finder at a function you do not know much about. In general it is best to examine the function visually by plotting before searching for a root.
const gsl_root_fsolver_type * T = gsl_root_fsolver_bisection; gsl_root_fsolver * s = gsl_root_fsolver_alloc (T);
If there is insufficient memory to create the solver then the function
returns a null pointer and the error handler is invoked with an error
code of GSL_ENOMEM
.
const gsl_root_fdfsolver_type * T = gsl_root_fdfsolver_newton; gsl_root_fdfsolver * s = gsl_root_fdfsolver_alloc (T);
If there is insufficient memory to create the solver then the function
returns a null pointer and the error handler is invoked with an error
code of GSL_ENOMEM
.
printf ("s is a '%s' solver\n", gsl_root_fsolver_name (s));
would print something like s is a 'bisection' solver
.
You must provide a continuous function of one variable for the root finders to operate on, and, sometimes, its first derivative. In order to allow for general parameters the functions are defined by the following data types:
double (* function) (double x, void * params)
void * params
Here is an example for the general quadratic function,
f(x) = a x^2 + b x + c
with a = 3, b = 2, c = 1. The following code
defines a gsl_function
F
which you could pass to a root
finder as a function pointer:
struct my_f_params { double a; double b; double c; }; double my_f (double x, void * p) { struct my_f_params * params = (struct my_f_params *)p; double a = (params->a); double b = (params->b); double c = (params->c); return (a * x + b) * x + c; } gsl_function F; struct my_f_params params = { 3.0, 2.0, 1.0 }; F.function = &my_f; F.params = ¶ms;
The function f(x) can be evaluated using the macro
GSL_FN_EVAL(&F,x)
defined in `gsl_math.h'.
double (* f) (double x, void * params)
double (* df) (double x, void * params)
void (* fdf) (double x, void * params, double * f, double * df)
void * params
Here is an example where f(x) = 2\exp(2x):
double my_f (double x, void * params) { return exp (2 * x); } double my_df (double x, void * params) { return 2 * exp (2 * x); } void my_fdf (double x, void * params, double * f, double * df) { double t = exp (2 * x); *f = t; *df = 2 * t; /* uses existing value */ } gsl_function_fdf FDF; FDF.f = &my_f; FDF.df = &my_df; FDF.fdf = &my_fdf; FDF.params = 0;
The function f(x) can be evaluated using the macro
GSL_FN_FDF_EVAL_F(&FDF,x)
and the derivative f'(x) can
be evaluated using the macro GSL_FN_FDF_EVAL_DF(&FDF,x)
. Both
the function y = f(x) and its derivative dy = f'(x) can
be evaluated at the same time using the macro
GSL_FN_FDF_EVAL_F_DF(&FDF,x,y,dy)
. The macro stores
f(x) in its y argument and f'(x) in its dy
argument--both of these should be pointers to double
.
You provide either search bounds or an initial guess; this section explains how search bounds and guesses work and how function arguments control them.
A guess is simply an x value which is iterated until it is within
the desired precision of a root. It takes the form of a double
.
Search bounds are the endpoints of an interval which is iterated until the length of the interval is smaller than the requested precision. The interval is defined by two values, the lower limit and the upper limit. Whether the endpoints are intended to be included in the interval or not depends on the context in which the interval is used.
The following functions drive the iteration of each algorithm. Each function performs one iteration to update the state of any solver of the corresponding type. The same functions work for all solvers so that different methods can be substituted at runtime without modifications to the code.
GSL_EBADFUNC
Inf
or NaN
.
GSL_EZERODIV
The solver maintains a current best estimate of the root at all times. The bracketing solvers also keep track of the current best interval bounding the root. This information can be accessed with the following auxiliary functions,
A root finding procedure should stop when one of the following conditions is true:
The handling of these conditions is under user control. The functions below allow the user to test the precision of the current result in several standard ways.
GSL_SUCCESS
if the following
condition is achieved,
|a - b| < epsabs + epsrel min(|a|,|b|)
when the interval x = [a,b] does not include the origin. If the interval includes the origin then \min(|a|,|b|) is replaced by zero (which is the minimum value of |x| over the interval). This ensures that the relative error is accurately estimated for roots close to the origin.
This condition on the interval also implies that any estimate of the root r in the interval satisfies the same condition with respect to the true root r^*,
|r - r^*| < epsabs + epsrel r^*
assuming that the true root r^* is contained within the interval.
This function tests for the convergence of the sequence ..., x0,
x1 with absolute error epsabs and relative error
epsrel. The test returns GSL_SUCCESS
if the following
condition is achieved,
|x_1 - x_0| < epsabs + epsrel |x_1|
and returns GSL_CONTINUE
otherwise.
GSL_SUCCESS
if the
following condition is achieved,
|f| < epsabs
and returns GSL_CONTINUE
otherwise. This criterion is suitable
for situations where the precise location of the root, x, is
unimportant provided a value can be found where the residual,
|f(x)|, is small enough.
The root bracketing algorithms described in this section require an initial interval which is guaranteed to contain a root--if a and b are the endpoints of the interval then f(a) must differ in sign from f(b). This ensures that the function crosses zero at least once in the interval. If a valid initial interval is used then these algorithm cannot fail, provided the function is well-behaved.
Note that a bracketing algorithm cannot find roots of even degree, since these do not cross the x-axis.
The bisection algorithm is the simplest method of bracketing the roots of a function. It is the slowest algorithm provided by the library, with linear convergence.
On each iteration, the interval is bisected and the value of the function at the midpoint is calculated. The sign of this value is used to determine which half of the interval does not contain a root. That half is discarded to give a new, smaller interval containing the root. This procedure can be continued indefinitely until the interval is sufficiently small.
At any time the current estimate of the root is taken as the midpoint of the interval.
The false position algorithm is a method of finding roots based on linear interpolation. Its convergence is linear, but it is usually faster than bisection.
On each iteration a line is drawn between the endpoints (a,f(a)) and (b,f(b)) and the point where this line crosses the x-axis taken as a "midpoint". The value of the function at this point is calculated and its sign is used to determine which side of the interval does not contain a root. That side is discarded to give a new, smaller interval containing the root. This procedure can be continued indefinitely until the interval is sufficiently small.
The best estimate of the root is taken from the linear interpolation of the interval on the current iteration.
The Brent-Dekker method (referred to here as Brent's method) combines an interpolation strategy with the bisection algorithm. This produces a fast algorithm which is still robust.
On each iteration Brent's method approximates the function using an interpolating curve. On the first iteration this is a linear interpolation of the two endpoints. For subsequent iterations the algorithm uses an inverse quadratic fit to the last three points, for higher accuracy. The intercept of the interpolating curve with the x-axis is taken as a guess for the root. If it lies within the bounds of the current interval then the interpolating point is accepted, and used to generate a smaller interval. If the interpolating point is not accepted then the algorithm falls back to an ordinary bisection step.
The best estimate of the root is taken from the most recent interpolation or bisection.
The root polishing algorithms described in this section require an initial guess for the location of the root. There is no absolute guarantee of convergence--the function must be suitable for this technique and the initial guess must be sufficiently close to the root for it to work. When these conditions are satisfied then convergence is quadratic.
These algorithms make use of both the function and its derivative.
Newton's Method is the standard root-polishing algorithm. The algorithm begins with an initial guess for the location of the root. On each iteration, a line tangent to the function f is drawn at that position. The point where this line crosses the x-axis becomes the new guess. The iteration is defined by the following sequence,
x_{i+1} = x_i - f(x_i)/f'(x_i)
Newton's method converges quadratically for single roots, and linearly for multiple roots.
The secant method is a simplified version of Newton's method which does not require the computation of the derivative on every step.
On its first iteration the algorithm begins with Newton's method, using the derivative to compute a first step,
x_1 = x_0 - f(x_0)/f'(x_0)
Subsequent iterations avoid the evaluation of the derivative by replacing it with a numerical estimate, the slope of the line through the previous two points,
x_{i+1} = x_i f(x_i) / f'_{est} where f'_{est} = (f(x_i) - f(x_{i-1})/(x_i - x_{i-1})
When the derivative does not change significantly in the vicinity of the root the secant method gives a useful saving. Asymptotically the secant method is faster than Newton's method whenever the cost of evaluating the derivative is more than 0.44 times the cost of evaluating the function itself. As with all methods of computing a numerical derivative the estimate can suffer from cancellation errors if the separation of the points becomes too small.
On single roots, the method has a convergence of order (1 + \sqrt 5)/2 (approximately 1.62). It converges linearly for multiple roots.
The Steffenson Method(14) provides the fastest convergence of all the routines. It combines the basic Newton algorithm with an Aitken "delta-squared" acceleration. If the Newton iterates are x_i then the acceleration procedure generates a new sequence R_i,
R_i = x_i - (x_{i+1} - x_i)^2 / (x_{i+2} - 2 x_{i+1} + x_{i})
which converges faster than the original sequence under reasonable conditions. The new sequence requires three terms before it can produce its first value so the method returns accelerated values on the second and subsequent iterations. On the first iteration it returns the ordinary Newton estimate. The Newton iterate is also returned if the denominator of the acceleration term ever becomes zero.
As with all acceleration procedures this method can become unstable if the function is not well-behaved.
For any root finding algorithm we need to prepare the function to be solved. For this example we will use the general quadratic equation described earlier. We first need a header file (`demo_fn.h') to define the function parameters,
struct quadratic_params { double a, b, c; }; double quadratic (double x, void *params); double quadratic_deriv (double x, void *params); void quadratic_fdf (double x, void *params, double *y, double *dy);
We place the function definitions in a separate file (`demo_fn.c'),
double quadratic (double x, void *params) { struct quadratic_params *p = (struct quadratic_params *) params; double a = p->a; double b = p->b; double c = p->c; return (a * x + b) * x + c; } double quadratic_deriv (double x, void *params) { struct quadratic_params *p = (struct quadratic_params *) params; double a = p->a; double b = p->b; double c = p->c; return 2.0 * a * x + b; } void quadratic_fdf (double x, void *params, double *y, double *dy) { struct quadratic_params *p = (struct quadratic_params *) params; double a = p->a; double b = p->b; double c = p->c; *y = (a * x + b) * x + c; *dy = 2.0 * a * x + b; }
The first program uses the function solver gsl_root_fsolver_brent
for Brent's method and the general quadratic defined above to solve the
following equation,
x^2 - 5 = 0
with solution x = \sqrt 5 = 2.236068...
#include <stdio.h> #include <gsl/gsl_errno.h> #include <gsl/gsl_math.h> #include <gsl/gsl_roots.h> #include "demo_fn.h" #include "demo_fn.c" int main (void) { int status; int iter = 0, max_iter = 100; const gsl_root_fsolver_type *T; gsl_root_fsolver *s; double r = 0, r_expected = sqrt (5.0); double x_lo = 0.0, x_hi = 5.0; gsl_function F; struct quadratic_params params = {1.0, 0.0, -5.0}; F.function = &quadratic; F.params = ¶ms; T = gsl_root_fsolver_brent; s = gsl_root_fsolver_alloc (T); gsl_root_fsolver_set (s, &F, x_lo, x_hi); printf ("using %s method\n", gsl_root_fsolver_name (s)); printf ("%5s [%9s, %9s] %9s %10s %9s\n", "iter", "lower", "upper", "root", "err", "err(est)"); do { iter++; status = gsl_root_fsolver_iterate (s); r = gsl_root_fsolver_root (s); x_lo = gsl_root_fsolver_x_lower (s); x_hi = gsl_root_fsolver_x_upper (s); status = gsl_root_test_interval (x_lo, x_hi, 0, 0.001); if (status == GSL_SUCCESS) printf ("Converged:\n"); printf ("%5d [%.7f, %.7f] %.7f %+.7f %.7f\n", iter, x_lo, x_hi, r, r - r_expected, x_hi - x_lo); } while (status == GSL_CONTINUE && iter < max_iter); gsl_root_fsolver_free (s); return status; }
Here are the results of the iterations,
$ ./a.out using brent method iter [ lower, upper] root err err(est) 1 [1.0000000, 5.0000000] 1.0000000 -1.2360680 4.0000000 2 [1.0000000, 3.0000000] 3.0000000 +0.7639320 2.0000000 3 [2.0000000, 3.0000000] 2.0000000 -0.2360680 1.0000000 4 [2.2000000, 3.0000000] 2.2000000 -0.0360680 0.8000000 5 [2.2000000, 2.2366300] 2.2366300 +0.0005621 0.0366300 Converged: 6 [2.2360634, 2.2366300] 2.2360634 -0.0000046 0.0005666
If the program is modified to use the bisection solver instead of
Brent's method, by changing gsl_root_fsolver_brent
to
gsl_root_fsolver_bisection
the slower convergence of the
Bisection method can be observed,
$ ./a.out using bisection method iter [ lower, upper] root err err(est) 1 [0.0000000, 2.5000000] 1.2500000 -0.9860680 2.5000000 2 [1.2500000, 2.5000000] 1.8750000 -0.3610680 1.2500000 3 [1.8750000, 2.5000000] 2.1875000 -0.0485680 0.6250000 4 [2.1875000, 2.5000000] 2.3437500 +0.1076820 0.3125000 5 [2.1875000, 2.3437500] 2.2656250 +0.0295570 0.1562500 6 [2.1875000, 2.2656250] 2.2265625 -0.0095055 0.0781250 7 [2.2265625, 2.2656250] 2.2460938 +0.0100258 0.0390625 8 [2.2265625, 2.2460938] 2.2363281 +0.0002601 0.0195312 9 [2.2265625, 2.2363281] 2.2314453 -0.0046227 0.0097656 10 [2.2314453, 2.2363281] 2.2338867 -0.0021813 0.0048828 11 [2.2338867, 2.2363281] 2.2351074 -0.0009606 0.0024414 Converged: 12 [2.2351074, 2.2363281] 2.2357178 -0.0003502 0.0012207
The next program solves the same function using a derivative solver instead.
#include <stdio.h> #include <gsl/gsl_errno.h> #include <gsl/gsl_math.h> #include <gsl/gsl_roots.h> #include "demo_fn.h" #include "demo_fn.c" int main (void) { int status; int iter = 0, max_iter = 100; const gsl_root_fdfsolver_type *T; gsl_root_fdfsolver *s; double x0, x = 5.0, r_expected = sqrt (5.0); gsl_function_fdf FDF; struct quadratic_params params = {1.0, 0.0, -5.0}; FDF.f = &quadratic; FDF.df = &quadratic_deriv; FDF.fdf = &quadratic_fdf; FDF.params = ¶ms; T = gsl_root_fdfsolver_newton; s = gsl_root_fdfsolver_alloc (T); gsl_root_fdfsolver_set (s, &FDF, x); printf ("using %s method\n", gsl_root_fdfsolver_name (s)); printf ("%-5s %10s %10s %10s\n", "iter", "root", "err", "err(est)"); do { iter++; status = gsl_root_fdfsolver_iterate (s); x0 = x; x = gsl_root_fdfsolver_root (s); status = gsl_root_test_delta (x, x0, 0, 1e-3); if (status == GSL_SUCCESS) printf ("Converged:\n"); printf ("%5d %10.7f %+10.7f %10.7f\n", iter, x, x - r_expected, x - x0); } while (status == GSL_CONTINUE && iter < max_iter); gsl_root_fdfsolver_free (s); return status; }
Here are the results for Newton's method,
$ ./a.out using newton method iter root err err(est) 1 3.0000000 +0.7639320 -2.0000000 2 2.3333333 +0.0972654 -0.6666667 3 2.2380952 +0.0020273 -0.0952381 Converged: 4 2.2360689 +0.0000009 -0.0020263
Note that the error can be estimated more accurately by taking the
difference between the current iterate and next iterate rather than the
previous iterate. The other derivative solvers can be investigated by
changing gsl_root_fdfsolver_newton
to
gsl_root_fdfsolver_secant
or
gsl_root_fdfsolver_steffenson
.
For information on the Brent-Dekker algorithm see the following two papers,
Go to the first, previous, next, last section, table of contents.