Go to the first, previous, next, last section, table of contents.


Basis Splines

This chapter describes functions for the computation of smoothing basis splines (B-splines). A smoothing spline differs from an interpolating spline in that the resulting curve is not required to pass through each datapoint. See section Interpolation, for information about interpolating splines.

The header file `gsl_bspline.h' contains the prototypes for the bspline functions and related declarations.

Overview

B-splines are commonly used as basis functions to fit smoothing curves to large data sets. To do this, the abscissa axis is broken up into some number of intervals, where the endpoints of each interval are called breakpoints. These breakpoints are then converted to knots by imposing various continuity and smoothness conditions at each interface. Given a nondecreasing knot vector t = {t_0, t_1, ..., t_{n+k-1}}, the n basis splines of order k are defined by

B_(i,1)(x) = (1, t_i <= x < t_(i+1)
             (0, else
B_(i,k)(x) = [(x - t_i)/(t_(i+k-1) - t_i)] B_(i,k-1)(x)
              + [(t_(i+k) - x)/(t_(i+k) - t_(i+1))] B_(i+1,k-1)(x)

for i = 0, ..., n-1. The common case of cubic B-splines is given by k = 4. The above recurrence relation can be evaluated in a numerically stable way by the de Boor algorithm.

If we define appropriate knots on an interval [a,b] then the B-spline basis functions form a complete set on that interval. Therefore we can expand a smoothing function as

f(x) = \sum_i c_i B_(i,k)(x)

given enough (x_j, f(x_j)) data pairs. The coefficients c_i can be readily obtained from a least-squares fit.

Initializing the B-splines solver

The computation of B-spline functions requires a preallocated workspace of type gsl_bspline_workspace. If B-spline derivatives are also required, an additional gsl_bspline_deriv_workspace is needed.

Function: gsl_bspline_workspace * gsl_bspline_alloc (const size_t k, const size_t nbreak)
This function allocates a workspace for computing B-splines of order k. The number of breakpoints is given by nbreak. This leads to n = nbreak + k - 2 basis functions. Cubic B-splines are specified by k = 4. The size of the workspace is O(5k + nbreak).

Function: void gsl_bspline_free (gsl_bspline_workspace * w)
This function frees the memory associated with the workspace w.

Function: gsl_bspline_deriv_workspace * gsl_bspline_deriv_alloc (const size_t k)
This function allocates a workspace for computing the derivatives of a B-spline basis function of order k. The size of the workspace is O(2k^2).

Function: void gsl_bspline_deriv_free (gsl_bspline_deriv_workspace * w)
This function frees the memory associated with the derivative workspace w.

Constructing the knots vector

Function: int gsl_bspline_knots (const gsl_vector * breakpts, gsl_bspline_workspace * w)
This function computes the knots associated with the given breakpoints and stores them internally in w->knots.

Function: int gsl_bspline_knots_uniform (const double a, const double b, gsl_bspline_workspace * w)
This function assumes uniformly spaced breakpoints on [a,b] and constructs the corresponding knot vector using the previously specified nbreak parameter. The knots are stored in w->knots.

Evaluation of B-splines

Function: int gsl_bspline_eval (const double x, gsl_vector * B, gsl_bspline_workspace * w)
This function evaluates all B-spline basis functions at the position x and stores them in the vector B, so that the i-th element is B_i(x). The vector B must be of length n = nbreak + k - 2. This value may also be obtained by calling gsl_bspline_ncoeffs. Computing all the basis functions at once is more efficient than computing them individually, due to the nature of the defining recurrence relation.

Function: int gsl_bspline_eval_nonzero (const double x, gsl_vector * Bk, size_t * istart, size_t * iend, gsl_bspline_workspace * w)
This function evaluates all potentially nonzero B-spline basis functions at the position x and stores them in the vector Bk, so that the i-th element is B_(istart+i)(x). The last element of Bk is B_(iend)(x). The vector Bk must be of length k. By returning only the nonzero basis functions, this function allows quantities involving linear combinations of the B_i(x) to be computed without unnecessary terms (such linear combinations occur, for example, when evaluating an interpolated function).

Function: size_t gsl_bspline_ncoeffs (gsl_bspline_workspace * w)
This function returns the number of B-spline coefficients given by n = nbreak + k - 2.

Evaluation of B-spline derivatives

Function: int gsl_bspline_deriv_eval (const double x, const size_t nderiv, gsl_matrix * dB, gsl_bspline_workspace * w, gsl_bspline_deriv_workspace * dw)
This function evaluates all B-spline basis function derivatives of orders 0 through nderiv (inclusive) at the position x and stores them in the matrix dB. The (i,j)-th element of dB is d^jB_i(x)/dx^j. The matrix dB must be of size n = nbreak + k - 2 by nderiv + 1. The value n may also be obtained by calling gsl_bspline_ncoeffs. Note that function evaluations are included as the zeroth order derivatives in dB. Computing all the basis function derivatives at once is more efficient than computing them individually, due to the nature of the defining recurrence relation.

Function: int gsl_bspline_deriv_eval_nonzero (const double x, const size_t nderiv, gsl_matrix * dB, size_t * istart, size_t * iend, gsl_bspline_workspace * w, gsl_bspline_deriv_workspace * dw)
This function evaluates all potentially nonzero B-spline basis function derivatives of orders 0 through nderiv (inclusive) at the position x and stores them in the matrix dB. The (i,j)-th element of dB is d^j/dx^j B_(istart+i)(x). The last row of dB contains d^j/dx^j B_(iend)(x). The matrix dB must be of size k by at least nderiv + 1. Note that function evaluations are included as the zeroth order derivatives in dB. By returning only the nonzero basis functions, this function allows quantities involving linear combinations of the B_i(x) and their derivatives to be computed without unnecessary terms.

Working with the Greville abscissae

The Greville abscissae are defined to be the mean location of k-1 consecutive knots in the knot vector for each basis spline function of order k. With the first and last knots in the gsl_bspline_workspace knot vector excluded, there are gsl_bspline_ncoeffs Greville abscissae for any given B-spline basis. These values are often used in B-spline collocation applications and may also be called Marsden-Schoenberg points.

Function: double gsl_bspline_greville_abscissa (size_t i, gsl_bspline_workspace *w);
Returns the location of the i-th Greville abscissa for the given B-spline basis. For the ill-defined case when k=1, the implementation chooses to return breakpoint interval midpoints.

Examples

The following program computes a linear least squares fit to data using cubic B-spline basis functions with uniform breakpoints. The data is generated from the curve y(x) = \cos{(x) \exp{(-x/10)}} on the interval [0, 15] with Gaussian noise added.

#include <stdio.h>
#include <stdlib.h>
#include <math.h>
#include <gsl/gsl_bspline.h>
#include <gsl/gsl_multifit.h>
#include <gsl/gsl_rng.h>
#include <gsl/gsl_randist.h>
#include <gsl/gsl_statistics.h>

/* number of data points to fit */
#define N        200

/* number of fit coefficients */
#define NCOEFFS  12

/* nbreak = ncoeffs + 2 - k = ncoeffs - 2 since k = 4 */
#define NBREAK   (NCOEFFS - 2)

int
main (void)
{
  const size_t n = N;
  const size_t ncoeffs = NCOEFFS;
  const size_t nbreak = NBREAK;
  size_t i, j;
  gsl_bspline_workspace *bw;
  gsl_vector *B;
  double dy;
  gsl_rng *r;
  gsl_vector *c, *w;
  gsl_vector *x, *y;
  gsl_matrix *X, *cov;
  gsl_multifit_linear_workspace *mw;
  double chisq, Rsq, dof, tss;

  gsl_rng_env_setup();
  r = gsl_rng_alloc(gsl_rng_default);

  /* allocate a cubic bspline workspace (k = 4) */
  bw = gsl_bspline_alloc(4, nbreak);
  B = gsl_vector_alloc(ncoeffs);

  x = gsl_vector_alloc(n);
  y = gsl_vector_alloc(n);
  X = gsl_matrix_alloc(n, ncoeffs);
  c = gsl_vector_alloc(ncoeffs);
  w = gsl_vector_alloc(n);
  cov = gsl_matrix_alloc(ncoeffs, ncoeffs);
  mw = gsl_multifit_linear_alloc(n, ncoeffs);

  printf("#m=0,S=0\n");
  /* this is the data to be fitted */
  for (i = 0; i < n; ++i)
    {
      double sigma;
      double xi = (15.0 / (N - 1)) * i;
      double yi = cos(xi) * exp(-0.1 * xi);

      sigma = 0.1 * yi;
      dy = gsl_ran_gaussian(r, sigma);
      yi += dy;

      gsl_vector_set(x, i, xi);
      gsl_vector_set(y, i, yi);
      gsl_vector_set(w, i, 1.0 / (sigma * sigma));

      printf("%f %f\n", xi, yi);
    }

  /* use uniform breakpoints on [0, 15] */
  gsl_bspline_knots_uniform(0.0, 15.0, bw);

  /* construct the fit matrix X */
  for (i = 0; i < n; ++i)
    {
      double xi = gsl_vector_get(x, i);

      /* compute B_j(xi) for all j */
      gsl_bspline_eval(xi, B, bw);

      /* fill in row i of X */
      for (j = 0; j < ncoeffs; ++j)
        {
          double Bj = gsl_vector_get(B, j);
          gsl_matrix_set(X, i, j, Bj);
        }
    }

  /* do the fit */
  gsl_multifit_wlinear(X, w, y, c, cov, &chisq, mw);

  dof = n - ncoeffs;
  tss = gsl_stats_wtss(w->data, 1, y->data, 1, y->size);
  Rsq = 1.0 - chisq / tss;

  fprintf(stderr, "chisq/dof = %e, Rsq = %f\n", 
                   chisq / dof, Rsq);

  /* output the smoothed curve */
  {
    double xi, yi, yerr;

    printf("#m=1,S=0\n");
    for (xi = 0.0; xi < 15.0; xi += 0.1)
      {
        gsl_bspline_eval(xi, B, bw);
        gsl_multifit_linear_est(B, c, cov, &yi, &yerr);
        printf("%f %f\n", xi, yi);
      }
  }

  gsl_rng_free(r);
  gsl_bspline_free(bw);
  gsl_vector_free(B);
  gsl_vector_free(x);
  gsl_vector_free(y);
  gsl_matrix_free(X);
  gsl_vector_free(c);
  gsl_vector_free(w);
  gsl_matrix_free(cov);
  gsl_multifit_linear_free(mw);

  return 0;
} /* main() */

The output can be plotted with GNU graph.

$ ./a.out > bspline.dat
chisq/dof = 1.118217e+00, Rsq = 0.989771
$ graph -T ps -X x -Y y -x 0 15 -y -1 1.3 < bspline.dat > bspline.ps

References and Further Reading

Further information on the algorithms described in this section can be found in the following book,

Further information of Greville abscissae and B-spline collocation can be found in the following paper,

A large collection of B-spline routines is available in the PPPACK library available at http://www.netlib.org/pppack, which is also part of SLATEC.


Go to the first, previous, next, last section, table of contents.