Next: Regularized regression, Previous: Linear regression, Up: Least-Squares Fitting [Index]

This section describes routines which perform least squares fits to a linear model by minimizing the cost function

\chi^2 = \sum_i w_i (y_i - \sum_j X_ij c_j)^2 = || y - Xc ||_W^2

where *y* is a vector of *n* observations, *X* is an
*n*-by-*p* matrix of predictor variables, *c*
is a vector of the *p* unknown best-fit parameters to be estimated,
and *||r||_W^2 = r^T W r*.
The matrix *W = * diag*(w_1,w_2,...,w_n)*
defines the weights or uncertainties of the observation vector.

This formulation can be used for fits to any number of functions and/or
variables by preparing the *n*-by-*p* matrix *X*
appropriately. For example, to fit to a *p*-th order polynomial in
`x`, use the following matrix,

X_{ij} = x_i^j

where the index *i* runs over the observations and the index
*j* runs from 0 to *p-1*.

To fit to a set of *p* sinusoidal functions with fixed frequencies
*\omega_1*, *\omega_2*, …, *\omega_p*, use,

X_{ij} = sin(\omega_j x_i)

To fit to *p* independent variables *x_1*, *x_2*, …,
*x_p*, use,

X_{ij} = x_j(i)

where *x_j(i)* is the *i*-th value of the predictor variable
*x_j*.

The solution of the general linear least-squares system requires an
additional working space for intermediate results, such as the singular
value decomposition of the matrix *X*.

These functions are declared in the header file `gsl_multifit.h`.

- Function:
*gsl_multifit_linear_workspace ****gsl_multifit_linear_alloc***(const size_t*`n`, const size_t`p`) -
This function allocates a workspace for fitting a model to a maximum of

`n`observations using a maximum of`p`parameters. The user may later supply a smaller least squares system if desired. The size of the workspace is*O(np + p^2)*.

- Function:
*void***gsl_multifit_linear_free***(gsl_multifit_linear_workspace **`work`) This function frees the memory associated with the workspace

`w`.

- Function:
*int***gsl_multifit_linear_svd***(const gsl_matrix **`X`, gsl_multifit_linear_workspace *`work`) This function performs a singular value decomposition of the matrix

`X`and stores the SVD factors internally in`work`.

- Function:
*int***gsl_multifit_linear_bsvd***(const gsl_matrix **`X`, gsl_multifit_linear_workspace *`work`) This function performs a singular value decomposition of the matrix

`X`and stores the SVD factors internally in`work`. The matrix`X`is first balanced by applying column scaling factors to improve the accuracy of the singular values.

- Function:
*int***gsl_multifit_linear***(const gsl_matrix **`X`, const gsl_vector *`y`, gsl_vector *`c`, gsl_matrix *`cov`, double *`chisq`, gsl_multifit_linear_workspace *`work`) This function computes the best-fit parameters

`c`of the model*y = X c*for the observations`y`and the matrix of predictor variables`X`, using the preallocated workspace provided in`work`. The*p*-by-*p*variance-covariance matrix of the model parameters`cov`is set to*\sigma^2 (X^T X)^{-1}*, where*\sigma*is the standard deviation of the fit residuals. The sum of squares of the residuals from the best-fit,*\chi^2*, is returned in`chisq`. If the coefficient of determination is desired, it can be computed from the expression*R^2 = 1 - \chi^2 / TSS*, where the total sum of squares (TSS) of the observations`y`may be computed from`gsl_stats_tss`

.The best-fit is found by singular value decomposition of the matrix

`X`using the modified Golub-Reinsch SVD algorithm, with column scaling to improve the accuracy of the singular values. Any components which have zero singular value (to machine precision) are discarded from the fit.

- Function:
*int***gsl_multifit_linear_tsvd***(const gsl_matrix **`X`, const gsl_vector *`y`, const double`tol`, gsl_vector *`c`, gsl_matrix *`cov`, double *`chisq`, size_t *`rank`, gsl_multifit_linear_workspace *`work`) This function computes the best-fit parameters

`c`of the model*y = X c*for the observations`y`and the matrix of predictor variables`X`, using a truncated SVD expansion. Singular values which satisfy*s_i \le tol \times s_0*are discarded from the fit, where*s_0*is the largest singular value. The*p*-by-*p*variance-covariance matrix of the model parameters`cov`is set to*\sigma^2 (X^T X)^{-1}*, where*\sigma*is the standard deviation of the fit residuals. The sum of squares of the residuals from the best-fit,*\chi^2*, is returned in`chisq`. The effective rank (number of singular values used in solution) is returned in`rank`. If the coefficient of determination is desired, it can be computed from the expression*R^2 = 1 - \chi^2 / TSS*, where the total sum of squares (TSS) of the observations`y`may be computed from`gsl_stats_tss`

.

- Function:
*int***gsl_multifit_wlinear***(const gsl_matrix **`X`, const gsl_vector *`w`, const gsl_vector *`y`, gsl_vector *`c`, gsl_matrix *`cov`, double *`chisq`, gsl_multifit_linear_workspace *`work`) This function computes the best-fit parameters

`c`of the weighted model*y = X c*for the observations`y`with weights`w`and the matrix of predictor variables`X`, using the preallocated workspace provided in`work`. The*p*-by-*p*covariance matrix of the model parameters`cov`is computed as*(X^T W X)^{-1}*. The weighted sum of squares of the residuals from the best-fit,*\chi^2*, is returned in`chisq`. If the coefficient of determination is desired, it can be computed from the expression*R^2 = 1 - \chi^2 / WTSS*, where the weighted total sum of squares (WTSS) of the observations`y`may be computed from`gsl_stats_wtss`

.

- Function:
*int***gsl_multifit_wlinear_tsvd***(const gsl_matrix **`X`, const gsl_vector *`w`, const gsl_vector *`y`, const double`tol`, gsl_vector *`c`, gsl_matrix *`cov`, double *`chisq`, size_t *`rank`, gsl_multifit_linear_workspace *`work`) This function computes the best-fit parameters

`c`of the weighted model*y = X c*for the observations`y`with weights`w`and the matrix of predictor variables`X`, using a truncated SVD expansion. Singular values which satisfy*s_i \le tol \times s_0*are discarded from the fit, where*s_0*is the largest singular value. The*p*-by-*p*covariance matrix of the model parameters`cov`is computed as*(X^T W X)^{-1}*. The weighted sum of squares of the residuals from the best-fit,*\chi^2*, is returned in`chisq`. The effective rank of the system (number of singular values used in the solution) is returned in`rank`. If the coefficient of determination is desired, it can be computed from the expression*R^2 = 1 - \chi^2 / WTSS*, where the weighted total sum of squares (WTSS) of the observations`y`may be computed from`gsl_stats_wtss`

.

- Function:
*int***gsl_multifit_linear_est***(const gsl_vector **`x`, const gsl_vector *`c`, const gsl_matrix *`cov`, double *`y`, double *`y_err`) This function uses the best-fit multilinear regression coefficients

`c`and their covariance matrix`cov`to compute the fitted function value`y`and its standard deviation`y_err`for the model*y = x.c*at the point`x`.

- Function:
*int***gsl_multifit_linear_residuals***(const gsl_matrix **`X`, const gsl_vector *`y`, const gsl_vector *`c`, gsl_vector *`r`) This function computes the vector of residuals

*r = y - X c*for the observations`y`, coefficients`c`and matrix of predictor variables`X`.

- Function:
*size_t***gsl_multifit_linear_rank***(const double*`tol`, const gsl_multifit_linear_workspace *`work`) This function returns the rank of the matrix

*X*which must first have its singular value decomposition computed. The rank is computed by counting the number of singular values*\sigma_j*which satisfy*\sigma_j > tol \times \sigma_0*, where*\sigma_0*is the largest singular value.

Next: Regularized regression, Previous: Linear regression, Up: Least-Squares Fitting [Index]