Next: Standard Nonlinear Models, Previous: Polynomial and Multilinear Fits, Up: Curve Fitting [Contents][Index]

With the Hyperbolic flag, `H a F` [`efit`

] performs the same
fitting operation as `a F`, but reports the coefficients as error
forms instead of plain numbers. Fitting our two data matrices (first
with 13, then with 14) to a line with `H a F` gives the results,

3. + 2. x 2.6 +/- 0.382970843103 + 2.2 +/- 0.115470053838 x

In the first case the estimated errors are zero because the linear fit is perfect. In the second case, the errors are nonzero but moderately small, because the data are still very close to linear.

It is also possible for the *input* to a fitting operation to
contain error forms. The data values must either all include errors
or all be plain numbers. Error forms can go anywhere but generally
go on the numbers in the last row of the data matrix. If the last
row contains error forms
‘`y_i` `+/-` `sigma_i`’,
then the
‘`chi^2`’
statistic is now,

chi^2 = sum(((y_i - (a + b x_i)) / sigma_i)^2, i, 1, N)

so that data points with larger error estimates contribute less to the fitting operation.

If there are error forms on other rows of the data matrix, all the
errors for a given data point are combined; the square root of the
sum of the squares of the errors forms the
‘`sigma_i`’
used for the data point.

Both `a F` and `H a F` can accept error forms in the input
matrix, although if you are concerned about error analysis you will
probably use `H a F` so that the output also contains error
estimates.

If the input contains error forms but all the
‘`sigma_i`’
values are the same, it is easy to see that the resulting fitted model
will be the same as if the input did not have error forms at all
(‘`chi^2`’
is simply scaled uniformly by
‘`1 / sigma^2`’,
which doesn’t affect where it has a minimum). But there *will* be
a difference in the estimated errors of the coefficients reported by
`H a F`.

Consult any text on statistical modeling of data for a discussion of where these error estimates come from and how they should be interpreted.

With the Inverse flag, `I a F` [`xfit`

] produces even more
information. The result is a vector of six items:

- The model formula with error forms for its coefficients or
parameters. This is the result that
`H a F`would have produced. - A vector of “raw” parameter values for the model. These are the
polynomial coefficients or other parameters as plain numbers, in the
same order as the parameters appeared in the final prompt of the
`I a F`command. For polynomials of degree ‘`d`’, this vector will have length ‘`M = d+1`’ with the constant term first. - The covariance matrix ‘
`C`’ computed from the fit. This is an`m`x`m`symmetric matrix; the diagonal elements ‘`C_j_j`’ are the variances ‘`sigma_j^2`’ of the parameters. The other elements are covariances ‘`sigma_i_j^2`’ that describe the correlation between pairs of parameters. (A related set of numbers, the*linear correlation coefficients*‘`r_i_j`’, are defined as ‘`sigma_i_j^2 / sigma_i sigma_j`’.) - A vector of ‘
`M`’ “parameter filter” functions whose meanings are described below. If no filters are necessary this will instead be an empty vector; this is always the case for the polynomial and multilinear fits described so far. - The value of
‘
`chi^2`’ for the fit, calculated by the formulas shown above. This gives a measure of the quality of the fit; statisticians consider ‘`chi^2 = N - M`’ to indicate a moderately good fit (where again ‘`N`’ is the number of data points and ‘`M`’ is the number of parameters). - A measure of goodness of fit expressed as a probability ‘
`Q`’. This is computed from the`utpc`

probability distribution function using ‘`chi^2`’ with ‘`N - M`’ degrees of freedom. A value of 0.5 implies a good fit; some texts recommend that often ‘`Q = 0.1`’ or even 0.001 can signify an acceptable fit. In particular, ‘`chi^2`’ statistics assume the errors in your inputs follow a normal (Gaussian) distribution; if they don’t, you may have to accept smaller values of ‘`Q`’.The ‘

`Q`’ value is computed only if the input included error estimates. Otherwise, Calc will report the symbol`nan`

for ‘`Q`’. The reason is that in this case the ‘`chi^2`’ value has effectively been used to estimate the original errors in the input, and thus there is no redundant information left over to use for a confidence test.