Next: Contour options, Previous: Generating histograms and cumulative freq., Up: Invoking Statistics [Contents][Index]
With the options below, you can customize the least squares fitting features of Statistics. For a tutorial of the usage of least squares fitting in Statistics, please see Least squares fitting. Here, we will just review the details of each option.
To activate least squares fitting in Statistics, it is necessary to use the --fit option to specify the type of fit you want to do. See the description of --fit for the various available fitting models. The fitting models that account for weights require three input columns, while the non-weighted ones only take two input columns. Here is a summary of the input columns:
If three columns are given to a model without weight, or two columns are given to a model that requires weights, Statistics will abort and inform you. Below you can see an example of fitting with the same linear model, once weighted and once without weights.
$ aststatistics table.fits --column=X,Y --fit=linear $ aststatistics table.fits --column=X,Y,Yerr --fit=linear-weighted
The output of the fitting can be in three modes listed below. For a complete example, see the tutorial in Least squares fitting).
By default (for example, the commands above) the output is an elaborate description of the model parameters. for example, \(c_0\) and \(c_1\) in the linear model (\(Y=c_0+c_1X\)). Their covariance matrix and the reduced \(\chi^2\) of the fit are also printed on the output.
If you don’t need the human friendly components of the output (which are annoying when you want to parse the outputs in some scenarios), you can use --quiet option. Only the raw output numbers will be printed.
Through the --fitestimate option, you can specify an independent table column to estimate the fit (it can also take a single value). See the description of this option for more.
The name of the fitting method to use. They are based on the linear and nonlinear least-squares fitting functions of the GNU Scientific Library (GSL).
linear
\(Y=c_0+c_1X\)
linear-weighted
\(Y=c_0+c_1X\); accounting for “weights” in \(Y\).
linear-no-constant
\(Y=c_1X\).
linear-no-constant-weighted
\(Y=c_1X\); accounting for “weights” in \(Y\).
polynomial
\(Y=c_0+c_1X+c_2X^2+\cdots+c_nX^n\); the maximum required power (\(n\)) is specified by --fitmaxpower.
polynomial-weighted
\(Y=c_0+c_1X+c_2X^2+\cdots+c_nX^n\); accounting for “weights” in \(Y\). The maximum required power (\(n\)) is specified by --fitmaxpower.
polynomial-robust
¶\(Y=c_0+c_1X+c_2X^2+\cdots+c_nX^n\); rejects outliers. The function to use for outlier removal can be specified with the --fitrobust option described below. This model doesn’t take weights since they are calculated internally based on the outlier removal function (requires two input columns). The maximum required power (\(n\)) is specified by --fitmaxpower.
For a comprehensive review of “robust” fitting and the available functions, please see the Robust linear regression section of the GNU Scientific Library.
The nature of the “weight” column (when a weight is necessary for the model). It can take one of the following values:
std
Standard deviation of each \(Y\) axis measurement: this is the usual “error” associated with a measurement (for example, in MakeCatalog) and is the default value to this option.
var
Variance of each \(Y\) axis measurement. Assuming a Gaussian distribution with standard deviation \(\sigma\), the variance is \(\sigma^2\).
inv-var
Inverse variance of each \(Y\) axis measurement. Assuming a Gaussian distribution with standard deviation \(\sigma\), the variance is \(1/\sigma^2\).
The maximum power (an integer) in a polynomial (\(n\) in \(Y=c_0+c_1X+c_2X^2+\cdots+c_nX^n\)). This is only relevant when one of the polynomial models is given to --fit. The fit will return \(n+1\) coefficients.
The function for rejecting outliers in the polynomial-robust
fitting model.
For a comprehensive review of “robust” fitting and the available functions, please see the Robust linear regression section of the GNU Scientific Library.
This function can take the following values:
bisquare
¶Tukey’s biweight (bisquare) function, this is the default function. According to the GSL manual, this is a good general purpose weight function.
cauchy
¶Cauchy’s function (also known as the Lorentzian function). It doesn’t guarantee a unique solution, so it should be used with care.
fair
¶The fair function. It guarantees a unique solution and has continuous derivatives to three orders.
huber
¶Huber’s \(\rho\) function. This is also a good general purpose weight function for rejecting outliers, but can cause difficulty in some special scenarios.
ols
Ordinary Least Squares (OLS) solution with a constant weight of unity.
welsch
¶Welsch function which is useful when the residuals follow an exponential distribution.
Estimate the fitted function at a single point or a complete column of points. The input \(X\) axis positions to estimate the function can be specified in the following ways:
self
: in this mode, the same X axis column that was used in the fit will be used for estimating the fitted function.
This can be useful to visually/easily check the fit, see Least squares fitting.
The output in this mode can be customized in the following ways:
HDU name or counter (counting from zero) that contains the table to be used for the estimating the fitted function over many points through --fitestimate. For more on selecting a HDU, see the description of --hdu in Input/Output options.
Column name or counter (counting from one) that contains the table to be used for the estimating the fitted function over many points through --fitestimate. See Selecting table columns.
Next: Contour options, Previous: Generating histograms and cumulative freq., Up: Invoking Statistics [Contents][Index]
JavaScript license information
GNU Astronomy Utilities 0.19 manual, October 2022.