Many statistical tests rely upon certain properties of the data.
One common property, upon which many linear tests depend, is that of
normality — the data must have been drawn from a normal distribution.
It is necessary then to ensure normality before deciding upon the
test procedure to use. One way to do this uses the `EXAMINE`

command.

In Example 5.5, a researcher was examining the failure rates
of equipment produced by an engineering company.
The file `repairs.sav` contains the mean time between
failures (`mtbf`) of some items of equipment subject to the study.
Before performing linear analysis on the data,
the researcher wanted to ascertain that the data is normally distributed.

A normal distribution has a skewness and kurtosis of zero.
Looking at the skewness of `mtbf` in Example 5.5 it is clear
that the mtbf figures have a lot of positive skew and are therefore
not drawn from a normally distributed variable.
Positive skew can often be compensated for by applying a logarithmic
transformation.
This is done with the `COMPUTE`

command in the line

compute mtbf_ln = ln (mtbf).

Rather than redefining the existing variable, this use of `COMPUTE`

defines a new variable `mtbf_ln` which is
the natural logarithm of `mtbf`.
The final command in this example calls `EXAMINE`

on this new variable,
and it can be seen from the results that both the skewness and
kurtosis for `mtbf_ln` are very close to zero.
This provides some confidence that the `mtbf_ln` variable is
normally distributed and thus safe for linear analysis.
In the event that no suitable transformation can be found,
then it would be worth considering
an appropriate non-parametric test instead of a linear one.
See NPAR TESTS, for information about non-parametric tests.

PSPP> get file='/usr/local/share/pspp/examples/repairs.sav'. PSPP> examine mtbf /statistics=descriptives. PSPP> compute mtbf_ln = ln (mtbf). PSPP> examine mtbf_ln /statistics=descriptives. Output: Case Processing Summary +-----------------------------------+-------------------------------+ | | Cases | | +----------+---------+----------+ | | Valid | Missing | Total | | | N|Percent|N|Percent| N|Percent| +-----------------------------------+--+-------+-+-------+--+-------+ |Mean time between failures (months)|15| 100.0%|0| .0%|15| 100.0%| +-----------------------------------+--+-------+-+-------+--+-------+ Descriptives +----------------------------------------------------------+---------+--------+ | | | Std. | | |Statistic| Error | +----------------------------------------------------------+---------+--------+ |Mean time between Mean | 8.32| 1.62| |failures (months) 95% Confidence Interval Lower | 4.85| | | for Mean Bound | | | | Upper | 11.79| | | Bound | | | | 5% Trimmed Mean | 7.69| | | Median | 8.12| | | Variance | 39.21| | | Std. Deviation | 6.26| | | Minimum | 1.63| | | Maximum | 26.47| | | Range | 24.84| | | Interquartile Range | 5.83| | | Skewness | 1.85| .58| | Kurtosis | 4.49| 1.12| +----------------------------------------------------------+---------+--------+ Case Processing Summary +-------+-------------------------------+ | | Cases | | +----------+---------+----------+ | | Valid | Missing | Total | | | N|Percent|N|Percent| N|Percent| +-------+--+-------+-+-------+--+-------+ |mtbf_ln|15| 100.0%|0| .0%|15| 100.0%| +-------+--+-------+-+-------+--+-------+ Descriptives +----------------------------------------------------+---------+----------+ | |Statistic|Std. Error| +----------------------------------------------------+---------+----------+ |mtbf_ln Mean | 1.88| .19| | 95% Confidence Interval for Mean Lower Bound| 1.47| | | Upper Bound| 2.29| | | 5% Trimmed Mean | 1.88| | | Median | 2.09| | | Variance | .54| | | Std. Deviation | .74| | | Minimum | .49| | | Maximum | 3.28| | | Range | 2.79| | | Interquartile Range | .92| | | Skewness | -.16| .58| | Kurtosis | -.09| 1.12| +----------------------------------------------------+---------+----------+ |

