Previous: Testing data consistency, Up: Data Screening and Transformation [Contents][Index]

Many statistical tests rely upon certain properties of the data.
One common property, upon which many linear tests depend, is that of
normality — the data must have been drawn from a normal distribution.
It is necessary then to ensure normality before deciding upon the
test procedure to use. One way to do this uses the `EXAMINE`

command.

In Example 5.5, a researcher was examining the failure rates
of equipment produced by an engineering company.
The file `repairs.sav` contains the mean time between
failures (`mtbf`) of some items of equipment subject to the study.
Before performing linear analysis on the data,
the researcher wanted to ascertain that the data is normally distributed.

A normal distribution has a skewness and kurtosis of zero.
Looking at the skewness of `mtbf` in Example 5.5 it is clear
that the mtbf figures have a lot of positive skew and are therefore
not drawn from a normally distributed variable.
Positive skew can often be compensated for by applying a logarithmic
transformation.
This is done with the `COMPUTE`

command in the line

compute mtbf_ln = ln (mtbf).

Rather than redefining the existing variable, this use of `COMPUTE`

defines a new variable `mtbf_ln` which is
the natural logarithm of `mtbf`.
The final command in this example calls `EXAMINE`

on this new variable,
and it can be seen from the results that both the skewness and
kurtosis for `mtbf_ln` are very close to zero.
This provides some confidence that the `mtbf_ln` variable is
normally distributed and thus safe for linear analysis.
In the event that no suitable transformation can be found,
then it would be worth considering
an appropriate non-parametric test instead of a linear one.
See NPAR TESTS, for information about non-parametric tests.

PSPP> get file='/usr/local/share/pspp/examples/repairs.sav'. PSPP> examine mtbf /statistics=descriptives. PSPP> compute mtbf_ln = ln (mtbf). PSPP> examine mtbf_ln /statistics=descriptives. Output: 1.2 EXAMINE. Descriptives #====================================================#=========#==========# # #Statistic|Std. Error# #====================================================#=========#==========# #mtbf Mean # 8.32 | 1.62 # # 95% Confidence Interval for Mean Lower Bound# 4.85 | # # Upper Bound# 11.79 | # # 5% Trimmed Mean # 7.69 | # # Median # 8.12 | # # Variance # 39.21 | # # Std. Deviation # 6.26 | # # Minimum # 1.63 | # # Maximum # 26.47 | # # Range # 24.84 | # # Interquartile Range # 5.83 | # # Skewness # 1.85 | .58 # # Kurtosis # 4.49 | 1.12 # #====================================================#=========#==========# 2.2 EXAMINE. Descriptives #====================================================#=========#==========# # #Statistic|Std. Error# #====================================================#=========#==========# #mtbf_ln Mean # 1.88 | .19 # # 95% Confidence Interval for Mean Lower Bound# 1.47 | # # Upper Bound# 2.29 | # # 5% Trimmed Mean # 1.88 | # # Median # 2.09 | # # Variance # .54 | # # Std. Deviation # .74 | # # Minimum # .49 | # # Maximum # 3.28 | # # Range # 2.79 | # # Interquartile Range # .92 | # # Skewness # -.16 | .58 # # Kurtosis # -.09 | 1.12 # #====================================================#=========#==========# |

Previous: Testing data consistency, Up: Data Screening and Transformation [Contents][Index]