Next: , Previous: CORRELATIONS, Up: Statistics


15.5 CROSSTABS

     CROSSTABS
             /TABLES=var_list BY var_list [BY var_list]...
             /MISSING={TABLE,INCLUDE,REPORT}
             /WRITE={NONE,CELLS,ALL}
             /FORMAT={TABLES,NOTABLES}
                     {PIVOT,NOPIVOT}
                     {AVALUE,DVALUE}
                     {NOINDEX,INDEX}
                     {BOX,NOBOX}
             /CELLS={COUNT,ROW,COLUMN,TOTAL,EXPECTED,RESIDUAL,SRESIDUAL,
                     ASRESIDUAL,ALL,NONE}
             /STATISTICS={CHISQ,PHI,CC,LAMBDA,UC,BTAU,CTAU,RISK,GAMMA,D,
                          KAPPA,ETA,CORR,ALL,NONE}
     
     (Integer mode.)
             /VARIABLES=var_list (low,high)...

The CROSSTABS procedure displays crosstabulation tables requested by the user. It can calculate several statistics for each cell in the crosstabulation tables. In addition, a number of statistics can be calculated for each table itself.

The TABLES subcommand is used to specify the tables to be reported. Any number of dimensions is permitted, and any number of variables per dimension is allowed. The TABLES subcommand may be repeated as many times as needed. This is the only required subcommand in general mode.

Occasionally, one may want to invoke a special mode called integer mode. Normally, in general mode, pspp automatically determines what values occur in the data. In integer mode, the user specifies the range of values that the data assumes. To invoke this mode, specify the VARIABLES subcommand, giving a range of data values in parentheses for each variable to be used on the TABLES subcommand. Data values inside the range are truncated to the nearest integer, then assigned to that value. If values occur outside this range, they are discarded. When it is present, the VARIABLES subcommand must precede the TABLES subcommand.

In general mode, numeric and string variables may be specified on TABLES. In integer mode, only numeric variables are allowed.

The MISSING subcommand determines the handling of user-missing values. When set to TABLE, the default, missing values are dropped on a table by table basis. When set to INCLUDE, user-missing values are included in tables and statistics. When set to REPORT, which is allowed only in integer mode, user-missing values are included in tables but marked with an ‘M’ (for “missing”) and excluded from statistical calculations.

Currently the WRITE subcommand is ignored.

The FORMAT subcommand controls the characteristics of the crosstabulation tables to be displayed. It has a number of possible settings:

The CELLS subcommand controls the contents of each cell in the displayed crosstabulation table. The possible settings are:

COUNT
Frequency count.
ROW
Row percent.
COLUMN
Column percent.
TOTAL
Table percent.
EXPECTED
Expected value.
RESIDUAL
Residual.
SRESIDUAL
Standardized residual.
ASRESIDUAL
Adjusted standardized residual.
ALL
All of the above.
NONE
Suppress cells entirely.

/CELLS’ without any settings specified requests COUNT, ROW, COLUMN, and TOTAL. If CELLS is not specified at all then only COUNT will be selected.

The STATISTICS subcommand selects statistics for computation:

CHISQ
Pearson chi-square, likelihood ratio, Fisher's exact test, continuity correction, linear-by-linear association.
PHI
Phi.
CC
Contingency coefficient.
LAMBDA
Lambda.
UC
Uncertainty coefficient.
BTAU
Tau-b.
CTAU
Tau-c.
RISK
Risk estimate.
GAMMA
Gamma.
D
Somers' D.
KAPPA
Cohen's Kappa.
ETA
Eta.
CORR
Spearman correlation, Pearson's r.
ALL
All of the above.
NONE
No statistics.

Selected statistics are only calculated when appropriate for the statistic. Certain statistics require tables of a particular size, and some statistics are calculated only in integer mode.

/STATISTICS’ without any settings selects CHISQ. If the STATISTICS subcommand is not given, no statistics are calculated.

Please note: Currently the implementation of CROSSTABS has the followings bugs:

Fixes for any of these deficiencies would be welcomed.