Next: , Previous: , Up: Usage Examples   [Contents][Index]


5.8 Crosstab - Cross-Tabulation (pivot-tables)

Cross-tabulation compares the relationship between two fields. Given the following input file:

$ cat input.txt
a    x    3
a    y    7
b    x    21
a    x    40

Show cross-tabulation between the first field (a/b) and the second field (x/y) - counting how many times each pair appears (note: sorting is required):

$ datamash -s crosstab 1,2 < input.txt
     x    y
a    2    1
b    1    N/A

The default operation is count - in the above example, a and x appear twice in the input file, while b and y never appear together.

An optional grouping operation can be used instead of counting.

For each pair, sum the values in the third column:

$ datamash -s crosstab 1,2 sum 3 < input.txt
     x    y
a    43   7
b    21   N/A

For each pair, list all unique values in the third column:

$ datamash -s crosstab 1,2 unique 3 < input.txt
     x    y
a    3,40 7
b    21   N/A

Note that using --header-out with crosstab prints a line showing how to interpret the rows and columns, and what operation was used.

$ datamash -s --header-in --header-out crosstab 1,2 < input.txt
GroupBy(a) GroupBy(x) count(a)
     x    y
a    1    1
b    1    N/A