2.5 Sources of random data

The shuf, shred, and sort commands sometimes need random data to do their work. For example, ‘sort -R’ must choose a hash function at random, and it needs random data to make this selection.

By default these commands use an internal pseudo-random generator initialized by a small amount of entropy, but can be directed to use an external source with the --random-source=file option. An error is reported if file does not contain enough bytes.

For example, the device file /dev/urandom could be used as the source of random data. Typically, this device gathers environmental noise from device drivers and other sources into an entropy pool, and uses the pool to generate random bits. If the pool is short of data, the device reuses the internal pool to produce more bits, using a cryptographically secure pseudo-random number generator. But be aware that this device is not designed for bulk random data generation and is relatively slow.

/dev/urandom suffices for most practical uses, but applications requiring high-value or long-term protection of private data may require an alternate data source like /dev/random or /dev/arandom. The set of available sources depends on your operating system.

To reproduce the results of an earlier invocation of a command, you can save some random data into a file and then use that file as the random source in earlier and later invocations of the command. Rather than depending on a file, one can generate a reproducible arbitrary amount of pseudo-random data given a seed value, using for example:

get_seeded_random()
{
  seed="$1"
  openssl enc -aes-256-ctr -pass pass:"$seed" -nosalt \
    </dev/zero 2>/dev/null
}

shuf -i1-100 --random-source=<(get_seeded_random 42)