parsort - Sort (big files) in parallel


parsort options for sort


parsort uses GNU sort to sort in parallel. It works just like sort but faster on inputs with more than 1 M lines, if you have a multicore machine.

Hopefully these ideas will make it into GNU sort in the future.


Same as sort. Except:

  • --parallel=N

Change the number of sorts run concurrently to N. N will be increased to number of files if parsort is given more than N files.


Sort files:

parsort *.txt > sorted.txt

Sort stdin (standard input) numerically:

cat numbers | parsort -n > sorted.txt


parsort is faster on files than on stdin (standard input), because different parts of a file can be read in parallel.

On a 48 core machine you should see a speedup of 3x over sort.


Copyright (C) 2020-2024 Ole Tange, and Free Software Foundation, Inc.


Copyright (C) 2012 Free Software Foundation, Inc.

This program is free software; you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation; either version 3 of the License, or at your option any later version.

This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details.

You should have received a copy of the GNU General Public License along with this program. If not, see <>.


parsort uses sort, bash, and parallel.



    # Speed optimization: Choose the correct shell_quote_scalar_*
=for pod2rst next-code-block: bash

    # and call that directly from now on
    no warnings 'redefine';
    if($Global::cshell) {
       # (t)csh
       *shell_quote_scalar = \\&shell_quote_scalar_csh;
=for pod2rst next-code-block: bash

    } elsif($Global::shell =~ m:(^|/)rc$:) {
       # rc-shell
       *shell_quote_scalar = \\&shell_quote_scalar_rc;
    } else {
       # other shells
       *shell_quote_scalar = \\&shell_quote_scalar_default;
    # The sub is now redefined. Call it
    return shell_quote_scalar($_[0]); }

sub Q($) { =for pod2rst next-code-block: bash

    # Q alias for ::shell_quote_scalar
    my $ret = shell_quote_scalar($_[0]);
    no warnings 'redefine';
=for pod2rst next-code-block: bash

    *Q = \&::shell_quote_scalar;
    return $ret; }
sub status(@) {

my @w = @_; my $fh = $Global::status_fd || *STDERR; print $fh map { ($_, "n") } @w; flush $fh; }

sub status_no_nl(@) {

my @w = @_; my $fh = $Global::status_fd || *STDERR; print $fh @w; flush $fh; }

sub warning(@) {

my @w = @_; my $prog = $Global::progname || "parsort"; status_no_nl(map { ($prog, ": Warning: ", $_, "n"); } @w); }


my %warnings; sub warning_once(@) {

my @w = @_; my $prog = $Global::progname || "parsort"; $warnings{@w}++ or

status_no_nl(map { ($prog, ": Warning: ", $_, "\n"); } @w);

=for pod2rst next-code-block: bash

} }
sub error(@) {

my @w = @_; my $prog = $Global::progname || "parsort"; status(map { ($prog.": Error: ". $_); } @w); }

sub die_bug($) {

my $bugid = shift; print STDERR

("$Global::progname: This should not happen. You have found a bug. ",

"Please follow\n", "\n", "\n", "Include this in the report:\n", "* The version number: $Global::version\n", "* The bugid: $bugid\n", "* The command line being run\n", "* The files being read (put the files on a webserver if they are big)\n", "\n", "If you get the error on smaller/fewer files, please include those instead.\n");

=for pod2rst next-code-block: bash

exit(255); }
if(@ARGV) {

sort_files(@ARGV); } elsif(length $opt::files0_from) {

=for pod2rst next-code-block: bash

open(my $fh,"<",$opt::files0_from) || die;
my @files = <$fh>;
sort_files(@files); } else {
sort_stdin(); }