History of GNU Parallel


GNU parallel was originally two tools: xxargs and parallel.

xxargs was a tool that had the most basic features of xargs namely -0, -n, -x, and {}. In addition it also had -1 which is called --halt-on-error 1 in GNU parallel. xxargs was developed during 2002-2005 and was based on work by Cameron Simpson. Cameron sums up the reason for developing xxargs very well in a posting from 2001-12-20:

Having pretty much had it with xargs, which is a busted piece of crap due to its quoting/whitespace problems, here is a less featured but more robust one. - Cameron Simpson <cs@zip.com.au>

parallel dates back to around the same time. It was originally a wrapper that generated a makefile and used make -j to do the parallelization. The only usage was: cat commandlines | parallel number.

The full(!) source code of the very first version of Parallel dated 2002-01-06:

  #!/usr/bin/perl

  $processes=shift;

  chomp(@jobs=<>);
  for (@jobs) {
      $jobnr++;
      push @makefile,
      (".PHONY : job$jobnr\n",
       "job$jobnr :\n",
       "\t$_\n");
  }
  unshift @makefile, "all : ",(map { "job$_ " } 1 .. $jobnr),"\n";
  
  open (MAKE, "| make -k -f - -j $processes") || die;
  print MAKE @makefile;
  close MAKE;

It did not support grouping of output or running jobs on multiple computers, and the most common reason for bugs was quoting the string incorrectly in the Makefile. It also stressed make's dependency calculation: at this time make's initialization would take O(n^2) time on n independant targets, which was a real problem if you had 1000000 commandlines to run. A later version chopped the commandlines up into blocks of 5000 lines to minimize the problem.

In 2005 it occurred to me that I often used xxargs and parallel together and I saw no reason why they could not merge into a single tool. The single tool was called parallel and the development of xxargs stopped. At this point both xxargs and parallel were used in production.

GNU parallel thus did not have one but two main objectives: replace xargs and run commands in parallel.

In the years after 2005 I noticed that other people still had problems with xargs' treatment of quotes and whitespace, so I tried getting parallel accepted into GNU findutils. It was not accepted as it was written in Perl and the team did not want GNU findutils to depend on Perl.

In 2007 unittests were added to parallel to make sure old problems would not creep back in while developing new features. The hosting was moved to git hosted on savannah.nongnu.org.

In February 2009 I tried getting parallel added to the package moreutils. The author never replied to the email or the two reminders, but in June 2009 moreutils chose to add another program called parallel. This choice leads to some confusion even today.

In 2010 parallel was adopted as an official GNU tool and the name was changed to GNU parallel. As GNU already had a tool for running jobs on remote computers (called pexec) it was a hard decision to include GNU parallel as well. I believe the decision was mostly based on GNU parallel having a more familiar user interface - behaving very much like xargs. Shortly after the release as GNU tool remote execution was added and all missing options from xargs were added to make it possible to use GNU parallel as a drop in replacement for xargs.

Copenhagen, 2010-11-20, Ole Tange, Author of GNU Parallel (Source code added 2021-03-06).