pexec

Up: (dir)

pexec

1 General information

1.1 Name

pexec - execute commands or shell scripts in parallel on a single host or on remote hosts using a remote shell

1.2 Description

This manual page documents briefly the pexec program. pexec executes in parallel the given command or shell script (e.g. parsed by /bin/sh) on the local host or on remote hosts, while some of the execution parameters, namely the redirected standard input, output or error and environmental variables can be varied.

The given program or script is executed as many times as how many parameters are specified in the command line or read from a given parameter file. Each parameter is a simple string which can be used either to pass to the program/script as the value of an environmental variable or it can be used in the format of the file names where the standard input, output or error are optionally redirected from or to.

Moreover, more than one shell command script can also be passed to parallel execution, in this case there is no need for parameters or the number of the parameters taken from command line (or read from a parameter file) must be the same as the number of the distinct shell command scripts.

The program is capable to automatically swallow the standard output and error (to /dev/null), or collecting them via pipes and dump to the invoker's standard output or error (with optional line headers or trailers which can be used to distinguish between the output of the distinctive processes).

The execution on remote hosts is done using a remote shell which both builds a tunnel between the invoking and the remote host(s) and do the authentication and ensures the security (if a secure remote shell is used). Hence, there is no need to run standalone daemons on the remote side: the remote shell itself executes the pexec program in daemon mode when the standard input and output of the latter is bound to the remote shell to form a (secure and authenticated) tunnel. See the appropriate section below for a more detailed explanation.

In order to avoid unexpected I/O load or to synchronize individual tasks, Bpexec supports mutual exclusions (mutexes) and atomic command executions. The maximum number of simultaneous tasks can be controlled by a hypervisor daemon: with such a daemon, concurrent pexec instances can start without an unexpectedly high load.

1.3 Synopsys

General invocation:

pexec [options] [--] command [arguments]

pexec [options] -c [--] script

pexec [options] -m [--] 'script1' ['script2'...]

Remote control, mutual exclusions and atomic command exectuion:

pexec [-j|--remote] [options]

pexec [-j|--remote] [options] [-l|-u <mutex>]

pexec [-j|--remote] [options] -a -m <mutex> [-c] [--] command [arguments]

Hypervisor daemon:

pexec [-H|--hypervisor] [options] start|stop

2 Command line options

2.1 General options

-h, --help: Gives general summary about the command line options.
--version: Gives some version information about the program.
-s, --shell <full shell path>: Full path (e.g. /bin/sh) of the shell to be used for script execution.
-c, --shell-command: Use a shell (see -s|--shell also) to interpret the command(s) instead of direct execution.
-m, --multiple-command: Allow multiple individual shell command scripts to be executed in parallel with the variation of the parameters (see notes at -l|--list or -f|--listfile).
-e, --environment <environmental variable name>: Name of an environmental variable which is set to the respective parameter (taken from the command line or the parameter file) before each execution and can be read from the program or shell script. In shells like /bin/bash, such environmental variables can easily be read as if they would be a normal shell variable (see the examples also).
-n, --number <number of parallel processes> | auto | managed | ncpu: The maximal number of processes running simultaneously. By default (omitting -n|--number or specifying -n|--number auto) the program tries to connect to a local hypervisor which keeps track the resources of the system (see sec. Hypervisor Mode for more details). If the connection to the local hypervisor failed, the program derives the number of available processing units on the local host using the content of /proc/cpuinfo or other system-specific method (on operating systems with different kernel than Linux). If the argument of -n|--number is managed, pexec searches only for hypervisor and terminates with a non-zero exits status if the connection is failed. If the argument for -n|--number is ncpu, the program does not try to connect to the hypervisor (even if it is running) but uses the available information (/proc/cpuinfo or other system-specific method) to figure out the number of processing units.
-C, --control [<host>:]<port>|<path>: This option specifies the control port of a hypervisor which is used to control the number of simultaneously run tasks. By default, this is the UNIX doman socket /tmp/pexec.sock. If the specified port is a single number, pexec connects to the given port on the localhost, if a host is specified, the program connects to the given host and port; otherwise if it is a valid path, pexec connects to that UNIX domain socket. Note that the hypervisor socket does not have a default port number, i.e. the port argument is mandatory after the host name. See sec. Hypervisor Mode for more details.
-p, --list <space separated list of parameters> [-p ...]: The single-argument form of main parameter list: each switch can accept one command line argument, thus if more than one parameter is defined after a single -p or --list option, the delimiter whitespace should be escaped somehow. Note that if there are more than one shell commands to be executed, the total number of parameters should be the same as the number of the individual shell commands or no parameters should be declared.
-r, --parameters <list of parameters up to the next switch> [-r ....]: The multiple-argument form of the main parameter list: each switch can be followed by multiple arguments up to the next switch, i.e. the nearest command line argument which begins with at least a single dash. Note that if in either of the arguments there is a space (or there are more spaces), such arguments will be splitted among the delimiting whitespaces. Note also that the parameter specifications by -p|--list and -r|--parameters can be mixed, depending on the actual problem or convenience. See also some notes below at -f|--listfile.
-f, --listfile <file containing the parameters>: The main parameter list file. The parameters are read line by line while the parts of the file after a '#' (hashmark) are treated as comments. By default, the parameter is the first whitespace delimited column of the line if the line is not empty (or fully commented out). The parameters can be gathered from another column (see -w|--column) or the complete line can be threaded as a single parameter (see -t|--complete). If the parameters are read from a single column and some of the parameters are wanted to contain space(s), it can be put between double quotation marks ("..."). Note that if there are more than one shell commands to be executed, the total number of parameters should be the same as the number of the individual shell commands or no parameters should be declared. Note also that the parameters can only be defined from command line or from this list file, i.e. -f|--listfile and -p|--list|-r|--parameters cannot be mixed.
-w, --column <column index>: The column from where the parameters should be taken if they are read from a parameter file (see -f|--listfile above). If the given column is not exist in the current line, that line will silently be omitted.
-t, --complete: Threat the whole line as a single parameter if the parameters are read from a file. Empty lines and all parts in the line after a '#' (hashmark) are omitted. Note that contrary to the argument splitting near -r|--parameters (or -p|--list), the content of a line won't be splitted into distinct parameters even if there are whitespaces.
-z, --nice <nice>: Sets the scheduling priority of pexec and all children (executed processes) to the priority defined by the nice value.
--: A marker after when the command begins. This is optional but useful to limit the parameter list of -r|--parameters or when the command itself begins with a literal '-' (dash) character (the latter is a rear case, as one can expect). This marker can also be used to emphasize the (beginning of the) command itself.

2.2 Redirecting standard input, output and error

-i, --input <input file format>: The name of the input file which is used for redirecting the standard input. If this argument is omitted, the standard input will be empty (i.e. /dev/null) unless the other two standard file redirection specifications are also omitted and there is not more than one parameter; in this case the standard streams are inherited by this single executed process. If the input file argument is a single existing file, all command execution processes will use the same file for input. If the argument contains the format elements %s and %d, these are replaced to the respective parameter name or the sequence number of the parameter (which is between 1 and the total number of parameters).
-o, --output <output file format>: The name of the output file which is used for redirecting the standard output. If this argument is omitted, the standard output will be swallowed (unless the other two redirections are also omitted, see -i|--input for this case). If the argument is a single file, all command execution processes writes their output to this single file. If the argument is a single dash or -1, all of the standard outputs are gathered to the invoker's standard output. If the argument is -2, the standard outputs are gathered to the standard error of the invoker. If the argument contains the format elements %s or %d, these are replaced to the respective parameter name and the standard output will be a different file for each process. Note that in the second case, when the output file is a single, non-formatted file name, the outputs are collected via pipes and there is no guarantee for subsequent data order, even the outputs of different processes can be mixed (moreover, if the output is ASCII text, parts of lines can also be mixed). This means that the processes will feel their standard outputs as pipes not as regular files. Note also that if only I%s is used in the formatted file name, the parameter list should contain unique parameters, unless some of the output files will be lost and/or written in parallel, yielding unexpected result.
-u, --error, --output-error <output error file format>: The name of the output file which is used for redirecting the standard error. All of the properties of the standard error redirection mechanism is the same as for the standard output, see -o|--output above for a more detailed explanation. The only exception is the single dash: specifying a single dash to -u|--error results that the standard errors are going to be collected to the standard error of the invoker. To redirect the errors to the standard output, use -1 as an argument.
-R, --normal-redirection: This is equivalent to specifying --output - and --error - and --input /dev/null. Since redirecting the same standard input to all of the executed commands is nearly meaningless in a parallel environment, this argument implies an expectable behaviour, i.e. the standard output and error streams of the commands are gathered to the invoker's standard output.
-a, --output-format <Ioutput line format>: The format of the final standard output redirection if the output of all of the processes are gathered into the same file. The format can contain any character, while the %s and %d format elements are replaced as it was written in -i|--input. The line itself without the trailing newline character is represented by %l. Extra characters (e.g. tabulators, newlines) can also be inserted using the well-known escape sequences. Note that the trailing newline is always set implicitly unless it is disabled by -x|--omit-newlines. The line buffering yielded by the simple format of %l can also be useful if all of the standard outputs (or errors) are collected in a single file and the invoker wants to avoid the inter-line confusion of output (i.e. if this redirection formatting is omitted, no line buffering is done at all).
The printf-like alignment syntax can also be used near %s, I%l and %d (both in the post-formatting and in the redirection file name formats): the number before the period indicates the minimum size and its sign refers to the alignment (positive: right alignment, negative: left alignment) while the number after the period indicates the minimum number of padding zeroes for numerical values. E.g the %5.3d would yield " -042" for -42.
-b, --error-format <error line format>: The same final redirection format for the standard error. See -a|--output-format for more details.
-x, --omit-newlines: If the final redirection of the standard output or error are re-formatted using -a|--output-format and/or -b|--error-format, the trailing newlines are disabled and only written if specified directly using '\n'.
.PP Note that in the case when no redirection is specified and the number of the parameters is exactly one or less; the executed process will inherit the standard files directly from the invoker. Otherwise, if there is more than one parameter in the list, the redirection will be defined by the -i|--input, -o|--output and -u|--error options. It means if one of these options is omitted, the respecting standard stream will be redirected from/to /dev/null. In other words, if any of these redirection options is specified, the latter rules will define the redirection, independently of the number of the parameters.

2.3 Execution using remote hosts

an appropriate daemon for the remote shell to connect to;
the appropriate version of pexec which is started in daemon mode;
and the same file structure if the executed command rely on some files (see below the option -k|--local-files for more details about this issue).

The same file structure which might be required the script can be ensured by an NFS or other types of network filesystem mounts. Note that in this case it is highly crucial since it is not determined which parameter is executed on which host. I.e. the executed script and the underlying filesystem should ensure the same result on every single host; otherwise one can get unexpected results.

-g, --remote-shell "<remote_shell> [<arguments>]": The name or full path of the remote shell to be used for building the tunnel between the local and the peer host(s). The default remote shell is /usr/bin/ssh with no extra arguments. Note that if additional arguments are defined for the remote shell, the whole argument of the switch should be escaped somehow (e.g. put between quotation marks). The connection and authentication are performed sequentially before executing anything and only once for each host: if the the authentication requires interactivity (e.g. typing a password), it is also done before the whole procedure starts.
-n, --number <hostspec>:[<processes>],...,[<processes>]: This more sophisticated form of the -n|--number option is used to specify a comma-separated list of names and expected capacities of the remote hosts used for parallel execution. The hostspec argument is the host specification argument passed directly to the remote shell which should be capable to understand it. In the most cases, it is simply the name of the peer machine, in the case of ssh, the username also can be passed using the well-known username@hostname form. The host specification must always be followed by a literal colon (':'). Optionally, the maximum number of processes to run on that host can follow the colon. If it is omitted, the maximum number of processes are determined on the peer side automatically (yielding the same number of processes as it is determined by --number auto, see above), moreover the literal auto, managed and ncpu arguments can also be used, like in the case of local host parallelization. The number of processes executed on the invoker's host can simply be specified by a single positive number, or by one of the keywords auto, managed and ncpu (just in the case of simple local host execution).
Note that the host specifications are additive, i.e. if the same machine (including the local one when the hostspec and its colon is omitted) is defined more than once, the maximum number of processes are added. It can yield unexpected results if the number is omitted after the colon, i.e. it is determined automatically, in this case the automatically determined maximum number of processes are also added, yielding a large load.
-k, --local-files [TBD]: If this option is enabled, the remote daemons will read or store the redirected standard input, output and error files on the local side. Otherwise (by default), these files are read from/stored to the invoker's host and tunneled to/from the peer.
Note that if the redirected files are parameter-specific and tunneled to/from the remote hosts, then these files 1/ on the invoker's host are seen as regular files; 2/ on the remote hosts are seen as pipes. Otherwise, if the redirection is done from/to a single file, both the local and remote hosts will see their standard outputs and errors as pipes but the standard input is still a regular file on the local side and a pipe on the remote side.
-P, --pexec <pexec-path>: The full path of the pexec program on the remote hosts. If this option is omitted, the invoker tries to figure out from the invoking syntax (see argv[0]) and the current path. This issue can be a bottleneck if the program is installed differently on the hosts since the remote shells executes their commands in non-interactive and/or non-login modes which might result different paths.
-T, --tunnel: If this option is the first in the positional argument list of pexec, the program will start in tunnel daemon mode. This parameter is not used during the regular usage but used by pexec itself to start daemons via the remote shell tunnels.

2.4 Remote control, mutual exclusions and atomic command execution

Running instances of pexec can be controlled remotely to gather some status information of the paralleled execution and implement mutual exclusions.

-y, --bind inet|unix|<port>|/<path>: This option lets pexec to be remote controlled via internet (AF_INET family sockets) or via UNIX domain sockets. If the literal inet or unix is specified as an argument for this switch, the port or the path of the named socket will be assigned randomly; but both of them can be specified directly by a single integer number (referring to an INET port) or by an absolute path (beginning with a literal slash, referring to an UNIX domain named socket). In all cases, the currently assigned port or path will be reported in the logs and will be exported as an environmental variable with the name of PEXEC_REMOTE_PORT (by default, see also -E|--pexec-connection-variable). This environmental variable is inherited by all processes executed on the local host and tunneled to the remote hosts too and inherited by the all of the processes executed by pexec daemons.
-E, --pexec-connection-variable <environment_variable_name>: This option overrides the default environment name PEXEC_REMOTE_PORT to the specified value, which is used by the -p|--connect auto combination to determine the control socket with which the running pexec instance can be controlled or polled. Note that in practice there is no need to change this variable since separate pexec jobs uses different environment space (i.e. a process which changes an environment variable affects the variables of its childrend only).
-j, --remote: If this option is the first in the argument list of pexec, then the program can be used to control and poll the status of other running instances of pexec if these other running ones were started with enabling the remote control by -y|--bind (see above).
-p, --connect auto|[<host>:]<port>|/<path>: With this option one can specify the pexec instance which is to be remote controlled. Since connecting to something is mandatory for doing remote control, omitting this option is equivalent with --connect auto and in this case the program gets the remote port information using the environmental variable PEXEC_REMOTE_PORT (by default, see also -E|--pexec-connection-variable). If this environmental variable is not exist, the connection will fail and pexec exits with an error.
The pexec instance to be remote controlled can also be specified directly by specifying either the INET host and port (in this case if host is omitted, localhost is used as default but the port number is mandatory since there is no default port) or the absolute path of the UNIX domain socket. Note that in the most shells --connect auto is equivalent to --connect $PEXEC_REMOTE_PORT (by default, see also -E|--pexec-connection-variable) since the environmental variables can be referred as a normal shell variable.
-t, --status: This option prints the actual status of the running jobs in a human-readable form to the invoker's standard output. It can be used for polling the progress of the whole paralleled execution.
-l, --lock, --mutex-lock <mutex-name>
-u, --unlock, --mutex-unlock <mutex-name>: With these options mutual exclusions (mutexes) can easily be implemented: if no one else performs the same locking, the program exists immediately, otherwise it would block until someone else releases the mutex (i.e. by calling -u|--unlock|--mutex-unlock with the mutex of the same name).
-m, --mutex <mutex-name>
-d, --dump <filename> | -s, --save <filename>: If one of the -d|--dump or -s|--save options are specified, the program prints the content of the file to standard output or stores the data read from the standard input to the specified file, respectively (like cat or tee with the difference that -s|--save does not copy the content to standard output like tee does so). If a mutex is specified by -m|--mutex, pexec locks the mutex before the dump/save operation and unlock after it is done, i.e. pexec -j -d something.txt -m mymutex is equivalent with ( pexec -j -l mymutex && cat something.txt && pexec -j -u mymutex ).
-m, --mutex <mutex-name>
-a, --atomic [-c|--shell-command] [--] <command>: With this option, an atomic execution respective to a given mutex (specified by -m|--mutex) of the command can be performed. For example, pexec -j -m mymutex -a cat something.txt is equivalent with ( pexec -j -l mymutex && cat something.txt && pexec -j -u mymutex ).

Note that if the lock and unlock operations are in the same pipeline and these operations use the same mutex, the invoker should ensure that the locking call exists before the unlock request could start (otherwise the whole parallel execution blocks infinitely). Like so, if dump and a save operations with the same mutex present in the same pipeline, the intermediate programs should delay the data propagation: the save part must not get any data until the dump part flushes everything to its standard output (otherwise even this single pipeline blocks infinitely).

Note also that the whole remote controlling procedure is transparent to the remote host execution, i.e. every necessary parameter, environmental variable and mutex lock/unlock request will propagate via the remote shell tunnel. Therefore the end-user won't see any difference (and do not have to bother with these details in his final invocation) between the purely local and remote execution processes.

2.5 Hypervisor mode

The program pexec is capable to run in hypervisor mode. The hypervisor daemon acts as a resource controller, i.e. other running instances of pexec ask the hypervisor if there is available resource or not. The main purpose of the hypervisor daemon is to balance the usage between concurrent running pexec instances in order to avoid unexpectedly high load.

-H, --hypervisor [start|stop]: This option starts pexec in hypervisor mode. By default, the hypervisor is not detached from the terminal. If start is given, the deamon is detached and put into background. Such running daemons can be stopped using the stop argument.
-C, --control <port>|/<path>: This option specifies the control port used by the hypervisor to listen for connections. By default, this is the UNIX doman socket /tmp/pexec.sock. If the specified port is a single number, pexec creates an INET server socket, otherwise if it is a valid path, an UNIX domain socket is created.
-n, --number <number of parallel processes> | auto | ncpu: The maximum number of parallel processes which is allowed to be executed simultaneously by all of the connected pexec programs together. By default, pexec uses /proc/cpuinfo (or other system-dependent way) to figure out the number of available processing units and use this number as the maximum of parallel processes.
-l, --load, --use-load <load>: This option limits the number of processes to yield a normalized load less than unity, still keeping the number of simultaneous processes not to be greater than the value specified by -n|--number. The normalized load is the actual load (averaged on 1, 5 or 15 minutes) divided by the maximum number of parallel processes. The argument of this switch can be 0, 1 or 2, or 1min, 5min or 15min, respectively. The pexec hypervisor uses the specified time averaged load.

2.6 Logging

-L, --log <log file>: The name of the log file where the details of the parallel executions are written. By default, no such log file is created.
-W, --log-level <log level>: The log level, an integer number between 0 and infinity. The more large this number is, the more information is written to the log file. If the log level is zero, no log file is created. The default log level is 1. If the log level is specified directly, but -l|--log is omitted, the log will be written to the invoker's standard error (maybe in parallel with other messages gathered by -u|--error -. If neither -l|--log nor -v|--log-level is specified, logging does not occur.
-V, --verbose: Increase the log level (see -v|--log-level) by one. For example, --log-level 2 is equivalent to -V -V.

3 Examples

3.1 Identical execution

If all options are omitted, only the command and its arguments are specified after pexec, the program simply runs the given process, as it would happen without pexec:

	pexec ssh -X -l user host

3.2 Calculate the square root of some numbers

Using directly the output of the command seq, let us calculate the square root of the first ten integer numbers and store the results in separate files. For the calculation itself, we use the program bc, a command-line driven arbitrary precision calculator:

	pexec -o sqrt-%s.dat -p "$(seq 10)" -e NUM -n 4 -c -- \
	  	'echo "scale=10000;sqrt($NUM)" | bc'

Here we explicitly used 4 processors. The number itself is passed to the shell-script via the environmental variable NUM.

3.3 Sort some files

This example sorts some files which matches the a pre-defined search pattern. The sorted versions of the files are stored in the files with the same names but the suffix .sort is always appended.

	pexec -p "$(ls myfiles*.ext)" -i %s -o %s.sort -- sort

Here we used explicit redirection from the input files to the output files. Since the command itself is very simple, it is wise to put the dash-dash before the command to make the reading of the whole command easier.

3.4 Detect stars on astronomical images

In this example we assume that we have a list file of base names of astronomical images, i.e. the images themselves have the base name followed by the extension *.fits. Since we not expect so much errors, the standard errors are collected in a single file (star.log, namely):

	pexec -f image.list -n auto -e B -u star.log -c -- \
	  	'fistar $B.fits -f 100 -F id,x,y,flux -o $B.star'

The program fistar can be more tuned, depending on the actual problem. The base name of the image implies the base name of the star detection output (*.star). Here also an environmental variable, B was used to pass the varied information, i.e. the basename of the images.

3.5 Convert all PNG images in the current directory to JPEG format

If all of our images (*.png) have a name which does not start with a dash, we can use the -r|--parameters switch too:

	pexec -r *.png -e IMG -c -o - -- \
	  	'convert $IMG ${IMG%.png}.jpeg ; "echo $IMG: done"'

For the conversion the ImageMagick tool convert is used which simply figures out the format from the extensions. The trailing echo just report the images which are ready after conversion, these "reports" are collected via the standard output and printed for the invoker due to -o -. Another realization, using the NetPBM package:

	pexec -r *.jpg -i %s -o %s.png -c 'jpegtopnm | pnmtopng'

3.6 Rescale all JPEGimages in the current directory, using mutexes:

In this example a simple usage of mutexes is demonstrated. The usage of mutexes prevents a high (peak-like) disk access if many processes would try to read/write the same disk simultaneously:

	pexec -n 8 -r *.jpg -y unix -e IMG -c \
	 	'pexec -j -m blockread -d $IMG | \
 		 jpegtopnm | pnmscale 0.5 | pnmtojpeg | \
	 	 pexec -j -m blockwrite -s th_$IMG'

In the above example an UNIX domain socket is used to for the communication between the main pexec program and the remote control calls.

4 Additional information

4.1 Bugs

The -k|--local-files option is still not implemented.
Actually, in the cases when the execution of the process fails (i.e. the program or shell does not exists, access is denied or things like that), the child of pexec prints an error to standard error and exits, therefore unless the standard error is gathered somehow (see -u|--error), the invoker won't be informed in such cases. And even if so, the invoker is unable to distinguish this from the case when the successfully executed process prints directly the same message to its standard error. This is not a real bug but this behavior is planned to be changed in the future.
Another strange cases are when the child process is explicitly terminated, killed or stopped/continued. These cases are handled somehow but should be fine-tuned by the invoker in the further releases.
The logging now is still poorish, it should also be improved (esp. to log the above unexpected shutdowns).

4.2 Version

This manual page describes pexec version 1.0rc5.

4.3 Copyright

This software was written by Andras Pal. The core part was written while working for the Hungarian-made Automated Telescope (HAT) project to make the data processing more easier and therefore find many-many extrasolar planets. See more information about this project: http://hatnet.hu. Another internal libraries (e.g. numhash.[ch] were primarily written for other projects. Send bug reports, comments and remarks to apal@szofi.elte.hu or apal@cfa.harvard.edu.

This program is free software: you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation, either version 3 of the License, or (at your option) any later version.

This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details.

This documentation: copyright (C) 2007, 2008; Andras Pal. Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or any later version published by the Free Software Foundation; with no Invariant Sections, no Front-Cover Texts, and no Back-Cover Texts. A copy of the license is included in the section entitled "GNU Free Documentation License".