2.6.1 Line Group Formats

Line group formats let you specify formats suitable for many applications that allow if-then-else input, including programming languages and text formatting languages. A line group format specifies the output format for a contiguous group of similar lines.

For example, the following command compares the TeX files old and new, and outputs a merged file in which old regions are surrounded by ‘\begin{em}’-‘\end{em}’ lines, and new regions are surrounded by ‘\begin{bf}’-‘\end{bf}’ lines.

diff \
   --old-group-format='\begin{em}
%<\end{em}
' \
   --new-group-format='\begin{bf}
%>\end{bf}
' \
   old new

The following command is equivalent to the above example, but it is a little more verbose, because it spells out the default line group formats.

diff \
   --old-group-format='\begin{em}
%<\end{em}
' \
   --new-group-format='\begin{bf}
%>\end{bf}
' \
   --unchanged-group-format='%=' \
   --changed-group-format='\begin{em}
%<\end{em}
\begin{bf}
%>\end{bf}
' \
   old new

Here is a more advanced example, which outputs a diff listing with headers containing line numbers in a “plain English” style.

diff \
   --unchanged-group-format='' \
   --old-group-format='-------- %dn line%(n=1?:s) deleted at %df:
%<' \
   --new-group-format='-------- %dN line%(N=1?:s) added after %de:
%>' \
   --changed-group-format='-------- %dn line%(n=1?:s) changed at %df:
%<-------- to:
%>' \
   old new

To specify a line group format, use diff with one of the options listed below. You can specify up to four line group formats, one for each kind of line group. You should quote format, because it typically contains shell metacharacters.

--old-group-format=format

These line groups are hunks containing only lines from the first file. The default old group format is the same as the changed group format if it is specified; otherwise it is a format that outputs the line group as-is.

--new-group-format=format

These line groups are hunks containing only lines from the second file. The default new group format is same as the changed group format if it is specified; otherwise it is a format that outputs the line group as-is.

--changed-group-format=format

These line groups are hunks containing lines from both files. The default changed group format is the concatenation of the old and new group formats.

--unchanged-group-format=format

These line groups contain lines common to both files. The default unchanged group format is a format that outputs the line group as-is.

In a line group format, ordinary characters represent themselves; conversion specifications start with ‘%’ and have one of the following forms.

%<

stands for the lines from the first file, including the trailing newline. Each line is formatted according to the old line format (see Line Formats).

%>

stands for the lines from the second file, including the trailing newline. Each line is formatted according to the new line format.

%=

stands for the lines common to both files, including the trailing newline. Each line is formatted according to the unchanged line format.

%%

stands for ‘%’.

%c'C'

where C is a single character, stands for C. C may not be a backslash or an apostrophe. For example, ‘%c':'’ stands for a colon, even inside the then-part of an if-then-else format, which a colon would normally terminate.

%c'\O'

where O is a string of 1, 2, or 3 octal digits, stands for the character with octal code O. For example, ‘%c'\0'’ stands for a null character.

Fn

where F is a printf conversion specification and n is one of the following letters, stands for n’s value formatted with F.

e

The line number of the line just before the group in the old file.

f

The line number of the first line in the group in the old file; equals e + 1.

l

The line number of the last line in the group in the old file.

m

The line number of the line just after the group in the old file; equals l + 1.

n

The number of lines in the group in the old file; equals l - f + 1.

E, F, L, M, N

Likewise, for lines in the new file.

The printf conversion specification can be ‘%d’, ‘%o’, ‘%x’, or ‘%X’, specifying decimal, octal, lower case hexadecimal, or upper case hexadecimal output respectively. After the ‘%’ the following options can appear in sequence: a series of zero or more flags; an integer specifying the minimum field width; and a period followed by an optional integer specifying the minimum number of digits. The flags are ‘-’ for left-justification, ‘'’ for separating the digit into groups as specified by the LC_NUMERIC locale category, and ‘0’ for padding with zeros instead of spaces. For example, ‘%5dN’ prints the number of new lines in the group in a field of width 5 characters, using the printf format "%5d".

(A=B?T:E)

If A equals B then T else E. A and B are each either a decimal constant or a single letter interpreted as above. This format spec is equivalent to T if A’s value equals B’s; otherwise it is equivalent to E.

For example, ‘%(N=0?no:%dN) line%(N=1?:s)’ is equivalent to ‘no lines’ if N (the number of lines in the group in the new file) is 0, to ‘1 line’ if N is 1, and to ‘%dN lines’ otherwise.