Next: , Previous: Evaluation Macros, Up: Programming in M4sugar


8.3.7 String manipulation in M4

The following macros may be used to manipulate strings in M4. Many of the macros in this section intentionally result in quoted strings as output, rather than subjecting the arguments to further expansions. As a result, if you are manipulating text that contains active M4 characters, the arguments are passed with single quoting rather than double.

— Macro: m4_append (macro-name, string, [separator])
— Macro: m4_append_uniq (macro-name, string, [separator] [if-uniq], [if-duplicate])

Redefine macro-name to its former contents with separator and string added at the end. If macro-name was undefined before (but not if it was defined but empty), then no separator is added. As of Autoconf 2.62, neither string nor separator are expanded during this macro; instead, they are expanded when macro-name is invoked.

m4_append can be used to grow strings, and m4_append_uniq to grow strings without duplicating substrings. Additionally, m4_append_uniq takes two optional parameters as of Autoconf 2.62; if-uniq is expanded if string was appended, and if-duplicate is expanded if string was already present. Also, m4_append_uniq warns if separator is not empty, but occurs within string, since that can lead to duplicates.

Note that m4_append can scale linearly in the length of the final string, depending on the quality of the underlying M4 implementation, while m4_append_uniq has an inherent quadratic scaling factor. If an algorithm can tolerate duplicates in the final string, use the former for speed. If duplicates must be avoided, consider using m4_set_add instead (see Set manipulation Macros).

          m4_define([active], [ACTIVE])dnl
          m4_append([sentence], [This is an])dnl
          m4_append([sentence], [ active ])dnl
          m4_append([sentence], [symbol.])dnl
          sentence
          =>This is an ACTIVE symbol.
          m4_undefine([active])dnl
          =>This is an active symbol.
          m4_append_uniq([list], [one], [, ], [new], [existing])
          =>new
          m4_append_uniq([list], [one], [, ], [new], [existing])
          =>existing
          m4_append_uniq([list], [two], [, ], [new], [existing])
          =>new
          m4_append_uniq([list], [three], [, ], [new], [existing])
          =>new
          m4_append_uniq([list], [two], [, ], [new], [existing])
          =>existing
          list
          =>one, two, three
          m4_dquote(list)
          =>[one],[two],[three]
          m4_append([list2], [one], [[, ]])dnl
          m4_append_uniq([list2], [two], [[, ]])dnl
          m4_append([list2], [three], [[, ]])dnl
          list2
          =>one, two, three
          m4_dquote(list2)
          =>[one, two, three]
     
— Macro: m4_append_uniq_w (macro-name, strings)

This macro was introduced in Autoconf 2.62. It is similar to m4_append_uniq, but treats strings as a whitespace separated list of words to append, and only appends unique words. macro-name is updated with a single space between new words.

          m4_append_uniq_w([numbers], [1 1 2])dnl
          m4_append_uniq_w([numbers], [ 2 3 ])dnl
          numbers
          =>1 2 3
     
— Macro: m4_combine ([separator], prefix-list, [infix], suffix-1, [suffix-2], ...)

This macro produces a quoted string containing the pairwise combination of every element of the quoted, comma-separated prefix-list, and every element from the suffix arguments. Each pairwise combination is joined with infix in the middle, and successive pairs are joined by separator. No expansion occurs on any of the arguments. No output occurs if either the prefix or suffix list is empty, but the lists can contain empty elements.

          m4_define([a], [oops])dnl
          m4_combine([, ], [[a], [b], [c]], [-], [1], [2], [3])
          =>a-1, a-2, a-3, b-1, b-2, b-3, c-1, c-2, c-3
          m4_combine([, ], [[a], [b]], [-])
          =>
          m4_combine([, ], [[a], [b]], [-], [])
          =>a-, b-
          m4_combine([, ], [], [-], [1], [2])
          =>
          m4_combine([, ], [[]], [-], [1], [2])
          =>-1, -2
     
— Macro: m4_flatten (string)

Flatten string into a single line. Delete all backslash-newline pairs, and replace all remaining newlines with a space. The result is still a quoted string.

— Macro: m4_join ([separator], args...)
— Macro: m4_joinall ([separator], args...)

Concatenate each arg, separated by separator. joinall uses every argument, while join omits empty arguments so that there are no back-to-back separators in the output. The result is a quoted string.

          m4_define([active], [ACTIVE])dnl
          m4_join([|], [one], [], [active], [two])
          =>one|active|two
          m4_joinall([|], [one], [], [active], [two])
          =>one||active|two
     

Note that if all you intend to do is join args with commas between them, to form a quoted list suitable for m4_foreach, it is more efficient to use m4_dquote.

— Macro: m4_newline

This macro was introduced in Autoconf 2.62, and expands to a newline. It is primarily useful for maintaining macro formatting, and ensuring that M4 does not discard leading whitespace during argument collection.

— Macro: m4_normalize (string)

Remove leading and trailing spaces and tabs, sequences of backslash-then-newline, and replace multiple spaces, tabs, and newlines with a single space. This is a combination of m4_flatten and m4_strip.

— Macro: m4_re_escape (string)

Backslash-escape all characters in string that are active in regexps.

— Macro: m4_split (string, [regexp = `[t ]+'])

Split string into an M4 list of elements quoted by `[' and `]', while keeping white space at the beginning and at the end. If regexp is given, use it instead of `[\t ]+' for splitting. If string is empty, the result is an empty list.

— Macro: m4_strip (string)

Strip whitespace from string. Sequences of spaces and tabs are reduced to a single space, then leading and trailing spaces are removed. The result is still a quoted string. Note that this does not interfere with newlines; if you want newlines stripped as well, consider m4_flatten, or do it all at once with m4_normalize.

— Macro: m4_text_box (message, [frame = `-'])

Add a text box around message, using frame as the border character above and below the message. The frame correctly accounts for the subsequent expansion of message. For example:

          m4_define([macro], [abc])dnl
          m4_text_box([macro])
          =>## --- ##
          =>## abc ##
          =>## --- ##
     

The message must contain balanced quotes and parentheses, although quadrigraphs can be used to work around this.

— Macro: m4_text_wrap (string, [prefix], [prefix1 = `prefix'], [width = `79'])

Break string into a series of whitespace-separated words, then output those words separated by spaces, and wrapping lines any time the output would exceed width columns. If given, prefix1 begins the first line, and prefix begins all wrapped lines. If prefix1 is longer than prefix, then the first line consists of just prefix1. If prefix is longer than prefix1, padding is inserted so that the first word of string begins at the same indentation as all wrapped lines. Note that using literal tab characters in any of the arguments will interfere with the calculation of width. No expansions occur on prefix, prefix1, or the words of string, although quadrigraphs are recognized.

For some examples:

          m4_text_wrap([Short string */], [   ], [/* ], [20])
          =>/* Short string */
          m4_text_wrap([Much longer string */], [   ], [/* ], [20])
          =>/* Much longer
          =>   string */
          m4_text_wrap([Short doc.], [          ], [  --short ], [30])
          =>  --short Short doc.
          m4_text_wrap([Short doc.], [          ], [  --too-wide ], [30])
          =>  --too-wide
          =>          Short doc.
          m4_text_wrap([Super long documentation.], [     ],
                       [  --too-wide ], 30)
          =>  --too-wide
          =>     Super long
          =>     documentation.
     
— Macro: m4_tolower (string)
— Macro: m4_toupper (string)

Return string with letters converted to upper or lower case, respectively.