Next: , Previous: , Up: Answers   [Contents][Index]

### 17.3 Solution for `foreach`

The `foreach` and `foreachq` macros (see Foreach) as presented earlier each have flaws. First, we will examine and fix the quadratic behavior of `foreachq`:

```\$ m4 -I examples
include(`foreachq.m4')
⇒
traceon(`shift')debugmode(`aq')
⇒
foreachq(`x', ``1', `2', `3', `4'', `x
')dnl
⇒1
error→m4trace: -3- shift(`1', `2', `3', `4')
error→m4trace: -2- shift(`1', `2', `3', `4')
⇒2
error→m4trace: -4- shift(`1', `2', `3', `4')
error→m4trace: -3- shift(`2', `3', `4')
error→m4trace: -3- shift(`1', `2', `3', `4')
error→m4trace: -2- shift(`2', `3', `4')
⇒3
error→m4trace: -5- shift(`1', `2', `3', `4')
error→m4trace: -4- shift(`2', `3', `4')
error→m4trace: -3- shift(`3', `4')
error→m4trace: -4- shift(`1', `2', `3', `4')
error→m4trace: -3- shift(`2', `3', `4')
error→m4trace: -2- shift(`3', `4')
⇒4
error→m4trace: -6- shift(`1', `2', `3', `4')
error→m4trace: -5- shift(`2', `3', `4')
error→m4trace: -4- shift(`3', `4')
error→m4trace: -3- shift(`4')
```

Each successive iteration was adding more quoted `shift` invocations, and the entire list contents were passing through every iteration. In general, when recursing, it is a good idea to make the recursion use fewer arguments, rather than adding additional quoted uses of `shift`. By doing so, `m4` uses less memory, invokes fewer macros, is less likely to run into machine limits, and most importantly, performs faster. The fixed version of `foreachq` can be found in m4-1.4.19/examples/foreachq2.m4:

```\$ m4 -I examples
include(`foreachq2.m4')
⇒
undivert(`foreachq2.m4')dnl
⇒include(`quote.m4')dnl
⇒divert(`-1')
⇒# foreachq(x, `item_1, item_2, ..., item_n', stmt)
⇒#   quoted list, improved version
⇒define(`foreachq', `pushdef(`\$1')_\$0(\$@)popdef(`\$1')')
⇒define(`_arg1q', ``\$1'')
⇒define(`_rest', `ifelse(`\$#', `1', `', `dquote(shift(\$@))')')
⇒define(`_foreachq', `ifelse(`\$2', `', `',
⇒  `define(`\$1', _arg1q(\$2))\$3`'\$0(`\$1', _rest(\$2), `\$3')')')
⇒divert`'dnl
traceon(`shift')debugmode(`aq')
⇒
foreachq(`x', ``1', `2', `3', `4'', `x
')dnl
⇒1
error→m4trace: -3- shift(`1', `2', `3', `4')
⇒2
error→m4trace: -3- shift(`2', `3', `4')
⇒3
error→m4trace: -3- shift(`3', `4')
⇒4
```

Note that the fixed version calls unquoted helper macros in `_foreachq` to trim elements immediately; those helper macros in turn must re-supply the layer of quotes lost in the macro invocation. Contrast the use of `_arg1q`, which quotes the first list element, with `_arg1` of the earlier implementation that returned the first list element directly. Additionally, by calling the helper method immediately, the ‘defn(`iterator')’ no longer contains unexpanded macros.

The astute m4 programmer might notice that the solution above still uses more memory and macro invocations, and thus more time, than strictly necessary. Note that ‘\$2’, which contains an arbitrarily long quoted list, is expanded and rescanned three times per iteration of `_foreachq`. Furthermore, every iteration of the algorithm effectively unboxes then reboxes the list, which costs a couple of macro invocations. It is possible to rewrite the algorithm for a bit more speed by swapping the order of the arguments to `_foreachq` in order to operate on an unboxed list in the first place, and by using the fixed-length ‘\$#’ instead of an arbitrary length list as the key to end recursion. The result is an overhead of six macro invocations per loop (excluding any macros in text), instead of eight. This alternative approach is available as m4-1.4.19/examples/foreach3.m4:

```\$ m4 -I examples
include(`foreachq3.m4')
⇒
undivert(`foreachq3.m4')dnl
⇒divert(`-1')
⇒# foreachq(x, `item_1, item_2, ..., item_n', stmt)
⇒#   quoted list, alternate improved version
⇒define(`foreachq', `ifelse(`\$2', `', `',
⇒  `pushdef(`\$1')_\$0(`\$1', `\$3', `', \$2)popdef(`\$1')')')
⇒define(`_foreachq', `ifelse(`\$#', `3', `',
⇒  `define(`\$1', `\$4')\$2`'\$0(`\$1', `\$2',
⇒    shift(shift(shift(\$@))))')')
⇒divert`'dnl
traceon(`shift')debugmode(`aq')
⇒
foreachq(`x', ``1', `2', `3', `4'', `x
')dnl
⇒1
error→m4trace: -4- shift(`x', `x
error→', `', `1', `2', `3', `4')
error→m4trace: -3- shift(`x
error→', `', `1', `2', `3', `4')
error→m4trace: -2- shift(`', `1', `2', `3', `4')
⇒2
error→m4trace: -4- shift(`x', `x
error→', `1', `2', `3', `4')
error→m4trace: -3- shift(`x
error→', `1', `2', `3', `4')
error→m4trace: -2- shift(`1', `2', `3', `4')
⇒3
error→m4trace: -4- shift(`x', `x
error→', `2', `3', `4')
error→m4trace: -3- shift(`x
error→', `2', `3', `4')
error→m4trace: -2- shift(`2', `3', `4')
⇒4
error→m4trace: -4- shift(`x', `x
error→', `3', `4')
error→m4trace: -3- shift(`x
error→', `3', `4')
error→m4trace: -2- shift(`3', `4')
```

In the current version of M4, every instance of ‘\$@’ is rescanned as it is encountered. Thus, the foreachq3.m4 alternative uses much less memory than foreachq2.m4, and executes as much as 10% faster, since each iteration encounters fewer ‘\$@’. However, the implementation of rescanning every byte in ‘\$@’ is quadratic in the number of bytes scanned (for example, making the broken version in foreachq.m4 cubic, rather than quadratic, in behavior). A future release of M4 will improve the underlying implementation by reusing results of previous scans, so that both styles of `foreachq` can become linear in the number of bytes scanned. Notice how the implementation injects an empty argument prior to expanding ‘\$2’ within `foreachq`; the helper macro `_foreachq` then ignores the third argument altogether, and ends recursion when there are three arguments left because there was nothing left to pass through `shift`. Thus, each iteration only needs one `ifelse`, rather than the two conditionals used in the version from foreachq2.m4.

So far, all of the implementations of `foreachq` presented have been quadratic with M4 1.4.x. But `forloop` is linear, because each iteration parses a constant amount of arguments. So, it is possible to design a variant that uses `forloop` to do the iteration, then uses ‘\$@’ only once at the end, giving a linear result even with older M4 implementations. This implementation relies on the GNU extension that ‘\$10’ expands to the tenth argument rather than the first argument concatenated with ‘0’. The trick is to define an intermediate macro that repeats the text `m4_define(`\$1', `\$n')\$2`'`, with ‘n’ set to successive integers corresponding to each argument. The helper macro `_foreachq_` is needed in order to generate the literal sequences such as ‘\$1’ into the intermediate macro, rather than expanding them as the arguments of `_foreachq`. With this approach, no `shift` calls are even needed! Even though there are seven macros of overhead per iteration instead of six in foreachq3.m4, the linear scaling is apparent at relatively small list sizes. However, this approach will need adjustment when a future version of M4 follows POSIX by no longer treating ‘\$10’ as the tenth argument; the anticipation is that ‘\${10}’ can be used instead, although that alternative syntax is not yet supported.

```\$ m4 -I examples
include(`foreachq4.m4')
⇒
undivert(`foreachq4.m4')dnl
⇒include(`forloop2.m4')dnl
⇒divert(`-1')
⇒# foreachq(x, `item_1, item_2, ..., item_n', stmt)
⇒#   quoted list, version based on forloop
⇒define(`foreachq',
⇒`ifelse(`\$2', `', `', `_\$0(`\$1', `\$3', \$2)')')
⇒define(`_foreachq',
⇒`pushdef(`\$1', forloop(`\$1', `3', `\$#',
⇒  `\$0_(`1', `2', indir(`\$1'))')`popdef(
⇒    `\$1')')indir(`\$1', \$@)')
⇒define(`_foreachq_',
⇒``define(`\$\$1', `\$\$3')\$\$2`''')
⇒divert`'dnl
traceon(`shift')debugmode(`aq')
⇒
foreachq(`x', ``1', `2', `3', `4'', `x
')dnl
⇒1
⇒2
⇒3
⇒4
```

For yet another approach, the improved version of `foreach`, available in m4-1.4.19/examples/foreach2.m4, simply overquotes the arguments to `_foreach` to begin with, using `dquote_elt`. Then `_foreach` can just use `_arg1` to remove the extra layer of quoting that was added up front:

```\$ m4 -I examples
include(`foreach2.m4')
⇒
undivert(`foreach2.m4')dnl
⇒include(`quote.m4')dnl
⇒divert(`-1')
⇒# foreach(x, (item_1, item_2, ..., item_n), stmt)
⇒#   parenthesized list, improved version
⇒define(`foreach', `pushdef(`\$1')_\$0(`\$1',
⇒  (dquote(dquote_elt\$2)), `\$3')popdef(`\$1')')
⇒define(`_arg1', `\$1')
⇒define(`_foreach', `ifelse(`\$2', `(`')', `',
⇒  `define(`\$1', _arg1\$2)\$3`'\$0(`\$1', (dquote(shift\$2)), `\$3')')')
⇒divert`'dnl
traceon(`shift')debugmode(`aq')
⇒
foreach(`x', `(`1', `2', `3', `4')', `x
')dnl
error→m4trace: -4- shift(`1', `2', `3', `4')
error→m4trace: -4- shift(`2', `3', `4')
error→m4trace: -4- shift(`3', `4')
⇒1
error→m4trace: -3- shift(``1'', ``2'', ``3'', ``4'')
⇒2
error→m4trace: -3- shift(``2'', ``3'', ``4'')
⇒3
error→m4trace: -3- shift(``3'', ``4'')
⇒4
error→m4trace: -3- shift(``4'')
```

It is likewise possible to write a variant of `foreach` that performs in linear time on M4 1.4.x; the easiest method is probably writing a version of `foreach` that unboxes its list, then invokes `_foreachq` as previously defined in foreachq4.m4.

In summary, recursion over list elements is trickier than it appeared at first glance, but provides a powerful idiom within `m4` processing. As a final demonstration, both list styles are now able to handle several scenarios that would wreak havoc on one or both of the original implementations. This points out one other difference between the list styles. `foreach` evaluates unquoted list elements only once, in preparation for calling `_foreach`, similary for `foreachq` as provided by foreachq3.m4 or foreachq4.m4. But `foreachq`, as provided by foreachq2.m4, evaluates unquoted list elements twice while visiting the first list element, once in `_arg1q` and once in `_rest`. When deciding which list style to use, one must take into account whether repeating the side effects of unquoted list elements will have any detrimental effects.

```\$ m4 -I examples
include(`foreach2.m4')
⇒
include(`foreachq2.m4')
⇒
dnl 0-element list:
foreach(`x', `', `<x>') / foreachq(`x', `', `<x>')
⇒ /
dnl 1-element list of empty element
foreach(`x', `()', `<x>') / foreachq(`x', ``'', `<x>')
⇒<> / <>
dnl 2-element list of empty elements
foreach(`x', `(`',`')', `<x>') / foreachq(`x', ``',`'', `<x>')
⇒<><> / <><>
dnl 1-element list of a comma
foreach(`x', `(`,')', `<x>') / foreachq(`x', ``,'', `<x>')
⇒<,> / <,>
dnl 2-element list of unbalanced parentheses
foreach(`x', `(`(', `)')', `<x>') / foreachq(`x', ``(', `)'', `<x>')
⇒<(><)> / <(><)>
define(`ab', `oops')dnl using defn(`iterator')
foreach(`x', `(`a', `b')', `defn(`x')') /dnl
foreachq(`x', ``a', `b'', `defn(`x')')
⇒ab / ab
define(`active', `ACT, IVE')
⇒
traceon(`active')
⇒
dnl list of unquoted macros; expansion occurs before recursion
foreach(`x', `(active, active)', `<x>
')dnl
error→m4trace: -4- active -> `ACT, IVE'
error→m4trace: -4- active -> `ACT, IVE'
⇒<ACT>
⇒<IVE>
⇒<ACT>
⇒<IVE>
foreachq(`x', `active, active', `<x>
')dnl
error→m4trace: -3- active -> `ACT, IVE'
error→m4trace: -3- active -> `ACT, IVE'
⇒<ACT>
error→m4trace: -3- active -> `ACT, IVE'
error→m4trace: -3- active -> `ACT, IVE'
⇒<IVE>
⇒<ACT>
⇒<IVE>
dnl list of quoted macros; expansion occurs during recursion
foreach(`x', `(`active', `active')', `<x>
')dnl
error→m4trace: -1- active -> `ACT, IVE'
⇒<ACT, IVE>
error→m4trace: -1- active -> `ACT, IVE'
⇒<ACT, IVE>
foreachq(`x', ``active', `active'', `<x>
')dnl
error→m4trace: -1- active -> `ACT, IVE'
⇒<ACT, IVE>
error→m4trace: -1- active -> `ACT, IVE'
⇒<ACT, IVE>
dnl list of double-quoted macro names; no expansion
foreach(`x', `(``active'', ``active'')', `<x>
')dnl
⇒<active>
⇒<active>
foreachq(`x', ```active'', ``active''', `<x>
')dnl
⇒<active>
⇒<active>
```

Next: , Previous: , Up: Answers   [Contents][Index]