Next: , Previous: , Up: Answers   [Contents][Index]

17.7 Solution for capitalize

The capitalize macro (see Patsubst) as presented earlier does not allow clients to follow the quoting rule of thumb. Consider the three macros active, Active, and ACTIVE, and the difference between calling capitalize with the expansion of a macro, expanding the result of a case change, and changing the case of a double-quoted string:

$ m4 -I examples
define(`active', `act1, ive')dnl
define(`Active', `Act2, Ive')dnl
define(`ACTIVE', `ACT3, IVE')dnl
⇒act1, ive
define(`A', `OOPS')

First, when capitalize is called with more than one argument, it was throwing away later arguments, whereas upcase and downcase used ‘$*’ to collect them all. The fix is simple: use ‘$*’ consistently.

Next, with single-quoting, capitalize outputs a single character, a set of quotes, then the rest of the characters, making it impossible to invoke Active after the fact, and allowing the alternate macro A to interfere. Here, the solution is to use additional quoting in the helper macros, then pass the final over-quoted output string through _arg1 to remove the extra quoting and finally invoke the concatenated portions as a single string.

Finally, when passed a double-quoted string, the nested macro _capitalize is never invoked because it ended up nested inside quotes. This one is the toughest to fix. In short, we have no idea how many levels of quotes are in effect on the substring being altered by patsubst. If the replacement string cannot be expressed entirely in terms of literal text and backslash substitutions, then we need a mechanism to guarantee that the helper macros are invoked outside of quotes. In other words, this sounds like a job for changequote (see Changequote). By changing the active quoting characters, we can guarantee that replacement text injected by patsubst always occurs in the middle of a string that has exactly one level of over-quoting using alternate quotes; so the replacement text closes the quoted string, invokes the helper macros, then reopens the quoted string. In turn, that means the replacement text has unbalanced quotes, necessitating another round of changequote.

In the fixed version below, (also shipped as m4-1.4.19/examples/capitalize2.m4), capitalize uses the alternate quotes of ‘<<[’ and ‘]>>’ (the longer strings are chosen so as to be less likely to appear in the text being converted). The helpers _to_alt and _from_alt merely reduce the number of characters required to perform a changequote, since the definition changes twice. The outermost pair means that patsubst and _capitalize_alt are invoked with alternate quoting; the innermost pair is used so that the third argument to patsubst can contain an unbalanced ‘]>>’/‘<<[’ pair. Note that upcase and downcase must be redefined as _upcase_alt and _downcase_alt, since they contain nested quotes but are invoked with the alternate quoting scheme in effect.

$ m4 -I examples
define(`active', `act1, ive')dnl
define(`Active', `Act2, Ive')dnl
define(`ACTIVE', `ACT3, IVE')dnl
define(`A', `OOPS')dnl
capitalize(active; `active'; ``active''; ```actIVE''')
⇒Act1,Ive; Act2, Ive; Active; `Active'
⇒# upcase(text)
⇒# downcase(text)
⇒# capitalize(text)
⇒#   change case of text, improved version
⇒define(`upcase', `translit(`$*', `a-z', `A-Z')')
⇒define(`downcase', `translit(`$*', `A-Z', `a-z')')
⇒define(`_arg1', `$1')
⇒define(`_to_alt', `changequote(`<<[', `]>>')')
⇒define(`_from_alt', `changequote(<<[`]>>, <<[']>>)')
⇒define(`_upcase_alt', `translit(<<[$*]>>, <<[a-z]>>, <<[A-Z]>>)')
⇒define(`_downcase_alt', `translit(<<[$*]>>, <<[A-Z]>>, <<[a-z]>>)')
⇒  `regexp(<<[$1]>>, <<[^\(\w\)\(\w*\)]>>,
⇒    <<[_upcase_alt(<<[<<[\1]>>]>>)_downcase_alt(<<[<<[\2]>>]>>)]>>)')
⇒  `_arg1(_to_alt()patsubst(<<[<<[$*]>>]>>, <<[\w+]>>,
⇒    _from_alt()`]>>_$0_alt(<<[\&]>>)<<['_to_alt())_from_alt())')

Next: , Previous: , Up: Answers   [Contents][Index]