10.11.8 Multi-Phase Rewrite Rules

It is possible to separate a rewrite rule set into several phases. During each phase, certain rules will be enabled while certain others will be disabled. A phase schedule controls the order in which phases occur during the rewriting process.

If a call to the marker function phase appears in the rules vector in place of a rule, all rules following that point will be members of the phase(s) identified in the arguments to phase. Phases are given integer numbers. The markers ‘phase()’ and ‘phase(all)’ both mean the following rules belong to all phases; this is the default at the start of the rule set.

If you do not explicitly schedule the phases, Calc sorts all phase numbers that appear in the rule set and executes the phases in ascending order. For example, the rule set

[ f0(x) := g0(x),
  phase(1),
  f1(x) := g1(x),
  phase(2),
  f2(x) := g2(x),
  phase(3),
  f3(x) := g3(x),
  phase(1,2),
  f4(x) := g4(x) ]

has three phases, 1 through 3. Phase 1 consists of the f0, f1, and f4 rules (in that order). Phase 2 consists of f0, f2, and f4. Phase 3 consists of f0 and f3.

When Calc rewrites a formula using this rule set, it first rewrites the formula using only the phase 1 rules until no further changes are possible. Then it switches to the phase 2 rule set and continues until no further changes occur, then finally rewrites with phase 3. When no more phase 3 rules apply, rewriting finishes. (This is assuming a r with a large enough prefix argument to allow the rewriting to run to completion; the sequence just described stops early if the number of iterations specified in the prefix argument, 100 by default, is reached.)

During each phase, Calc descends through the nested levels of the formula as described previously. (See Nested Formulas with Rewrite Rules.) Rewriting starts at the top of the formula, then works its way down to the parts, then goes back to the top and works down again. The phase 2 rules do not begin until no phase 1 rules apply anywhere in the formula.

A schedule marker appearing in the rule set (anywhere, but conventionally at the top) changes the default schedule of phases. In the simplest case, schedule has a sequence of phase numbers for arguments; each phase number is invoked in turn until the arguments to schedule are exhausted. Thus adding ‘schedule(3,2,1)’ at the top of the above rule set would reverse the order of the phases; ‘schedule(1,2,3)’ would have no effect since this is the default schedule; and ‘schedule(1,2,1,3)’ would give phase 1 a second chance after phase 2 has completed, before moving on to phase 3.

Any argument to schedule can instead be a vector of phase numbers (or even of sub-vectors). Then the sub-sequence of phases described by the vector are tried repeatedly until no change occurs in any phase in the sequence. For example, ‘schedule([1, 2], 3)’ tries phase 1, then phase 2, then, if either phase made any changes to the formula, repeats these two phases until they can make no further progress. Finally, it goes on to phase 3 for finishing touches.

Also, items in schedule can be variable names as well as numbers. A variable name is interpreted as the name of a function to call on the whole formula. For example, ‘schedule(1, simplify)’ says to apply the phase-1 rules (presumably, all of them), then to call simplify which is the function name equivalent of a s. Likewise, ‘schedule([1, simplify])’ says to alternate between phase 1 and a s until no further changes occur.

Phases can be used purely to improve efficiency; if it is known that a certain group of rules will apply only at the beginning of rewriting, and a certain other group will apply only at the end, then rewriting will be faster if these groups are identified as separate phases. Once the phase 1 rules are done, Calc can put them aside and no longer spend any time on them while it works on phase 2.

There are also some problems that can only be solved with several rewrite phases. For a real-world example of a multi-phase rule set, examine the set FitRules, which is used by the curve-fitting command to convert a model expression to linear form. See Curve Fitting Details. This set is divided into four phases. The first phase rewrites certain kinds of expressions to be more easily linearizable, but less computationally efficient. After the linear components have been picked out, the final phase includes the opposite rewrites to put each component back into an efficient form. If both sets of rules were included in one big phase, Calc could get into an infinite loop going back and forth between the two forms.

Elsewhere in FitRules, the components are first isolated, then recombined where possible to reduce the complexity of the linear fit, then finally packaged one component at a time into vectors. If the packaging rules were allowed to begin before the recombining rules were finished, some components might be put away into vectors before they had a chance to recombine. By putting these rules in two separate phases, this problem is neatly avoided.