Abstract Syntax Tree Optimizations

This page describes ongoing work to improve GCC's tree based optimizations. There is a branch in CVS called, ast-optimizer-branch, which is available to experiment with these optimizers. As these stabilize, they can be submitted for review onto the mainline development tree. Please contact Nathan Sidwell, <nathan@codesourcery.com>, if you want to contribute.

Background & Rationale

GCC, in common with many other compilers, has more than one internal representation of a program. The main ones are trees and RTL. The trees, (or formally abstract syntax trees - ASTs) are generated during parsing, and are close to the source language semantics. The RTL is generated in the back end, and is close to the generated assembly code. Ideally, the AST would contain all the semantic information of the source program.

Historically, GCC generated RTL one statement at a time, so the AST did not stay around very long. This has changed with 'function at a time' compilation (Inliner), which both C and C++ front ends now implement. With the AST for complete functions, and the additional semantic information they contain, the opportunity for new optimizations presents itself.

Goals

In addition to providing completely new optimization passes, there are a number of sub goals.

Provide a testing framework

Although GCC's testsuite can verify an optimization's correctness, there is no framework to verify the efficiency. Efficiency in both compiler resources (memory and compile time), and produced executable (object size and execution speed) both need to be measured. Without a test framework, we will have no information about the effectiveness and stability of optimizers.

Optimization tuning parameters

A number of optimizations can be tuned by various parameters. Giving the user a knob to twiddle is good, provided (a) it does something (b) there is a measurable way of determining its effectiveness.

Move RTL optimizations to the AST level

GCC has many optimizations that work at the RTL level. At the AST level some of these can do a better job.

Move AST optimizations into one place

GCC has a number of AST optimizations that attempt to optimize trees during parsing. These have limited effectiveness, and complicate the parser. It would be better to put these all in one place, where they can be more effective (by being repeated, or using common data).

Move AST optimizers into the common middle end

Although C and C++ both have function at a time mode, the AST new inliner is only in the C++ front end.

Provable optimizer performance

It is often easy to implement a mis-optimization. Without evidence that an optimizer actually does optimize, it will be hard to have it accepted into the mainline. Obviously in the ast-optimizer-branch branch, things are different.

Status

The branch was created from the development mainline on 21 July 2001. Its version string is 'ast-optimizer-branch'.

New inlining algorithm

The first AST inliner was top down, and exhibits problematic compiler time & memory usage at -O3. This is particularly nasty for heavily templated C++ code. A new bottom up inliner is available for testing, which it is hoped will provide better -O3 performance. See this thread.

SSA for trees

The tree SSA infrastructure is maintained by Diego Novillo <dnovillo@redhat.com>. Documentation and plans about this pass can be found in the tree SSA page.

Contributing

Checkout the ast-optimizer-branch branch

cvs co -r ast-optimizer-branch gcc

then configure and build in the normal way.

Please post patches in the usual way to the gcc-patches list, marked [ast-optimizer-branch]. As this is a branch, and not the mainline, the usual maintainer rules do not apply. This branch is maintained by Nathan Sidwell, <nathan@codesourcery.com>. Approval from the usual maintainer will be needed when submitting patches from the branch onto the mainline.