How to Locate GCC Regressions

A regression is a bug that did not exist in a previous release. Problem reports for GCC regressions have a very high priority, and we make every effort to fix them before the next release. Knowing which change caused a regression is valuable information to the developer who is fixing the problem, even if that patch merely exposed an existing bug.

People who are familiar with building GCC but who don't have the knowledge of GCC internals to fix bugs can help a lot by identifying patches that caused regressions to occur. The same techniques can be used to identify the patch that unknowingly fixed a particular bug on the mainline when that bug also exists as a regression on a release branch, allowing someone to port the fix to the branch.

These instructions assume that you are already familiar with building GCC on your platform.

Search strategies

If you've got sufficient disk space available, keep old install tree around for use in finding small windows in which regressions occur. Some people who do this regularly add information to Bugzilla about particular problem reports for regressions.

Before you start your search, verify that you can reproduce the problem with GCC built from the current sources. If not, the bug might have been fixed, or it might not be relevant for your platform, or the failure might only happen with particular options. Next, verify that you get the expected behavior for the start and end dates of the range.

The basic search strategy is to iterate through the following steps while the range is too large to investigate by hand:

The first three steps are described below. They can be automated, as can the framework for the binary search. The directory contrib/reghunt in the GCC repository includes scripts to do this work.

There are several short cuts that can be used to shorten the elapsed time of the search.

Eventually you'll need to identify the patch and verify that it causes the behavior of the test to change.

There are a variety of problems you might encounter, but many of them are simple to work around.

Get GCC sources

Get a Local Copy of the GCC repository

Using rsync to get a local copy of the GCC repository is highly recommended for regression hunts. You'll be checking out the tree used for the regression search over and over again and won't want to affect access times for other GCC developers who are using the real repository, and it will also be faster for you.

The full tree takes a lot of disk space, but it's possible to exclude directories you won't need for your hunt. If you're already using a local SVN repository via rsync, you can make a cut-down version of it that leaves out directories you don't need for the regression hunt. This makes SVN operations much quicker, making it worthwhile even if the copy is on the same system. It's particularly useful if you'll want to copy it to a system that is low on available disk space. The following, for example, makes a smaller copy of the repository that can be used for finding C and C++ compile-time problems and takes only half the disk space as the full repository.

    cat <<EOF > rsync_exclude
    --exclude=gcc-svn/benchmarks
    --exclude=gcc-svn/boehm-gcc
    --exclude=gcc-svn/old-gcc
    --exclude=gcc-svn/wwwdocs
    --exclude=gcc-svn/gcc/libjava
    --exclude=gcc-svn/gcc/libstdc++-v3
    --exclude=gcc-svn/gcc/gcc/ada
    --exclude=gcc-svn/gcc/gcc/testsuite
    EOF

    tar `cat rsync_exclude` -cf - gcc-svn | gzip > gcc-svn.tar.gz

Check Out a Working Copy

Check out a local working copy of GCC from your local repository. If you are not using a local repository, then check out a working copy using anonymous read-only SVN access. In any case, use a new working copy that is separate from what you use for development or other testing, since it is easy to end up with files in strange states.

Information about checking out specific dates, working with branches and tags, and inspecting the commit logs is available at the SVN Help pages in the GCC Wiki.

Branch and release dates

If no one has provided a range of dates for when a particular mainline regression appeared, you can narrow the search by knowing in which release it first appeared and then testing the mainline between the branchpoint for that release and the branchpoint for the previous release that does not have the bug. To find out the revision/date at which a branch or tag was created, use the command svn log --stop-on-copy.

Build GCC

The kind of bug you are investigating will determine what kind of build is required for testing GCC on a particular date. In almost all cases you can do a simple make rather than make bootstrap, provided that you start with a recent version of gcc as the build compiler. When building a full compiler, enable only the language you'll need to test. If you're testing a bug in a library, you'll only need to build that library, provided you've already got a compatible version of the compiler to test it with. If there are dependencies between components, or if you don't know which component(s) affect the bug, you'll need to update and rebuild everything for the language.

If you're chasing bugs that are known to be in cc1plus you can do the following after a normal configure:

    cd objdir
    make all-build-libiberty || true
    make all-libiberty
    make all-libcpp || true
    make all-intl || true
    make all-libbanshee || true
    make configure-gcc || true
    cd gcc
    make cc1plus

This will build libiberty, libcpp, libbanshee, intl and cc1plus (make configure-gcc is required since December 2002, make all-intl since July 2003, make all-libbanshee from May until September 2004, make all-libcpp since May 2004, and make all-build-libiberty since September 2004). Alternatively, you can do

    cd objdir
    make all-gcc TARGET-cc1plus

This works since October 2004. When you have built cc1plus, you can feed your source code snippet to it:

    cc1plus -quiet testcase.ii

Run the test

Assuming that there is a self-contained test for the bug, as there usually is for bugs reported via Bugzilla, write a small script to run it and to report whether it passed or failed. If you're automating your search then the script should tell you whether the next compiler build should use earlier or later GCC sources.

Hints for coming up with a self-contained test is beyond the scope of this document.

Identify the patch

The following SVN commands are particularly useful to help you identify changes from one version of a file to another:

When you've identified the likely patch out of a set of patches between the current low and high dates of the range, test a source tree from just before or just after that patch was added and then add or remove the patch by updating only the affected files. You can do this by identifying the revision of each file when the patch was added and then using svn update -rrev file to get the desired version of each of those files. Build and test to verify that this patch changes the behavior of the test.

Short cuts

Work around problems