GCC Coding Conventions

There are some additional coding conventions for code in GCC, beyond those in the GNU Coding Standards. Some existing code may not follow these conventions, but they must be used for new code. If changing existing code to follow these conventions, it is best to send changes to follow the conventions separately from any other changes to the code.

Documentation

Documentation, both of user interfaces and of internals, must be maintained and kept up to date. In particular:

ChangeLogs

ChangeLog entries are part of git commit messages and are automatically put into a corresponding ChangeLog file. A ChangeLog template can be easily generated with ./contrib/mklog.py script. GCC offers a checking script that verifies a proper ChangeLog formatting (see git gcc-verify git alias). for a particular git commit. The checking script covers most commonly used ChangeLog formats and the following paragraphs explain what it supports.

See also what the GNU Coding Standards have to say about what goes in ChangeLogs; in particular, descriptions of the purpose of code and changes should go in comments rather than the ChangeLog, though a single line overall description of the changes may be useful above the ChangeLog entry for a large batch of changes.

Components

Format rules

Documented behaviour

Example patch

This patch adds a second movk pattern that models the instruction
as a "normal" and/ior operation rather than an insertion.  It fixes
the third insv_1.c failure in PR87763, which was a regression from
GCC 8.

2020-02-06  John Foo  <john@example.com>

gcc/
	PR target/87763
	* config/aarch64/aarch64-protos.h (aarch64_movk_shift): Declare.
	* config/aarch64/aarch64.c (aarch64_movk_shift): New function.
	* config/aarch64/aarch64.md (aarch64_movk<mode>): New pattern.

gcc/testsuite/
	PR target/87763
	* gcc.target/aarch64/movk_2.c: New test.

Co-Authored-By: Jack Bar  <jack@example.com>

Tokenized patch

$git_description

$committer_timestamp

$changelog_location
$pr_entry
$changelog_file
$changelog_file
$changelog_file

$changelog_location
$pr_entry
$changelog_file

$co_authored_by

Portability

There are strict requirements for portability of code in GCC to older systems whose compilers do not implement all of the latest ISO C and C++ standards.

The directories gcc, libcpp and fixincludes may use C++03. They may also use the long long type if the host C++ compiler supports it. These directories should use reasonably portable parts of C++03, so that it is possible to build GCC with C++ compilers other than GCC itself. If testing reveals that reasonably recent versions of non-GCC C++ compilers cannot compile GCC, then GCC code should be adjusted accordingly. (Avoiding unusual language constructs helps immensely.) Furthermore, these directories should also be compatible with C++11.

The directories libiberty and libdecnumber must use C and require at least an ANSI C89 or ISO C90 host compiler. C code should avoid pre-standard style function definitions, unnecessary function prototypes and use of the now deprecated PARAMS macro. See README.Portability for details of some of the portability problems that may arise. Some of these problems are warned about by gcc -Wtraditional, which is included in the default warning options in a bootstrap.

The programs included in GCC are linked with the libiberty library, which will replace some standard library functions if not present on the system used, so those functions may be freely used in GCC. In particular, the ISO C string functions memcmp, memcpy, memmove, memset, strchr and strrchr are preferred to the old functions bcmp, bcopy, bzero, index and rindex; see messages 1 and 2. The older functions must no longer be used in GCC; apart from index, these identifiers are poisoned to prevent their use.

Machine-independent files may contain conditionals on features of a particular system, but should never contain conditionals such as #ifdef __hpux__ on the name or version of a particular system. Exceptions may be made to this on a release branch late in the release cycle, to reduce the risk involved in fixing a problem that only shows up on one particular system.

Function prototypes for extern functions should only occur in header files. Functions should be ordered within source files to minimize the number of function prototypes, by defining them before their first use. Function prototypes should only be used when necessary, to break mutually recursive cycles.

Makefiles

touch should never be used in GCC Makefiles. Instead of touch foo always use $(STAMP) foo.

Testsuite Conventions

Every language or library feature, whether standard or a GNU extension, and every warning GCC can give, should have testcases thoroughly covering both its specification and its implementation. Every bug fixed should have a testcase to detect if the bug recurs.

The testsuite READMEs discuss the requirement to use abort () for runtime failures and exit (0) for success. For compile-time tests, a trick taken from autoconf may be used to evaluate expressions: a declaration extern char x[(EXPR) ? 1 : -1]; will compile successfully if and only if EXPR is nonzero.

Where appropriate, testsuite entries should include comments giving their origin: the people who added them or submitted the bug report they relate to, possibly with a reference to a PR in our bug tracking system. There are some copyright guidelines on what can be included in the testsuite.

If a testcase itself is incorrect, but there's a possibility that an improved testcase might fail on some platform where the incorrect testcase passed, the old testcase should be removed and a new testcase (with a different name) should be added. This helps automated regression-checkers distinguish a true regression from an improvement to the testsuite.

Diagnostics Conventions

Spelling, terminology and markup

The following conventions of spelling and terminology apply throughout GCC, including the manuals, web pages, diagnostics, comments, and (except where they require spaces or hyphens to be used) function and variable names, although consistency in user-visible documentation and diagnostics is more important than that in comments and code. The following table lists some simple cases:

Use......instead ofRationale
American spelling (in particular -ize, -or) British spelling (in particular -ise, -our)
"32-bit" (adjective) "32 bit"
"alphanumeric" "alpha numeric"
"back end" (noun) "back-end" or "backend"
"back-end" (adjective) "back end" or "backend"
"bit-field" "bit field" or "bitfield" Spelling used in C and C++ standards
"built-in" as an adjective ("built-in function") or "built in" "builtin" "builtin" isn't a word
"bug fix" (noun) or "bug-fix" (adjective) "bugfix" or "bug-fix" "bugfix" isn't a word
"ColdFire" "coldfire" or "Coldfire"
"command-line option" "command line option"
"compilation time" (noun); how long it takes to compile the program "compile time"
"compile time" (noun), "compile-time" (adjective); the time at which the program is compiled
"dependent" (adjective), "dependence", "dependency" "dependant", "dependance", "dependancy"
"enumerated" "enumeral" Terminology used in C and C++ standards
"epilogue" "epilog" Established convention
"execution time" (noun); how long it takes the program to run "run time" or "runtime"
file name filename
"floating-point" (adjective) "floating point"
"free software" or just "free" "Open Source" or "OpenSource"
"front end" (noun) "front-end" or "frontend"
"front-end" (adjective) "front end" or "frontend"
"GNU/Linux" (except in reference to the kernel) "Linux" or "linux" or "Linux/GNU"
"link time" (noun), "link-time" (adjective); the time at which the program is linked
"lowercase" "lower case" or "lower-case"
"H8S" "H8/S"
"Microsoft Windows" "Windows"
"MIPS" "Mips" or "mips"
"nonzero" "non-zero" or "non zero"
"null character" "zero character"
"Objective-C" "Objective C"
"prologue" "prolog" Established convention
"PowerPC" "powerpc", "powerPC" or "PowerPc"
"Red Hat" "RedHat" or "Redhat"
"return type" (noun), "return value" (noun) "return-type", "return-value"
"run time" (noun), "run-time" (adjective); the time at which the program is run "runtime"
"runtime" (both noun and adjective); libraries and system support present at run time "run time", "run-time"
"SPARC" "Sparc" or "sparc"
"testcase", "testsuite" "test-case" or "test case", "test-suite" or "test suite"
"uppercase" "upper case" or "upper-case"
"VAX", "VAXen", "MicroVAX" "vax" or "Vax", "vaxen" or "vaxes", "microvax" or "microVAX"

"GCC" should be used for the GNU Compiler Collection, both generally and as the GNU C Compiler in the context of compiling C; "G++" for the C++ compiler; "gcc" and "g++" (lowercase), marked up with @command when in Texinfo, for the commands for compilation when the emphasis is on those; "GNU C" and "GNU C++" for language dialects; and try to avoid the older term "GNU CC".

Use a comma after "e.g." or "i.e." if and only if it is appropriate in the context and the slight pause a comma means helps the reader; do not add them automatically in all cases just because some style guides say so. (In Texinfo manuals, @: should to be used after "e.g." and "i.e." when a comma isn't used.)

In Texinfo manuals, Texinfo 4.0 features may be used, and should be used where appropriate. URLs should be marked up with @uref; email addresses with @email; command-line options with @option; names of commands with @command; environment variables with @env. NULL should be written as @code{NULL}. Tables of contents should come just after the title page; printed manuals will be formatted (for example, by make dvi) using texi2dvi which reruns TeX until cross-references stabilize, so there is no need for a table of contents to go at the end for it to have correct page numbers. The @refill feature is obsolete and should not be used. All manuals should use @dircategory and @direntry to provide Info directory information for install-info.

It is useful to read the Texinfo manual. Some general Texinfo style issues discussed in that manual should be noted:

Upstream packages

Some files and packages in the GCC source tree are imported from elsewhere, and we want to minimize divergence from their upstream sources. The following files should be updated only according to the rules set below:

C and C++ Language Conventions

The following conventions apply to both C and C++.

Compiler Options

The compiler must build cleanly with -Wall -Wextra.

Language Use

Assertions

Code should use gcc_assert (EXPR) to check invariants. Use gcc_unreachable () to mark places that should never be reachable (such as an unreachable default case of a switch). Do not use gcc_assert (0) for such purposes, as gcc_unreachable gives the compiler more information. The assertions are enabled unless explicitly configured off with --enable-checking=none. Do not use abort. User input should never be validated by either gcc_assert or gcc_unreachable. If the checks are expensive or the compiler can reasonably carry on after the error, they may be conditioned on --enable-checking by using gcc_checking_assert.

Character Testing

Code testing properties of characters from user source code should use macros such as ISALPHA from safe-ctype.h instead of the standard functions such as isalpha from <ctype.h> to avoid any locale-dependency of the language accepted.

Error Node Testing

Testing for ERROR_MARKs should be done by comparing against error_mark_node rather than by comparing the TREE_CODE against ERROR_MARK; see message.

Parameters Affecting Generated Code

Internal numeric parameters that may affect generated code should be controlled by --param rather than being hardcoded.

Inlining Functions

Inlining functions only when you have reason to believe that the expansion of the function is smaller than a call to the function or that inlining is significant to the run-time of the compiler.

Formatting Conventions

Line Length

Lines shall be at most 80 columns.

Names

Macros names should be in ALL_CAPS when it's important to be aware that it's a macro (e.g. accessors and simple predicates), but in lowercase (e.g., size_int) where the macro is a wrapper for efficiency that should be considered as a function; see messages 1 and 2.

Other names should be lower-case and separated by low_lines.

Expressions

Code in GCC should use the following formatting conventions:

For Use... ...instead of
logical not !x ! x
bitwise complement ~x ~ x
unary minus -x - x
cast (type) x (type)x
pointer cast (type *) x (type*)x
pointer return type type *f (void) type* f (void)
pointer dereference *x * x

C++ Language Conventions

The following conventions apply only to C++.

These conventions will change over time, but changing them requires a convincing rationale.

Language Use

C++ is a complex language, and we strive to use it in a manner that is not surprising. So, the primary rule is to be reasonable. Use a language feature in known good ways. If you need to use a feature in an unusual way, or a way that violates the "should" rules below, seek guidance, review and feedback from the wider community.

All use of C++ features is subject to the decisions of the maintainers of the relevant components. (This restates something that is always true for gcc, which is that component maintainers make the final decisions about those components.)

Variable Definitions

Variables should be defined at the point of first use, rather than at the top of the function. The existing code obviously does not follow that rule, so variables may be defined at the top of the function, as in C90.

Variables may be simultaneously defined and tested in control expressions.

Rationale and Discussion

Struct Definitions

Some coding conventions, including GCC's own in the past, recommend using the struct keyword (also known as the class-key) for plain old data (POD) types. However, since the POD concept has been replaced in C++ by a set of much more nuanced distinctions, the current guidance (though not a requirement) is to use the struct class-key when defining structures that could be used without change in C, and use class for all other classes. It is recommended to use the same class-key consistently in all declarations and, if necessary, in uses of the class. The -Wmismatched-tags warning option helps detect mismatches. The -Wredundant-tags GCC option further helps identify places where the class-key can safely be omitted.

Rationale and Discussion

Class Definitions

See the guidance in Struct Definitions for the suggested choice of a class-key.

A class defined with the class-key class type will often (but not always) ave a declaration of a special member function. If any one of these is declared, then all should be either declared or have an explicit comment saying that the default is intended.

Single inheritance is permitted. Use public inheritance to describe interface inheritance, i.e. 'is-a' relationships. Use private and protected inheritance to describe implementation inheritance. Implementation inheritance can be expedient, but think twice before using it in code intended to last a long time.

Complex hierarchies are to be avoided. Take special care with multiple inheritance. On the rare occasion that using mulitple inheritance is indeed useful, prepare design rationales in advance, and take special care to make documentation of the entire hierarchy clear. (In particular, multiple inheritance can be an acceptable way of combining "traits"-style classes that only contain static member functions. Its use with data-carrying classes is more problematic.)

Think carefully about the size and performance impact of virtual functions and virtual bases before using them.

Prefer to make data members private.

Rationale and Discussion

Constructors and Destructors

All constructors should initialize data members in the member initializer list rather than in the body of the constructor.

A class with virtual functions or virtual bases should have a virtual destructor.

Rationale and Discussion

Conversions

Single argument constructors should nearly always be declared explicit.

Conversion operators should be avoided.

Rationale and Discussion

Overloading Functions

Overloading functions is permitted, but take care to ensure that overloads are not surprising, i.e. semantically identical or at least very similar. Virtual functions should not be overloaded.

Rationale and Discussion

Overloading Operators

Overloading operators is permitted, but take care to ensure that overloads are not surprising. Some unsurprising uses are in the implementation of numeric types and in following the C++ Standard Library's conventions. In addition, overloaded operators, excepting the call operator, should not be used for expensive implementations.

Rationale and Discussion

Note: in declarations of operator functions or in invocations of such functions that involve the keyword operator, the full name of the operator should be considered as including the keyword with no spaces in between the keyword and the operator token. Thus, the expected format of a declaration of an operator is

    T &operator== (const T & const T &);

and not

    T &operator == (const T & const T &);

(with the space between operator and ==).

Default Arguments

Default arguments are another type of function overloading, and the same rules apply. Default arguments must always be POD values, i.e. may not run constructors. Virtual functions should not have default arguments.

Rationale and Discussion

Inlining Functions

Constructors and destructors, even those with empty bodies, are often much larger than programmers expect. Prefer non-inline versions unless you have evidence that the inline version is smaller or has a significant performance impact.

Templates

To avoid excessive compiler size, consider implementing non-trivial templates on a non-template base class with void* parameters.

Namespaces

Namespaces are encouraged. All separable libraries should have a unique global namespace. All individual tools should have a unique global namespace. Nested include directories names should map to nested namespaces when possible.

Header files should have neither using directives nor namespace-scope using declarations.

Rationale and Discussion

RTTI and dynamic_cast

Run-time type information (RTTI) is permitted when certain non-default --enable-checking options are enabled, so as to allow checkers to report dynamic types. However, by default, RTTI is not permitted and the compiler must build cleanly with -fno-rtti.

Rationale and Discussion

Other Casts

C-style casts should not be used. Instead, use C++-style casts.

Rationale and Discussion

Exceptions

Exceptions and throw specifications are not permitted and the compiler must build cleanly with -fno-exceptions.

Rationale and Discussion

The Standard Library

Use of the standard library is permitted. Note, however, that it is currently not usable with garbage collected data.

For compiler messages, indeed any text that needs i18n, should continue to use the existing facilities.

For long-term code, at least for now, we will continue to use printf style I/O rather than <iostream> style I/O.

Rationale and Discussion

Formatting Conventions

Names

When structs and/or classes have member functions, prefer to name data members with a leading m_ and static data members with a leading s_.

Template parameter names should use CamelCase, following the C++ Standard.

Rationale and Discussion

Struct Definitions

Note that the rules for classes do not apply to structs. Structs continue to behave as before.

Class Definitions

If the entire class definition fits on a single line, put it on a single line. Otherwise, use the following rules.

Do not indent protection labels.

Indent class members by two spaces.

Prefer to put the entire class head on a single line.


class gnuclass : base {

Otherwise, start the colon of the base list at the beginning of a line.


class a_rather_long_class_name 
: with_a_very_long_base_name, and_another_just_to_make_life_hard
{ 
  int member; 
};

If the base clause exceeds one line, move overflowing initializers to the next line and indent by two spaces.


class gnuclass 
: base1 <template_argument1>, base2 <template_argument1>,
  base3 <template_argument1>, base4 <template_argument1>
{ 
  int member; 
};

When defining a class,

Semantic constraints may require a different declaration order, but seek to minimize the potential confusion.

Close a class definition with a right brace, semicolon, optional closing comment, and a new line.


}; // class gnuclass

Class Member Definitions

Define all members outside the class definition. That is, there are no function bodies or member initializers inside the class definition.

Prefer to put the entire member head on a single line.


gnuclass::gnuclass () : base_class ()
{ 
  ...
};

When that is not possible, place the colon of the initializer clause at the beginning of a line.


gnuclass::gnuclass ()
: base1 (), base2 (), member1 (), member2 (), member3 (), member4 ()
{ 
  ...
};

If the initializer clause exceeds one line, move overflowing initializers to the next line and indent by two spaces.


gnuclass::gnuclass ()
: base1 (some_expression), base2 (another_expression),
  member1 (my_expressions_everywhere)
{ 
  ...
};

If a C++ function name is long enough to cause the first function parameter with its type to exceed 80 characters, it should appear on the next line indented four spaces.


void
very_long_class_name::very_long_function_name (
    very_long_type_name arg)
{

Sometimes the class qualifier and function name together exceed 80 characters. In this case, break the line before the :: operator. We may wish to do so pre-emptively for all class member functions.


void
very_long_template_class_name <with, a, great, many, arguments>
::very_long_function_name (
    very_long_type_name arg)
{

Templates

A declaration following a template parameter list should not have additional indentation.

Prefer typename over class in template parameter lists.

Extern "C"

Prefer an extern "C" block to a declaration qualifier.

Open an extern "C" block with the left brace on the same line.


extern "C" {

Close an extern "C" block with a right brace, optional closing comment, and a new line.


} // extern "C"

Definitions within the body of an extern "C" block are not indented.

Namespaces

Open a namespace with the namespace name followed by a left brace and a new line.


namespace gnutool {

Close a namespace with a right brace, optional closing comment, and a new line.


} // namespace gnutool

Definitions within the body of a namespace are not indented.

Lambdas

There should be a space between the lambda-introducer and the parameter list, if any.

Lambdas that do not outlive their enclosing function should typically use [&] implicit capture.

auto l = [&] (tree arg) { ... };

If a lambda does not fit on one line, the left brace should be indented like the body of a for-statement.

auto l = [&] (tree arg)
  {
    ...
  };

This also applies if the lambda is the last argument, and only lambda argument, to a function.

std::for_each (start, end, [&] (tree arg)
  {
    ...
  });
To get the above behavior from GNU Emacs CC Mode, you can add this to your .emacs:
(defun lambda-offset (elem)
  "If the opening brace of a lambda is on a new line, indent it one step."
  (if (assq 'inline-open c-syntactic-context) '+ 0))
(add-hook 'c++-mode-hook
	  '(lambda () (c-set-offset 'inlambda 'lambda-offset)))

If the multi-line lambda is not the last argument, or there are multiple lambda arguments, you are encouraged to make them local variables, as the l examples above. If you do pass them directly, they should be indented like other parameters.

my_algo (start, end,
	 [&] (tree arg)
           {
             thing one...
           },
	 [&] (tree arg)
           {
             thing two...
           });

See also the GDB coding standards.

Python Language Conventions

Python scripts should follow PEP 8 – Style Guide for Python Code which can be verified by the flake8 tool. We recommend using the following flake8 plug-ins: