SSA for Trees

Table of Contents


Latest News

2004-05-13

The branch is now closed. Tree SSA has been merged into mainline.

2004-05-03

Merge status: http://gcc.gnu.org/ml/gcc/2004-05/msg00075.html.

2004-04-05

Merge status: http://gcc.gnu.org/ml/gcc/2004-04/msg00217.html.

2004-03-12

Merge status: http://gcc.gnu.org/ml/gcc/2004-03/msg00596.html.

2004-02-25

The branch will be merged into mainline to become part of the next release of GCC. As such, the branch is now closed to new development.

The stabilization process will be driven by the merge criteria stated in http://gcc.gnu.org/ml/gcc/2004-02/msg01294.html. We expect to merge the branch into mainline by mid to late April.


Introduction

The goal of this project is to build an optimization framework for trees based on the Static Single Assignment (SSA) form [1]. The implementation currently lives in the tree-ssa-20020619-branch branch.


Documentation

API documentation for the Tree SSA infrastructure is generated from the source code using Doxygen. The documentation is updated daily.

Internal design documentation is also available in the texinfo documentation (doc/gccint.info in GCC's build directory).

A high-level overview of GENERIC/GIMPLE and the SSA implementation may be found in the Proceedings of the 2003 GCC Developers Summit (pages 171-193).

Presentation slides and other documents for Tree SSA are also available.


Contributing

Checkout the tree-ssa-20020619-branch branch following the instructions found in the CVS documentation.

All contributions to the branch must also comply with the requirements detailed in the contributing documentation.

When posting to the development lists, please mark messages and patches with [tree-ssa] in the subject. As this is a branch, and not the mainline, the usual maintainer rules do not apply. This branch is maintained by Diego Novillo <dnovillo@redhat.com>. Approval from the usual maintainers will be needed when submitting patches from the branch onto the mainline.

Current contributors to this project include Daniel Berlin, Steven Bosscher, Paul Brook, Zdenek Dvorak, Frank Eigler, Ben Elliston, Andrew Haley, Richard Henderson, Graydon Hoare, Jan Hubicka, Aldy Hernandez, Andreas Jaeger, Jeff Law, Andrew MacLeod, Toon Moene, Jason Merrill, Diego Novillo, Sebastian Pop, Graham Stott and Jeff Sturm.


Branch stability

Every patch submitted for review must either fix a PR or adress one of the issues mentioned in the merge criteria document.

Patches that break default bootstraps or introduce new regressions will be removed (if a fix is not immediately obvious).

There are regular mainline merges about 2 or 3 times a month. The latest merge tag is always added to GCC's version string. A merge may be postponed if there is major breakage in mainline.


GENERIC and GIMPLE

While GCC trees contain sufficient information for implementing SSA, there are two major problems that make this difficult:

To address the first problem, we have created a common tree representation called GENERIC that is meant to be able to represent all the constructs needed by the different front ends while removing all the language dependencies. The design of GENERIC was discussed on the GCC lists. One of the main threads of discussion started with http://gcc.gnu.org/ml/gcc/2002-07/msg00890.html. A description of the current design can be found at http://gcc.gnu.org/ml/gcc/2002-08/msg01397.html.

To address the complexity problem we have implemented a new simplified intermediate representation based on GENERIC. The IR, called GIMPLE, is a very simple C-like three-address language that looks pretty straightforward to analyze and keeps all the high-level attributes of data types. GIMPLE is derived from the SIMPLE representation proposed by the McCAT project out of McGill University [2].

The data structures are the same trees used by GCC, but we impose rules on how the trees can be combined. For instance, the initial GENERIC representation for the expression:

a = b + c - d;

generates a single tree for the whole assignment statement. The GIMPLE version of that expression generates 2 different trees:

t1 = b + c;
a = t1 - d;

So, when examining expressions we can now assume that a PLUS operation will always have exactly two operands that are variables. This also exposes other opportunities like finding common expressions to eliminate (although it might also lead to code bloating, so we need to be careful). This new pass was discussed at length on the GCC lists, starting with http://gcc.gnu.org/ml/gcc/2002-01/msg00082.html.

The conversion from GENERIC into GIMPLE trees is implemented in gimplify.c. Additionally, each front end may have a set of language-specific helpers. For instance, the C/Objective-C front ends contain the helper functions in c-simplify.c, the C++ front end has its own in cp/cp-simplify.c. Predicates to determine whether a tree is in GIMPLE form are defined in tree-simple.[ch].


SSA implementation

Having trees in GIMPLE form enables language-independent analysis and transformation passes. Currently, we are implementing an SSA pass based on the algorithms described by Cytron et. al. [1].

The graph below describes the process:

The front ends described in the graph are just an example. In general, any front end that can emit functions-as-trees can be converted to emit GENERIC trees.

Conversion to SSA form is a three step process driven from tree-optimize.c:

  1. Convert the function into GIMPLE form. Implemented in gimplify.c and c-simplify.c.
  2. Find variable references in the code. Implemented in tree-dfa.c.
  3. Build a control-flow graph (CFG). Implemented in tree-cfg.c. This implementation uses the same basic_block structure used by the RTL optimizers. This allows us to share most of the existing CFG code.
  4. Rewrite the tree in SSA form. Implemented in tree-ssa.c.

Unparsing C trees

The file tree-pretty-print.c implements several debugging functions that given a GENERIC tree node, they print a C representation of the tree. The output is not meant to be compilable, but it is of great help when debugging transformations done by the transformation passes.


Tree Browser

For debugging, browsing, discovering, and playing with trees you can use the Tree Browser directly from gdb.

Implementation Status

This is a short list of the work that has already been finished or is ongoing.

Lowering of trees

Lowering from the language specific tree representations for C, C++, Java, Fortran 95, and Java bytecodes to GENERIC has been implemented. A more or less language-independent pass to lower from GENERIC to GIMPLE has also been implemented.

Into/out of SSA tree rewriting

The CFG builder has been in place for a while now, and most of the work that is done on it is tuning and speeding it up where possible. Most of the DFA infrastructure is in place. Type-based alias analysis has been implemented, but it needs some work to be fast enough. An initial implementation of Andersen Points-to analysis is also available. Rewriting the tree into SSA form is finished. Writing out of SSA form can now handle variables with overlapping live ranges. This means that most of the basic infrastructure is in place.

SSA Optimizations

The following optimization passes have been implemented to date:

All passes are enabled by default at -O1 and better.

Testing framework

Work on a framework for validation of the optimization passes has only just started. All analysis and optimization passes in the tree-ssa framework have the ability to dump an annotated intermediate representation. Support for scanning these tree dumps has been implemented for the existing DejaGNU testing framework.


TODO list

This is a loosely organized list of unimplemented features, possible improvement, and planned analyses and optimizations. Suggestions for other passes and volunteers to help finish the different passes are welcome.

Medium and long term goals

Infrastructure

Find and fix integration deficiencies
We have to integrate all the different passes to make one run after the other efficiently. The integration work done so far has uncovered many bugs and inefficiencies in the way we build and maintain DFA/SSA information, and as more passes are implemented, we will probably find some more deficiencies that need to be addressed.
Tune RTL expanders to the subset of trees used by GIMPLE
This is best illustrated with an example. In GIMPLE all loops are LOOP_EXPRs, a kind of tree node that was not produced by the C front end. A simple patch reduced the number of INSNs for a loop header from 16 to 12. It is very likely that other improvements like this are possible.
Analyse interactions between the tree and RTL optimizers
In the current implementation, no low-level (RTL) optimizations have been disabled yet, while the new infrastructure is already enabled. So, there now is a new stage in the compile pipeline but nothing has been taken out yet. In theory, the tree optimizers should make some of the optimizations we try to do on RTL redundant for the front ends that already use the tree-ssa infrastructure. But how exactly these two optimization infrastructures interact is not well understood yet.
Extend the testing framework
More analysis is required for a complete testing framework. The framework will have to be updated to allow it to verify things like correct/optimal code motions.
Add test cases and consistency checks
Test cases that prove the correctness and efficiency of the optimization passes are especially welcome.

Analyses

SSA information for arrays
The existing implementation treats arrays as an opaque object. A definition to an array location is treated as a definition for the whole array.
Pointer analysis (aliasing)
Dan Berlin is working on Andersen Points-to analysis. Unfortunately we cannot implement Steensgaard analysis due to patent issues.

Basic scalar transformations

GVN (Global Value Numbering)
Value numbering is used to detect equivalent expressions to eliminate redundancies. It assigns symbolic values to expressions such that if two expression have the same symbolic value, they compute the same value.
VRP (Value Range Propagation)[6]
This optimization tracks the weighted value ranges of variables through a program, much like constant propagation. These value ranges may be either numeric or symbolic in nature. A paper describing the algorithm, as well as some applications and experimental results, is available at http://www.pattosoft.com.au/jason/Papers/ValueRangeProp/.

Control transformations

Loop nest optimizations
The lno-branch contains preliminary patches that aim at implementing tree level loop optimizations.

References

[1]
R. Cytron, J. Ferrante, B. Rosen, M. Wegman, and K. Zadeck. Efficiently Computing Static Single Assignment Form and the Control Dependence Graph. ACM Transactions on Programming Languages and Systems, 13(4): 451-490, October 1991.
[2]
L. Hendren, C. Donawa, M. Emami, G. Gao, Justiani, and B. Sridharan. Designing the McCAT compiler based on a family of structured intermediate representations. In Proceedings of the 5th International Workshop on Languages and Compilers for Parallel Computing, pages 406-420. Lecture Notes in Computer Science, no. 457, Springer-Verlag, August 1992.
[3]
Robert Morgan. Building an Optimizing Compiler, Butterworth-Heinemann, 1998.
[4]
Mark N. Wegman and F. Kenneth Zadeck. Constant Propagation with Conditional Branches. ACM Transactions on Programming Languages and Systems, 13(2): 181-210, April 1991.
[5]
Robbert Kennedy, Sun Chan, Shin-Ming Liu, Raymond Lo, Peng Tu, and Fred Chow. Partial Redundancy Elimination in SSA Form. ACM Transactions on Programming Languages and Systems, 21(3): 627-676, 1999.
[6]
Jason R. C. Patterson. Accurate Static Branch Prediction by Value Range Propagation. Proceedings of the ACM SIGPLAN '95 Conference on Programming Language Design and Implementation, pages 67-78, June 1995.