Static single assignment form

In compiler design, static single assignment form is a property of an intermediate representation, which requires that each variable be assigned exactly once, and every variable be defined before it is used. Existing variables in the original IR are split into versions, new variables typically indicated by the original name with a subscript in textbooks, so that every definition gets its own version. In SSA form, use-def chains are explicit and each contains a single element.
SSA was proposed by Barry K. Rosen, Mark N. Wegman, and F. Kenneth Zadeck in 1988. Ron Cytron, Jeanne Ferrante and the previous three researchers at IBM developed an algorithm that can compute the SSA form efficiently.
One can expect to find SSA in a compiler for Fortran, C or C++, whereas in functional language compilers, such as those for Scheme and ML, continuation-passing style is generally used. SSA is formally equivalent to a well-behaved subset of CPS excluding non-local control flow, which does not occur when CPS is used as intermediate representation. So optimizations and transformations formulated in terms of one immediately apply to the other.

Benefits

The primary usefulness of SSA comes from how it simultaneously simplifies and improves the results of a variety of compiler optimizations, by simplifying the properties of variables. For example, consider this piece of code:
y := 1
y := 2
x := y
Humans can see that the first assignment is not necessary, and that the value of y being used in the third line comes from the second assignment of y. A program would have to perform reaching definition analysis to determine this. But if the program is in SSA form, both of these are immediate:
y₁ := 1
y₂ := 2
x₁ := y₂
Compiler optimization algorithms which are either enabled or strongly enhanced by the use of SSA include:

Constant propagation
Value range propagation
Sparse conditional constant propagation
Dead code elimination
Global value numbering
Partial redundancy elimination
Strength reduction
Register allocation
Converting to SSA

Converting ordinary code into SSA form is primarily a simple matter of replacing the target of each assignment with a new variable, and replacing each use of a variable with the "version" of the variable reaching that point. For example, consider the following control flow graph:

Changing the name on the left hand side of "x x - 3" and changing the following uses of x to that new name would leave the program unaltered. This can be exploited in SSA by creating two new variables: x₁ and x₂, each of which is assigned only once. Likewise, giving distinguishing subscripts to all the other variables yields:

It is clear which definition each use is referring to, except for one case: both uses of y in the bottom block could be referring to either y₁ or y₂, depending on which path the control flow took.
To resolve this, a special statement is inserted in the last block, called a Φ function. This statement will generate a new definition of y called y₃ by "choosing" either y₁ or y₂, depending on the control flow in the past.

Now, the last block can simply use y₃, and the correct value will be obtained either way. A Φ function for x is not needed: only one version of x, namely x₂ is reaching this place, so there is no problem.
Given an arbitrary control flow graph, it can be difficult to tell where to insert Φ functions, and for which variables. This general question has an efficient solution that can be computed using a concept called dominance frontiers.
Φ functions are not implemented as machine operations on most machines. A compiler can implement a Φ function by inserting "move" operations at the end of every predecessor block. In the example above, the compiler might insert a move from y₁ to y₃ at the end of the middle-left block and a move from y₂ to y₃ at the end of the middle-right block. These move operations might not end up in the final code based on the compiler's register allocation procedure. However, this approach may not work when simultaneous operations are speculatively producing inputs to a Φ function, as can happen on wide-issue machines. Typically, a wide-issue machine has a selection instruction used in such situations by the compiler to implement the Φ function.
According to Kenny Zadeck, Φ functions were originally known as phony functions while SSA was being developed at IBM Research in the 1980s. The formal name of a Φ function was only adopted when the work was first published in an academic paper.

Computing minimal SSA using dominance frontiers

First, we need the concept of a dominator: we say that a node A strictly dominates a different node B in the control flow graph if it is impossible to reach B without passing through A first. This is useful, because if we ever reach B we know that any code in A has run. We say that A dominates B if either A strictly dominates B or A = B.
Now we can define the dominance frontier: a node B is in the dominance frontier of a node A if A does not strictly dominate B, but does dominate some immediate predecessor of B. From A's point of view, these are the nodes at which other control paths, which don't go through A, make their earliest appearance.
Dominance frontiers capture the precise places at which we need Φ functions: if the node A defines a certain variable, then that definition and that definition alone will reach every node A dominates. Only when we leave these nodes and enter the dominance frontier must we account for other flows bringing in other definitions of the same variable. Moreover, no other Φ functions are needed in the control flow graph to deal with A's definitions, and we can do with no less.
There is an efficient algorithm for finding dominance frontiers of each node. This algorithm was originally described in Cytron et al. 1991. Also useful is chapter 19 of the book "Modern compiler implementation in Java" by Andrew Appel. See the paper for more details.
Keith D. Cooper, Timothy J. Harvey, and Ken Kennedy of Rice University describe an algorithm in their paper titled A Simple, Fast Dominance Algorithm. The algorithm uses well-engineered data structures to improve performance. It is expressed simply as follows:
for each node b
dominance_frontier :=
for each node b
if the number of immediate predecessors of b ≥ 2
for each p in immediate predecessors of b
runner := p
while runner ≠ idom
dominance_frontier := dominance_frontier ∪
runner := idom
Note: in the code above, an immediate predecessor of node n is any node from which control is transferred to node n, and idom is the node that immediately dominates node b.

Variations that reduce the number of Φ functions

"Minimal" SSA inserts the minimal number of Φ functions required to ensure that each name is assigned a value exactly once and that each reference of a name in the original program can still refer to a unique name.
However, some of these Φ functions could be dead. For this reason, minimal SSA does not necessarily produce the fewest Φ functions that are needed by a specific procedure. For some types of analysis, these Φ functions are superfluous and can cause the analysis to run less efficiently.

Pruned SSA

Pruned SSA form is based on a simple observation: Φ functions are only needed for variables that are "live" after the Φ function. If a variable is not live, the result of the Φ function cannot be used and the assignment by the Φ function is dead.
Construction of pruned SSA form uses live variable information in the Φ function insertion phase to decide whether a given Φ function is needed. If the original variable name isn't live at the Φ function insertion point, the Φ function isn't inserted.
Another possibility is to treat pruning as a dead code elimination problem. Then, a Φ function is live only if any use in the input program will be rewritten to it, or if it will be used as an argument in another Φ function. When entering SSA form, each use is rewritten to the nearest definition that dominates it. A Φ function will then be considered live as long as it is the nearest definition that dominates at least one use, or at least one argument of a live Φ.

Semi-pruned SSA

Semi-pruned SSA form is an attempt to reduce the number of Φ functions without incurring the relatively high cost of computing live variable information. It is based on the following observation: if a variable is never live upon entry into a basic block, it never needs a Φ function. During SSA construction, Φ functions for any "block-local" variables are omitted.
Computing the set of block-local variables is a simpler and faster procedure than full live variable analysis, making semi-pruned SSA form more efficient to compute than pruned SSA form. On the other hand, semi-pruned SSA form will contain more Φ functions.

Converting out of SSA form

SSA form is not normally used for direct execution, and it is frequently used "on top of" another IR with which it remains in direct correspondence. This can be accomplished by "constructing" SSA as a set of functions which map between parts of the existing IR and its SSA counterpart. When the SSA form is no longer needed, these mapping functions may be discarded, leaving only the now-optimized IR.
Performing optimizations on SSA form usually leads to entangled SSA-Webs, meaning there are Φ instructions whose operands do not all have the same root operand. In such cases color-out algorithms are used to come out of SSA. Naive algorithms introduce a copy along each predecessor path which caused a source of different root symbol to be put in Φ than the destination of Φ. There are multiple algorithms for coming out of SSA with fewer copies, most use interference graphs or some approximation of it to do copy coalescing.

Extensions

Extensions to SSA form can be divided into two categories.
Renaming scheme extensions alter the renaming criterion. Recall that SSA form renames each variable when it is assigned a value. Alternative schemes include static single use form and static single information form.
Feature-specific extensions retain the single assignment property for variables, but incorporate new semantics to model additional features. Some feature-specific extensions model high-level programming language features like arrays, objects and aliased pointers. Other feature-specific extensions model low-level architectural features like speculation and predication.

Compilers using SSA form

SSA form is a relatively recent development in the compiler community. As such, many older compilers only use SSA form for some part of the compilation or optimization process, but most do not rely on it. Examples of compilers that rely heavily on SSA form include:

The ETH Oberon-2 compiler was one of the first public projects to incorporate "GSA", a variant of SSA.
The LLVM Compiler Infrastructure uses SSA form for all scalar register values in its primary code representation. SSA form is only eliminated once register allocation occurs, late in the compile process.
The Open64 compiler uses SSA form in its global scalar optimizer, though the code is brought into SSA form before and taken out of SSA form afterwards. Open64 uses extensions to SSA form to represent memory in SSA form as well as scalar values.
As of version 4 GCC, the GNU Compiler Collection, makes extensive use of SSA. The frontends generate "GENERIC" code which is then converted into "GIMPLE" code by the "gimplifier". High-level optimizations are then applied on the SSA form of "GIMPLE". The resulting optimized intermediate code is then translated into RTL, on which low-level optimizations are applied. The architecture-specific backends finally turn RTL into assembly language.
IBM's open source adaptive Java virtual machine, Jikes RVM, uses extended Array SSA, an extension of SSA that allows analysis of scalars, arrays, and object fields in a unified framework. Extended Array SSA analysis is only enabled at the maximum optimization level, which is applied to the most frequently executed portions of code.
In 2002, IBM's JikesRVM to run both standard Java bytecode and a typesafe SSA bytecode class files, and demonstrated significant performance benefits to using the SSA bytecode.
Oracle's HotSpot Java Virtual Machine uses an SSA-based intermediate language in its JIT compiler.
Microsoft Visual C++ compiler backend available in Microsoft Visual Studio 2015 Update 3 uses SSA
Mono uses SSA in its JIT compiler called Mini.
is an open-source compiler for the academic instruction set Jackal 3.0. It uses a simple 3-operand code with SSA for its intermediate representation. As an interesting variant, it replaces Φ functions with a so-called SAME instruction, which instructs the register allocator to place the two live ranges into the same physical register.
Although not a compiler, the decompiler uses SSA form in its internal representation. SSA is used to simplify expression propagation, identifying parameters and returns, preservation analysis, and more.
Portable.NET uses SSA in its JIT compiler.
a completely graph based SSA intermediate representation for compilers. libFirm uses SSA form for all scalar register values until code generation by use of an SSA-aware register allocator.
The Illinois Concert Compiler circa 1994 used a variant of SSA called SSU which renames each variable when it is assigned a value, and in each conditional context in which that variable is used; essentially the static single information form mentioned above. The SSU form is documented in .
The COINS compiler uses SSA form optimizations as explained here: http://www.is.titech.ac.jp/~sassa/coins-www-ssa/english/
The Mozilla Firefox SpiderMonkey JavaScript engine uses SSA-based IR.
The Chromium V8 JavaScript engine implements SSA in its Crankshaft compiler infrastructure as
PyPy uses a linear SSA representation for traces in its JIT compiler.
Android's Dalvik virtual machine uses SSA in its JIT compiler.
Android's new optimizing compiler for the Android Runtime uses SSA for its IR.
The Standard ML compiler MLton uses SSA in one of its intermediate languages.
LuaJIT makes heavy use of SSA-based optimizations.
The PHP and Hack compiler HHVM uses SSA in its IR.
Reservoir Labs' R-Stream compiler supports non-SSA, SSA and SSI forms.
Go.
SPIR-V, the shading language standard for the Vulkan graphics API and kernel language for OpenCL compute API, is an SSA representation.
Various Mesa drivers via NIR, an SSA representation for shading languages.
WebKit uses SSA in its JIT compilers.
Swift defines its own SSA form above LLVM IR, called SIL.
Erlang rewrote their compiler in OTP 22.0 to "internally use an intermediate representation based on Static Single Assignment." With plans for further optimizations built on top of SSA in future releases.
General references
Also available in Java and C versions.

Popular movies

The Hunger Games (film) - 2012 American dystopian action thriller science fiction-adventure film directed by Gary Ross and based on Suzanne Collins’s 2008 novel of the same name. It is the first insta...
untitled Captain Marvel sequel - part of Marvel Cinematic Universe....
Killers of the Flower Moon (film project) - Killers of the Flower Moon - film project in United States of America. It was presented as drama, detective fiction, thriller. The film project starred Leonardo Dicaprio, Robert De Niro. Director of...
Five Nights at Freddy's (film) - Five Nights at Freddy's - film published in 2017 in United States of America. Scenarist of the film - Scott Cawthon....

Popular books

Book of Revelation - The Book of Revelation is the final book of the New Testament, and consequently is also the final book of the Christian Bible. Its title is derived from the first word of the Koine Greek text: apok...
Book of Genesis - account of the creation of the world, the early history of humanity, Israel's ancestors and the origins...
Gospel of Matthew - The Gospel According to Matthew is the first book of the New Testament and one of the three synoptic gospels. It tells how Israel's Messiah, rejected and executed in Israel, pronounces judgement on ...
Michelin Guide - Michelin Guides are a series of guide books published by the French tyre company Michelin for more than a century. The term normally refers to the annually published Michelin Red Guide , the oldest...
Psalms - The Book of Psalms , commonly referred to simply as Psalms , the Psalter or "the Psalms", is the first book of the Ketuvim , the third section of the Hebrew Bible, and thus a book of th...
Ecclesiastes - Ecclesiastes is one of 24 books of the Tanakh , where it is classified as one of the Ketuvim . Originally written c. 450–200 BCE, it is also among the canonical Wisdom literature of the Old Tes...
The 48 Laws of Power - non-fiction book by American author Robert Greene. The book...

Popular television series

The Crown (TV series) - historical drama web television series about the reign of Queen Elizabeth II, created and principally written by Peter Morgan, and produced by Left Bank Pictures and Sony Pictures Tel...
Friends - American sitcom television series, created by David Crane and Marta Kauffman, which aired on NBC from September 22, 1994, to May 6, 2004, lasting ten seasons. With an ensemble cast sta...
Young Sheldon - spin-off prequel to The Big Bang Theory and begins with the character Sheldon...
Modern Family - American television mockumentary family sitcom created by Christopher Lloyd and Steven Levitan for the American Broadcasting Company. It ran for eleven seasons, from September 23...
Loki (TV series) - upcoming American web television miniseries created for Disney+ by Michael Waldron, based on the Marvel Comics character of the same name. It is set in the Marvel Cinematic Universe, shar...
Game of Thrones - American fantasy drama television series created by David Benioff and D. B. Weiss for HBO. It...
Shameless (American TV series) - American comedy-drama television series developed by John Wells which debuted on Showtime on January 9, 2011. It...