C-- is a "portable assembly language", designed to ease the task of implementing a compiler which produces high qualitymachine code. This is done by having the compiler generate C-- code, delegating the harder work of low-level code generation and optimisation to a C-- compiler. Work on C-- began in the late 1990s. Since writing a custom code generator is a challenge in itself, and the compiler back ends available to researchers at that time were complex and poorly documented, several projects had written compilers which generated C code. However, C is a poor choice for functional languages: it does not guarantee tail call optimization, or support accurate garbage collection or efficient exception handling. C-- is a simpler, tightly-defined alternative to C which does support all of these things. Its most innovative feature is a run-time interface which allows writing of portable garbage collectors, exception handling systems and other run-time features which work with any C-- compiler. The language's syntax borrows heavily from C. It omits or changes standard C features such as variadic functions, pointer syntax, and aspects of C's type system, because they hamper certain essential features of C-- and the ease with which code-generation tools can produce it. The name of the language is an in-joke, indicating that C-- is a reduced form of C, in the same way that C++ is basically an expanded form of C. The first version of C-- was released in April 1998 as a MSRA paper, accompanied by a January 1999 paper on garbage collection. A revised manual was posted in HTML form in May 1999. Two sets of major changes proposed in 2000 by Norman Ramsey and Christian Lindig lead to C-- version 2, which was finalized around 2004 and officially released in 2005.
Type system
The C-- type system is deliberately designed to reflect constraints imposed by hardware rather than conventions imposed by higher-level languages. In C--, a value stored in a register or memory may have only one type: bit vector. However, bit vector is a polymorphic type and may come in several widths, e.g., bits8, bits32, or bits64. A separate 32-or-64 bit family of floating-point types is supported. In addition to the bit-vector type, C-- also provides a Boolean typebool, which can be computed by expressions and used for control flow but cannot be stored in a register or in memory. As in an assembly language, any higher type discipline, such as distinctions between signed, unsigned, float, and pointer, is imposed by the C-- operators or other syntactic constructs in the language. C-- version 2 removes the distinction between bit-vector and floating-point types. Programmers are allowed to annotate these types with a string "kind" tag to distinguish, among other things, a variable's integer vs float typing and its storage behavior. The first part is useful on targets that have separate registers for integer and floating-point values. In addition, special types for pointers and the native word is introduced, although all they do is mapping to a bit vector with a target-dependent length. C-- is not type-checked, nor does it enforce or check the calling convention.
Implementations
The specification page of C-- lists a few implementations of C--. The "most actively developed" compiler, Quick C--, was abandoned in 2013.
Haskell
A C-- dialect called Cmm is the intermediate representation for the Glasgow Haskell Compiler. GHC backends are responsible for further transforming C-- into executable code, via LLVM IR, slow C, or directly through the built-in native backend. Some of the developers of C--, including Simon Peyton Jones, João Dias, and Norman Ramsey, work or have worked on the Glasgow Haskell Compiler. Work on GHC has also led to extensions in the C-- language, forming the Cmm dialect. Cmm uses the C preprocessor for ergonomics. Despite the original intention, GHC does perform many of its generic optimizations on C--. As with other compiler IRs, GHC allows for dumping the C-- representation for debugging. Target-specific optimizations are, of course, performed later by the backend.