Hygienic macro


Hygienic macros are macros whose expansion is guaranteed not to cause the accidental capture of identifiers. They are a feature of programming languages such as Scheme, Dylan, Rust, and Julia. The general problem of accidental capture was well known within the Lisp community prior to the introduction of hygienic macros. Macro writers would use language features that would generate unique identifiers or use obfuscated identifiers in order to avoid the problem. Hygienic macros are a programmatic solution to the capture problem that is integrated into the macro expander itself. The term "hygiene" was coined in Kohlbecker et al.'s 1986 paper that introduced hygienic macro expansion, inspired by the terminology used in mathematics.

The hygiene problem

In programming languages that have non-hygienic macro systems, it is possible for existing variable bindings to be hidden from a macro by variable bindings that are created during its expansion. In C, this problem can be illustrated by the following fragment:

  1. define INCI do while
int main

Running the above through the C preprocessor produces:

int main

The variable a declared in the top scope is shadowed by the a variable in the macro, which introduces a new scope. As a result, it is never altered by the execution of the program, as the output of the compiled program shows:
a is now 4, b is now 9
The simplest solution is to give the macros variables names that do not conflict with any variable in the current program:

  1. define INCI do while
int main

Until a variable named INCIa is created, this solution produces the correct output:
a is now 5, b is now 9
The problem is solved for the current program, but this solution is not robust. The variables used inside the macro and those in the rest of the program have to be kept in sync by the programmer. Specifically, using the macro INCI on a variable INCIa is going to fail in the same way that the original macro failed on a variable a.
The "hygiene problem" can extend beyond variable bindings. Consider this Common Lisp macro:

`
))

While there are no references to variables in this macro, it assumes the symbols "if", "not", and "progn" are all bound to their usual definitions. If, however the above macro is used in the following code:

)

The definition of "not" has been locally altered and so the expansion of my-unless changes.
On the other hand, hygienic macro systems preserve the lexical scoping of all identifiers automatically. This property is called referential transparency.
Of course, the problem can occur for program-defined functions which are not protected in the same way:

`
))
)

The Common Lisp solution to this problem is to use packages. The my-unless macro can reside in its own package, where user-defined-operator is a private symbol in that package. The symbol user-defined-operator occurring in the user code will then be a different symbol, unrelated to the one used in the definition of the my-unless macro.
Meanwhile, languages such as Scheme that use hygienic macros prevent accidental capture and ensure referential transparency automatically as part of the macro expansion process. In cases where capture is desired, some systems allow the programmer to explicitly violate the hygiene mechanisms of the macro system.
For example, the following Scheme implementation of my-unless will have the desired behavior:



))))
)

))

Strategies used in languages that lack hygienic macros

In some languages such as Common Lisp, Scheme and others of the Lisp language family, macros provide a powerful means of extending the language. Here the lack of hygiene in conventional macros is resolved by several strategies.
;Obfuscation
;Temporary symbol creation
;Read-time Uninterned Symbol
;Packages
;Hygienic transformation
;Literal objects

Implementations

Macro systems that automatically enforce hygiene originated with Scheme. The original algorithm for a hygienic macro system was presented by Kohlbecker in '86. At the time, no standard macro system was adopted by Scheme implementations. Shortly thereafter in '87, Kohlbecker and Wand proposed a declarative pattern-based language for writing macros, which was the predecessor to the syntax-rules macro facility adopted by the R5RS standard. Syntactic closures, an alternative hygiene mechanism, was proposed as an alternative to Kohlbecker et al.'s system by Bawden and Rees in '88. Unlike the KFFD algorithm, syntactic closures require the programmer to explicitly specify the resolution of the scope of an identifier. In 1993, Dybvig et al. introduced the syntax-case macro system, which uses an alternative representation of syntax and maintains hygiene automatically. The syntax-case system can express the syntax-rules pattern language as a derived macro.
The term macro system can be ambiguous because, in the context of Scheme, it can refer to both a pattern-matching construct and a framework for representing and manipulating syntax. Syntax-rules is a high-level pattern matching facility that attempts to make macros easier to write. However, syntax-rules is not able to succinctly describe certain classes of macros and is insufficient to express other macro systems. Syntax-rules was described in the R4RS document in an appendix but not mandated. Later, R5RS adopted it as a standard macro facility. Here is an example syntax-rules macro that swaps the value of two variables:


)

))))

Due to the deficiencies of a purely syntax-rules based macro system, low-level macro systems have also been proposed and implemented for Scheme. Syntax-case is one such system. Unlike syntax-rules, syntax-case contains both a pattern matching language and a low-level facility for writing macros. The former allows macros to be written declaratively, while the latter allows the implementation of alternative frontends for writing macros. The swap example from before is nearly identical in syntax-case because the pattern matching language is similar:



)

))))))

However, syntax-case is more powerful than syntax-rules. For example, syntax-case macros can specify side-conditions on its pattern matching rules via arbitrary Scheme functions. Alternatively, a macro writer can choose not to use the pattern matching frontend and manipulate the syntax directly. Using the datum->syntax function, syntax-case macros can also intentionally capture identifiers, thus breaking hygiene. The R6RS Scheme standard adopted the syntax-case macro system.
Syntactic closures and explicit renaming are two other alternative macro systems. Both systems are lower-level than syntax-rules and leave the enforcement of hygiene to the macro writer. This differs from both syntax-rules and syntax-case, which automatically enforce hygiene by default. The swap examples from above are shown here using a syntactic closure and explicit renaming implementation respectively:

;; syntactic closures

)
`)

)))))
;; explicit renaming

` )

)))))

Languages with hygienic macro systems

Hygienic Macros offer some safety for the programmer at the expense of limiting the power of macros. As a direct consequence, Common Lisp macros are much more powerful than Scheme macros, in terms of what can be achieved with them. Doug Hoyte, author of Let Over Lambda, stated: