Homoiconicity


In computer programming, homoiconicity is a property of some programming languages. A language is homoiconic if a program written in it can be manipulated as data using the language, and thus the program's internal representation can be inferred just by reading the program itself. For example, a Lisp program is written as a regular Lisp list, and can be manipulated by other Lisp code. This property is often summarized by saying that the language treats "code as data".
In a homoiconic language, the primary representation of programs is also a data structure in a primitive type of the language itself. This makes metaprogramming easier than in a language without this property: reflection in the language depends on a single, homogeneous structure, and it does not have to handle several different structures that would appear in a complex syntax.
As noted above, a commonly cited example is Lisp, which was created to allow for easy list manipulations and where the structure is given by S-expressions that take the form of nested lists. Lisp programs are written in the form of lists; the result is that the program can access its own functions and procedures while running, and programmatically alter itself on the fly. Homoiconic languages typically include full support of syntactic macros, allowing the programmer to express transformations of programs in a concise way. Examples are the programming languages Clojure, Rebol, Refal, Prolog, and more recently Julia.

History

The original source is the paper Macro Instruction Extensions of Compiler Languages, according to the early and influential paper TRAC, A Text-Handling Language:
Alan Kay used and possibly popularized the term "homoiconic" through his use of the term in his 1969 PhD thesis:

Uses and advantages

One advantage of homoiconicity is that extending the language with new concepts typically becomes simpler, as data representing code can be passed between the meta and base layer of the program. The abstract syntax tree of a function may be composed and manipulated as a data structure in the meta layer, and then evaluated. It can be much easier to understand how to manipulate the code since it can be more easily understood as simple data.
A typical demonstration of homoiconicity is the meta-circular evaluator.

Implementation methods

All Von Neumann architecture systems, which includes the vast majority of general purpose computers today, can implicitly be described as homoiconic due to the way that raw machine code executes in memory, the data type being bytes in memory. However, this feature can also be abstracted to the programming language level.
Languages such as Lisp and its dialects, such as Scheme, Clojure, Racket employ S-expressions to achieve homoiconicity.
Other languages which are considered to be homoiconic include:
uses S-expressions as an external representation for data and code. S-expressions can be read with the primitive Lisp function READ. READ returns Lisp data: lists, symbols, numbers, strings. The primitive Lisp function EVAL uses Lisp code represented as Lisp data, computes side-effects and returns a result. The result will be printed by the primitive function PRINT, which creates an external S-expression from Lisp data.
Lisp data, a list using different data types: lists, symbols, strings and integer numbers.

)

Lisp code. The example uses lists, symbols and numbers.

) ; in infix: sin*cos

Create above expression with the primitive Lisp function LIST and set the variable EXPRESSION to the result

) )
-> ) ; Lisp returns and prints the result
; the third element of the expression
->

Change the COS term to SIN

; The expression is now ).

Evaluate the expression

-> 0.7988834

Print the expression to a string

-> " )"

Read the expression from a string

)")
-> ) ; returns a list of lists, numbers and symbols

In Prolog


1 ?- X is 2*5.
X = 10.
2 ?- L =, write_canonical.
is
L =.
3 ?- L = :-), write_canonical.
L =.
4 ?- L = :-), assert.
L =.
5 ?- ten.
X = 10.
6 ?-

On line 4 we create a new clause. The operator :- separates the head and the body of a clause. With assert/1* we add it to the existing clauses, so we can call it later. In other languages we would call it "creating a function during runtime". We can also remove clauses from the database with abolish/1, or retract/1.
* The number after the clause's name is the number of arguments it can take. It is also called arity.
We can also query the database to get the body of a clause:

7 ?- clause.
Y =.
8 ?- clause, Y =.
Y =,
Z = 2*5.
9 ?- clause, call.
X = 10,
Y =.

call is analogous to Lisp's eval function.

In Rebol

The concept of treating code as data and the manipulation and evaluation thereof can be demonstrated very neatly in Rebol..
The following is an example of code in Rebol :
>> repeat i 3 ]
1 hello
2 hello
3 hello
.
By enclosing the code in square brackets, the interpreter does not evaluate it, but merely treats it as a block containing words:

] ]

This block has the type block! and can furthermore be assigned as the value of a word by using what appears to be a syntax for assignment, but is actually understood by the interpreter as a special type and takes the form of a word followed by a colon:
>> block1: ] ] ;; Assign the value of the block to the word `block1`

>> type? block1 ;; Evaluate the type of the word `block1`
block!
The block can still be interpreted by using the do function provided in Rebol.
It is possible to interrogate the elements of the block and change their values, thus altering the behavior of the code if it were to be evaluated:
>> block1/3 ;; The third element of the block
3
>> block1/3: 5 ;; Set the value of the 3rd element to 5
5
>> probe block1 ;; Show the changed block

>> do block1 ;; Evaluate the block
1 hello
2 hello
3 hello
4 hello
5 hello