In computer science, program synthesis is the task to construct a program that provably satisfies a given high-level formal specification. In contrast to program verification, the program is to be constructed rather than given; however, both fields make use of formal proof techniques, and both comprise approaches of different degrees of automatization. In contrast to automatic programming techniques, specifications in program synthesis are usually non-algorithmic statements in an appropriate logical calculus.
Origin
During the Summer Institute of Symbolic Logic at Cornell University in 1957, Alonzo Church defined the problem to synthesize a circuit from mathematical requirements. Even though the work only refers to circuits and not programs, the work is considered to be one of the earliest descriptions of program synthesis and some researchers refer to program synthesis as "Church's Problem". In the 1960s, a similar idea for an "automatic programmer" was explored by researchers in artificial intelligence. Since then, various research communities considered the problem of program synthesis. Notable works include the 1969 automata-theoretic approach by Büchi and Landweber, and the works by Manna and Waldinger. The development of modern high-level programming languages can also be understood as a form of program synthesis.
The early 21st century has seen a surge of practical interest in the idea of program synthesis in the formal verification community and related fields. Armando Solar-Lezama showed that it is possible to encode program synthesis problems in Boolean logic and use algorithms for the Boolean satisfiability problem to automatically find programs. In 2013, a unified framework for program synthesis problems was proposed by researchers at UPenn, UC Berkeley, and MIT. Since 2014 there has been a yearly program synthesis competition comparing the different algorithms for program synthesis in a competitive event, the Syntax-Guided Synthesis Competition or SyGuS-Comp. Still, the available algorithms are only able to synthesize small programs. A 2015 paper demonstrated synthesis of PHP programs axiomatically proven to meet a given specification, for purposes such as checking a number for being prime or listing the factors of a number.
The framework of Manna and Waldinger
The framework of Manna and Waldinger, published in 1980, starts from a user-given first-order specification formula. For that formula, a proof is constructed, thereby also synthesizing a functional program from unifying substitutions. The framework is presented in a table layout, the columns containing:
A line number for reference purposes
Formulas that already have been established, including axioms and preconditions,
Formulas still to be proven, including postconditions,,
Initially, background knowledge, pre-conditions, and post-conditions are entered into the table. After that, appropriate proof rules are applied manually. The framework has been designed to enhance human readability of intermediate formulas: contrary to classical resolution, it does not require clausal normal form, but allows one to reason with formulas of arbitrary structure and containing any junctors. The proof is complete when has been derived in the Goals column, or, equivalently, in the Assertions column. Programs obtained by this approach are guaranteed to satisfy the specification formula started from; in this sense they are correct by construction. Only a minimalist, yet Turing-complete, functional programming language, consisting of conditional, recursion, and arithmetic and other operators is supported. Case studies performed within this framework synthesized algorithms to compute e.g. division, remainder, square root, term unification, answers to relational database queries and several sorting algorithms.
Splitting of conjunctive assertions and of disjunctive goals.
Structural induction.
Murray has shown these rules to be complete for first-order logic. In 1986, Manna and Waldinger added generalized E-resolution and paramodulation rules to handle also equality; later, these rules turned out to be incomplete.
Example
As a toy example, a functional program to compute the maximum of two numbers and can be derived as follows. Starting from the requirement description "The maximum is larger than or equal to any given number, and is one of the given numbers", the first-order formula is obtained as its formal translation. This formula is to be proved. By reverse Skolemization, the specification in line 10 is obtained, an upper- and lower-case letter denoting a variable and a Skolem constant, respectively. After applying a transformation rule for the distributive law in line 11, the proof goal is a disjunction, and hence can be split into two cases, viz. lines 12 and 13. Turning to the first case, resolving line 12 with the axiom in line 1 leads to instantiation of the program variable in line 14. Intuitively, the last conjunct of line 12 prescribes the value that must take in this case. Formally, the non-clausal resolution rule shown in line 57 above is applied to lines 12 and 1, with
being the common instance of and, obtained by syntactically unifying the latter formulas,
yielding which simplifies to. In a similar way, line 14 yields line 15 and then line 16 by resolution. Also, the second case, in line 13, is handled similarly, yielding eventually line 18. In a last step, both cases are joined, using the resolution rule from line 58; to make that rule applicable, the preparatory step 15→16 was needed. Intuitively, line 18 could be read as "in case, the output is valid, while line 15 says "in case, the output is valid; the step 15→16 established that both cases 16 and 18 are complementary. Since both line 16 and 18 comes with a program term, a results in the program column. Since the goal formula has been derived, the proof is done, and the program column of the "" line contains the program. This procedure produces only a single operator of the form p?s:t taken from line 58. This is not a programming language because it is not Turing Complete. There are no commands e.g. ASSIGNMENT, IF/ELSE, FOR/WHILE or recursive programs, that are needed to make a language Turing Complete. It should be labeled as such: a way to create a single logical operator, not a way to create programs in general. Perhaps “Operator Synthesis” could be used. A method to build a wheel is not a method to build an automobile.