Term indexing


In computer science, a term index is a data structure to facilitate fast lookup of terms and clauses in a logic program, deductive database, or automated theorem prover.

Overview

Many operations in automatic theorem provers require search in huge collections of terms and clauses. Such operations typically fall into the following scheme. Given a collection of terms and a query term , find in some/all terms related to according to a certain retrieval condition. Most interesting retrieval conditions are formulated as existence of a substitution that relates in a special way the query and the retrieved objects. Here is a list of retrieval conditions frequently used in provers:
More often than not, we are actually interested in finding the appropriate
substitutions explicitly, together with the retrieved terms,
rather than just in establishing existence of such substitutions.
Very often the sizes of term sets to be searched are large,
the retrieval calls are frequent and the retrieval condition test
is rather complex. In such situations linear search in, when the retrieval
condition is tested on every term from, becomes prohibitively costly.
To overcome this problem, special data structures, called indexes, are
designed in order to support fast retrieval. Such data structures,
together with the accompanying algorithms for index maintenance
and retrieval, are called term indexing techniques.

Classic indexing techniques

Substitution trees outperform path indexing, discrimination tree indexing, and abstraction trees.
A discrimination tree term index stores its information in a trie data structure.

Modern indexing techniques