Information distance

Information distance is the distance between two finite objects expressed as the number of bits in the shortest program which transforms one object into the other one or vice versa on a
universal computer. This is an extension of Kolmogorov complexity. The Kolmogorov complexity of a single finite object is the information in that object; the information distance between a pair of finite objects is the minimum information required to go from one object to the other or vice versa.
Information distance was first defined and investigated in based on thermodynamic principles, see also. Subsequently, it achieved final form in. It is applied in the normalized compression distance and the normalized Google distance.

Properties

Formally the information distance between and is defined by
with a finite binary program for the fixed universal computer
with as inputs finite binary strings. In it is proven that
with
where is the Kolmogorov complexity defined by of the prefix type. This is the important quantity.

Universality

Let be the class of upper semicomputable distances that satisfy the density condition
This excludes irrelevant distances such as for ;
it takes care that if the distance growth then the number of objects within that distance of a geven object grows.
If then up to a constant additive term.

Metricity

The distance is a metric up to an additive
term in the metric equalities.

Maximum overlap

If, then there is a program of length that converts to, and a program of length such that the program converts to. That is, the shortest programs to convert between two objects can be made maximally overlapping: For it can be divided into a program that converts object to object, and another program which concatenated with the first converts to while the concatenation of these two programs is a shortest program to convert between these objects.

Minimum overlap

The programs to convert between objects and can also be made minimal overlapping.
There exists a program of length up to an additive term of that maps to and has small complexity when is known. Interchanging the two objects we have the other program Having in mind the parallelism between Shannon information theory and Kolmogorov complexity theory, one can say that this result is parallel to the Slepian-Wolf and Körner–Imre Csiszár–Marton theorems.

Applications

Theoretical

The result of An.A. Muchnik on minimum overlap above is an important theoretical application showing
that certain codes exist: to go to finite target object from any object there is a program which almost only
depends on the target object! This result is fairly precise and the error term cannot be significantly improved. Information distance was material in the textbook, it occurs in the Encyclopedia on Distances.

Practical

To determine the similarity of objects such as genomes, languages, music, internet attacks and worms, software programs, and so on, information distance is normalized and the Kolmogorov complexity terms approximated by real-world compressors. The result is the normalized compression distance between the objects. This pertains to objects given as computer files like the genome of a mouse or text of a book. If the objects are just given by name such as `Einstein' or `table' or the name of a book or the name `mouse', compression does not make sense. We need outside information about what the name means. Using a data base and a means to search the database provides this information. Every search engine on a data base that provides aggregate page counts can be used in the normalized Google distance.

Related literature

Popular movies

The Hunger Games (film) - 2012 American dystopian action thriller science fiction-adventure film directed by Gary Ross and based on Suzanne Collins’s 2008 novel of the same name. It is the first insta...
untitled Captain Marvel sequel - part of Marvel Cinematic Universe....
Killers of the Flower Moon (film project) - Killers of the Flower Moon - film project in United States of America. It was presented as drama, detective fiction, thriller. The film project starred Leonardo Dicaprio, Robert De Niro. Director of...
Five Nights at Freddy's (film) - Five Nights at Freddy's - film published in 2017 in United States of America. Scenarist of the film - Scott Cawthon....

Popular books

Book of Revelation - The Book of Revelation is the final book of the New Testament, and consequently is also the final book of the Christian Bible. Its title is derived from the first word of the Koine Greek text: apok...
Book of Genesis - account of the creation of the world, the early history of humanity, Israel's ancestors and the origins...
Gospel of Matthew - The Gospel According to Matthew is the first book of the New Testament and one of the three synoptic gospels. It tells how Israel's Messiah, rejected and executed in Israel, pronounces judgement on ...
Michelin Guide - Michelin Guides are a series of guide books published by the French tyre company Michelin for more than a century. The term normally refers to the annually published Michelin Red Guide , the oldest...
Psalms - The Book of Psalms , commonly referred to simply as Psalms , the Psalter or "the Psalms", is the first book of the Ketuvim , the third section of the Hebrew Bible, and thus a book of th...
Ecclesiastes - Ecclesiastes is one of 24 books of the Tanakh , where it is classified as one of the Ketuvim . Originally written c. 450–200 BCE, it is also among the canonical Wisdom literature of the Old Tes...
The 48 Laws of Power - non-fiction book by American author Robert Greene. The book...

Popular television series

The Crown (TV series) - historical drama web television series about the reign of Queen Elizabeth II, created and principally written by Peter Morgan, and produced by Left Bank Pictures and Sony Pictures Tel...
Friends - American sitcom television series, created by David Crane and Marta Kauffman, which aired on NBC from September 22, 1994, to May 6, 2004, lasting ten seasons. With an ensemble cast sta...
Young Sheldon - spin-off prequel to The Big Bang Theory and begins with the character Sheldon...
Modern Family - American television mockumentary family sitcom created by Christopher Lloyd and Steven Levitan for the American Broadcasting Company. It ran for eleven seasons, from September 23...
Loki (TV series) - upcoming American web television miniseries created for Disney+ by Michael Waldron, based on the Marvel Comics character of the same name. It is set in the Marvel Cinematic Universe, shar...
Game of Thrones - American fantasy drama television series created by David Benioff and D. B. Weiss for HBO. It...
Shameless (American TV series) - American comedy-drama television series developed by John Wells which debuted on Showtime on January 9, 2011. It...