Johnson–Lindenstrauss lemma

In mathematics, the Johnson–Lindenstrauss lemma is a result named after William B. Johnson and Joram Lindenstrauss concerning low-distortion embeddings of points from high-dimensional into low-dimensional Euclidean space. The lemma states that a set of points in a high-dimensional space can be embedded into a space of much lower dimension in such a way that distances between the points are nearly preserved. The map used for the embedding is at least Lipschitz, and can even be taken to be an orthogonal projection.
The lemma has uses in compressed sensing, manifold learning, dimensionality reduction, and graph embedding. Much of the data stored and manipulated on computers, including text and images, can be represented as points in a high-dimensional space. However, the essential algorithms for working with such data tend to become bogged down very quickly as dimension increases. It is therefore desirable to reduce the dimensionality of the data in a way that preserves its relevant structure. The Johnson–Lindenstrauss lemma is a classic result in this vein.
Also, the lemma is tight up to a constant factor, i.e. there exists a set of points of size m that needs dimension
in order to preserve the distances between all pairs of points within a factor of.

Lemma

Given, a set of points in, and a number, there is a linear map such that
for all.
The formula can be rearranged:
One proof of the lemma takes ƒ to be a suitable multiple of the orthogonal projection onto a random subspace of dimension in, and exploits the phenomenon of concentration of measure.
Obviously an orthogonal projection will, in general, reduce the average distance between points, but the lemma can be viewed as dealing with relative distances, which do not change under scaling. In a nutshell, you roll the dice and obtain a random projection, which will reduce the average distance, and then you scale up the distances so that the average distance returns to its previous value. If you keep rolling the dice, you will, in polynomial random time, find a projection for which the distances satisfy the lemma.

Alternate statement

A related lemma is the distributional JL lemma. This lemma states that for any 0 < ε, δ < 1/2 and positive integer d, there exists a distribution over R^{k × d} from which the matrix A is drawn such that for k = O and for any unit-length vector x ∈ R^d, the claim below holds.
One can obtain the JL lemma from the distributional version by setting and for some pair u,v both in X. Then the JL lemma follows by a union bound over all such pairs.

Speeding up the JL transform

Given A, computing the matrix vector product takes O time. There has been some work in deriving distributions for which the matrix vector product can be computed in less than O time.
There are two major lines of work. The first, Fast Johnson Lindenstrauss Transform, was introduced by Ailon and Chazelle in 2006.
This method allows the computation of the matrix vector product in just for any constant.
Another approach is to build a distribution supported over matrices that are sparse.
This method allows keeping only an fraction of the entries in the matrix, which means the computation can be done in just time.
Furthermore, if the vector has only non-zereo entries, the Sparse JL takes time, which may be much less than the time used by Fast JL.

Tensorized Random Projections

It is possible to combine two JL matrices by taking the so-called Face-splitting product is defined as the tensor products of the rows.
More directly, let and be two matrices.
Then the Face-splitting product is
This idea of tensorization was used by Kasiviswanathan et al. 2010 for differential privacy.
JL matrices defined like this use fewer random bits, and can be applied quickly to vectors that have tensor structure, due to the following identity:
where is the element-wise product.
Such computations have been used to efficiently compute polynomial kernels and many other linear algebra algorithms.
In 2020 it was shown that if the matrices are independent or Gaussian matrices, the combined matrix satisfies the distributional JL lemma if the number of rows is at least
For large this is as good as the completely random Johnson-Lindenstrauss, but
a matching lower bound in the same paper shows that this exponential dependency on is necessary.
Alternative JL constructions are suggested to circumvent this.

Popular movies

The Hunger Games (film) - 2012 American dystopian action thriller science fiction-adventure film directed by Gary Ross and based on Suzanne Collins’s 2008 novel of the same name. It is the first insta...
untitled Captain Marvel sequel - part of Marvel Cinematic Universe....
Killers of the Flower Moon (film project) - Killers of the Flower Moon - film project in United States of America. It was presented as drama, detective fiction, thriller. The film project starred Leonardo Dicaprio, Robert De Niro. Director of...
Five Nights at Freddy's (film) - Five Nights at Freddy's - film published in 2017 in United States of America. Scenarist of the film - Scott Cawthon....

Popular books

Book of Revelation - The Book of Revelation is the final book of the New Testament, and consequently is also the final book of the Christian Bible. Its title is derived from the first word of the Koine Greek text: apok...
Book of Genesis - account of the creation of the world, the early history of humanity, Israel's ancestors and the origins...
Gospel of Matthew - The Gospel According to Matthew is the first book of the New Testament and one of the three synoptic gospels. It tells how Israel's Messiah, rejected and executed in Israel, pronounces judgement on ...
Michelin Guide - Michelin Guides are a series of guide books published by the French tyre company Michelin for more than a century. The term normally refers to the annually published Michelin Red Guide , the oldest...
Psalms - The Book of Psalms , commonly referred to simply as Psalms , the Psalter or "the Psalms", is the first book of the Ketuvim , the third section of the Hebrew Bible, and thus a book of th...
Ecclesiastes - Ecclesiastes is one of 24 books of the Tanakh , where it is classified as one of the Ketuvim . Originally written c. 450–200 BCE, it is also among the canonical Wisdom literature of the Old Tes...
The 48 Laws of Power - non-fiction book by American author Robert Greene. The book...

Popular television series

The Crown (TV series) - historical drama web television series about the reign of Queen Elizabeth II, created and principally written by Peter Morgan, and produced by Left Bank Pictures and Sony Pictures Tel...
Friends - American sitcom television series, created by David Crane and Marta Kauffman, which aired on NBC from September 22, 1994, to May 6, 2004, lasting ten seasons. With an ensemble cast sta...
Young Sheldon - spin-off prequel to The Big Bang Theory and begins with the character Sheldon...
Modern Family - American television mockumentary family sitcom created by Christopher Lloyd and Steven Levitan for the American Broadcasting Company. It ran for eleven seasons, from September 23...
Loki (TV series) - upcoming American web television miniseries created for Disney+ by Michael Waldron, based on the Marvel Comics character of the same name. It is set in the Marvel Cinematic Universe, shar...
Game of Thrones - American fantasy drama television series created by David Benioff and D. B. Weiss for HBO. It...
Shameless (American TV series) - American comedy-drama television series developed by John Wells which debuted on Showtime on January 9, 2011. It...