Manifold alignment

Manifold alignment is a class of machine learning algorithms that produce projections between sets of data, given that the original data sets lie on a common manifold. The concept was first introduced as such by Ham, Lee, and Saul in 2003, adding a manifold constraint to the general problem of correlating sets of high-dimensional vectors.

Overview

Manifold alignment assumes that disparate data sets produced by similar generating processes will share a similar underlying manifold representation. By learning projections from each original space to the shared manifold, correspondences are recovered and knowledge from one domain can be transferred to another. Most manifold alignment techniques consider only two data sets, but the concept extends to arbitrarily many initial data sets.
Consider the case of aligning two data sets, and, with and.
Manifold alignment algorithms attempt to project both and into a new d-dimensional space such that the projections both minimize distance between corresponding points and preserve the local manifold structure of the original data. The projection functions are denoted:
Let represent the binary correspondence matrix between points in and :
Let and represent pointwise similarities within data sets. This is usually encoded as the heat kernel of the adjacency matrix of a k-nearest neighbor graph.
Finally, introduce a coefficient, which can be tuned to adjust the weight of the 'preserve manifold structure' goal, versus the 'minimize corresponding point distances' goal.
With these definitions in place, the loss function for manifold alignment can be written:
Solving this optimization problem is equivalent to solving a generalized eigenvalue problem using the graph laplacian of the joint matrix, G:

Inter-data correspondences

The algorithm described above requires full pairwise correspondence information between input data sets; a supervised learning paradigm. However, this information is usually difficult or impossible to obtain in real world applications. Recent work has extended the core manifold alignment algorithm to semi-supervised
, unsupervised
, and multiple-instance
settings.

One-step vs. two-step alignment

The algorithm described above performs a "one-step" alignment, finding embeddings for both data sets at the same time. A similar effect can also be achieved with "two-step" alignments
, following a slightly modified procedure:

Project each input data set to a lower-dimensional space independently, using any of a variety of dimension reduction algorithms.
Perform linear manifold alignment on the embedded data, holding the first data set fixed, mapping each additional data set onto the first's manifold. This approach has the benefit of decomposing the required computation, which lowers memory overhead and allows parallel implementations.
Instance-level vs. feature-level projections

Manifold alignment can be used to find linear projections, or nonlinear embeddings. While the instance-level version generally produces more accurate alignments, it sacrifices a great degree of flexibility as the learned embedding is often difficult to parameterize. Feature-level projections allow any new instances to be easily embedded in the manifold space, and projections may be combined to form direct mappings between the original data representations. These properties are especially important for knowledge-transfer applications.

Applications

Manifold alignment is suited to problems with several corpora that lie on a shared manifold, even when each corpus is of a different dimensionality. Many real-world problems fit this description, but traditional techniques are not able to take advantage of all corpora at the same time. Manifold alignment also facilitates transfer learning, in which knowledge of one domain is used to jump-start learning in correlated domains.
Applications of manifold alignment include:

Cross-language information retrieval / automatic translation
* By representing documents as vector of word counts, manifold alignment can recover the mapping between documents of different languages.
* Cross-language document correspondence is relatively easy to obtain, especially from multi-lingual organizations like the European Union.
Transfer learning of policy and state representations for reinforcement learning
Alignment of protein NMR structures
Accelerating model learning in robotics by sharing data generated by other robots

Popular movies

The Hunger Games (film) - 2012 American dystopian action thriller science fiction-adventure film directed by Gary Ross and based on Suzanne Collins’s 2008 novel of the same name. It is the first insta...
untitled Captain Marvel sequel - part of Marvel Cinematic Universe....
Killers of the Flower Moon (film project) - Killers of the Flower Moon - film project in United States of America. It was presented as drama, detective fiction, thriller. The film project starred Leonardo Dicaprio, Robert De Niro. Director of...
Five Nights at Freddy's (film) - Five Nights at Freddy's - film published in 2017 in United States of America. Scenarist of the film - Scott Cawthon....

Popular books

Book of Revelation - The Book of Revelation is the final book of the New Testament, and consequently is also the final book of the Christian Bible. Its title is derived from the first word of the Koine Greek text: apok...
Book of Genesis - account of the creation of the world, the early history of humanity, Israel's ancestors and the origins...
Gospel of Matthew - The Gospel According to Matthew is the first book of the New Testament and one of the three synoptic gospels. It tells how Israel's Messiah, rejected and executed in Israel, pronounces judgement on ...
Michelin Guide - Michelin Guides are a series of guide books published by the French tyre company Michelin for more than a century. The term normally refers to the annually published Michelin Red Guide , the oldest...
Psalms - The Book of Psalms , commonly referred to simply as Psalms , the Psalter or "the Psalms", is the first book of the Ketuvim , the third section of the Hebrew Bible, and thus a book of th...
Ecclesiastes - Ecclesiastes is one of 24 books of the Tanakh , where it is classified as one of the Ketuvim . Originally written c. 450–200 BCE, it is also among the canonical Wisdom literature of the Old Tes...
The 48 Laws of Power - non-fiction book by American author Robert Greene. The book...

Popular television series

The Crown (TV series) - historical drama web television series about the reign of Queen Elizabeth II, created and principally written by Peter Morgan, and produced by Left Bank Pictures and Sony Pictures Tel...
Friends - American sitcom television series, created by David Crane and Marta Kauffman, which aired on NBC from September 22, 1994, to May 6, 2004, lasting ten seasons. With an ensemble cast sta...
Young Sheldon - spin-off prequel to The Big Bang Theory and begins with the character Sheldon...
Modern Family - American television mockumentary family sitcom created by Christopher Lloyd and Steven Levitan for the American Broadcasting Company. It ran for eleven seasons, from September 23...
Loki (TV series) - upcoming American web television miniseries created for Disney+ by Michael Waldron, based on the Marvel Comics character of the same name. It is set in the Marvel Cinematic Universe, shar...
Game of Thrones - American fantasy drama television series created by David Benioff and D. B. Weiss for HBO. It...
Shameless (American TV series) - American comedy-drama television series developed by John Wells which debuted on Showtime on January 9, 2011. It...

Manifold alignment

Overview

Inter-data correspondences

One-step vs. two-step alignment

Instance-level vs. feature-level projections

Applications