Multidimensional scaling


Multidimensional scaling is a means of visualizing the level of similarity of individual cases of a dataset. MDS is used to translate "information about the pairwise 'distances' among a set of n objects or individuals" into a configuration of n points mapped into an abstract Cartesian space.
More technically, MDS refers to a set of related ordination techniques used in information visualization, in particular to display the information contained in a distance matrix. It is a form of non-linear dimensionality reduction.
Given a distance matrix with the distances between each pair of objects in a set, and a chosen number of dimensions, N, an MDS algorithm places each object into N-dimensional space such that the between-object distances are preserved as well as possible. For N=1, 2, and 3, the resulting points can be visualized on a scatter plot.

Types

MDS algorithms fall into a taxonomy, depending on the meaning of the input matrix:

Classical multidimensional scaling

It is also known as Principal Coordinates Analysis, Torgerson Scaling or Torgerson–Gower scaling. It takes an input matrix giving dissimilarities between pairs of items and outputs a coordinate matrix whose configuration minimizes a loss function called strain. For example, given the aerial distances between many cities in a matrix, where is the distance between the coordinates of and city, given by, you want to find the coordinates of the cities. This problem is addressed in classical MDS.
General forms of loss functions called Stress in distance MDS and Strain in classical MDS. The strain is given by:
, where are the terms of the matrix defined on step 2 of the following algorithm.

Metric multidimensional scaling (mMDS)

It is a superset of classical MDS that generalizes the optimization procedure to a variety of loss functions and input matrices of known distances with weights and so on. A useful loss function in this context is called stress, which is often minimized using a procedure called stress majorization. Metric MDS minimizes the cost function called “Stress” which is a residual sum of squares:
: or,

Non-metric multidimensional scaling (nMDS)

In contrast to metric MDS, non-metric MDS finds both a non-parametric monotonic relationship between the dissimilarities in the item-item matrix and the Euclidean distances between items, and the location of each item in the low-dimensional space. The relationship is typically found using isotonic regression: let denote the vector of proximities, a monotonic transformation of, and the point distances; then coordinates have to be found, that minimize the so-called stress,
Louis Guttman's smallest space analysis is an example of a non-metric MDS procedure.

Generalized multidimensional scaling (GMD)

An extension of metric multidimensional scaling, in which the target space is an arbitrary smooth non-Euclidean space. In cases where the dissimilarities are distances on a surface and the target space is another surface, GMDS allows finding the minimum-distortion embedding of one surface into another.

Details

The data to be analyzed is a collection of objects on which a distance function is defined,
These distances are the entries of the dissimilarity matrix
The goal of MDS is, given, to find vectors
such that
where is a vector norm. In classical MDS, this norm is the Euclidean distance, but, in a broader sense, it may be a metric or arbitrary distance function.
In other words, MDS attempts to find a mapping from the objects into such that distances are preserved. If the dimension is chosen to be 2 or 3, we may plot the vectors to obtain a visualization of the similarities between the objects. Note that the vectors are not unique: With the Euclidean distance, they may be arbitrarily translated, rotated, and reflected, since these transformations do not change the pairwise distances.
There are various approaches to determining the vectors. Usually, MDS is formulated as an optimization problem, where is found as a minimizer of some cost function, for example,
A solution may then be found by numerical optimization techniques. For some particularly chosen cost functions, minimizers can be stated analytically in terms of matrix eigendecompositions.

Procedure

There are several steps in conducting MDS research:
  1. Formulating the problem – What variables do you want to compare? How many variables do you want to compare? What purpose is the study to be used for?
  2. Obtaining input data – For example, :- Respondents are asked a series of questions. For each product pair, they are asked to rate similarity. The first question could be for Coke/Pepsi for example, the next for Coke/Hires rootbeer, the next for Pepsi/Dr Pepper, the next for Dr Pepper/Hires rootbeer, etc. The number of questions is a function of the number of brands and can be calculated as where Q is the number of questions and N is the number of brands. This approach is referred to as the “Perception data : direct approach”. There are two other approaches. There is the “Perception data : derived approach” in which products are decomposed into attributes that are rated on a semantic differential scale. The other is the “Preference data approach” in which respondents are asked their preference rather than similarity.
  3. Running the MDS statistical program – Software for running the procedure is available in many statistical software packages. Often there is a choice between Metric MDS, and Nonmetric MDS.
  4. Decide number of dimensions – The researcher must decide on the number of dimensions they want the computer to create. Interpretability of the MDS solution is often important, and lower dimensional solutions will typically be easier to interpret and visualize. However, dimension selection is also an issue of balancing underfitting and overfitting. Lower dimensional solutions may underfit by leaving out important dimensions of the dissimilarity data. Higher dimensional solutions may overfit to noise in the dissimilarity measurements. Model selection tools like AIC/BIC, Bayes factors, or cross-validation can thus be useful to select the dimensionality that balances underfitting and overfitting.
  5. Mapping the results and defining the dimensions – The statistical program will map the results. The map will plot each product. The proximity of products to each other indicate either how similar they are or how preferred they are, depending on which approach was used. How the dimensions of the embedding actually correspond to dimensions of system behavior, however, are not necessarily obvious. Here, a subjective judgment about the correspondence can be made.
  6. Test the results for reliability and validity – Compute R-squared to determine what proportion of variance of the scaled data can be accounted for by the MDS procedure. An R-square of 0.6 is considered the minimum acceptable level. An R-square of 0.8 is considered good for metric scaling and.9 is considered good for non-metric scaling. Other possible tests are Kruskal’s Stress, split data tests, data stability tests, and test-retest reliability.
  7. Report the results comprehensively – Along with the mapping, at least distance measure and reliability should be given. It is also very advisable to give the algorithm, which is often defined by the program used, if you have given a start configuration or had a random choice, the number of runs, the assessment of dimensionality, the Monte Carlo method results, the number of iterations, the assessment of stability, and the proportional variance of each axis.

    Implementations