Euclidean distance


In mathematics, the Euclidean distance or Euclidean metric is the "ordinary" straight-line distance between two points in Euclidean space. With this distance, Euclidean space becomes a metric space. The associated norm is called the Euclidean norm. Older literature refers to the metric as the Pythagorean metric. A generalized term for the Euclidean norm is the L2 norm or L2 distance.

Definition

The Euclidean distance between points p and q is the length of the line segment connecting them.
In Cartesian coordinates, if p = and q = are two points in Euclidean n-space, then the Euclidean distance from p to q, or from q to p is given by the Pythagorean formula:
The position of a point in a Euclidean n-space is a Euclidean vector. So, p and q may be represented as Euclidean vectors, starting from the origin of the space with their tips ending at the two points. The Euclidean norm, or Euclidean length, or magnitude of a vector measures the length of the vector:
where the last expression involves the dot product.
Describing a vector as a directed line segment from the origin of the Euclidean space, to a point in that space, its length is actually the distance from its tail to its tip. The Euclidean norm of a vector is seen to be just the Euclidean distance between its tail and its tip.
The relationship between points p and q may involve a direction, so when it does, this relationship can itself be represented by a vector, given by
In a two- or three-dimensional space, this can be visually represented as an arrow from p to q. In any space it can be regarded as the position of q relative to p. It may also be called a displacement vector if p and q represent two positions of some moving point.
The Euclidean distance between p and q is just the Euclidean length of this displacement vector:
which is equivalent to equation 1, and also to:

One dimension

In the context of Euclidean geometry, a metric is established in one dimension by fixing two points on a line, and choosing one to be the origin. The length of the line segment between these points defines the unit of distance and the direction from the origin to the second point is defined as the positive direction. This line segment may be translated along the line to build longer segments whose lengths correspond to multiples of the unit distance. In this manner real numbers can be associated to points on the line and these are the Cartesian coordinates of the points on what may now be called the real line. As an alternate way to establish the metric, instead of choosing two points on the line, choose one point to be the origin, a unit of length and a direction along the line to call positive. The second point is then uniquely determined as the point on the line that is at a distance of one positive unit from the origin.
The distance between any two points on the real line is the absolute value of the numerical difference of their coordinates. It is common to identify the name of a point with its Cartesian coordinate. Thus if p and q are two points on the real line, then the distance between them is given by:
In one dimension, there is a single homogeneous, translation-invariant metric, up to a scale factor of length, which is the Euclidean distance, induced by the absolute-value norm. In higher dimensions there are other possible norms, such as the norms, and in one dimension there are other metrics, but they are not induced by norms.

Two dimensions

In the Euclidean plane, if p = and q = then the distance is given by
This is equivalent to the Pythagorean theorem.
Alternatively, it follows from that if the polar coordinates of the point p are and those of q are, then the distance between the points is

Three dimensions

In three-dimensional Euclidean space, the distance is

''n'' dimensions

In general, for an n-dimensional space, the distance is

Squared Euclidean distance

The square of the standard Euclidean distance, which is known as the squared Euclidean distance, is also of interest; as an equation:
Squared Euclidean distance is of central importance in estimating parameters of statistical models, where it is used in the method of least squares, a standard approach to regression analysis. The corresponding loss function is the squared error loss, and places progressively greater weight on larger errors. The corresponding risk function is mean squared error.
Squared Euclidean distance is not a metric, as it does not satisfy the triangle inequality. However, it is a more general notion of distance, namely a divergence, and can be used as a statistical distance. The Pythagorean theorem is simpler in terms of squared distance ; if, then:
In information geometry, the Pythagorean identity can be generalized from SED to other Bregman divergences, including relative entropy, allowing generalized forms of least squares to be used to solve non-linear problems.
The SED is a smooth, strictly convex function of the two points, unlike the distance, which is not smooth when two points are equal and is not strictly convex. The SED is thus preferred in optimization theory, since it allows convex analysis to be used. Since squaring is a monotonic function of non-negative values, minimizing the SED is equivalent to minimizing the Euclidean distance, so the optimization problem is equivalent in terms of either, but easier to solve using the SED.
If one of the points is fixed, the SED can be interpreted as a potential function, in which case a normalization factor of one half is used, and the sign may be switched, depending on convention. In detail, given two points, the vector points from to and has magnitude proportional to their Euclidean distance. If one fixes, one can thus define a smooth vector field "pointing at " by This is the gradient of the scalar-valued function "half SED from ", where the half cancels the two in the power rule. Writing half the squared distance from as, one has Alternatively, one can consider the vector field pointing from, and omit the minus sign.
In information geometry, the notion of a vector field of "pointing from one point to another" can be generalized to statistical manifolds – one can use an affine connection to connect tangent vectors at different points and the exponential map to flow from one point to another, and on a statistical manifold this is invertible, defining a unique "difference vector" from any given point to another. In this context, the SED generalized to a divergence that generates the information geometry of the manifold; a uniform construction of such a divergence is called a canonical divergence.
In the field of rational trigonometry, the SED is referred to as quadrance.