Correlation coefficient


A correlation coefficient is a numerical measure of some type of correlation, meaning a statistical relationship between two variables. The variables may be two columns of a given data set of observations, often called a sample, or two components of a multivariate random variable with a known distribution.
Several types of correlation coefficient exist, each with their own definition and own range of usability and characteristics. They all assume values in the range from −1 to +1, where ±1 indicates the strongest possible agreement and 0 the strongest possible disagreement. As tools of analysis, correlation coefficients present certain problems, including the propensity of some types to be distorted by outliers and the possibility of incorrectly being used to infer a causal relationship between the variables.

Types

There are several different measures for the degree of correlation in data, depending on the kind of data: principally whether the data is a measurement, ordinal, or categorical.

Pearson

The Pearson product-moment correlation coefficient, also known as,, or Pearson's , is a measure of the strength and direction of the linear relationship between two variables that is defined as the covariance of the variables divided by the product of their standard deviations. This is the best-known and most commonly used type of correlation coefficient. When the term "correlation coefficient" is used without further qualification, it usually refers to the Pearson product-moment correlation coefficient.

Intra-class

is a descriptive statistic that can be used when quantitative measurements are made on units that are organized into groups; it describes how strongly units in the same group resemble each other.

Rank

is a measure of the relationship between the rankings of two variables or two rankings of the same variable:
The polychoric correlation coefficient measures association between two ordered-categorical variables. It's technically defined as the estimate of the Pearson correlation coefficient one would obtain if the two variables were measured on a continuous scale, instead of as ordered-category variables, and the two continuous variables followed a bivariate normal distribution. When both variables are dichotomous instead of ordered-categorical, the polychoric correlation coefficient is called the tetrachoric correlation coefficient.

Footnotes