Tidy data


Tidy data is an alternative name for the common statistical form called a model matrix or data matrix. A data matrix is defined in as follows:
A standard method of displaying a multivariate set of data is in the form of a data matrix in which rows correspond to sample individuals and columns to variables, so that the entry in the ith row and jth column gives the value of the jth variate as measured or observed on the ith individual.

Hadley Wickham later defined "Tidy Data" as data sets that are arranged such that each variable is a column and each observation is a row.
Data arrangement is an important consideration in data processing, but should not be confused with the also important task of data cleansing.
Other relevant formulations include denormalization prior to machine learning modeling, and use of semantic triples as intermediate representation.