Matching (statistics)


Matching is a statistical technique which is used to evaluate the effect of a treatment by comparing the treated and the non-treated units in an observational study or quasi-experiment. The goal of matching is, for every treated unit, to find one non-treated unit with similar observable characteristics against whom the effect of the treatment can be assessed. By matching treated units to similar non-treated units, matching enables a comparison of outcomes among treated and non-treated units to estimate the effect of the treatment reducing bias due to confounding. Propensity score matching, an early matching technique, was developed as part of the Rubin causal model, but has been shown to increase model dependence, bias, inefficiency, and power and is no longer recommended compared to other matching methods.
Matching has been promoted by Donald Rubin. It was prominently criticized in economics by LaLonde, who compared estimates of treatment effects from an experiment to comparable estimates produced with matching methods and showed that matching methods are biased. Dehejia and Wahba reevaluated LaLonde's critique and showed that matching is a good solution. Similar critiques have been raised in political science and sociology journals.

Analysis

When the outcome of interest is binary, the most general tool for the analysis of matched data is conditional logistic regression as it handles strata of arbitrary size and continuous or binary treatments and can control for covariates. In particular cases, simpler tests like paired difference test, McNemar test and Cochran-Mantel-Haenszel test are available.
When the outcome of interest is continuous, estimation of the average treatment effect is performed.
Matching can also be used to "pre-process" a sample before analysis via another technique, such as regression analysis.

Overmatching

Overmatching is matching for an apparent mediator that actually is a result of the exposure. If the mediator itself is stratified, an obscured relation of the exposure to the disease would highly be likely to be induced. Overmatching thus causes statistical bias.
For example, matching the control group by gestation length and/or the number of multiple births when estimating perinatal mortality and birthweight after in vitro fertilization is overmatching, since IVF itself increases the risk of premature birth and multiple birth.
It may be regarded as a sampling bias in decreasing the external validity of a study, because the controls become more similar to the cases in regard to exposure than the general population.