Duncan's new multiple range test

In statistics, Duncan's new multiple range test is a multiple comparison procedure developed by David B. Duncan in 1955. Duncan's MRT belongs to the general class of multiple comparison procedures that use the studentized range statistic q_r to compare sets of means.
David B. Duncan developed this test as a modification of the Student–Newman–Keuls method that would have greater power. Duncan's MRT is especially protective against false negative error at the expense of having a greater risk of making false positive errors. Duncan's test is commonly used in agronomy and other agricultural research.
The result of the test is a set of subsets of means, where in each subset means have been found not to be significantly different from one another.

Definition

Assumptions:

1.A sample of observed means , which have been drawn independently from n normal populations with "true" means, respectively.

2.A common standard error. This standard error is unknown, but there is available the usual estimate , which is independent of the observed means and is based on a number of degrees of freedom, denoted by ..
The exact definition of the test is:
The difference between any two means in a set of n means is significant provided the range of each and every subset which contains the given means is significant according to an level range test where , and is the number of means in the subset concerned.
Exception: The sole exception to this rule is that no difference
between two means can be declared significant if the two means concerned
are both contained in a subset of the means which has a non-significant
range.

Procedure

The procedure consists of a series of pairwise comparisons between means. Each comparison is performed at a significance level , defined by the number of means separating the two means compared. The test are performed sequentially, where the result of a test determines which test is performed next.
The tests are performed in the following order: the largest minus the smallest, the largest minus the second smallest, up to the largest minus the second largest; then the second largest minus the smallest, the second largest minus the second smallest, and so on, finishing with the second smallest minus the smallest.
With only one exception, given below, each difference is significant if it exceeds the corresponding shortest significant range; otherwise it is not significant. Where the shortest significant range is the significant studentized range, multiplied by the standard error.
The shortest significant range will be designated as, where is the number means in the subset.
The sole exception to this rule is that no difference between two means can be declared significant if the two means concerned are both contained in a subset of the means which has a non-significant range.
An algorithm for performing the test is as follows:
1.Rank the sample means, largest to smallest.
2. For each sample mean, largest to smallest, do the following:
2.1 for each sample mean,, for smallest up to.
2.1.1 compare to critical value,
2.1.2 if does not exceed the critical value, the subset is declared not siginificantlly different:
2.1.2.1 Go to next iteration of loop 2.
2.1.3 Otherwise, keep going with loop 2.1

Critical values

Duncan's multiple range test makes use of the studentized range distribution in order to determine critical values for comparisons between means. Note that different comparisons between means may differ by their significance levels- since the significance level is subject to the size of the subset of means in question.
Let us denote as the quantile of the studentized range distribution, with p observations, and degrees of freedom for the second sample.
Let us denote as the standardized critical value, given by the rule:
If p=2

Else

The shortest critical range, is computed as :
For ->∞, a tabulation exists for an exact value of Q.
A word of caution is needed here: notations for Q and R are not the same throughout literature, where Q is sometimes denoted as the shortest significant interval, and R as the significant quantile for studentized range distribution.

Numeric example

Let us look at the example of 5 treatment means:

Treatments	T1	T2	T3	T4	T5
Treatment Means	9.8	15.4	17.6	21.6	10.8
Rank	5	3	2	1	4

With a standard error of, and .
Using a known tabulation for Q, one reaches the values of :

Now we may obtain the values of the shortest significant range, by the formula:

Reaching:

Then, the observed differences between means are tested, beginning with the largest versus smallest, which would be compared with the least significant range Next, the difference of the largest and the second smallest is computed and compared with the least significant difference.

If an observed difference is greater than the corresponding shortest significant range, then we conclude that the pair of means in question is significantly different.
If an observed difference is smaller than the corresponding shortest significant range, all differences sharing the same upper mean are considered insignificant, in order to prevent contradictions.

For our case, the comparison will yield:

We see that there are significant differences between all pairs of treatments except and. A graph underlining those means that are not significantly different is shown below:

T1 T5 T2 T3 T4

Protection and significance levels based on degrees of freedom

The new multiple range test proposed by Duncan makes use of special protection levels based upon degrees of freedom. Let be the protection level for testing the significance of a difference between two means; that is, the probability that a significant difference between two means will not be found if the population means are equal. Duncan reasons that one has p-1 degrees of freedom for testing p ranked mean, and hence one may conduct p-1 independent tests, each with protection level . Hence, the joint protection level is:
where
that is, the probability that one finds no significant differences in making p-1 independent tests, each at protection level, is , under the hypothesis that all p population means are equal.
In general: the difference between any two means in a set of n means is significant provided the range of each and every subset, which contains the given means, is significant according to an –level range test, where p is the number of means in the subset concerned.
For , the protection level can be tabulated for various value of r as follows:

	Protection level	probability of falsely rejecting
p=2	0.95	0.05
p=3	0.903	0.097
p=4	0.857	0.143
p=5	0.815	0.185
p=6	0.774	0.226
p=7	0.735	0.265

Note that although this procedure makes use of the Studentized range, his error rate is neither on an experiment-wise basis nor on a per- comparisons basis. Duncan's multiple range test does not control the familywise error rate. See Criticism Section for further details.

Duncan Bayesian multiple comparison procedure

Duncan also gave the first Bayesian multiple comparison procedure, for the pairwise comparisons among the means in a one-way layout.
This multiple comparison procedure is different for the one discussed above.
Duncan's Bayesian MCP discusses the differences between ordered group means, where the statistics in question are pairwise comparison.
Duncan modeled the consequences of two or more means being equal using additive loss functions within and across the pairwise comparisons. If one assumes the same loss function across the pairwise comparisons, one needs to specify only one constant K, and this indicates the relative seriousness of type I to type II errors in each pairwise comparison.
A study, which performed by Juliet Popper Shaffer, has shown that the method proposed by Duncan, modified to provide weak control of FWE and using an empirical estimate of the variance of the population means, has good properties both from the Bayesian point of view, as a minimum- risk method, and from the frequentist point of view, with good average power.
In addition, results indicate considerable similarity in both risk and average power between Duncan's modified procedure and the Benjamini and Hochberg False discovery rate -controlling procedure, with the same weak familywise error control.

Criticism

Duncan's test has been criticised as being too liberal by many statisticians including Henry Scheffé, and John W. Tukey.
Duncan argued that a more liberal procedure was appropriate because in real world practice the global null hypothesis H₀ = "All means are equal" is often false and thus traditional statisticians overprotect a probably false null hypothesis against type I errors. According to Duncan, one should adjust the protection levels for different p-mean comparisons according to the problem discussed. The example discussed by Duncan in his 1955 paper is of a comparison of many means,when one is interested only in two-mean and three-mean comparisons, and general p-mean comparisons are of no special interest.
Duncan's multiple range test is very “liberal” in terms of Type I errors. The following example will illustrate why:
Let us assume one is truly interested, as Duncan suggested, only with the correct ranking of subsets of size 4 or below. Let us also assume that one performs the simple pairwise comparison with a protection level. Given an overall set of 100 means, let us look at the null hypotheses of the test:
There are null hypotheses for the correct ranking of each 2 means. The significance level of each hypothesis is
There are null hypotheses for the correct ranking of each 3 means. The significance level of each hypothesis is
There are null hypotheses for the correct ranking of each 4 means. The significance level of each hypothesis is
As we can see, the test has two main problems, regarding the type I errors:

Duncan’s tests is based on the Newman–Keuls procedure, which does not protect the familywise error rate
Duncan’s test intentionally raises the alpha levels in each step of the Newman–Keuls procedure.

Therefore, it is advised not to use the procedure discussed.
Duncan later developed the Duncan–Waller test which is based on Bayesian principles. It uses the obtained value of F to estimate the prior probability of the null hypothesis being true.

Different approaches to the problem

If one still wishes to address the problem of finding similar subsets of group means, other solutions are found in literature.
Tukey's range test is commonly used to compare pairs of means, this procedure controls the familywise error rate in the strong sense.
Another solution is to perform Student's t-test of all pairs of means, and then to use FDR Controlling procedure.
Other possible solutions, which do not include hypothesis testing, but result in a partition of subsets include Clustering & Hierarchical Clustering. These solutions differ from the approach presented in this method:

By being distance/density based, and not distribution based.
Needing a larger group of means, in order to produce significant results or working with the entire data set.

Popular movies

The Hunger Games (film) - 2012 American dystopian action thriller science fiction-adventure film directed by Gary Ross and based on Suzanne Collins’s 2008 novel of the same name. It is the first insta...
untitled Captain Marvel sequel - part of Marvel Cinematic Universe....
Killers of the Flower Moon (film project) - Killers of the Flower Moon - film project in United States of America. It was presented as drama, detective fiction, thriller. The film project starred Leonardo Dicaprio, Robert De Niro. Director of...
Five Nights at Freddy's (film) - Five Nights at Freddy's - film published in 2017 in United States of America. Scenarist of the film - Scott Cawthon....

Popular books

Book of Revelation - The Book of Revelation is the final book of the New Testament, and consequently is also the final book of the Christian Bible. Its title is derived from the first word of the Koine Greek text: apok...
Book of Genesis - account of the creation of the world, the early history of humanity, Israel's ancestors and the origins...
Gospel of Matthew - The Gospel According to Matthew is the first book of the New Testament and one of the three synoptic gospels. It tells how Israel's Messiah, rejected and executed in Israel, pronounces judgement on ...
Michelin Guide - Michelin Guides are a series of guide books published by the French tyre company Michelin for more than a century. The term normally refers to the annually published Michelin Red Guide , the oldest...
Psalms - The Book of Psalms , commonly referred to simply as Psalms , the Psalter or "the Psalms", is the first book of the Ketuvim , the third section of the Hebrew Bible, and thus a book of th...
Ecclesiastes - Ecclesiastes is one of 24 books of the Tanakh , where it is classified as one of the Ketuvim . Originally written c. 450–200 BCE, it is also among the canonical Wisdom literature of the Old Tes...
The 48 Laws of Power - non-fiction book by American author Robert Greene. The book...

Popular television series

The Crown (TV series) - historical drama web television series about the reign of Queen Elizabeth II, created and principally written by Peter Morgan, and produced by Left Bank Pictures and Sony Pictures Tel...
Friends - American sitcom television series, created by David Crane and Marta Kauffman, which aired on NBC from September 22, 1994, to May 6, 2004, lasting ten seasons. With an ensemble cast sta...
Young Sheldon - spin-off prequel to The Big Bang Theory and begins with the character Sheldon...
Modern Family - American television mockumentary family sitcom created by Christopher Lloyd and Steven Levitan for the American Broadcasting Company. It ran for eleven seasons, from September 23...
Loki (TV series) - upcoming American web television miniseries created for Disney+ by Michael Waldron, based on the Marvel Comics character of the same name. It is set in the Marvel Cinematic Universe, shar...
Game of Thrones - American fantasy drama television series created by David Benioff and D. B. Weiss for HBO. It...
Shameless (American TV series) - American comedy-drama television series developed by John Wells which debuted on Showtime on January 9, 2011. It...