Coleman–Liau index


The Coleman–Liau index is a readability test designed by Meri Coleman and T. L. Liau to gauge the understandability of a text. Like the Flesch–Kincaid Grade Level, Gunning fog index, SMOG index, and Automated Readability Index, its output approximates the U.S. grade level thought necessary to comprehend the text.
Like the ARI but unlike most of the other indices, Coleman–Liau relies on characters instead of syllables per word. Although opinion varies on its accuracy as compared to the syllable/word and complex word indices, characters are more readily and accurately counted by computer programs than are syllables.
The Coleman–Liau index was designed to be easily calculated mechanically from samples of hard-copy text. Unlike syllable-based readability indices, it does not require that the character content of words be analyzed, only their length in characters. Therefore, it could be used in conjunction with theoretically simple mechanical scanners that would only need to recognize character, word, and sentence boundaries, removing the need for full optical character recognition or manual keypunching.

Formula

The Coleman–Liau index is calculated with the following formula:
L is the average number of letters per 100 words and S is the average number of sentences per 100 words.
As an example, we shall use the abstract from Coleman and Liau's original 1975 paper introducing the index:
Existing computer programs that measure readability are based largely upon subroutines which estimate number of syllables, usually by counting vowels. The shortcoming in estimating syllables is that it necessitates keypunching the prose into the computer. There is no need to estimate syllables since word length in letters is a better predictor of readability than word length in syllables. Therefore, a new readability formula was computed that has for its predictors letters per 100 words and sentences per 100 words. Both predictors can be counted by an optical scanning device, and thus the formula makes it economically feasible for an organization such as the U.S. Office of Education to calibrate the readability of all textbooks for the public school system.

The abstract contains 5 sentences, 119 words, and 639 letters or digits; L is 537 and S is 4.20 obtained by the formulas:
L = Letters ÷ Words × 100 = 639 ÷ 119 × 100 ≈ 537
S = Sentences ÷ Words × 100 = 5 ÷ 119 × 100 ≈ 4.20
Therefore, the abstract is at a grade level of 14.5, or roughly appropriate for a second-year undergraduate.