Lexical density

Lexical density is a concept in computational linguistics that measures the structure and complexity of human communication in a language. Lexical density estimates the linguistic complexity in a written or spoken composition from the functional words and content words. One method to calculate the lexical density is to compute the ratio of lexical items to the total number of words. Another method is to compute the ratio of lexical items to the number of higher structural items in a composition, such as the total number of clauses in the sentences.
The lexical density for an individual evolves with age, education, communication style, circumstances, unusual injuries or medical condition, and his or her creativity. The inherent structure of a human language and one's first language may impact the lexical density of the individual's writing and speaking style. Further, human communication in the written form is generally more lexically dense than in the spoken form after the early childhood stage. The lexical density impacts the readability of a composition and the ease with which the listener or reader can comprehend a communication. The lexical density may also impact the memorability and retention of a sentence and the message.

Discussion

The lexical density is the proportion of content words in a given discourse. It can be measured either as the ratio of lexical items to total number of words, or as the ratio of lexical items to the number of higher structural items in the sentences. A lexical item is typically the real content and it includes nouns, verbs, adjectives and adverbs. A grammatical item typically is the functional glue and thread that weaves the content and includes pronouns, conjunctions, prepositions, determiners, and certain classes of finite verbs and adverbs.
Lexical density is one of the methods used in discourse analysis as a descriptive parameter which varies with register and genre. There are many proposed methods for computing the lexical density of any composition or corpus. Lexical density may be determined as:

Ure Lexical density

Ure proposed the following formula in 1971 to compute the lexical density of a sentence:
Biber terms this ratio as "type-token ratio".

Halliday Lexical density

In 1985, Halliday revised the denominator of the Ure formula and proposed the following to compute the lexical density of a sentence:
In some formulations, the Halliday proposed lexical density is computed as a simple ratio, without the "100" multiplier.

Characteristics

Lexical density measurements may vary for the same composition depending on how a "lexical item" is defined and which items are classified as lexical or as a grammatical item. Any adopted methodology when consistently applied across various compositions provides the lexical density of those compositions. Typically, the lexical density of a written composition is higher than a spoken composition. According to Ure, written forms of human communication in the English language typically have lexical densities above 40%, while spoken forms tend to have lexical densities below 40%. In a survey of historical texts by Michael Stubbs, the typical lexical density of fictional literature ranged between 40% and 54%, while non-fiction ranged between 40% and 65% percent.
The relation and intimacy between the participants of a particular communication impact the lexical density, states Ure, as do the circumstances prior to the start of communication for the same speaker or writer. The higher lexical density of written forms of communication, she proposed, is primarily because written forms of human communication involve greater preparation, reflection and revisions. Human discussions and conversations involving or anticipating feedback tend to be sparser and have lower lexical density. In contrast, state Stubbs and Biber, instructions, law enforcement orders, news read from screen prompts within the allotted time, and literature that authors expect will be available to the reader for re-reading tend to maximize lexical density. In surveys of lexical density of spoken and written materials across different European countries and age groups, Johansson and Strömqvist report that the lexical density of population groups were similar and depended on the morphological structure of the native language and within a country, the age groups sampled. The lexical density was highest for adults, while the variations estimated as lexical diversity, states Johansson, were higher for teenagers for the same age group.

Popular movies

The Hunger Games (film) - 2012 American dystopian action thriller science fiction-adventure film directed by Gary Ross and based on Suzanne Collins’s 2008 novel of the same name. It is the first insta...
untitled Captain Marvel sequel - part of Marvel Cinematic Universe....
Killers of the Flower Moon (film project) - Killers of the Flower Moon - film project in United States of America. It was presented as drama, detective fiction, thriller. The film project starred Leonardo Dicaprio, Robert De Niro. Director of...
Five Nights at Freddy's (film) - Five Nights at Freddy's - film published in 2017 in United States of America. Scenarist of the film - Scott Cawthon....

Popular books

Book of Revelation - The Book of Revelation is the final book of the New Testament, and consequently is also the final book of the Christian Bible. Its title is derived from the first word of the Koine Greek text: apok...
Book of Genesis - account of the creation of the world, the early history of humanity, Israel's ancestors and the origins...
Gospel of Matthew - The Gospel According to Matthew is the first book of the New Testament and one of the three synoptic gospels. It tells how Israel's Messiah, rejected and executed in Israel, pronounces judgement on ...
Michelin Guide - Michelin Guides are a series of guide books published by the French tyre company Michelin for more than a century. The term normally refers to the annually published Michelin Red Guide , the oldest...
Psalms - The Book of Psalms , commonly referred to simply as Psalms , the Psalter or "the Psalms", is the first book of the Ketuvim , the third section of the Hebrew Bible, and thus a book of th...
Ecclesiastes - Ecclesiastes is one of 24 books of the Tanakh , where it is classified as one of the Ketuvim . Originally written c. 450–200 BCE, it is also among the canonical Wisdom literature of the Old Tes...
The 48 Laws of Power - non-fiction book by American author Robert Greene. The book...

Popular television series

The Crown (TV series) - historical drama web television series about the reign of Queen Elizabeth II, created and principally written by Peter Morgan, and produced by Left Bank Pictures and Sony Pictures Tel...
Friends - American sitcom television series, created by David Crane and Marta Kauffman, which aired on NBC from September 22, 1994, to May 6, 2004, lasting ten seasons. With an ensemble cast sta...
Young Sheldon - spin-off prequel to The Big Bang Theory and begins with the character Sheldon...
Modern Family - American television mockumentary family sitcom created by Christopher Lloyd and Steven Levitan for the American Broadcasting Company. It ran for eleven seasons, from September 23...
Loki (TV series) - upcoming American web television miniseries created for Disney+ by Michael Waldron, based on the Marvel Comics character of the same name. It is set in the Marvel Cinematic Universe, shar...
Game of Thrones - American fantasy drama television series created by David Benioff and D. B. Weiss for HBO. It...
Shameless (American TV series) - American comedy-drama television series developed by John Wells which debuted on Showtime on January 9, 2011. It...