Compound (linguistics)
In linguistics, a compound is a lexeme that consists of more than one stem. Compounding, composition or nominal composition is the process of word formation that creates compound lexemes. That is, in familiar terms, compounding occurs when two or more words or signs are joined to make one longer word or sign. The meaning of the compound may be similar to or different from the meaning of its components in isolation. The component stems of a compound may be of the same part of speech—as in the case of the English word footpath, composed of the two nouns foot and path—or they may belong to different parts of speech, as in the case of the English word blackbird, composed of the adjective black and the noun bird. With very few exceptions, English compound words are stressed on their first component stem.
The process occurs readily in other Germanic languages for different reasons. Words can be concatenated both to mean the same as the sum of two words or where an adjective and noun are compounded.
The addition of affix morphemes to words should not be confused with nominal composition, as this is actually morphological derivation.
Some languages easily form compounds from what in other languages would be a multi-word expression. This can result in unusually long words, a phenomenon known in German as Bandwurmwörter or tapeworm words.
Sign languages also have compounds. They are created by combining two or more sign stems.
Formation of compounds
Compound formation rules vary widely across language types.In a synthetic language, the relationship between the elements of a compound may be marked with a case or other morpheme. For example, the German compound Kapitänspatent consists of the lexemes Kapitän and Patent joined by an -s- ; and similarly, the Latin lexeme paterfamilias contains the archaic genitive form familias of the lexeme familia. Conversely, in the Hebrew language compound, the word בֵּית סֵפֶר bet sefer, it is the head that is modified: the compound literally means "house-of book", with בַּיִת bayit having entered the construct state to become בֵּית bet. This latter pattern is common throughout the Semitic languages, though in some it is combined with an explicit genitive case, so that both parts of the compound are marked.
Agglutinative languages tend to create very long words with derivational morphemes. Compounds may or may not require the use of derivational morphemes also.
The longest compounds in the world may be found in the Finnic and Germanic languages. In German, extremely
extendable compound words can be found in the language of chemical compounds, where, in the cases of biochemistry and polymers, they can be practically unlimited in length, mostly because the German rule suggests combining all noun adjuncts with the noun as the very last stem. German examples include Farbfernsehgerät, Funkfernbedienung, and the often quoted jocular word Donaudampfschifffahrtsgesellschaftskapitänsmütze, which can of course be made even longer and even more absurd, e.g. Donaudampfschifffahrtsgesellschaftskapitänsmützenreinigungsausschreibungsverordnungsdiskussionsanfang etc. According to several editions of the Guinness Book of World Records, the longest published German word has 79 letters and is Donaudampfschiffahrtselektrizitätenhauptbetriebswerkbauunterbeamtengesellschaft , but there is no evidence that this association ever actually existed.
In Finnish, although there is theoretically no limit to the length of compound words, words consisting of more than three components are rare. Even those with fewer than three components can look mysterious to non-Finnish speakers, such as hätäuloskäynti. Internet folklore sometimes suggests that lentokonesuihkuturbiinimoottoriapumekaanikkoaliupseerioppilas is the longest word in Finnish, but evidence of it actually being used is scant and anecdotal at best.
Compounds can be rather long when translating technical documents from English to some other language, since the lengths of the words are theoretically unlimited, especially in chemical terminology. For example, when translating an English technical document to Swedish, the term "Motion estimation search range settings" can be directly translated to rörelseuppskattningssökintervallsinställningar, though in reality, the word would most likely be divided in two: sökintervallsinställningar för rörelseuppskattning – "search range settings for motion estimation".
Subclasses
Semantic classification
A common semantic classification of compounds yields four types:- endocentric
- exocentric
- copulative
- appositional
An exocentric compound is a hyponym of some unexpressed semantic category : none of its components can be perceived as a formal head, and its meaning often cannot be transparently guessed from its constituent parts. For example, the English compound white-collar is neither a kind of collar nor a white thing. In an exocentric compound, the word class is determined lexically, disregarding the class of the constituents. For example, a must-have is not a verb but a noun. The meaning of this type of compound can be glossed as " whose B is A", where B is the second element of the compound and A the first. A bahuvrihi compound is one whose nature is expressed by neither of the words: thus a white-collar person is neither white nor a collar. Other English examples include barefoot.
Copulative compounds are compounds with two semantic heads.
Appositional compounds are lexemes that have two attributes that classify the compound.
Type | Description | Examples |
endocentric | A+B denotes a special kind of B | darkroom, smalltalk |
exocentric | A+B denotes a special kind of an unexpressed semantic head | skinhead, paleface |
copulative | A+B denotes 'the sum' of what A and B denote | bittersweet, sleepwalk |
appositional | A and B provide different descriptions for the same referent | actor-director, maidservant |
Syntactic classification
Noun–noun compounds
All natural languages have compound nouns. The positioning of the words varies according to the language. While Germanic languages, for example, are left-branching when it comes to noun phrases, the Romance languages are usually right-branching.In French, compound nouns are often formed by left-hand heads with prepositional components inserted before the modifier, as in chemin-de-fer 'railway', lit. 'road of iron', and moulin à vent 'windmill', lit. 'mill -by-means-of wind'.
In Turkish, one way of forming compound nouns is as follows: yeldeğirmeni 'windmill' ; demiryolu 'railway'.
Verb–noun compounds
A type of compound that is fairly common in the Indo-European languages is formed of a verb and its object, and in effect transforms a simple verbal clause into a noun.In Spanish, for example, such compounds consist of a verb conjugated for the second person singular imperative followed by a noun : e.g., rascacielos, sacacorchos 'corkscrew', guardarropa 'wardrobe'. These compounds are formally invariable in the plural. French and Italian have these same compounds with the noun in the singular form: Italian grattacielo 'skyscraper', French grille-pain 'toaster'.
This construction exists in English, generally with the verb and noun both in uninflected form: examples are spoilsport, killjoy, breakfast, cutthroat, pickpocket, dreadnought, and know-nothing.
Also common in English is another type of verb–noun compound, in which an argument of the verb is incorporated into the verb, which is then usually turned into a gerund, such as breastfeeding, finger-pointing, etc. The noun is often an instrumental complement. From these gerunds new verbs can be made: breastfeeds and from them new compounds mother-child breastfeeding, etc.
In the Australian Aboriginal language Jingulu, a Pama–Nyungan language, it is claimed that all verbs are V+N compounds, such as "do a sleep", or "run a dive", and the language has only three basic verbs: do, make, and run.
A special kind of compounding is incorporation, of which noun incorporation into a verbal root is most prevalent.
Verb–verb compounds
Verb–verb compounds are sequences of more than one verb acting together to determine clause structure. They have two types:- In a serial verb, two actions, often sequential, are expressed in a single clause. For example, Ewe trɔ dzo, lit. "turn leave", means "turn and leave", and Hindi जाकर देखो jā-kar dekh-o, lit. "go-CONJUNCTIVE PARTICIPLE see-IMPERATIVE", means "go and see". In Tamil, a Dravidian language, van̪t̪u paːr, lit. "come see". In each case, the two verbs together determine the semantics and argument structure.
- In a compound verb, one of the verbs is the primary, and determines the primary semantics and also the argument structure. The secondary verb, often called a vector verb or explicator, provides fine distinctions, usually in temporality or aspect, and also carries the inflection. The main verb usually appears in conjunctive participial form. For examples, Hindi निकल गया nikal gayā, lit. "exit went", means 'went out', while निकल पड़ा nikal paRā, lit. "exit fell", means 'departed' or 'was blurted out'. In these examples निकल nikal is the primary verb, and गया gayā and पड़ा paRā are the vector verbs. Similarly, in both English start reading and Japanese 読み始める yomihajimeru "read-CONJUNCTIVE-start" "start reading," the vector verbs start and 始める hajimeru "start" change according to tense, negation, and the like, while the main verbs reading and 読み yomi "reading" usually remain the same. An exception to this is the passive voice, in which both English and Japanese modify the main verb, i.e. start to be read and 読まれ始める yomarehajimeru lit. "read-PASSIVE--start" start to be read. With a few exceptions all compound verbs alternate with their simple counterparts. That is, removing the vector does not affect grammaticality at all nor the meaning very much: निकला nikalā ' went out.' In a few languages both components of the compound verb can be finite forms: Kurukh kecc-ar ker-ar lit. "died-3pl went-3pl" ' died.'
- Compound verbs are very common in some languages, such as the northern Indo-Aryan languages Hindustani and Punjabi, and Dravidian languages like Tamil, where as many as 20% of verb forms in running text are compound. They exist but are less common in other Indo-Aryan languages like Marathi and Nepali, in Tibeto-Burman languages like Limbu and Newari, in Turkic languages like Turkish and Kyrgyz, in Korean and Japanese, and in northeast Caucasian languages like Tsez and Avar.
- Under the influence of a Quichua substrate speakers living in the Ecuadorian altiplano have innovated compound verbs in Spanish:
- Compound verb equivalents in English :
- Caution: In descriptions of Persian and other Iranian languages the term 'compound verb' refers to noun-plus-verb compounds, not to the verb–verb compounds discussed here.
Parasynthetic compounds
Compound adpositions
Compound prepositions formed by prepositions and nouns are common in English and the Romance languages. Hindi has a small number of simple postpositions and a large number of compound postpositions, mostly consisting of simple postposition ke followed by a specific postposition.Examples from different languages
Chinese :- 學生/学生 'student': 學 xué/hok6 learn + 生 shēng/sang1 living being
- 太空/太空 'space': 太 tài/taai3 great + 空 kōng/hung1 emptiness
- 摩天樓/摩天楼 'skyscraper': 摩 mó/mo1 touch + 天 tiān/tin1 sky + 樓 lóu/lau2 building
- 打印機/打印机 'printer': 打 dǎ/daa2 strike + 印 yìn/yan3 stamp/print + 機 jī/gei1 machine
- 百科全書/百科全书 'encyclopaedia': 百 bǎi/baak3 hundred + 科 kē/fo1 study + 全 quán/cyun4 entire/complete + 書 shū/syu1 book
- 謝謝/谢谢 'thanks': Repeating of 謝 xiè thank
- arbeidsongeschiktheidsverzekering 'disability insurance': arbeid 'labour' + ongeschiktheid 'inaptitude' + verzekering 'insurance'.
- rioolwaterzuiveringsinstallatie 'sewage treatment plant': riool 'sewer' + water 'water' + zuivering 'cleaning' + installatie 'installation'.
- verjaardagskalender 'birthday calendar': verjaardag 'birthday' + kalender 'calendar'.
- klantenservicemedewerker 'customer service representative': klanten 'customers' + service 'service' + medewerker 'worker'.
- universiteitsbibliotheek 'university library': universiteit 'university' + bibliotheek 'library'.
- doorgroeimogelijkheden 'possibilities for advancement': door 'through' + groei 'grow' + mogelijkheden 'possibilities'.
- sanakirja 'dictionary': sana 'word' + kirja 'book'
- tietokone 'computer': tieto 'knowledge data' + kone 'machine'
- keskiviikko 'Wednesday': keski 'middle' + viikko 'week'
- maailma 'world': maa 'land' + ilma 'air'
- rautatieasema 'railway station': rauta 'iron' + tie 'road' + asema 'station'
- kolmivaihekilowattituntimittari 'electricity meter': 'three-phase kilowatt hour meter'
- Wolkenkratzer 'skyscraper': Wolken 'clouds' + Kratzer 'scraper'
- Eisenbahn 'railway': Eisen 'iron' + Bahn 'track'
- Kraftfahrzeug 'automobile': Kraft 'power' + fahren/fahr 'drive' + Zeug 'machinery'
- Stacheldraht 'barbed wire': Stachel 'barb/barbed' + Draht 'wire'
- : お好み okonomi 'preference' + 焼き yaki 'cooking'
- 日帰り higaeri 'day trip': 日 hi 'day' + 帰り kaeri 'returning '
- 国会議事堂 kokkaigijidō 'national diet building': 国会 kokkai 'national diet' + 議事 giji 'proceedings' + 堂 dō 'hall'
- 안팎 anpak 'inside and outside': 안 an 'inside' + 밖 bak 'outside'
- mashkikiwaaboo 'tonic': mashkiki 'medicine' + waaboo 'liquid'
- miskomin 'raspberry': misko 'red' + miin 'berry'
- dibik-giizis 'moon': dibik 'night' + giizis 'sun'
- gichi-mookomaan 'white person/American': gichi 'big' + mookomaan 'knife'
- ciencia-ficción 'science fiction': ciencia, 'science', + ficción, 'fiction'
- ciempiés 'centipede': cien 'hundred' + pies 'feet'
- ferrocarril 'railway': ferro 'iron' + carril 'lane'
- paraguas 'umbrella': para 'stops' + aguas ' water'
- cabizbajo 'keeping the head low in a bad mood': cabeza 'head' + bajo 'down'
- subibaja 'seesaw'
- limpiaparabrisas 'windshield wiper' is a nested compound: limpia 'clean' + parabrisas windshield, which is itself a compound of para 'stop' + brisas 'breezes'.
In Cemmozhi, rules for compounding are laid down in grammars such as Tolkappiyam and Nannūl, in various forms, under the name punarcci. Examples of compounds include kopuram from 'kō' + 'puram'. Sometimes phonemes may be inserted during the blending process such as in kovil from 'kō' + 'il'. Other types are like vennai from 'veḷḷai' + 'nei' ; note how 'veḷḷai' becomes 'ven'.
In koṭuntamizh, parts of words from other languages may be morphed into Tamil. Common examples include 'ratta-azhuttam' from the Sanskrit rakta and Cemmozhi 'azhuttam' ; note how rakta becomes ratta in Tamil order to remove the consonant-cluster. This also happens with English, for examples kāpi-kaṭai is from English coffee, which becomes kāpi in Tamil, and the Tamil kaṭai meaning shop.
Tłįchǫ Yatiì/Dogrib:
Germanic languages
Because a compound is understood as a word in its own right, it may in turn be used in new compounds, so forming an arbitrarily long word is trivial. This contrasts to Romance languages, where prepositions are more used to specify word relationships instead of concatenating the words. As a member of the Germanic family of languages, English is unusual in that compounds are normally written in separate parts. This would be an error in other Germanic languages such as Norwegian, Swedish, Danish, German and Dutch. However, this is merely an orthographic convention: As in other Germanic languages, arbitrary noun phrases, for example "girl scout troop", "city council member", and "cellar door", can be made up on the spot and used as compound nouns in speech, also in English.
Russian language
In the Russian language compounding is a common type of word formation, and several types of compounds exist, both in terms of compounded parts of speech and of the way of the formation of a compound.Compound nouns may be agglutinative compounds, hyphenated compounds, or abbreviated compounds. Some compounds look like acronym, while in fact they are an agglutinations of type stem + word: Академгородок 'Akademgorodok'. In agglutinative compound nouns, an agglutinating infix is typically used: пароход 'steamship': пар + о + ход. Compound nouns may be created as noun+noun, adjective + noun, noun + adjective, noun + verb.
Compound adjectives may be formed either per se or as a result of compounding during the derivation of an adjective from a multi-word term: Каменноостровский проспект 'Stone Island Avenue', a street in St.Petersburg.
Reduplication in Russian is also a source of compounds.
Quite a few Russian words are borrowed from other languages in an already-compounded form, including numerous "classical compounds" or internationalisms: автомобиль 'automobile'.
Sanskrit language
Sanskrit is very rich in compound formation with seven major compound types and as many as 55 sub-types. The compound formation process is productive, so it is not possible to list all Sanskrit compounds in a dictionary. Compounds of two or three words are more frequent, but longer compounds with some running through pages are not rare in Sanskrit literature. Some examples are below.- हिमालय : Name of the Himalaya mountain range. Literally the abode of snow. A compound of two words and four syllables.
- प्रवर-मुकुट-मणि-मरीचि-मञ्जरी-चय-चर्चित-चरण-युगल : Literally, O the one whose dual feet are covered by the cluster of brilliant rays from the gems of the best crowns, from the Sanskrit work Panchatantra. A compound of nine words and 25 syllables.
- कमला-कुच-कुङ्कुम-पिञ्जरीकृत-वक्षः-स्थल-विराजित-महा-कौस्तुभ-मणि-मरीचि-माला-निराकृत-त्रि-भुवन-तिमिर : Literally O the one who dispels the darkness of three worlds by the shine of Kaustubha jewel hanging on the chest, which has been made reddish-yellow by the saffron from the bosom of Kamalā , an adjective of Rama in the Kakabhushundi Rāmāyaṇa. A compound of 16 words and 44 syllables.
- साङ्ख्य-योग-न्याय-वैशेषिक-पूर्व-मीमांसा-वेदान्त-नारद-शाण्डिल्य-भक्ति-सूत्र-गीता-वाल्मीकीय-रामायण-भागवतादि-सिद्धान्त-बोध-पुरः-सर-समधिकृताशेष-तुलसी-दास-साहित्य-सौहित्य-स्वाध्याय-प्रवचन-व्याख्यान-परम-प्रवीणाः : Literally the acclaimed forerunner in understanding of the canons of Sāṅkhya, Yoga, Nyāya, Vaiśeṣika, Pūrva Mīmāṃsā, Vedānta, Nārada Bhakti Sūtra, Śāṇḍilya Bhakti Sūtra, Bhagavad Gītā, the Ramayana of Vālmīki, Śrīmadbhāgavata; and the most skilled in comprehensive self-study, discoursing and expounding of the complete works of Gosvāmī Tulasīdāsa. An adjective used in a panegyric of Jagadguru Rambhadracharya. The hyphens show only those word boundaries where there is no sandhi. On including word boundaries with sandhi, this is a compound of 35 words and 86 syllables.
Sign languages
Recent trends
Although there is no universally agreed-upon guideline regarding the use of compound words in the English language, in recent decades written English has displayed a noticeable trend towards increased use of compounds. Recently, many words have been made by taking syllables of words and compounding them, such as pixel and bit. This is called a syllabic abbreviation.In Dutch and the Scandinavian languages there is an unofficial trend toward splitting compound words, known in Norwegian as særskriving, in Swedish as särskrivning, and in Dutch as Engelse ziekte. Because the Scandinavian languages rely heavily on the distinction between the compound word and the sequence of the separate words it consists of, this has serious implications. For example, the adjective røykfritt if separated into its composite parts, would mean røyk fritt. In Dutch, compounds written with spaces may also be confused, but can also be interpreted as a sequence of a noun and a genitive in formal abbreviated writing. This may lead to, for example, commissie vergadering being read as "commission of the meeting" rather than "meeting of the commission".
The German spelling reform of 1996 introduced the option of hyphenating compound nouns when it enhances comprehensibility and readability. This is done mostly with very long compound words by separating them into two or more smaller compounds, like Eisenbahn-Unterführung or Kraftfahrzeugs-Betriebsanleitung. Such practice is also permitted in other Germanic languages, e.g. Danish and Norwegian, and is even encouraged between parts of the word that have very different pronunciation, such as when one part is a loan word or an acronym.
Compounding by language
- Classical compounds
- English compounds
- German compounds
- Sanskrit compounds