Chinese dictionary


Chinese dictionaries date back over two millennia to the Han Dynasty, which is a significantly longer lexicographical history than any other language. There are hundreds of dictionaries for the Chinese language, and this article discusses some of the most important.

Terminology

The general term císhū semantically encompasses "dictionary; lexicon; encyclopedia; glossary". The Chinese language has two words for dictionary: zidian for written forms, that is, Chinese characters, and cidian, for spoken forms.
For character dictionaries, zidian combines zi and dian.
For word dictionaries, cidian is interchangeably written or ; using , and its graphic variant . Zidian is a much older and more common word than cidian, and Yang notes zidian is often "used for both 'character dictionary' and 'word dictionary'."

Traditional Chinese lexicography

The precursors of Chinese dictionaries are primers designed for students of Chinese characters. The earliest of them only survive in fragments or quotations within Chinese classic texts. For example, the Shizhoupian was compiled by one or more historians in the court of King Xuan of Zhou, named after the legendary inventor of writing, was edited by Li Si, and helped to standardize the Small seal script during the Qin Dynasty.
The collation or lexicographical ordering of a dictionary generally depends upon its writing system. For a language written in an alphabet or syllabary, dictionaries are usually ordered alphabetically. Samuel Johnson defined dictionary as "a book containing the words of any language in alphabetical order, with explanations of their meaning" in his dictionary. But Johnson's definition cannot be applied to the Chinese dictionaries, as Chinese is written in characters or logograph, not alphabets. To Johnson, not having an alphabet is not to the Chinese's credit, as in 1778, when James Boswell asked about the Chinese characters, he replied "Sir, they have not an alphabet. They have not been able to form what all other nations have formed." Nevertheless, the Chinese made their dictionaries, and developed three original systems for lexicographical ordering: semantic categories, graphic components, and pronunciations.

Semantically organized dictionaries

The first system of dictionary organization is by semantic categories. The circa 3rd-century BCE Erya is the oldest extant Chinese dictionary, and scholarship reveals that it is a pre-Qin compilation of glosses to classical texts. It contains lists of synonyms arranged into 19 semantic categories. The Han Dynasty dictionary Xiao Erya reduces these 19 to 13 chapters. The early 3rd century CE Guangya, from the Northern Wei Dynasty, followed the Eryas original 19 chapters. The circa 1080 CE Piya, from the Song Dynasty, has 8 semantically-based chapters of names for plants and animals. For a dictionary user wanting to look up a character, this arbitrary semantic system is inefficient unless one already knows, or can guess, the meaning.
Two other Han Dynasty lexicons are loosely organized by semantics. The 1st century CE Fangyan is the world's oldest known dialectal dictionary. The circa 200 CE Shiming employs paranomastic glosses to define words.

Graphically organized dictionaries

The second system of dictionary organization is by recurring graphic components or radicals. The famous 100–121 CE Shuowen Jiezi arranged characters through a system of 540 bushou radicals. The 543 CE Yupian, from the Liang Dynasty, rearranged them into 542. The 1615 CE Zihui, edited by Mei Yingzuo during the Ming Dynasty, simplified the 540 Shuowen Jiezi radicals to 214. It also originated the "radical-stroke" scheme of ordering characters on the number of residual graphic strokes besides the radical. The 1627 Zhengzitong also used 214. The 1716 CE Kangxi Zidian, compiled under the Kangxi Emperor of the Qing Dynasty, became the standard dictionary for Chinese characters, and popularized the system of 214 radicals. As most Chinese characters are semantic-phonetic ones, the radical method is usually effective, thus it continues to be widely used in the present day. However, sometimes the radical of a character is not obvious. To compensate this, a "Chart of Characters that Are Difficult to Look up", arranged by the number of strokes of the characters, is usually provided.

Phonetically organized dictionaries

The third system of lexicographical ordering is by character pronunciation. This type of dictionary collates its entries by syllable rime and tones, and comprises the so-called "rime dictionary". The first surviving rime dictionary is the 601 CE Qieyun from the Sui Dynasty; it became the standard of pronunciation for Middle Chinese. During the Song Dynasty, it was expanded into the 1011 CE Guangyun and the 1037 CE Jiyun.
The clear problem with these old phonetically arranged dictionary is that the would-be user needs to have the knowledge of rime. Thus, dictionaries collated this way can only serve the literati.
A great number of modern dictionaries published today arrange their entries by pinyin or other methods of romanisation, together with a radicals index. Some of these pinyin dictionaries also contain indices of the characters arranged by number and order of strokes, by the four corner encoding or by the cangjie encoding.
Some dictionaries employ more than one of these three methods of collation. For example, the Longkan Shoujian of the Liao Dynasty uses radicals, which are grouped by tone. The characters under each radical are also grouped by tone.

Functional classifications

Besides categorizing ancient Chinese dictionaries by their methods of collation, they can also be classified by their functions. In the traditional bibliographic divisions of the imperial collection Siku Quanshu, dictionaries were classified as belonging to xiǎoxué, which was contrasted with dàxué. Xiaoxue was divided into texts dealing with xùngǔ, wénzì, and yīnyùn .
The Xungu type, sometimes called yǎshū , comprises Erya and its descendants. These exegetical dictionaries focus on explaining meanings of words as found in the Chinese classics.
The Wenzi dictionaries, called zìshū, comprise Shuowen Jiezi, Yupian, Zihui, Zhengzitong, and Kangxi Zidian. This type of dictionary, which focuses on the shape and structure of the characters, subsumes both "orthography dictionaries", such as the Ganlu Zishu of the Tang Dynasty, and "script dictionaries", such as the Liyun of the Song Dynasty. Although these dictionaries center upon the graphic properties of Chinese characters, they do not necessarily collate characters by radical. For instance, Liyun is a clerical script dictionary collated by tone and rime.
The Yinyun type, called yùnshū, focuses on the pronunciations of characters. These dictionaries are always collated by rimes.
While the above traditional pre-20th-century Chinese dictionaries focused upon the meanings and pronunciations of words in classical texts, they practically ignored the spoken language and vernacular literature.

Modern Chinese lexicography

The Kangxi Zidian served as the standard Chinese dictionary for generations, is still published and is now online. Contemporary lexicography is divisible between bilingual and monolingual Chinese dictionaries.

Chinese–English dictionaries

The foreigners who entered China in late Ming and Qing Dynasties needed dictionaries for different purposes than native speakers. Wanting to learn Chinese, they compiled the first grammar books and bilingual dictionaries. Westerners adapted the Latin alphabet to represent Chinese pronunciation, and arranged their dictionaries accordingly.
Two Bible translators edited early Chinese dictionaries. The Scottish missionary Robert Morrison wrote A Dictionary of the Chinese Language. The British missionary Walter Henry Medhurst wrote a Hokkien dialect dictionary and the Chinese and English Dictionary. Both were flawed in their representation of pronunciations, such as aspirated stops. The American philologist and diplomat Samuel Wells Williams applied the method of dialect comparison in his dictionary, A Syllabic Dictionary of the Chinese Language, which refined distinctions in articulation and gave variant regional pronunciations in addition to standard Peking pronunciation.
The British consular officer and linguist Herbert Giles criticized Williams as "the lexicographer not for the future but of the past", and took nearly twenty years to compile his A Chinese-English Dictionary, one that Norman calls "the first truly adequate Chinese–English dictionary". It contained 13,848 characters and numerous compound expressions, with pronunciation based upon Beijing Mandarin, which it compared with nine southern dialects such as Cantonese, Hakka, and Fuzhou dialect. It has been called "still interesting as a repository of late Qing documentary Chinese, although there is little or no indication of the citations, mainly from the Kangxi zidian." Giles modified the Chinese romanization system of Thomas Francis Wade to create the Wade-Giles system, which was standard in English speaking countries until 1979 when pinyin was adopted. The Giles dictionary was replaced by the 1931 dictionary of the Australian missionary Robert Henry Mathews. Mathews' Chinese-English Dictionary, which was popular for decades, was based on Giles and partially updated by Y.R. Chao in 1943 and reprinted in 1960.
Trained in American structural linguistics, Yuen Ren Chao and Lien-sheng Yang wrote a Concise Dictionary of Spoken Chinese, that emphasized the spoken rather than the written language. Main entries were listed in Gwoyeu Romatzyh, and they distinguished free morphemes from bound morphemes. A hint of non-standard pronunciation was also given, by marking final stops and initial voicing and non-palatalization in non-Mandarin dialects.
The Swedish sinologist Bernhard Karlgren wrote the seminal Grammata Serica Recensa with his reconstructed pronunciations for Middle Chinese and Old Chinese.
Chinese lexicography advanced during the 1970s. The translator Lin Yutang wrote the semantically sophisticated Lin Yutang's Chinese-English Dictionary of Modern Usage that is now available online. The author Liang Shih-Chiu edited two full-scale dictionaries: Chinese-English with over 8,000 characters and 100,000 entries, and English-Chinese with over 160,000 entries.
The linguist and professor of Chinese, John DeFrancis edited a groundbreaking Chinese–English dictionary giving more than 196,000 words or terms alphabetically arranged in a single-tier pinyin order. The user can therefore in a straightforward way find a term whose pronunciation is known rather than searching by radical or character structure, the latter being a 2-tiered approach. This project had long been advocated by another pinyin proponent, Victor H. Mair in his .

Chinese–Chinese dictionaries

When the Republic of China began in 1912, educators and scholars recognized the need to update the 1716 Kangxi Zidian. It was thoroughly revised in the Zhonghua Da Zidian, which corrected over 4,000 Kangxi Zidian mistakes and added more than 1,000 new characters. Lu Erkui's Ciyuan was a groundbreaking effort in Chinese lexicography and can be considered the first cidian "word dictionary".
Shu Xincheng's Cihai was a comprehensive dictionary of characters and expressions, and provided near-encyclopedic coverage in fields like science, philosophy, history. The Cihai remains a popular dictionary and has been frequently revised.
The Guoyu cidian was a four-volume dictionary of words, designed to standardize modern pronunciation. The main entries were characters listed phonologically by Zhuyin Fuhao and Gwoyeu Romatzyh. For example, the title in these systems is ㄍㄨㄛㄩ ㄘㄉ一ㄢ and Gwoyeu tsyrdean.
Wei Jiangong's Xinhua Zidian is a pocket-sized reference, alphabetically arranged by pinyin. It is the world's most popular reference work. The 11th edition was published in 2011.
Lu Shuxiang's Xiandai Hanyu Cidian is a middle-sized dictionary of words. It is arranged by characters, alphabetized by pinyin, which list compounds and phrases, with a total 56,000 entries. Both the Xinhua zidian and the Xiandai Hanyu cidian followed a simplified scheme of 189 radicals.
Two outstanding achievements in contemporary Chinese lexicography are the Hanyu Da Cidian with over 370,000 word and phrase entries listed under 23,000 different characters; and the Hanyu Da Zidian with 54,678 head entries for characters. They both use a system of 200 radicals.
In recent years, the computerization of Chinese has allowed lexicographers to create dianzi cidian usable on computers, PDAs, etc. There are proprietary systems, such as Wenlin Software for learning Chinese, and there are also free dictionaries available online. After Paul Denisowski started the volunteer CEDICT project in 1997, it has grown into a standard reference database. The CEDICT is the basis for many Internet dictionaries of Chinese, and is included in the Unihan Database.

Specialized dictionaries

Chinese publishing houses print diverse types of zhuanke cidian . One Chinese dictionary bibliography lists over 130 subject categories, from "Abbreviations, Accounting" to "Veterinary, Zoology." The following examples are limited to specialized dictionaries from a few representative fields.

Ancient Chinese

Dictionaries of Ancient Chinese give definitions, in Modern Chinese, of characters and words found in the pre-Modern Chinese literature. They are typically organized by pinyin or by Zihui radicals, and give definitions in order of antiquity when several definitions exist. Quotes from the literature exemplifying each listed meaning are given. Quotes are usually chosen from the pre-Han Classical literature when possible, unless the definition emerged during the post-Classical period. Dictionaries intended for historians, linguists, and other classical scholars will sometimes also provide Middle Chinese fanqie readings and/or Old Chinese rime groups, as well as bronze script or oracle bone script forms.
While dictionaries published in Mainland China intended for study or reference by high school/college students are generally printed in Simplified Chinese, dictionaries intended for scholarly research are set in Traditional Chinese.
Twenty centuries ago, the Fangyan was the first Chinese specialized dictionary. The usual English translation for fangyan is "dialect", but the language situation in China is said to be uniquely complex. In the "dialect" sense of English dialects, Chinese has Mandarin dialects, yet fangyan also means "non-Mandarin languages, mutually unintelligible regional varieties of Chinese", such as Cantonese and Hakka. Some linguists like John DeFrancis prefer the translation "topolect", which are very similar to independent languages. The Dictionary of Frequently-Used Taiwan Minnan is an online dictionary of Taiwanese Hokkien. Here are some general fangyan cidian examples.
Chinese has five words translatable as "idiom": chengyu, yanyu, xiehouyu, xiyu, and guanyongyu. Some modern dictionaries for idioms are:
The Chinese language adopted a few foreign wailaici during the Han Dynasty, especially after Zhang Qian's exploration of the Western Regions. The lexicon absorbed many Buddhist terms and concepts when Chinese Buddhism began to flourish in the Southern and Northern Dynasties. During the late 19th century, when Western powers forced open China's doors, numerous loanwords entered Chinese, many through the Japanese language. While some foreign borrowings became obsolete, others became indispensable terms in modern vocabulary.
The 20th century saw the rapid progress of the studies of the lexicons found in the Chinese vernacular literature, which includes novels, dramas and poetry. Important works in the field include:
Employing corpus linguistics and lists of Chinese characters arranged by frequency of usage , lexicographers have compiled dictionaries for learners of Chinese as a foreign language. These specialized Chinese dictionaries are available either as add-ons to existing publications like Yuan and Wenlin or as specific ones like
Victor H. Mair lists eight adverse features of traditional Chinese lexicography, some of which have continued up to the present day: persistent confusion of spoken word with written graph; lack of etymological science as opposed to the analysis of script; absence of the concept of word; ignoring the script's historical developments in the oracle bones and bronze inscriptions; no precise, unambiguous, and convenient means for specifying pronunciations; no standardized, user-friendly means for looking up words and graphs; failure to distinguish linguistically between vernacular and literary registers, or between usages peculiar to different regions and times; and open-endedness of the writing system, with current unabridged character dictionaries containing 60,000 to 85,000 graphs..

Online Chinese dictionaries

*
*
*
*
*
*
*