General Chinese


General Chinese is a diaphonemic orthography invented by Yuen Ren Chao to represent the pronunciations of all major varieties of Chinese simultaneously. It is "the most complete genuine Chinese diasystem yet published". It can also be used for the Korean, Japanese, and Vietnamese pronunciations of Chinese characters, and challenges the claim that Chinese characters are required for interdialectal communication in written Chinese.
General Chinese is not specifically a romanization system, but two alternative systems: one uses Chinese characters phonetically, as a syllabary of 2082 glyphs, and the other is an alphabetic romanization system with similar sound values and tone spellings to Gwoyeu Romatzyh.

Character-based General Chinese

The character version of General Chinese uses distinct characters for any traditional characters that are distinguished phonemically in any of the control varieties of Chinese, which consist of several dialects of Mandarin, Wu, Min, Hakka, and Yue. That is, a single syllabic character will correspond to more than one logographic character only when these are homonyms in all control dialects. In effect, General Chinese is a syllabic reconstruction of the pronunciation of Middle Chinese, less distinctions which have been dropped nearly everywhere.
The result is a syllabary of 2082 syllables, about 80% of which are single morphemes—that is, in 80% of cases there is no difference between GC and standard written Chinese, and in running text, that figure rises to 90–95%, as the most common morphemes tend to be uniquely identified. For example, kai can mean only 開 kāi 'open', and sam can mean only 三 sān 'three'. Chao notes, "These syllables then are morphemes, or words with definite meanings, or clusters of meanings related by extensions. About 20 percent of the syllables are homophones under each of which there will be more than one morpheme, usually written with different characters The degree of homophony is so low that it will be possible to write text either in literary or colloquial Chinese with the same character for each syllable as has been tested in texts of various styles." Chao compares General Chinese to how Chinese was written when the writing system was still productive: "This amounts to a 100 percent use of writing Chinese by 'phonetic loan' The situation is that when the ancients wrote a character by sound regardless of meaning, it was a 'loan character', whereas if a modern schoolboy writes one, he is punished for writing the wrong character!"
Taking a telegraphic code-book of about 10,000 characters as a representative list of characters in modern use, Chao notes that General Chinese results in a reduction of 80% in the number of characters needing to be learned.
In the 20% of cases where a syllable corresponded to more than one word, Chao generally selected the graphically most basic traditional character for General Chinese, as long as it wasn't unduly rare. However, when that character had strong semantic connotations that would have interfered with a phonetic reading, he selected a more neutral character. This phenomenon is familiar from Chinese transcriptions of foreign names.

Romanized General Chinese

Romanized General Chinese has distinct symbols for the onsets and the rimes distinguished by any of the control dialects. For example, it retains the final consonants p, t, k, and the distinction between final m and n, as these are found in several modern dialects, such as Cantonese. General Chinese also maintains the "round-sharp" distinction, such as sia vs. hia, though those are both xia in Beijing Mandarin. It also indicates the "muddy" stops of Shanghainese. Indeed, Chao characterized GC as having "the initial consonants of the Wu dialects , the vowels of Mandarin, and the endings of Cantonese. It can, however, be pronounced in any dialect, and it is meant to be, by a relatively short list of rules of pronunciation."
Like Chao's other invention, Gwoyeu Romatzyh, romanized General Chinese uses tone spelling. However, the system is somewhat different. The difference between the yin and yang tones is indicated by the voicing of the initial consonant, which is possible because the original voicing distinctions are retained. Given that some tones are indicated by changing rather than adding letters, writing tone requires on average only one additional letter for every three syllables of text.
The digraphs are not reliably featural; for example, the digraphs for the voiced stops do not all follow the same pattern. This is because Chao ran frequency tests, and used single letters for the most common consonants and vowels, while restricting digraphs and trigraphs to the more infrequent ones. Overall, syllables in the texts he transliterated averaged under 3 letters apiece.
An example of Romanized General Chinese can be illustrated with Chao's name:
Traditional
characters
Notes
General Chinesedhyaoqiuanremm
Mandarin jawyuanrenn
Mandarin zhàoyuánrèn
Yuè jiuhyùhnyahm
Yuè ziu6jyun4jam6
Mǐnnán tiōgôanjīm
Wu zau*gnioegnin
Wu zauyoezen
Wu zaunyoenyin
Japanese, go'on readingdeugwanninHistorical kana orthography
Japanese, go'on readingganninModern kana usage
Japanese, kan'on readingteugenjinHistorical kana orthography
Japanese, kan'on readingchōgenjinModern kana usage
Koreanchowŏnim
Koreanjowonyim
Vietnamesetriệunguyênnhiệm

All the General Chinese initials here are voiced: The h in dh shows that this is a "muddy" consonant, and the q in qiuan represents an initial ng-. This voicing shows up in the Cantonese yang tones, which are represented by h in Yale romanization. "Heavy" codas, such as remm, indicate the "departing" tone, as in Gwoyeu Romatzyh. Similarly, the spelling ao in dhyao indicates the "rising" tone, but because of the voiced initial, it merges with "departing" in Mandarin and literary Cantonese. The y in dhyao indicates that the initial is a stop in Min, Japanese, and Vietnamese, but otherwise an affricate. Cantonese and Korean retain the final m of remm. These pronunciations are all predictable given the General Chinese transcription, though it was not designed with the Sinospheric languages specifically in mind. Both the pre-war and post-war Japanese orthographies are recoverable.
In every control dialect, some syllables with different spellings will be pronounced the same. However, which these are differs from dialect to dialect. There are some irregular correlations: Often a particular variety will have a pronunciation for a syllable that is not what one would expect from other syllables with similar spellings, due to irregular developments in that variety. This is especially true with the voicing of Japanese consonants, which has evolved idiosyncratically in different compound words. However, except for Japanese voicing, the system is phonetic about 90% of the time.

Onsets and rhymes

Character GC has a separate character for each syllable. However, romanized GC has distinct onsets and rhymes. The onsets are as follows:

Onsets

泥 and 娘 are both transcribed, as these are not distinct in modern dialects. 喻, a conflation of two older initials, 云 hy~hw and 以 y~w, is transcribed or ∅ according to modern rather than ancient forms; when palatalization is lost, it is transcribed. The palatal and retroflex fricatives 照穿牀審禪 fell together early on in the rime tables of Classical Chinese, but are still distinguished in some modern dialects, and so are distinguished here. The convention for nasal 疑, which drops in many dialects, is repeated in the finals, where it represents with a departing tone.
Although to some extent systematic—the retroflex series are digraphs ending in, for example—this is overridden in many cases by the principle of using short transcriptions for common sounds. Thus is used for 精 rather than for the less common 邪, where it might also be expected; is used for frequent 微 rather than for 奉; and and, for the high-frequency 見 and 羣, have the additional benefit of being familiar in their palatalized forms from English words like cello and gem.

Dialectal correspondences

The voiced obstruents are only distinct in Wu dialects. In Min, they are collapsed with the consonants of the tenuis column. Elsewhere they are generally collapsed with the aspirated column in the even tone, and with the tenuis column in other tones. An exception is Cantonese, where in the rising tone they are aspirated in colloquial speech, but tenuis in reading pronunciations. The sonorants do not vary much apart from,, which in Wu are nasals colloquially but fricatives when read.
Velars,, are palatalized to affricates before, apart from Min and Yue, where they remain stops before all vowels;, also palatalize, but remain fricatives. For instance in Mandarin, they are g, k, h before non-palatalizing vowels and j, q, x before palatalizing vowels, whereas in Cantonese they remain g, k, h everywhere. The alveolar sibilants,,,, are also generally palatalized before, , collapsing with the palatalized velars,,,, in dialects which have lost the "round-sharp" distinction so important to Peking opera.
The palatal stops,, remain stops only in Min among the Chinese topolects ; elsewhere they are conflated with the affricates. The palatal and retroflex sibilants are generally conflated; in Yue and Min, as well as in much of Wu and Mandarin, they are further conflated with the alveolar sibilants. This contrast remains in Beijing, where 'three' is distinct from 'mountain'; both are in Sichuanese and Taiwanese Mandarin.
There are numerous more sporadic correlations. For instance, the alveolar affricates,, become stops in Taishan Yue, whereas the alveolar stops are debuccalized to, as in Hoisaan for Cantonese Toisaan. In Yüchi, Yunnan, it is the velars,, which are debuccalized, to. In the Min dialects,, become or. In Xi'an Mandarin, the fricatives,,, are rounded to before rounded vowels, as in 'water'.

Medials

The categories of the Late Middle Chinese rime tables are reduced to the four medials of modern Chinese, plus an intermediate type :
⟨i⟩ and ⟨iu⟩ are omitted after labiodental initials.

Dialectal correspondences

The medial is used for syllables which have a palatalizing medial in Mandarin, but no medial in Yue. That is, in Mandarin should be read as, with the same effect on consonants as has, whereas in Cantonese it is silent. In Shanghainese both situations occur: is equivalent to in reading pronunciations, as or, but is not found in colloquial speech.
In Cantonese, medial can be ignored after sibilants, as palatalization has been lost in these series. That is, siao, shao are read the same.

Rimes

Chao uses the following rimes. They do not always correspond to the Middle Chinese rimes.
Rimes consist of a nucleus and optionally a coda. They need to be considered as a unit because of a strong historical interaction between vowel and coda in Chinese dialects. The following combinations occur :

Dialectal correspondences

The most salient dialectal difference in rimes is perhaps the lack of the obstruent codas,, in most dialects of Mandarin and independently in the Wencheng dialect of Oujiang, though this has traditionally been seen as a loss of tone. In Wu, Min, New Xiang, Jin, and in the Lower Yangtze and Minjiang dialects of Mandarin, these codas conflate to glottal stop. In others, such as Gan, they are reduced to, while Yue dialects, Hakka, and Old Xiang maintain the original system.
Nasal codas are also reduced in many dialects. Mandarin and Wu do not distinguish between and, with them being reduced to or nasal vowels, or in some cases dropped altogether. In Shanghainese many instances of have conflated as well, or been dropped, but a phonemic distinction is maintained.
In Mandarin, an additional coda is found, -er, from GC.
In Cantonese, the simple vowels i u iu o a e are, apart from and after velars, which open to diphthongs, as in ci and ciu. Diphthongs may vary markedly depending on initial and medial, as in cau, ceau, ciau, though both ceu ~ cieu are, following the general pattern of before a coda. Cantonese does not have medials, apart from gw, kw, though sometimes it is the nuclear vowel which drops: giung, xiong, but giuan.

Combinations of medials and rimes

The following combinations of orthographic medials and rimes occur, taking -iu to be medial i + rime u :
Double cells show discrepancies between analysis and orthography. For instance, Chao analyzes ieng, iueng as part of the aeng series rather than the eng series, and ien as part of the an series. Though not apparent from the chart, eng-ing-ueng-iuing, ung-ong-iung-iong, and en-in-un-iun are similar series. The discrepancies are due to an effort to keep frequent syllables short: en-in-un-iun rather than *en-ien-uen-iuen, for example; as well as a reflection of some of the more widespread phonological changes in the rimes.
The Classical correspondences, with many archaic distinctions lost, are as follows:
These all occur in the velar-initial series, but not all in the others.

Dialectal correspondences

In Cantonese, after coronal stops and sibilants, rounded finals such as -on and -uan produce front rounded vowels, as in don, and after velars, iung and iong lose their.
Min dialects are similar, but in certain tones and become diphthongs rather that their usual. For example, in Fuzhou, even-tone 星 sieng is but departing-tone 性 sieq is.
In Yunnan Mandarin, is pronounced as, so that the name of the province, yunnom, is rather than as in Beijing.
In Nanking, metathesizes to after alveolars, as in 天 for Beijing tian.

Tones

The basic spelling is used for the even 平 tone. For the rising 上 tone, the nucleus is doubled, or the coda is changed to a 'lighter' letter. For the departing 去 tone, the coda is made 'heavier'; if there is no coda, add. For the entering 入 tone, a stop coda is used.
'Lighter' means that a vowel coda is made more open ; 'heavier' means that a vowel coda is made more close and a nasal coda is doubled. The nasal is 'lightened' to and made heavier as :
codaeven 平rising 上departing 去entering 入
babaabah
ciuciuuciuh
-ifuifuefuy
-ucaucaocaw
-mlamlaamlammlap
-nrenreenrennret
-ngjangjagjaqjoc

One consequence of this is that the rimes -e and -ei in the even tone conflate to in the rising tone. However, since there are no such syllables which begin with the same consonant and medial, no syllables are actually conflated.
The difference between yin and yang tones is indicated by the voicing of the consonant. A zero consonant is treated as voiceless, so i, iem, uon, iuan are ping yin, whereas yi, yem, won, yuan are ping yang. In a few cases, the effect that voiced,,, have on tone needs to be negated to achieve a ping yin tone. This is accomplished by spelling them,,,.
To mark the toneless Mandarin syllable ma, a centered dot is used:. The dot is omitted for toneless, as tonic me, de, te, ne, le do not exist.

Dialectal correspondences

The realization of the tones in the various varieties of Chinese is generally predictable; see Four tones for details. In Beijing Mandarin, for example, even tone is split according to voicing, with muddy consonants becoming aspirates: ba, pa, ma, bhabā, pā, má, pá. Departing tone is not split, and muddy consonants become tenuis: bah, pah, mah, bhahbà, pà, mà, bà. Rising tone splits, not along voicing, but with muddy-consonant syllables conflating with departing tone: baa, paa, maa, bhaabǎ, pǎ, mǎ, bà. That is, bhaa and bhah are homonyms in Beijing, as indeed they are in all of Mandarin, in Wu apart from Wenzhounese, in Hakka, and in reading pronunciations of Cantonese. Entering tone is likewise split in Beijing: mat, bhatmà, bá.
However, the realization of entering tones in Beijing dialect, and thus in Standard Chinese, is not predictable when a syllable has a voiceless initial such as bat or pat. In such cases even syllables with the same GC spelling may have different tones in Beijing, though they remain homonyms in other Mandarin dialects, such as Xian and Sichuanese. This is due to historical dialect-mixing in the Chinese capital that resulted in unavoidably idiosyncratic correspondences.
In Yue, there is a straightforward split according to consonant voicing, with a postvocalic in Yale romanization for the latter. Muddy onsets become aspirates in even and rising tones, but tenuis in departing and entering tones: ba, pa, ma, bhabā, pā, māh, pāh; baa, paa, maa, bhaabá, pá, máh, páh; bah, pah, mah, bhahba, pa, mah, bah; bat, pat, mat, bhatbaat, paat, maht, baht. In addition, there is a split in entering tone according to vowel length, with Cantonese mid-entering tone for short vowels like bāt, pāt. In reading pronunciations, however, rising tone syllables with muddy onsets are treated as departing tone: bhaabah rather than → páh. There is also an unpredictable split in the even ping yin tone which indicates diminutives or a change in part of speech, but this is not written in all Cantonese romanizations.

Sample text

Chao provided this poem as an example. The character text is no different in GC and standard Chinese, apart from 裏 , which in any case has now been substituted with Chao's choice of 里 on the Mainland. Note that simplified characters like this would affect all of Chao's proposal, so that 對 below would become 对, etc. The only other difference is 他 for 'her', which may differ from contemporary written Chinese 她, but which follows Classical usage.