Vietnamese phonology

This article is a technical description of the sound system of the Vietnamese language, including phonetics and phonology. Two main varieties of Vietnamese, Hanoi and Ho Chi Minh City, are described below.

Initial consonants

Initial consonants which exist only in the Hanoi dialect are in red, while those that exist only in the Saigon dialect are in blue.
The table below summarizes these sound correspondences:


Vowel nuclei


The IPA chart of vowel nuclei above is based on the sounds in Hanoi Vietnamese; other regions may have different inventories. Vowel nuclei consist of monophthongs and three centering diphthongs.
In Vietnamese, vowel nuclei are able to combine with offglides or to form closing diphthongs and triphthongs. Below is a chart listing the closing sequences of general northern speech.
says that in Hanoi, words spelled with ưu and ươu are pronounced, respectively, whereas other dialects in the Tonkin delta pronounce them as and. This observation is also made by and.


When stops occur at the end of words, they have no audible release :
When the velar consonants are after, they are articulated with a simultaneous bilabial closure or are strongly labialized.

Hanoi finals

Analysis of final ''ch'', ''nh''

The pronunciation of syllable-final ch and nh in Hanoi Vietnamese has had different analyses. One analysis, that of has them as being phonemes, where contrasts with both syllable-final t and c and contrasts with syllable-final n and ng. Final is, then, identified with syllable-initial.
Another analysis has final and as representing different spellings of the velar phonemes and that occur after upper front vowels and . This analysis interprets orthographic ⟨ach⟩ and ⟨anh⟩ as an underlying, which becomes phonetically open and diphthongized: →, →. This diphthongization also affects ⟨êch⟩ and ⟨ênh⟩: →, →.
Arguments for the second analysis include the limited distribution of final and, the gap in the distribution of and which do not occur after and, the pronunciation of ⟨ach⟩ and ⟨anh⟩ as and in certain conservative central dialects, and the patterning of ~ and ~ in certain reduplicated words. Additionally, final is not articulated as far forward as the initial : and are pre-velar with no alveolar contact.
The first analysis closely follows the surface pronunciation of a slightly different Hanoi dialect than the second. In this dialect, the in and is not diphthongized but is actually articulated more forward, approaching a front vowel. This results in a three-way contrast between the rimes ăn vs. anh vs. ăng. For this reason, a separate phonemic is posited.

Table of Hanoi finals

The following rimes ending with velar consonants have been diphthongized in the Hanoi dialect, but, and are more open:
With the above phonemic analyses, the following is a table of rimes ending in in the Hanoi dialect:

Saigon finals

Merger of finals

While the variety of Vietnamese spoken in Hanoi has retained finals faithfully from Middle Vietnamese, the variety spoken in Ho Chi Minh City has drastically changed its finals. Rimes ending in merged with those ending in, respectively, so they are always pronounced, respectively, after the short front vowels . However, they are always pronounced after the other vowels. After rounded vowels, many speakers close their lips, i.e. they pronounce as. Subsequently, vowels of rimes ending in labiovelars have been diphthongized, while vowels of rimes ending in alveolar have been centralized. Otherwise, some Southern speakers distinguish and after in formal speech, but there are no Southern speakers who pronounce "ch" and "nh" at the end of syllables as.

Table of Saigon finals

The short back vowels in the rimes have been diphthongized and centralized, meanwhile, the consonants have been labialized. Similarly, the short front vowels have been centralized which are realized as central vowels and the "unspecified" consonants have been affected by Coronal Spreading from the preceding front vowels which are surfaced as coronals .
The other closed dialects which have also been merged in codas, but some vowels are pronounced differently in some dialects:
HueQuang NamBinh DinhSai Gon
ung, uc, , , ,
un, ut, , , ,
ênh, êch, , , ,
ên, êt, , , ,
inh, ich, , , ,
in, it, , , ,

The ông, ôc rimes is merged into ong, oc as, in many Southern speakers, but not with ôn, ôt as pronounced,. The oong, ooc and eng, ec rimes are few and are mostly loanwords or onomatopoeia. The ôông, ôôc rimes are the "archaric" form before become ông, ôc'' by diphthongization and still exist in North Central dialect in many placenames. The articulation of these rimes in North Central dialect are, without a simultaneous bilabial closure or labialization.
With the above phonemic analyses, the following is a table of rimes ending in in the Saigon dialect:


Vietnamese vowels are all pronounced with an inherent tone. Tones differ in
Unlike many Native American, African, and Chinese languages, Vietnamese tones do not rely solely on pitch contour. Vietnamese often uses instead a register complex. So perhaps a better description would be that Vietnamese is a register language and not a "pure" tonal language.
In Vietnamese orthography, tone is indicated by diacritics written above or below the vowel.

Six-tone analysis

There is much variation among speakers concerning how tone is realized phonetically. There are differences between varieties of Vietnamese spoken in the major geographic areas and smaller differences within the major areas. In addition, there seems to be variation among individuals. More research is needed to determine the remaining details of tone realization and the variation among speakers.

Northern varieties

The six tones in the Hanoi and other northern varieties are:
Ngang tone:
Huyền tone:
Hỏi tone:
Ngã tone:
Sắc tone:
Nặng tone:
The Southern tones contour of ngang, sắc, huyền is similar as Northern tones, however, these tones are produced with normal voice instead of breathy voice.
The nặng tone are pronounced as low rising tone in fast speech or low falling-rising tone in more careful utterance.
The ngã and hỏi tone are merged into a mid falling-rising which is somewhat similar hỏi tone of non-Hanoi Northern accent mentioned above.

North-central and Central varieties

North-central and Central Vietnamese varieties are fairly similar with respect to tone although within the North-central dialect region there is considerable internal variation.
It is sometimes said that people from Nghệ An pronounce every tone as a nặng tone.

Eight-tone analysis

An older analysis assumes eight tones rather than six. This follows the lead of traditional Chinese phonology. In Middle Chinese, syllables ending in a vowel or nasal allowed for three tonal distinctions, but syllables ending with, or had no tonal distinctions. Rather, they were consistently pronounced with a short high tone, which was called the entering tone and considered a fourth tone. Similar considerations lead to the identification of two additional tones in Vietnamese for syllables ending in,, and. These are not phonemically distinct from the sắc and nặng tones, however, and hence not considered as separate tones by modern linguists and are not distinguished in the orthography.

Syllables and phonotactics

According to, there are 4,500 to 4,800 possible spoken syllables, and the standard national orthography can represent 6,200 syllables. A description of syllable structure and exploration of its patterning according to the Prosodic Analysis approach of J.R. Firth is given in Henderson.
The Vietnamese syllable structure follows the scheme:
In other words, a syllable has an obligatory nucleus and tone, and can have an optional consonant onset, an optional on-glide, and an optional coda or off-glide.
More explicitly, the syllable types are as follows:
C1: Any consonant may occur in as an onset with the following exceptions:
w: the onglide :
V: The vowel nucleus V may be any of the following 14 monophthongs or diphthongs:.
G: The offglide may be or. Together, V and G must form one of the diphthongs or triphthongs listed in the section on Vowels.
C2: The optional coda C2 is restricted to labial, coronal, and velar stops and nasals, which cannot cooccur with the offglides.
T: Syllables are spoken with an inherent tone contour: