Bengali alphabet
The Bengali or Bangla alphabet is the alphabet used to write the Bengali language and has historically been used to write Sanskrit within Bengal. It is quite similar to the Assamese alphabet and other alphabets based on the Bengali–Assamese script.
From a classificatory point of view, the Bengali script is an abugida, i.e. its vowel graphemes are mainly realised not as independent letters, but as diacritics modifying the vowel inherent in the base letter they are added to. Bengali script is written from left to right and lacks distinct letter cases. It is recognisable, as are other Brahmic scripts, by a distinctive horizontal line known as মাত্রা matra running along the tops of the letters that links them together. The Bengali script is however less blocky and presents a more sinuous shape.
Characters
The Bengali script can be divided into vowels and vowel diacritics/marks, consonants and consonant conjuncts, diacritical and other symbols, digits, and punctuation marks. Vowels & Consonant are used as alphabet and also diacritical marks.Vowels
The Bengali script has a total of 9 vowel graphemes, each of which is called a স্বরবর্ণ swôrôbôrnô "vowel letter". The swôrôbôrnôs represent six of the seven main vowel sounds of Bengali, along with two vowel diphthongs. All of them are used in both Bengali and Assamese languages.- "অ" ô sounds as the default Inherent vowel for the entire Bengali script. Bengali, Assamese and Odia which are Eastern languages have this value for the inherent vowel, while other languages using Brahmic scripts have a for their inherent vowel.
- Even though the open-mid front unrounded vowel is one of the seven main vowel sounds in the standard Bengali language, no distinct vowel symbol has been allotted for it in the script since there is no sound in Sanskrit, the primary written language when the script was conceived. As a result, the sound is orthographically realised by multiple means in modern Bengali orthography, usually using some combination of "এ" e , "অ", "আ" a and the যফলা jôfôla.
- There are two graphemes for the vowel sound and two graphemes for the vowel sound. The redundancy stems from the time when this script was used to write Sanskrit, a language that had short and long vowels: "ই" i and "ঈ" ī , and "উ" u and "ঊ" ū . The letters are preserved in the Bengali script with their traditional names despite the fact that they are no longer pronounced differently in ordinary speech. These graphemes serve an etymological function, however, in preserving the original Sanskrit spelling in tôtsômô Bengali words.
- The grapheme called "ঋ" ṛ does not really represent a vowel phoneme in Bengali but the consonant-vowel combination রি. Nevertheless, it is included in the vowel section of the inventory of the Bengali script. This inconsistency is also a remnant from Sanskrit, where the grapheme represents the vocalic equivalent of a retroflex approximant. Another grapheme called "ঌ" ḷ representing the vocalic equivalent of a dental approximant in Sanskrit but actually representing the constant-vowel combination লি in Bengali instead of a vowel phoneme, was also included in the vowel section but unlike "ঋ", it was recently discarded from the inventory since its usage was extremely limited even in Sanskrit.
- When a vowel sound occurs syllable-initially or when it follows another vowel, it is written using a distinct letter. When a vowel sound follows a consonant, it is written with a diacritic which, depending on the vowel, can appear above, below, before or after the consonant. These vowel marks cannot appear without a consonant and are called কার kar.
- An exception to the above system is the vowel, which has no vowel mark but is considered inherent in every consonant letter. To denote the absence of the inherent vowel following a consonant, a diacritic called the হসন্ত hôsôntô may be written underneath the consonant.
- Although there are only two diphthongs in the inventory of the script: "ঐ" oi and "ঔ" ou , the Bengali phonetic system has, in fact, many diphthongs. Most diphthongs are represented by juxtaposing the graphemes of their forming vowels, as in কেউ keu.
- There also used to be two long vowels: "ৠ" ṝ and "ৡ" ḹ, which were removed from the inventory during the Vidyasagarian reform of the script due to peculiarity to Sanskrit.
Consonants
Consonant letters are called ব্যঞ্জনবর্ণ bænjônbôrnô "consonant letter" in Bengali. The names of the letters are typically just the consonant sound plus the inherent vowel অ ô. Since the inherent vowel is assumed and not written, most letters' names look identical to the letter itself.- Some letters that have lost their distinctive pronunciation in modern Bengali are called by more elaborate names. For example, since the consonant phoneme is written as both ন and ণ, the letters are not called simply nô; instead, they are called দন্ত্য ন dôntyô nô and মূর্ধন্য ণ murdhônyô nô. What was once pronounced and written as a retroflex nasal ণ is now pronounced as an alveolar although the spelling does not reflect the change.
- Although still named Murdhônyô when they are being taught, retroflex consonants do not exist in Bengali and are instead fronted to their postalveolar and alveolar equivalents.
- The voiceless palato-alveolar sibilant phoneme can be written as শ,, ষ, or স, depending on the word.
- The voiced palato-alveolar affricate phoneme can be written in two ways, as য or জ. In many varieties of Bengali, are not distinct from this phoneme, but speakers who distinguish them may use the letters য and জ with contrast.
- Since the nasals ঞ ñô and ঙ ngô cannot occur at the beginning of a word in Bengali, their names are not ñô and ngô respectively but উঙ ungô and ইঞ iñô respectively.
- Similarly, since the semivowel য় yô cannot occur at the beginning of a Bengali word, its name is not ôntôsthô yô but অন্তঃস্থ অ ôntôsthô ô.
- There is a difference in the pronunciation of ড় ṛô ḍô with a zero and ঢ় ṛhô with that of র rô - similar to other Indic languages. This is especially true in the parlance of western and southern part of Bengal but lesser on the dialects of the eastern side of the Padma River. ড় and ঢ় were introduced to the inventory during the Vidyasagarian reform to indicate the retroflex flap in the pronunciation of ড ḍô and ঢ ḍhô in the middle or end of a word. It is an allophonic development in some Indic languages not present in Sanskrit. Yet in ordinary speech these letters are pronounced the same as র in modern Bengali.
Consonant conjuncts
Often, consonant conjuncts are not actually pronounced as would be implied by the pronunciation of the individual components. For example, adding ল lô underneath শ shô in Bengali creates the conjunct শ্ল, which is not pronounced shlô but slô in Bengali. Many conjuncts represent Sanskrit sounds that were lost centuries before modern Bengali was ever spoken as in জ্ঞ. It is a combination of জ ǰô and ঞ ñô but it is not pronounced "ǰñô" or "jnô". Instead, it is pronounced ggô in modern Bengali. Thus, as conjuncts often represent sounds that cannot be easily understood from the components, the following descriptions are concerned only with the construction of the conjunct, and not the resulting pronunciation.
Fused forms
Some consonants fuse in such a way that one stroke of the first consonant also serves as a stroke of the next.- The consonants can be placed on top of one another, sharing their vertical line: ক্ক kkô গ্ন gnô গ্ল glô ন্ন nnô প্ন pnô প্প ppô ল্ল llô etc.
- As the last member of a conjunct, ব bô can hang on the vertical line under the preceding consonants, taking the shape of ব bô : গ্ব gbô ণ্ব "ṇbô" দ্ব "dbô" ল্ব lbô শ্ব "shbô".
- The consonants can also be placed side-by-side, sharing their vertical line: দ্দ ddô ন্দ ndô ব্দ bdô ব্জ bǰô প্ট pṭô শ্চ shchô শ্ছ shchhô, etc.
Approximated forms
- The consonants can be placed side-by-side, appearing unaltered: দ্গ dgô দ্ঘ dghô ড্ড ḍḍô.
- As the last member of a conjunct, ব bô can appear immediately to the right of the preceding consonant, taking the shape of ব bô : ধ্ব "dhbô" ব্ব bbô হ্ব "hbô".
Compressed forms
- As the first member of a conjunct, the consonants ঙ ngô চ chô ড ḍô and ব bô are often compressed and placed at the top-left of the following consonant, with little or no change to the basic shape: ঙ্ক্ষ "ngkṣô" ঙ্খ ngkhô ঙ্ঘ ngghô ঙ্ম ngmô চ্চ chchô চ্ছ chchhô চ্ঞ "chnô" ড্ঢ ḍḍhô ব্ব bbô.
- As the first member of a conjunct, ত tô is compressed and placed above the following consonant, with little or no change to the basic shape: ত্ন tnô ত্ম "tmô" ত্ব "tbô".
- As the first member of a conjunct, ম mô is compressed and simplified to a curved shape. It is placed above or to the top-left of the following consonant: ম্ন mnô ম্প mpô ম্ফ mfô ম্ব mbô ম্ভ mbhô ম্ম mmô ম্ল mlô.
- As the first member of a conjunct, ষ ṣô is compressed and simplified to an oval shape with a diagonal stroke through it. It is placed to the top-left of the following consonants: ষ্ক ṣkô ষ্ট ṣṭô ষ্ঠ ṣṭhô ষ্প ṣpô ষ্ফ ṣfô ষ্ম ṣmô.
- As the first member of a conjunct, স sô is compressed and simplified to a ribbon shape. It is placed above or to the top-left of the following consonant: স্ক skô স্খ skhô স্ট sṭô স্ত stô স্থ sthô স্ন snô স্প spô স্ফ sfô স্ব "sbô" স্ম "smô" স্ল slô.
Abbreviated forms
- As the first member of a conjunct, জ ǰô can lose its final down-stroke: জ্জ ǰǰô জ্ঞ "ǰñô" জ্ব "jbô".
- As the first member of a conjunct, ঞ ñô can lose its bottom half: ঞ্চ ñchô ঞ্ছ ñchhô ঞ্জ ñǰô ঞ্ঝ ñǰhô.
- As the last member of a conjunct, ঞ ñô can lose its left half : জ্ঞ "ǰñô".
- As the first member of a conjunct, ণ ṇô and প pô can lose their down-stroke: ণ্ঠ ṇṭhô ণ্ড ṇḍô প্ত ptô প্স psô.
- As the first member of a conjunct, ত tô and ভ bhô can lose their final upward tail: ত্ত ttô ত্থ tthô ত্র trô ভ্র bhrô.
- As the last member of a conjunct, থ thô can lose its final upstroke, taking the form of হ hô instead: ন্থ nthô স্থ sthô ম্থ mthô
- As the last member of a conjunct, ম mô can lose its initial down-stroke: ক্ম "kmô" গ্ম "gmô" ঙ্ম ngmô ট্ম "ṭmô" ণ্ম "ṇmô" ত্ম "tmô" দ্ম "dmô" ন্ম nmô ম্ম mmô শ্ম "shmô" ষ্ম ṣmô স্ম "smô".
- As the last member of a conjunct, স sô can lose its top half: ক্স ksô.
- As the last member of a conjunct ট ṭô, ড ḍô and ঢ ḍhô can lose their matra: প্ট pṭô ণ্ড ṇḍô ণ্ট ṇṭô ণ্ঢ ṇḍhô.
- As the last member of a conjunct ড ḍô can change its shape: ণ্ড ṇḍô
Variant forms
- As the first member of a conjunct, ঙ ngô can appear as a loop and curl: ঙ্ক ngkô ঙ্গ nggô.
- As the last member of a conjunct, the curled top of ধ dhô is replaced by a straight downstroke to the right, taking the form of ঝ ǰhô instead: গ্ধ gdhô দ্ধ ddhô ন্ধ ndhô ব্ধ bdhô.
- As the first member of a conjunct, র rô appears as a diagonal stroke above the following member: র্ক rkô র্খ rkhô র্গ rgô র্ঘ rghô, etc.
- As the last member of a conjunct, র rô appears as a wavy horizontal line under the previous member: খ্র khrô গ্র grô ঘ্র ghrô ব্র brô, etc.
- * In some fonts, certain conjuncts with রফলা rôfôla appear using the compressed form of the previous consonant: জ্র ǰrô ট্র ṭrô ঠ্র ṭhrô ড্র ḍrô ম্র mrô স্র srô.
- * In some fonts, certain conjuncts with রফলা rôfôla appear using the abbreviated form of the previous consonant: ক্র krô ত্র trô ভ্র bhrô.
- As the last member of a conjunct, য jô appears as a wavy vertical line to the right of the previous member: ক্য "kyô" খ্য "khyô" গ্য "gyô" ঘ্য "ghyô" etc.
- * In some fonts, certain conjuncts with যফলা jôfôla appear using special fused forms: দ্য "dyô" ন্য "nyô" শ্য "shyô" ষ্য "ṣyô" স্য "syô" হ্য "hyô".
Exceptions
- When followed by র rô or ত tô, ক kô takes on the same form as ত tô would with the addition of a curl to the right: ক্র krô, ক্ত ktô.
- When preceded by the abbreviated form of ঞ ñô, চ chô takes the shape of ব bô: ঞ্চ ñchô
- When preceded by another ট ṭô, ট is reduced to a leftward curl: ট্ট ṭṭô.
- When preceded by ষ ṣô, ণ ṇô appears as two loops to the right: ষ্ণ ṣṇô.
- As the first member of a conjunct, or when at the end of a word and followed by no vowel, ত tô can appear as ৎ: ৎস "tsô" ৎপ tpô ৎক tkô etc.
- When preceded by হ hô, ন nô appears as a curl to the right: হ্ন "hnô".
- Certain combinations must be memorised: ক্ষ "kṣô" হ্ম "hmô".
Certain compounds
- উ u
- * When following গ gô or শ shô, it takes on a variant form resembling the final tail of ও o: গু gu শু shu.
- * When following a ত tô that is already part of a conjunct with প pô, ন nô or স sô, it is fused with the ত to resemble ও o: ন্তু ntu স্তু stu প্তু ptu.
- * When following র rô, and in many fonts also following the variant রফলা rôfôla, it appears as an upward curl to the right of the preceding consonant as opposed to a downward loop below: রু ru গ্রু gru ত্রু tru থ্রু thru দ্রু dru ধ্রু dhru ব্রু bru ভ্রু bhru শ্রু shru.
- * When following হ hô, it appears as an extra curl: হু hu.
- ঊ u
- * When following র rô, and in many fonts also following the variant রফলা rôfôla, it appears as a downstroke to the right of the preceding consonant as opposed to a downward hook below: রূ rū গ্রূ grū থ্রূ thrū দ্রূ drū ধ্রূ dhrū ভ্রূ bhrū শ্রূ shrū.
- ঋ ri
- * When following হ hô, it takes the variant shape of ঊ u: হৃ hri.
- Conjuncts of three consonants also exist, and follow the same rules as above: স sô + ত tô +র rô = স্ত্র strô, ম mô + প pô + র rô = ম্প্র mprô, জ ǰô + জ ǰô + ব bô = জ্জ্ব "ǰǰbô", ক্ষ "kṣô" + ম mô = ক্ষ্ম "kṣmô".
- Theoretically, four-consonant conjuncts can also be created, as in র rô + স sô + ট ṭô + র rô = র্স্ট্র rsṭrô, but they are not found in native words.
Diacritics and other symbols
Symbol/ Graphemes | Name | Function | Romanization | IPA transcription |
ৎ | খণ্ড ত khôndô tô | Special character. Final unaspirated dental | t | |
ং | অনুস্বার ônushshar | Diacritic. Final velar nasal | ng | |
ঃ | বিসর্গ bishôrgô | Diacritic. 1. Doubles the next consonant sound without the vowel in দুঃখ dukkhô, the k of খ khô was repeated before the whole খ khô 2. "h" sound at end, examples: এঃ eh!, উঃ uh! 3. Silent in spellings like অন্তঃনগর ôntônôgôr meaning "Inter-city" 4. Also used as abbreviation, like কিঃমিঃ, for the word কিলোমিটার "kilometer", another example can be ডাঃ for ডাক্তার dāktār "doctor" | h | |
ঁ | চন্দ্রবিন্দু chôndrôbindu | Diacritic. Vowel nasalization | ñ | |
্ | হসন্ত hôshôntô | Diacritic. Suppresses the inherent vowel ' | – | – |
ঽ | অবগ্রহ ôbôgrôhô | Special character or sign. Used for prolonging vowel sounds Example1: শুনঽঽঽ shunôôôô meaning "listennnn...", this is where the default inherited vowel sound ô in ন nô is prolonged. Example2: কিঽঽঽ? kiiii? meaning "Whatttt...?", this is where the vowel sound i which is attached with the consonant ক kô is prolonged. | - | – |
্য | যফলা jôfôla | Diacritic. Used with two types of pronunciation in modern Bengali depending on the location of the consonant it is used with within a syllable Example 1 - When the consonant it is used with is syllable-initial, it acts as the vowel : ত্যাগ is pronounced Example 2 - When the consonant it is used with is syllable-final, it doubles the consonant: মুখ্য is pronounced Notably used in transliterating English words with sounding vowels, e.g. ব্ল্যাক "black" and sometimes as a diacritic to indicate non-Bengali vowels of various kinds in transliterated foreign words, e.g. the schwa indicated by a jôfôla, the French u, and the German umlaut ü as উ্য uyô, the German umlaut ö as ও্য oyô or এ্য eyô | ê / yô | or |
্র | রফলা rôfôla | Diacritic. pronounced following a consonant phoneme. | r | |
র্ক | রেফ ref/reph | Diacritic. pronounced preceding a consonant phoneme. | r | |
্ব | বফলা bôfôla | Diacritic. Used in spellings only if they were adopted from Sanskrit and has two different pronunciations depending on the location of the consonant it is used with Example 1 - When the consonant it is used with is syllable-initial, it remains silent: স্বাধীন is pronounced as rather than Example 2 - When the consonant it is used with is syllable-final, it doubles the consonant: বিদ্বান is pronounced and বিশ্ব is pronounced However, certain Sanskrit sandhis such as 'ঋগ্বেদ', 'দিগ্বিজয়', 'উদ্বেগ', 'উদ্বৃত্ত' are pronounced,,, respectively while usage with the consonant হ defies phonological rules: 'আহ্বান' and 'জিহ্বা' are properly pronounced and rather than and, respectively. Also used in transliterating Islam-related Arabic words Note: Not all instances of ব bô used as the last member of a conjunct are bôfôla, for example, in the words অম্বর ômbôr, লম্বা lômba, তিব্বত tibbôt, বাল্ব balb', etc. | - | |
৺ | ঈশ্বার ishshar | Sign. Represents the name of a deity or also written before the name of a deceased person | – | – |
ঀ | আঞ্জী/সিদ্ধিরস্তু anji /siddhirôstu'' | Sign. Used at the beginning of texts as an invocation | – | – |
Digits and numerals
The Bengali script has ten numerical digits. Bengali numerals have no horizontal headstroke or মাত্রা "matra".Hindu-Arabic numerals | 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 |
Bengali numerals | ০ | ১ | ২ | ৩ | ৪ | ৫ | ৬ | ৭ | ৮ | ৯ |
Numbers larger than 9 are written in Bengali using a positional base 10 numeral system. A period or dot is used to denote the decimal separator, which separates the integral and the fractional parts of a decimal number. When writing large numbers with many digits, commas are used as delimiters to group digits, indicating the thousand, the hundred thousand or lakh, and the ten million or hundred lakh or crore units. In other words, leftwards from the decimal separator, the first grouping consists of three digits, and the subsequent groupings always consist of two digits.
For example, the English number 17,557,345 will be written in traditional Bengali as ১,৭৫,৫৭,৩৪৫.
Punctuation marks
Bengali punctuation marks, apart from the downstroke দাড়ি dari, the Bengali equivalent of a full stop, have been adopted from western scripts and their usage is similar: Commas, semicolons, colons, quotation marks, etc. are the same as in English. Capital letters are absent in the Bengali script so proper names are unmarked.An apostrophe, known in Bengali as ঊর্ধ্বকমা urdhbôkôma "upper comma", is sometimes used to distinguish between homographs, as in পাটা pata "plank" and পাʼটা pa'ta "the leg". Sometimes, a hyphen is used for the same purpose.
Characteristics of the Bengali text
Bengali text is written and read horizontally, from left to right. The consonant graphemes and the full form of vowel graphemes fit into an imaginary rectangle of uniform size. The size of a consonant conjunct, regardless of its complexity, is deliberately maintained the same as that of a single consonant grapheme, so that diacritic vowel forms can be attached to it without any distortion. In a typical Bengali text, orthographic words, words as they are written, can be seen as being separated from each other by an even spacing. Graphemes within a word are also evenly spaced, but that spacing is much narrower than the spacing between words.Unlike in western scripts for which the letter-forms stand on an invisible baseline, the Bengali letter-forms instead hang from a visible horizontal left-to-right headstroke called মাত্রা matra. The presence and absence of this matra can be important. For example, the letter ত tô and the numeral ৩ "3" are distinguishable only by the presence or absence of the matra, as is the case between the consonant cluster ত্র trô and the independent vowel এ e. The letter-forms also employ the concepts of letter-width and letter-height.
According to Bengali linguist Munier Chowdhury, there are about nine graphemes that are the most frequent in Bengali texts, shown with its percentage of appearance in the adjacent table.
Standardization
In the script, clusters of consonants are represented by different and sometimes quite irregular forms; thus, learning to read is complicated by the sheer size of the full set of letters and letter combinations, numbering about 350. While efforts at standardising the alphabet for the Bengali language continue in such notable centres as the Bangla Academy at Dhaka and the Pôshchimbônggô Bangla Akademi at Kolkata, it is still not quite uniform yet, as many people continue to use various archaic forms of letters, resulting in concurrent forms for the same sounds. Among the various regional variations within this script, only the Assamese and Bengali variations exist today in the formalised system.It seems likely that standardisation of the alphabet will be greatly influenced by the need to typeset it on computers. The large alphabet can be represented, with a great deal of ingenuity, within the ASCII character set, omitting certain irregular conjuncts. Work has been underway since around 2001 to develop Unicode fonts, and it seems likely that it will split into two variants, traditional and modern. In this and other articles on Wikipedia dealing with the Bengali language, a Romanization scheme used by linguists specialising in Bengali phonology is included along with IPA transcription. A recent effort by the Government of West Bengal focused on simplifying the Bengali orthography in primary school texts.
There is yet to be a uniform standard collating sequence of Bengali graphemes. Experts in both Bangladesh and India are currently working towards a common solution for the problem.
Romanization
Romanization of Bengali is the representation of the Bengali language in the Latin script. There are various ways of Romanization systems of Bengali, created in recent years but failed to represent the true Bengali phonetic sound. While different standards for romanisation have been proposed for Bengali, they have not been adopted with the degree of uniformity seen in languages such as Japanese or Sanskrit. The Bengali alphabet has often been included with the group of Brahmic scripts for romanisation in which the true phonetic value of Bengali is never represented. Some of them are the International Alphabet of Sanskrit Transliteration or "IAST system" "Indian languages Transliteration" or ITRANS, and the extension of IAST intended for non-Sanskrit languages of the Indian region called the National Library at Kolkata romanisation.Sample texts
Article 1 of the Universal Declaration of Human RightsBengali in Bengali alphabet
Bengali in phonetic Romanization
Bengali in IPA
Gloss
Translation
Unicode
Bengali script was added to the Unicode Standard in October 1991 with the release of version 1.0.The Unicode block for Bengali is U+0980–U+09FF: