ISO 639 macrolanguage


A macrolanguage is a book-keeping mechanism for the ISO 639 international standard for language codes. Macrolanguages are established to assist mapping between different sets of ISO language codes. Specifically, there may be a many-to-one correspondence between ISO 639-3, intended to identify all the thousands of languages of the world, and either of two other sets, ISO 639-1, established to identify languages in computer systems, and ISO 639-2, which encodes a few hundred languages for library cataloguing and bibliographic purposes. When such many-to-one ISO 639-2 codes are included in an ISO 639-3 context, they are called "macrolanguages" to distinguish them from the corresponding individual languages of ISO 639-3. According to the ISO,
ISO 639-3 is curated by SIL International, ISO 639-2 is curated by the Library of Congress.
The mapping often has the implication that it covers borderline cases where two language varieties may be considered strongly divergent dialects of the same language or very closely related languages ; it may also encompass situations when there are language varieties that are considered to be varieties of the same language on the grounds of ethnic, cultural, and political considerations, rather than linguistic reasons. However, this is not its primary function and the classification is not evenly applied.
For example, Chinese is a macrolanguage encompassing many languages that are not mutually intelligible, but the languages "Standard German", "Bavarian German", and other closely related languages do not form a macrolanguage, despite being more mutually intelligible. Other examples include Tajiki not being part of the Persian macrolanguage despite sharing much lexicon, and Urdu and Hindi not forming a macrolanguage despite forming a mutually intelligible dialect continuum. Even all dialects of Hindi are considered as separate languages. Basically, ISO 639-2 and ISO 639-3 use different criteria for dividing language varieties into languages, 639-2 uses shared writing systems and literature more whereas 639-3 focuses on mutual intelligibility and shared lexicon. The macrolanguages exist within the ISO 639-3 code set to make mapping between the two sets easier.
The use of macrolanguages was applied in Ethnologue, starting in the 16th edition., there are fifty-eight language codes in ISO 639-2 that are counted as macrolanguages in ISO 639-3, but new macrolanguages are no longer being created, as current databases are sufficient to indicated the relationships between codes.
Some of the macrolanguages had no individual language in ISO 639-2, e.g. "ara", but ISO 639-3 recognizes different varieties of Arabic as separate languages under some circumstances. Others, like "nor" had their two individual parts already in 639-2. That means some languages that were considered by ISO 639-2 to be dialects of one language are now in ISO 639-3 in certain contexts considered to be individual languages themselves. This is an attempt to deal with varieties that may be linguistically distinct from each other, but are treated by their speakers as forms of the same language, e.g. in cases of diglossia. For example,
ISO 639-2 also includes codes for collections of languages; these are not the same as macrolanguages. These collections of languages are excluded from ISO 639-3, because they never refer to individual languages. Most such codes are included in ISO 639-5.

Types of macrolanguages

This list only includes official data from https://iso639-3.sil.org/code_tables/macrolanguage_mappings/data.
ISO 639-1ISO 639-2ISO 639-3Number of individual languagesName of macrolanguage
akaka2Akan language
arara29 + retired 1Arabic language
ayaym2Aymara language
azaze2Azerbaijani language
bal3Baluchi language
bik8 + retired 1Bikol language
5Bontok language
bua3Buriat language
chm2Mari language
crcre6Cree language
del2Delaware language
den2Slavey language
din5Dinka language
doi2Dogri language
etest2Estonian language
fafas/per2Persian language
ffful9Fulah language
gba6 + retired 1Gbaya language
gon3 + retired 1Gondi language
grb5Grebo language
gngrn5Guaraní language
hai2Haida language
4Serbo-Croatian
hmn25 + retired 1Hmong language
iuiku2Inuktitut language
ikipk2Inupiaq language
jrb5Judeo-Arabic languages
krkau3Kanuri language
9Kalenjin languages
kok2Konkani language
kvkom2Komi language
kgkon3Kongo language
kpe2Kpelle language
kukur3Kurdish language
lah7 + retired 1Lahnda language
lvlav2Latvian language
14Luyia language
man6 + retired 1Manding languages
mgmlg11 + retired 1Malagasy language
mnmon2Mongolian language
msmsa/may36 + retired 1Malay language
mwr6Marwari language
nenep2Nepali language
nonor2Norwegian language
ojoji7Ojibwa language
orori2Oriya language
omorm4Oromo language
pspus3Pashto language
quque43 + retired 1Quechua language
raj6Rajasthani language
rom7Romany language
sqsqi/alb4Albanian language
scsrd4Sardinian language
swswa2Swahili language
syr2Syriac language
tmh4Tuareg languages
uzuzb2Uzbek language
yiyid2Yiddish language
zap57 + retired 1Zapotec language
zazha16 + retired 2Zhuang languages
zhzho/chi16Chinese language
zza2Zaza language
335862440 + retired 13total codes
ISO 639-1ISO 639-2ISO 639-3Number of individual languagesName of macrolanguage

List of macrolanguages and the individual languages

This is a complete list of the individual language codes that comprise the macrolanguages in the ISO 639-3 code tables as of 2020.

aaa–ezz

aka

is the ISO 639-3 language code for Akan. Its ISO 639-1 code is ak. There are two individual language codes assigned:
is the ISO 639-3 language code for Arabic language. Its ISO 639-1 code is ar. There are twenty-nine individual language codes assigned:
The following codes were previously part of ara:
is the ISO 639-3 language code for Aymara. Its ISO 639-1 code is ay. There are two individual language codes assigned:
is the ISO 639-3 language code for Azerbaijani. Its ISO 639-1 code is az. There are two individual language codes assigned:
is the ISO 639-3 language code for Baluchi. There are three individual language codes assigned:
is the ISO 639-3 language code for Bikol. There are eight individual language codes assigned:
The following code was previously part of bik:
is the ISO 639-3 language code for Bontok. There are five individual language codes assigned:
is the ISO 639-3 language code for Buriat. There are three individual language codes assigned:
is the ISO 639-3 language code for Mari, a language located in Russia. There are two individual language codes assigned:
is the ISO 639-3 language code for Cree. Its ISO 639-1 code is cr. There are six individual language codes assigned:
In addition, there are six closely associated individual codes:
In addition, there is one other language without individual codes closely associated, but not part of, this macrolanguage code:
is the ISO 639-3 language code for Delaware. There are two individual language codes assigned:
is the ISO 639-3 language code for Slave. There are two individual language codes assigned:
is the ISO 639-3 language code for Dinka. There are five individual language codes assigned:
is the ISO 639-3 language code for Dogri. There are two individual language codes assigned:
is the ISO 639-3 language code for Estonian. Its ISO 639-1 code is et. There are two individual language codes assigned:

fas

is the ISO 639-3 language code for Persian. Its ISO 639-1 code is fa. There are two individual language codes assigned:
is the ISO 639-2 and ISO 639-3 language code for Fulah. Its ISO 639-1 code is ff. There are nine individual language codes assigned for varieties of Fulah:
is the ISO 639-3 language code for Gbaya located in the Central African Republic. There are six individual language codes assigned:
The following code was previously part of gba:
is the ISO 639-3 language code for Gondi. There are three individual language codes assigned:
The following code was previously part of gon:
is the ISO 639-3 language code for Grebo. There are five individual language codes assigned:
is the ISO 639-3 language code for Guarani. Its ISO 639-1 code is gn. There are five individual language codes assigned:
is the ISO 639-3 language code for Haida. There are two individual language codes assigned:
is the ISO 639-3 language code for Serbo-Croatian. There are four individual language codes assigned:
is the ISO 639-3 language code for Hmong. There are twenty-five individual language codes assigned:
The following code was previously part of hmn:
is the ISO 639-3 language code for Inuktitut. Its ISO 639-1 code is iu. There are two individual language codes assigned:
is the ISO 639-3 language code for Inupiaq. Its ISO 639-1 code is ik. There are two individual language codes assigned:
is the ISO 639-3 language code for Judeo-Arabic. There are five individual language codes assigned:

kau

is the ISO 639-2 and ISO 639-3 language code for the Kanuri. Its ISO 639-1 code is kr. There are three individual language codes assigned in ISO 639-3 for varieties of Kanuri:
There are two other related languages that are not considered part of the macrolanguage under ISO 639:
is the ISO 639-3 language code for Kalenjin. There are nine individual language codes assigned:
is the ISO 639-3 language code for Konkani. There are two individual language codes assigned:
Both languages are referred to as Konkani by their respective speakers.

kom

is the ISO 639-3 language code for Komi. Its ISO 639-1 code is kv. There are two individual language codes assigned:
is the ISO 639-3 language code for Kongo. Its ISO 639-1 code is kg. There are three individual language codes assigned:
is the ISO 639-3 language code for Kpelle. There are two individual language codes assigned:
is the ISO 639-3 language code for Kurdish. Its ISO 639-1 code is ku. There are three individual language codes assigned:
is the ISO 639-3 language code for Lahnda. There are seven individual language codes assigned.
Note that lah does not include Panjabi/Punjabi.
The following code was previously part of lah:
is the ISO 639-3 language code for Latvian. Its ISO 639-1 code is lv. There are two individual language codes assigned:
is the ISO 639-3 language code for Luyia. There are fourteen individual language codes assigned:
is the ISO 639-3 language code for Mandingo. There are six individual language codes assigned:
The following codes were previously part of man:
is the ISO 639-3 language code for Malagasy. Its ISO 639-1 code is mg. There are eleven individual language codes assigned:
The following codes were previously part of mlg:
is the ISO 639-3 language code for Mongolian. Its ISO 639-1 code is mn. There are two individual language codes assigned:
is the ISO 639-3 language code for Malay. Its ISO 639-1 code is ms. There are thirty-six individual language codes assigned:
The following code was previously part of msa:
is the ISO 639-3 language code for Marwari. There are six individual language codes assigned:
is the ISO 639-3 language code for Nepali. Its ISO 639-1 code is ne. There are two individual language codes assigned:
is the ISO 639-3 language code for Norwegian. Its ISO 639-1 code is no. There are two individual language codes assigned:
is the ISO 639-3 language code for Ojibwa. Its ISO 639-1 code is oj. There are seven individual language codes assigned:
In addition, there are three closely associated individual codes:
In addition, there are two other languages without individual codes closely associated, but not part of, this macrolanguage code:
is the ISO 639-3 language code for Oriya. Its ISO 639-1 code is or. There are two individual language codes assigned:
is the ISO 639-3 language code for Oromo. Its ISO 639-1 code is om. There are four individual language codes assigned:

pus

is the ISO 639-3 language code for Pushto. Its ISO 639-1 code is ps. There are three individual language codes assigned:
is the ISO 639-3 language code for Quechua. Its ISO 639-1 code is qu. There are forty-three individual language codes assigned:
The following code was previously part of que:
is the ISO 639-3 language code for Rajasthani. There are six individual language codes assigned:
is the ISO 639-3 language code for Romany. There are seven individual language codes assigned:
In addition, there are eight individual codes not part of this macrolanguage but they are categorized as mixed languages:
In addition, there is a language without an individual code assigned, which it is not part of this macrolanguage:
is the ISO 639-3 language code for Albanian. Its ISO 639-1 code is sq. There are four individual language codes assigned:
is the ISO 639-3 language code for Sardinian. Its ISO 639-1 code is sc. There are four individual language codes assigned:
is the ISO 639-3 language code for Swahili. Its ISO 639-1 code is sw. There are two individual language codes assigned:
is the ISO 639-3 language code for Syriac. There are two individual language codes assigned:
is the ISO 639-3 language code for Tamashek. There are four individual language codes assigned:
is the ISO 639-3 language code for Uzbek. Its ISO 639-1 code is uz. There are two individual language codes assigned:
is the ISO 639-3 language code for Yiddish. Its ISO 639-1 code is yi. There are two individual language codes assigned:
is the ISO 639-3 language code for Zapotec. There are fifty-seven individual language codes assigned.
The following codes were previously part of zap:
In addition, there is an individual code not part of this macrolanguage because it is categorized as a historical language:
is the ISO 639-3 language code for Zhuang. Its ISO 639-1 code is za. There are sixteen individual language codes assigned:
The following codes were previously part of zha:
is the ISO 639-3 language code for Chinese. Its ISO 639-1 code is zh. There are sixteen individual language codes assigned, most of which are not actually languages but rather groups of Sinitic languages distinguished by isoglosses:
Although the Dungan language is a dialect of Mandarin, it is not listed under Chinese in ISO 639-3 due to separate historical and cultural development.
ISO 639 also lists codes for Old Chinese and Late Middle Chinese ). They are not listed under Chinese in ISO 639-3 because they are categorized as ancient and historical languages, respectively.

zza

is the ISO 639-3 language code for Zaza. There are two individual language codes assigned: