Language code


A language code is a code that assigns letters or numbers as identifiers or classifiers for languages. These codes may be used to organize library collections or presentations of data, to choose the correct localizations and translations in computing, and as a shorthand designation for longer forms of language names.

Difficulties of classification

Language code schemes attempt to classify the complex world of human languages, dialects, and variants. Most schemes make some compromises between being general and being complete enough to support specific dialects.
For example, most people in Central America and South America speak Spanish. Spanish spoken in Mexico will be slightly different from Spanish spoken in Peru. Different regions of Mexico will have slightly different dialects and accents of Spanish. A language code scheme might group these all as "Spanish" for choosing a keyboard layout, most as "Spanish" for general usage, or separate each dialect to allow region-specific idioms.

Common schemes

Some common language code schemes include:
SchemeNotesExample for EnglishExample for Spanish
Glottolog codesCreated for minority languages as a scientific alternative to the industrial ISO 639‑3 standard.
Intentionally do not resemble abbreviations.

  • stan1288 – standard Spanish
  • olds1249 – Old Spanish
  • cast1243 – Castilic
  • IETF language tagAn IETF best practice, currently specified by RFC 5646 and RFC 4647, for language tags easy to parse by computer. The tag system is extensible to region, dialect, and private designations.
  • en – English, as shortest ISO 639 code.
  • en-US – English as used in the United States
  • Source: IETF memo

    • es – Spanish, as shortest ISO 639 code.
    • es-419 – Spanish appropriate for the Latin America and Caribbean region, using the UN M.49 region code
    ISO 639‑1Two-letter code system made official in 2002, containing 136 codes. Many systems use two-letter ISO 639‑1 codes supplemented by three-letter ISO 639‑2 codes when no two-letter code is applicable.See: List of ISO 639-1 codes
    • en
  • es – Spanish
  • ISO 639‑2Three-letter system of 464 codes.See: List of ISO 639-2 codes
    • eng – three-letter code
    • enm – Middle English, c. 1100–1500
    • ang – Old English, c. 450–1100
    • cpe – other English-based creoles and pidgins
  • spa – Spanish
  • ISO 639‑3An extension of ISO 639‑2 to cover all known, living or dead, spoken or written languages in 7,589 entries.See: List of ISO 639-3 codes
  • spa – Spanish
  • spq – Spanish, Loreto-Ucayali
  • ssp – Spanish sign language
  • Linguasphere Register code-systemTwo-digit + one to six letter Linguasphere Register code-system published in 2000, containing over 32,000 codes within 10 sectors of reference, covering the world's languages and speech communities.Navigate also the hierarchy of the Linguasphere Register code-system published online by hortensj-garden.org
    Within hierarchy of Linguasphere Register code-system:
    • 5= Indo-European phylosector
    • 52= Germanic phylozone
    • 52-A Germanic set
    • 52-AB English + Anglo-Creole chain
    • 52-ABA English net
    • 52-ABA-c Global English
    outer unit
    52-ABA-ca to
    52-ABA-cwe

    Compare: 52-ABA-a Scots + Northumbrian
    outer unit & 52-ABA-b "Anglo-English" outer unit

    Within hierarchy of Linguasphere Register code-system:
    • 5= Indo-European phylosector
    • 51= Romanic phylozone
    • 51-A Romance set
    • 51-AA Romance chain
    • 51-AAA West Romance net
    • 51-AAA-b Español/Castellano
    outer unit
    51-AAA-ba to
    51-AAA-bkk

    Compare: 51-AAA-a Português + Galego outer unit & 51-AAA-c Astur + Leonés outer unit, etc.
    SIL codes Codes created for use in the Ethnologue, a publication of SIL International that lists language statistics. The publication now uses ISO 639‑3 codes.ENGSPN
    Verbix language codesConstructed codes starting with old SIL codes and adding more information.ENGSPN