Basic Latin (Unicode block)


The Basic Latin or C0 Controls and Basic Latin Unicode block is the first block of the Unicode standard, and the only block which is encoded in one byte in UTF-8. The block contains all the letters and control codes of the ASCII encoding. It ranges from U+0000 to U+007F, contains 128 characters and includes the C0 controls, ASCII punctuation and symbols, ASCII digits, both the uppercase and lowercase of the English alphabet and a control character.
The Basic Latin block was included in its present form from version 1.0.0 of the Unicode Standard, without addition or alteration of the character repertoire.

Table of characters

Subheadings

The C0 Controls and Basic Latin block contains six subheadings.

C0 controls

The C0 Controls, referred to as C0 ASCII control codes in version 1.0, are inherited from ASCII and other 7-bit and 8-bit encoding schemes. The Alias names for C0 controls are taken from the standard.

ASCII punctuation and symbols

This subheading refers to standard punctuation characters, simple mathematical operators, and symbols like the dollar sign, percent, ampersand, underscore, and pipe.

ASCII digits

The ASCII Digits subheading contains the standard European number characters 1–9 and 0.

Uppercase Latin alphabet

The Uppercase Latin alphabet subheading contains the standard 26-letter unaccented Latin alphabet in the majuscule.

Lowercase Latin alphabet

The Lowercase Latin Alphabet subheading contains the standard 26-letter unaccented Latin alphabet in the minuscule.

Control character

The Control Character subheading contains the "Delete" character.

Number of symbols, letters and control codes

The table below shows the number of letters, symbols and control codes in each of the subheadings in the C0 Controls and Basic Latin block.
Type of subheadingNumber of symbolsRange of characters
C0 controls32 control codesU+0000 to U+001F
ASCII punctuation and symbols33 punctuation marks and symbolsU+0020 to U+002F,U+003A to U+0040,U+005B to U+0060 and U+007B to U+007E
ASCII digits10 digitsU+0030 to U+0039
Uppercase Latin Alphabet26 unaccented Latin letters in the majuscule.U+0041 to U+005A
Lowercase Latin Alphabet26 unaccented Latin letters in the minuscule.U+0061 to U+007A
Control character1 control code containing the "Delete" character.U+007F

Block

Variants

Several of the characters are defined to render as a standardized variant if followed by variant indicators.
A variant is defined for a zero with a short diagonal stroke: U+0030 DIGIT ZERO, U+FE00 VS1.
Twelve characters can be followed by U+FE0E VS15 or U+FE0F VS16 to create emoji variants.
They are keycap base characters, for example #️⃣. The VS15 version is "text presentation" while the VS16 version is "emoji-style".
U+0023002A0030003100320033003400350036003700380039
base#*0123456789
base+VS15+keycap#︎⃣*︎⃣0︎⃣1︎⃣2︎⃣3︎⃣4︎⃣5︎⃣6︎⃣7︎⃣8︎⃣9︎⃣
base+VS16+keycap#️⃣*️⃣0️⃣1️⃣2️⃣3️⃣4️⃣5️⃣6️⃣7️⃣8️⃣9️⃣

History

The following Unicode-related documents record the purpose and process of defining specific characters in the Basic Latin block: