ARIB STD B24 character set


Volume 1 of the Association of Radio Industries and Businesses STD-B24 standard for Broadcast Markup Language specifies, amongst other details, a character encoding for use in Japanese-language broadcasting. It was introduced on. The latest revision is version 6.3 as of.
It includes a number of ARIB extended characters not found in the base standards. It was the source standard for many symbol characters which were added to Unicode, including portions of the Miscellaneous Symbols, Enclosed Alphanumeric Supplement and Enclosed Ideographic Supplement blocks. Its contributions partially overlap the Unicode emoji, but were added a year earlier, in Unicode 5.2.
The ARIB STD-B62 standard, published in 2014, defines Unicode mappings for a selection of the B24 extended characters, as well as a few extended Kanji. It also includes a mapping of utilised characters outside the Basic Multilingual Plane to the BMP's private use area.

Sets and codes

The ARIB STD B24 standard defines multiple character sets and a method of switching between them. These include a Kanji set, an Alphanumeric set, a Hiragana set, Katakana sets of two distinct layouts and four mosaic sets. The sets are selected using ISO 2022 mechanisms for 94-sets, using the following codes :
SetTypeCode Code Code Comments
Kanji2-byte4/242BThe escape code B used for the ARIB Kanji set is used for the 1983 version of JIS C 6226 in ISO-2022-JP.
Alphanumeric1-byte4/104AJJIS_C6220-ro. Similar to ASCII, with two assignments differing. Escape code J matches usage in ISO-2022-JP.
Proportional alphanumeric1-byte3/6366JIS_C6220-ro. Similar to ASCII, with two assignments differing. Escape code J matches usage in ISO-2022-JP.
Hiragana1-byte3/0300Hiragana themselves follow the same layout as row 4 of JIS X 0208, but without a lead byte. Also adds several additional assignments for punctuation.
Proportional Hiragana1-byte3/7377Hiragana themselves follow the same layout as row 4 of JIS X 0208, but without a lead byte. Also adds several additional assignments for punctuation.
Katakana1-byte3/1311Katakana themselves follow the same layout as row 5 of JIS X 0208, but without a lead byte. Also adds several additional assignments for punctuation.
Proportional Katakana1-byte3/8388Katakana themselves follow the same layout as row 5 of JIS X 0208, but without a lead byte. Also adds several additional assignments for punctuation.
JIS X 0201 Katakana1-byte4/949IJIS_C6220-jp. Escape code matches usage in ISO-2022-JP-3.
Mosaic A1-byte3/2322Pseudographics
Mosaic B1-byte3/3333Pseudographics
Mosaic C1-byte3/4344Non-spacing pseudographics
Mosaic D1-byte3/5355Non-spacing pseudographics

Code charts

Kanji (double-byte) set

This is a double-byte character set extending JIS X 0208.

Lead byte

The encoding bytes correspond to the row or cell number plus 0x20, or 32 in decimal. Hence, the code set starting with 0x21 has a row number of 1, and its cell 1 has a continuation byte of 0x21, and so forth. Most of the code corresponds to JIS X 0208, exceptions are shown with a heavy border.

Character sets 0x21-0x74 (row numbers 1-84: punctuation, alphabets, numbers, Kana, Kanji)

Character set 0x7A (row number 90, traffic symbols)

Characters 90-45 through 90-63 and 90-66 through 90-84 are listed in the B24 standard only in table 7-10, and are also the only characters in rows 90 through 91 which are not transport-related symbols; this is noted in the B24 standard in an endnote to table 7-10. The remainder of the extensions are listed in both table 7-4 and table 7-10.

Character set 0x7B (row number 91, map symbols)

Characters from ARIB STD-B24 which were not retained in ARIB STD-B62 are shown shaded.

Character set 0x7C (row number 92, units, enclosed forms, list markers, arrows)

Characters from ARIB STD-B24 which were not retained in ARIB STD-B62 are shown shaded.

Character set 0x7D (row number 93, game and weather symbols, fractions, units, enclosed forms)

Characters from ARIB STD-B24 which were not retained in ARIB STD-B62 are shown shaded.

Character set 0x7E (row number 94, list markers)

Characters from ARIB STD-B24 which were not retained in ARIB STD-B62 are shown shaded.

Single-byte sets

Alphanumeric set

Differences from US-ASCII are shown with a heavy border.

Hiragana set

Character allocations not following row 4 of JIS X 0208 are shown with a heavy border.

Katakana set

Character allocations not following row 5 of JIS X 0208 are shown with a heavy border.

JIS X 0201 Katakana set

Mosaic sets

Shift_JIS variant

In addition to the modified ISO 2022 encoding, the B24 standard also specifies a Shift JIS encoding following JIS X 0208:1997, but with the addition of the extended characters in the kanji set.

Footnotes