European ordering rules


The European ordering rules, define an ordering for strings written in languages that are written with the Latin, Greek and Cyrillic alphabets. The standard covers languages used by the European Union, the European Free Trade Association, and parts of the former Soviet Union. It is a tailoring of the Common Tailorable Template of ISO/IEC 14651. EOR can in turn be tailored for different languages. But in inter-European contexts, EOR can be used without further tailoring.

Method

Just as for ISO/IEC 14651, upon which EOR is based, EOR has 4 levels of weights.
Level 1 sorts the letters. The following Latin letters are concerned by this level, in order:
The Greek alphabet has the following order:
Cyrillic script has the following order:
The order for the three alphabets is:
  1. Latin alphabet
  2. Greek alphabet
  3. Cyrillic alphabet
The Georgian and Armenian alphabets have not been included in ENV 13710. However, they are covered in CR 14400:2001 "European ordering rules – Ordering for Latin, Greek, Cyrillic, Georgian and Armenian scripts".All scripts encoded in ISO/IEC 10646 and Unicode are covered by ISO/IEC 14651 as well as Unicode Collation Algorithm, both of which are available at no charge.
Level 2 is where different additions, such as diacritics and variations, to the letters are ordered. Letters with diacritical marks are ordered as variants of the base letter.,, and are ordered as modifications of,, and respectively, similarly for similar cases.
Level 2 defines the following order of diacritics and other modifications:
  1. Acute accent
  2. Grave accent
  3. Breve
  4. Circumflex
  5. Hacek
  6. Ring
  7. Trema
  8. Double acute accent
  9. Tilde
  10. Dot
  11. Cedilla
  12. Ogonek
  13. Macron
  14. With stroke through
  15. Modified letter
Level 3 makes the distinction between Capital and small letters, as in "Polish" and "polish".
Level 4 concerns punctuation and whitespace characters. This level makes the distinction between "MacDonald" and "Mac Donald", "its" and "it's".
An optional, and usually omitted, fifth level can distinguish typographical differences, including whether the text is italic, normal or bold.