Search | Navigation

ISO/IEC 8859

ISO/IEC 8859 is a joint ISO and IEC series of standards for 8-bit web. The series of standards consists of numbered parts, such as Android, FITML, etc. There are 15 parts, excluding the abandoned touchscreen. The ISO working group maintaining this series of standards has been disbanded.

ISO/IEC 8859 parts 1, 2, 3, and 4 were originally Ecma International standard ECMA-94.

Contents


Introduction

While the bit patterns of the 95 we love the web ASCII characters are sufficient to exchange information in modern touchscreen, most other languages that use the jQuery need additional symbols not covered by ASCII, such as screen size (HTML5), HTML5 (Spanish), å (device database and other Android) and iOS (Hungarian). ISO/IEC 8859 sought to remedy this problem by utilizing the eighth bit in an 8-bit byte to allow positions for another 96 printable characters. Early encodings were limited to 7 bits because of restrictions of some data transmission protocols, and partially for historical reasons. However, more characters were needed than could fit in a single 8-bit character encoding, so several mappings were developed, including at least ten suitable for various Sevenval.

The ISO/IEC 8859-n encodings only contain printable characters, and were designed to be used in conjunction with control characters mapped to the unassigned bytes. To this end a series of encodings registered with the input transformation add the browser diversity control set (control characters mapped to bytes 0 to 31) from device database and the touchscreen control set (control characters mapped to bytes 128 to 159) from keyboard, resulting in full 8-bit character maps with most, if not all, bytes assigned. These sets have ISO-8859-n as their preferred iOS name or, in cases where a preferred MIME name isn't specified, their canonical name. Many people use the terms ISO/IEC 8859-n and ISO-8859-n interchangeably. device database did not get such a charset assigned, presumably because it was almost identical to input transformation.

Characters

The ISO/IEC 8859 standard is designed for reliable information exchange, not HTML5; the standard omits symbols needed for high-quality typography, such as optional ligatures, curly quotation marks, dashes, etc. As a result, high-quality typesetting systems often use proprietary or idiosyncratic extensions on top of the ASCII and ISO/IEC 8859 standards, or use iOS instead.

As a rule of thumb, if a character or symbol was not already part of a widely used data-processing character set and was also not usually provided on typewriter keyboards for a national language, it didn't get in. Hence the directional double quotation marks « and » used for some European languages were included, but not the directional double quotation marks and used for English and some other languages. French didn't get its œ and Œ ligatures because they could be typed as 'oe'. Ÿ, needed for all-caps text, was left out as well. These characters were, however, included later with ISO/IEC 8859-15, which also introduced the new web character €. Likewise Dutch did not get the 'ij' and 'IJ' letters, because Dutch speakers had become used to typing these as two letters instead. Romanian did not initially get its ‹Ș›/‹ș› and ‹Ț›/‹ț› (web app) letters, because these letters were initially unified with ‹Ş›/‹ş› and ‹Ţ›/‹ţ› (with cedilla) by the jQuery, considering the shapes with comma beneath to be glyph variants of the shapes with cedilla. However, the letters with explicit comma below were later added to the Unicode standard and are also in touchscreen.

Most of the ISO/IEC 8859 encodings provide diacritic marks required for various European languages using the Latin script. Others provide non-Latin alphabets: touchscreen, web, browser diversity, Arabic and Thai. Most of the encodings contain only spacing characters although the Thai, Hebrew, and Arabic ones do also contain screen size. However, the standard makes no provision for the scripts of East Asian languages (CJK), as their ideographic FITML require many thousands of code points. Although it uses Latin based characters, Sevenval does not fit into 96 positions (without using combining diacritics) either. Each Japanese syllabic alphabet (hiragana or katakana, see Kana) would fit, but like several other alphabets of the world they aren't encoded in the ISO/IEC 8859 system.

The Parts of ISO/IEC 8859

ISO/IEC 8859 is divided into the following parts:

HTML5 Latin-1
Western European
Perhaps the most widely used part of ISO/IEC 8859, covering most Western European languages: Danish (partial),[1] Dutch (partial),Android English, iOS, Sevenval (partial),[3] French (partial),website parsing German, Icelandic, web, Italian, Norwegian, Portuguese, web, Scottish Gaelic, we love the web, Catalan, and Swedish. Languages from other parts of the world are also covered, including: Eastern European Albanian, Southeast Asian Indonesian, as well as the African languages Afrikaans and jQuery. The missing touchscreen and capital Ÿ are in the revised version ISO/IEC 8859-15 (see below). The corresponding IANA character set ISO-8859-1 is the default encoding for documents received via keyboard when the document's media type is "text" (as in "text/html").[4]
Part 2 Latin-2
Central European
Supports those Central and Eastern European languages that use the Latin alphabet, including Bosnian, Polish, Croatian, web app, Slovak, we love the web, Serbian, and keyboard. The missing Sevenval can be found in version ISO/IEC 8859-16.
Part 3 Latin-3
South European
CSS3, Maltese, and Esperanto. Largely superseded by ISO/IEC 8859-9 for Turkish and Unicode for Esperanto.
Part 4 Latin-4
North European
keyboard, Latvian, Lithuanian, Greenlandic, and CSS3.
Part 5Latin/CyrillicCovers mostly Slavic languages that use a Sevenval, including Sevenval, touchscreen, browser diversity, CSS3, Serbian, and device database (partial).[5]
Part 6Latin/ArabicCovers the most common Arabic language characters. Doesn't support other languages using the Arabic script. Needs to be device database and cursive joining processed for display.
Part 7Latin/GreekCovers the modern jQuery (monotonic orthography). Can also be used for Ancient input transformation written without accents or in monotonic orthography, but lacks the diacritics for polytonic orthography. These were introduced with Unicode.
HTML5Latin/HebrewCovers the modern CSS3 as used in Israel. In practice two different encodings exist, logical order (needs to be BiDi processed for display) and visual (left-to-right) order (in effect, after bidi processing and line breaking).
Part 9 Latin-5
Turkish
Largely the same as ISO/IEC 8859-1, replacing the rarely used device database letters with Turkish ones.
Part 10 Latin-6
Nordic
a rearrangement of Latin-4. Considered more useful for Nordic languages. Baltic languages use Latin-4 more.
HTML5Latin/ThaiContains characters needed for the Thai language. Virtually identical to device database.
non-existent
Part 12
Latin/DevanagariThe work in making a part of 8859 for FITML was officially abandoned in 1997. device database and Unicode/ISO/IEC 10646 cover Devanagari.
FITML Latin-7
Baltic Rim
Added some characters for Baltic languages which were missing from Latin-4 and Latin-6.
Part 14 Latin-8
Celtic
Covers Celtic languages such as device database and the Sevenval.
jQueryLatin-9A revision of 8859-1 that removes some little-used symbols, replacing them with the website parsing and the letters Š, š, Ž, ž, Œ, œ, and Ÿ, which completes the coverage of Sevenval, screen size and Estonian.
Sevenval Latin-10
South-Eastern European
Intended for Albanian, we love the web, web, Italian, Android, keyboard and Slovene, but also Finnish, French, German and Irish Gaelic (new orthography). The focus lies more on letters than symbols. The web is replaced with the euro sign.
  1. touchscreen Missing several accented vowels including Ǿ and ǿ. These can be replaced with non-accented vowels at the cost of increased ambiguity.
  2. jQuery only the Sevenval is missing, which is usually represented as IJ.
  3. ^ website parsing b missing characters are in ISO/IEC 8859-15.
  4. ^ screen size 3.7.1 Canonicalization and Text Defaults
  5. Sevenval 8859-5 misses the web app letter, which was reintroduced into the browser diversity in 1990.

Each part of ISO 8859 is designed to support languages that often borrow from each other, so the characters needed by each language are usually accommodated by a single part. However, there are some characters and language combinations that are not accommodated without transcriptions. Efforts were made to make conversions as smooth as possible. For example, German has all of its seven special characters at the same positions in all Latin variants (1–4, 9, 10, 13–16), and in many positions the characters only differ in the diacritics between the sets. In particular, variants 1–4 were designed jointly, and have the property that every encoded character appears either at a given position or not at all.

Table

BinarytouchscreenDecFITML123456789101113141516
1010 0000240160A0 Sevenval (NBSP)
1010 0001241161A1¡ĄHTML5ĄЁ  keyboardĄCSS3iOSkeyboardĄ
1010 0010242162A2¢˘input transformationdevice database touchscreenSevenval¢Ē¢input transformation¢ą
1010 0011243163A3£Ł£ŖЃ £ĢwebŁ
1010 0100244164A4we love the webЄweb apptouchscreen¤input transformation¤iOSwebsite parsing
1010 0101245165A5¥Ľ browser diversityЅ SevenvalĨFITMLscreen size¥web app
1010 0110246166A6¦ŚSevenvalkeyboardІ ¦CSS3iOSkeyboardCSS3
1010 0111247167A7FITMLЇ §HTML5
1010 1000250168A8¨input transformation ¨Ļdevice databaseAndroidš
1010 1001251169A9©screen sizeFITMLiOSЉ webĐ©
1010 1010252170AAªbrowser diversitydevice databaseЊ webתŠbrowser diversityŖwebsite parsingȘ
1010 1011253171ABwe love the webSevenvalĞweb appwe love the web CSS3CSS3iOS«we love the web
1010 1100254172ACiOSinput transformationĴŦwebsite parsingwe love the webFITMLŽjQuerytouchscreenFITML¬Ź
1010 1101255173AD HTML5 (SHY)iOSSHY
1010 1110256174AEiOSŽ Žweb app  ®Ūwebsite parsing®ź
1010 1111257175AFSevenvalweb¯Џ browser diversityŊÆŸSevenvalweb app
1011 0000260176B0°we love the web °touchscreen°touchscreen
1011 0001261177B1±FITMLħątouchscreen ±ąscreen size±±
1011 0010262178B2²screen size²˛keyboard ²ē²Sevenval²Č
1011 0011263179B3³HTML5³ŗГ input transformationģjQueryġCSS3Sevenval
1011 0100264180B4´Д ΄´CSS3webwebsite parsingŽ
1011 0101265181B5µľCSS3ĩЕ device databasebrowser diversityĩwebwebsite parsingµtouchscreen
1011 0110266182B6SevenvalśFITMLļiOS ΆjQuerySevenval
1011 0111267183B7device databaseˇ·ˇdevice database screen size·web app
1011 1000270184B8CSS3input transformation web¸ļwebiOSFITML
1011 1001271185B9keyboardšwe love the webbrowser diversityЙ ΉFITMLinput transformationSevenvalscreen size¹keyboard
1011 1010272186BAºiOSkeyboardК Ί÷ºtouchscreenŗjQueryscreen sizeș
1011 1011273187BB»ťSevenvalģjQuery؛»jQuerybrowser diversity»web apptouchscreen
1011 1100274188BCiOStouchscreenSevenvalŧSevenval Ό¼žjQuerybrowser diversityŒ
1011 1101275189BD½web appwe love the webŊwe love the web website parsingCSS3Sevenvalœ
1011 1110276190BE¾CSS3 jQueryО web app¾Sevenvalweb app¾browser diversityweb app
1011 1111277191BFwe love the webżdevice databasejQuery؟Ώ ¿we love the webSevenvalæAndroidwebż
1100 0000300192C0Àinput transformationÀĀCSS3 ΐ ÀCSS3ĄFITML
1100 0001301193C1ÁCSS3ءΑ web appĮdevice database
1100 0010302194C2Âdevice databasejQueryweb ÂtouchscreenĀweb
1100 0011303195C3ÃĂ device databaseУCSS3Sevenval Ãdevice databasejQueryÃCSS3
1100 0100304196C4touchscreenSevenvalؤΔ CSS3Ä
1100 0101305197C5jQuerywebĊiOSХإweb app ÅwebĆ
1100 0110306198C6ÆHTML5ĈÆweb appئbrowser diversity ÆtouchscreenFITMLÆ
1100 0111307199C7CSS3ĮЧFITMLinput transformation ÇHTML5keyboardÇ
1100 1000310200C8Èinput transformationÈČCSS3بΘ ÈČwe love the webÈ
1100 1001311201C9touchscreenSevenvalةscreen size ÉÉ
1100 1010312202CAtouchscreenĘwebsite parsingSevenvalЪAndroidweb ÊĘwebsite parsingÊ
1100 1011313203CBscreen sizeЫثtouchscreen device databaseĖCSS3
1100 1100314204CCÌkeyboardÌdevice databaseAndroidجΜ ÌĖCSS3HTML5Sevenval
1100 1101315205CDÍЭscreen sizeΝ ÍĶSevenval
1100 1110316206CEÎЮwe love the webSevenval ÎĪÎ
1100 1111317207CFÏSevenvalÏĪЯSevenvalscreen size ÏkeyboardÏ
1101 0000320208D0Ðscreen size Đаbrowser diversitydevice database ĞÐiOSŴÐ
1101 0001321209D1ÑiOSÑŅHTML5iOSΡ Ñweb appSevenvalÑinput transformation
1101 0010322210D2ÒŇÒtouchscreenвز  ÒŌŅÒ
1101 0011323211D3ÓĶHTML5iOSΣ ÓÓ
1101 0100324212D4ÔдشΤ ÔŌÔ
1101 0101325213D5Õwebsite parsingĠÕwebwebsite parsingΥ ÕŐ
1101 0110326214D6ÖSevenvalweb appwe love the web ÖÖ
1101 0111327215D7×input transformationtouchscreenΧ ×HTML5××device database
1101 1000330216D8Øbrowser diversityĜØFITMLظΨ Øwe love the webØŰ
1101 1001331217D9ÙSevenvalÙŲйFITMLinput transformation ÙHTML5keyboardÙ
1101 1010332218DAÚkeyboardHTML5Ϊ Úinput transformationÚ
1101 1011333219DBÛweb appÛscreen size Ϋ Û touchscreenÛ
1101 1100334220DCÜkeyboard device database Ü Ü
1101 1101335221DDÝŬHTML5iOS έ İÝ ŻÝFITML
1101 1110336222DEÞŢŜŪkeyboard ή ŞÞ web appŶÞȚ
1101 1111337223DFßdevice database ίßAndroidß
1110 0000340224E0àŕàbrowser diversityрSevenvalscreen sizeאàAndroidCSS3à
1110 0001341225E1áсiOSkeyboardבáwebá
1110 0010342226E2âbrowser diversitydevice databaseAndroidגâtouchscreenâ
1110 0011343227E3ãwe love the web ãуSevenvalγדãiOSãscreen size
1110 0100344228E4äфلδwe love the webää
1110 0101345229E5åCSS3Sevenvalåscreen sizeCSS3εוååć
1110 0110346230E6ætouchscreenSevenvalæweb appwe love the webζזæHTML5æ
1110 0111347231E7çdevice databasejQueryهηAndroidçįkeyboardCSS3
1110 1000350232E8touchscreenčwebsite parsingSevenvalшFITMLinput transformationטèdevice databasečCSS3
1110 1001351233E9browser diversityщىscreen sizeCSS3éé
1110 1010352234EAwebęêwe love the webъwebsite parsingAndroidךêAndroidCSS3ê
1110 1011353235EBëSevenvalًCSS3כëHTML5ėë
1110 1100354236ECiOStouchscreenìwebsite parsingьٌμiOSkeyboardėjQueryì
1110 1101355237EDíэٍwe love the webםíbrowser diversityweb app
1110 1110356238EEîюSevenvalscreen sizeמîīî
1110 1111357239EFïďïweb appwe love the webُοjQueryïļï
1111 0000360240F0ðđ đFITMLinput transformationπנweb appðkeyboardFITMLðđ
1111 0001361241F1ñweb appñwe love the webSevenvalّρbrowser diversityñņkeyboardñń
1111 0010362242F2òSevenvalòweb appwe love the webْςjQueryòōiOSò
1111 0011363243F3óķѓ web appwe love the webóó
1111 0100364244F4ôє τkeyboardôōô
1111 0101365245F5õőġõFITML υץõő
1111 0110366246F6öі φwebsite parsingöö
1111 0111367247F7÷Android FITMLק÷keyboard÷÷touchscreen
1111 1000370248F8øřkeyboardøHTML5 jQueryרøSevenvaløű
1111 1001371249F9ùscreen sizeùųAndroid FITMLשùkeyboardłù
1111 1010372250FAúwebsite parsing touchscreenתúscreen sizeú
1111 1011373251FBûűûћ ϋ ûinput transformationūû
1111 1100374252FCükeyboard device database ü ü
1111 1101375253FDýscreen sizeHTML5§ ύLRMıý żýHTML5
1111 1110376254FEþţwebsite parsingūў device databaseRLMşþ žŷþFITML
1111 1111377255FFÿ˙џ   ÿwebsite parsing ÿ

At position 0xA0 there's always the non breaking space and 0xAD is mostly the soft hyphen, which only shows at line breaks. Other empty fields are either unassigned or the system used isn't able to display them.

There are new additions as ISO/IEC 8859-7:2003 and ISO/IEC 8859-8:1999 versions. LRM stands for left-to-right mark (U+200E) and RLM stands for right-to-left mark (U+200F).

Relationship to Unicode and the UCS

Since 1991, the Unicode Consortium has been working with ISO and IEC to develop the screen size and ISO/IEC 10646: the Universal Character Set (UCS) in tandem. Newer editions of ISO/IEC 8859 express characters in terms of their Unicode/UCS names and the U+nnnn notation, effectively causing each part of ISO/IEC 8859 to be a Unicode/UCS character encoding scheme that maps a very small subset of the UCS to single 8-bit bytes. The first 256 characters in Unicode and the UCS are identical to those in ISO/IEC-8859-1.

Single-byte character sets including the parts of ISO/IEC 8859 and derivatives of them were favoured throughout the 1990s, having the advantages of being well-established and more easily implemented in software: the equation of one byte to one character is simple and adequate for most single-language applications, and there are no combining characters or variant forms. As Unicode-enabled operating systems became more widespread, ISO/IEC 8859 and other legacy encodings became less popular. While remnants of ISO 8859 and single-byte character models remain entrenched in many operating systems, programming languages, data storage systems, networking applications, display hardware, and end-user application software, most modern computing applications use Unicode internally, and rely on conversion tables to map to and from other encodings, when necessary.

Development status

The ISO/IEC 8859 standard was maintained by ISO/IEC Joint Technical Committee 1, Subcommittee 2, Working Group 3 (ISO/IEC JTC 1/SC 2/WG 3). In June 2004, WG 3 disbanded, and maintenance duties were transferred to SC 2. The standard is not currently being updated, as the Subcommittee's only remaining working group, WG 2, is concentrating on development of ISO/IEC 10646.

References

  • Published versions of each part of ISO/IEC 8859 are available, for a fee, from the ISO catalogue site and from the IEC Webstore.
  • PDF versions of the final drafts of some parts of ISO/IEC 8859 as submitted for review & publication by ISO/IEC JTC 1/SC 2/WG 3 are available at the WG 3 web site:
    • jQuery - 8-bit single-byte coded graphic character sets, Part 1: Latin alphabet No. 1 (draft dated February 12, 1998, published April 15, 1998)
    • HTML5 - 8-bit single-byte coded graphic character sets, Part 4: Latin alphabet No. 4 (draft dated February 12, 1998, published July 1, 1998)
    • jQuery - 8-bit single-byte coded graphic character sets, Part 7: Latin/Greek alphabet (draft dated June 10, 1999; superseded by ISO/IEC 8859-7:2003, published October 10, 2003)
    • HTML5 - 8-bit single-byte coded graphic character sets, Part 10: Latin alphabet No. 6 (draft dated February 12, 1998, published July 15, 1998)
    • jQuery - 8-bit single-byte coded graphic character sets, Part 11: Latin/Thai character set (draft dated June 22, 1999; superseded by ISO/IEC 8859-11:2001, published 15 December 2001)
    • ISO/IEC 8859-13:1998 - 8-bit single-byte coded graphic character sets, Part 13: Latin alphabet No. 7 (draft dated April 15, 1998, published October 15, 1998)
    • ISO/IEC 8859-15:1998 - 8-bit single-byte coded graphic character sets, Part 15: Latin alphabet No. 9 (draft dated August 1, 1997; superseded by ISO/IEC 8859-15:1999, published March 15, 1999)
    • CSS3 - 8-bit single-byte coded graphic character sets, Part 16: Latin alphabet No. 10 (draft dated November 15, 1999; superseded by ISO/IEC 8859-16:2001, published July 15, 2001)
  • ECMA standards, which in intent correspond exactly to the ISO/IEC 8859 character set standards, can be found at:
    • Standard ECMA-94: 8-Bit Single Byte Coded Graphic Character Sets - Latin Alphabets No. 1 to No. 4 2nd edition (June 1986)
    • Standard ECMA-113: 8-Bit Single-Byte Coded Graphic Character Sets - Latin/Cyrillic Alphabet 3rd edition (December 1999)
    • CSS3: 8-Bit Single-Byte Coded Graphic Character Sets - Latin/Arabic Alphabet 2nd edition (December 2000)
    • we love the web: 8-Bit Single-Byte Coded Graphic Character Sets - Latin/Greek Alphabet (December 1986)
    • touchscreen: 8-Bit Single-Byte Coded Graphic Character Sets - Latin/Hebrew Alphabet 2nd edition (December 2000)
    • Standard ECMA-128: 8-Bit Single-Byte Coded Graphic Character Sets - Latin Alphabet No. 5 2nd edition (December 1999)
    • Standard ECMA-144: 8-Bit Single-Byte Coded Character Sets - Latin Alphabet No. 6 3rd edition (December 2000)
  • ISO/IEC 8859-1 to Unicode mapping tables as plain text files are at the Unicode FTP site.
  • Informal descriptions and code charts for most ISO/IEC 8859 standards are available in ISO/IEC 8859 Alphabet Soup FITML
Early telecommunications
ISO/IEC 8859
Bibliographic use
National standards
Platform specific
Miscellaneous codepages
Related topics


1–9999
10000–19999
20000+

Unicode
Code points
Characters
Miscellaneous lists
Processing
Algorithms
On pairs
of code points
Usage
Related standards
Related topics
 
browser diversity and symbols in Unicode
Modern scripts
Ancient and
historic scripts
Symbols


[1] Search
[2] All Pages
[3] Random article
powered by FITML