Link Codes

Link Codes

[Login to edit this page]

Microsoft refers to code pages as OEM code pages, and supplements them with its own "ANSI" code pages.

Most well-known code pages, excluding those for the CJK languages and Vietnamese, fit all their code-points into 8 bits and do not involve anything more than mapping each code-point to a single bitmap; furthermore, techniques such as combining characters, complex scripts, etc., are not involved.

The text mode of standard (VGA-compatible) PC graphics hardware is built around using an 8-bit code page, though it is possible to use two at once with some color depth sacrifice, and up to 8 may be stored in the display adaptor for easy switching . There were a selection of code pages that could be loaded into such hardware. However, it is now commonplace for operating system vendors to provide their own character encoding and rendering systems that run in a graphics mode and bypass this system entirely. The character encodings used by these graphical systems (particularly MS-Windows) are sometimes called code pages as well.

The basis of the IBM1 PC1 code pages is ASCII, a 7-bit code representing 128 characters and control codes. In the past, 8-bit extensions to the ASCII code often either set the top bit to zero, or used it as a parity bit in network data transmissions. When this bit was instead made available for representing character data, another 128 characters and control codes could be represented. IBM used this extended range to encode characters used by various languages. No formal standard existed for these ‘extended character sets’; IBM merely referred to the variants as code pages, as it had always done for variants of EBCDIC encodings.

Unicode is an effort to include all characters from previous code pages into a single character enumeration that can be used with a number of encoding schemes. In the process, duplicate characters are eliminated and new variants are introduced, like Fullwidth ASCII. Most code page characters in the 0-7f range are the same in Unicode. The Unicode character set may be encoded using one of several schemes, which are called Unicode Transformation Format (UTF) schemes. Commonly used schemes include UTF-7, UTF-8, UTF-16. The UTF schemes have been fit as new code pages into the existing code page enumerations.

These code pages were designed to be compatible with text modes provided by graphics adapters, including VGA compatible text mode, that were used with MS-DOS and its clones. This limited code pages to 256 points, which often include box-drawing characters. Since the original IBM PC code page (number 437) was not really designed for international use, several incompatible variants emerged. Microsoft refers to these as the OEM code pages. Examples include:

When dealing with legacy hardware and documents, it is often necessary to support these code pages, but use of newer standards, in particular Unicode, is encouraged.

The following code page numbers are specific to Microsoft Windows. IBM uses different numbers for these code pages.

Microsoft defined a number of code pages known as the ANSI code pages (as the first one, 1252 was based on an apocryphal ANSI draft of what became ISO 8859-1). Code page 1252 is built on ISO 8859-1 but uses the range 0x80-0x9F for extra printable characters rather than the C1 control codes used in ISO-8859-1. Some of the others are based in part on other parts of ISO 8859 but often rearranged to make them closer to 1252.

Microsoft recommends applications use UTF-8 or Unicode instead of these code pages.


0 Comments

Write a comment

Rating:    

Share On Facebook
Search And Find
Epik Search:

Related Clips for Link Codes

Join The Epik Network
Join Now:

Browse The Epik Network

  • Amartyasen

    Johnfloyd

    Goingdown

    Katiedownes

    Vncclient

    Cenkakyol

    Chicomarx

    Giggangel

    Examinations

    Rolandtopor

    Exemestane

    Isosorbide

    Rick-warren

    Markpoirier

    Usmiapps003

    Linklarkin

    Nyrr

    Maureendowd

    74

    Marcfaber

    Alternati