ANSI character set and equivalent Unicode and HTML characters
The ANSI set of 217 characters, also known as Windows-1252, was the standard for the core fonts supplied with US versions of Microsoft Windows up to and including Windows 95 and Windows NT 4. During the lifetime of those two products, Microsoft added the euro currency symbol bringing the number of characters to 218, and introduced a new core set of Pan-European fonts containing the WGL4 (Windows Glyph List 4) character set, with 652 characters.
If you use a version of Windows that is designed for a non-Latin alphabet such as Arabic, Cyrillic, Greek, Hebrew or Thai to view a document that has been typed using the ANSI character set, then characters from these languages may replace some of those in the 128–255 range; this problem will be resolved when Unicode becomes more widely used, because it provides a unique numeric identifier for each character. There are similar problems when transferring ANSI documents to DOS or Macintosh computers, because DOS and MacRoman arrange characters differently in the 128–255 range.
ANSI characters 32 to 127 correspond to those in the 7-bit ASCII character set, which forms the Basic Latin Unicode character range. Characters 160–255 correspond to those in the Latin-1 Supplement Unicode character range. Positions 128–159 in Latin-1 Supplement are reserved for controls, but most of them are used for printable characters in ANSI; the Unicode equivalents are noted in the table below. Entries in the “Entity” column are character entity references that can be used in HTML and should be interpreted correctly by Web browsers that support HTML 4.0.
The characters that appear in the first column of the following table are generated from Unicode numeric character references, and so they should appear correctly in any Web browser that supports Unicode and that has suitable fonts available, regardless of the operating system.
Detail here
If you use a version of Windows that is designed for a non-Latin alphabet such as Arabic, Cyrillic, Greek, Hebrew or Thai to view a document that has been typed using the ANSI character set, then characters from these languages may replace some of those in the 128–255 range; this problem will be resolved when Unicode becomes more widely used, because it provides a unique numeric identifier for each character. There are similar problems when transferring ANSI documents to DOS or Macintosh computers, because DOS and MacRoman arrange characters differently in the 128–255 range.
ANSI characters 32 to 127 correspond to those in the 7-bit ASCII character set, which forms the Basic Latin Unicode character range. Characters 160–255 correspond to those in the Latin-1 Supplement Unicode character range. Positions 128–159 in Latin-1 Supplement are reserved for controls, but most of them are used for printable characters in ANSI; the Unicode equivalents are noted in the table below. Entries in the “Entity” column are character entity references that can be used in HTML and should be interpreted correctly by Web browsers that support HTML 4.0.
The characters that appear in the first column of the following table are generated from Unicode numeric character references, and so they should appear correctly in any Web browser that supports Unicode and that has suitable fonts available, regardless of the operating system.
Detail here
Nhận xét
Đăng nhận xét