Libraries |
|
Charsets | Source Code |
|
|
Constant Summary | |||||
string |
| ||||
string |
| ||||
string |
| ||||
string |
| ||||
string |
| ||||
string |
| ||||
string |
| ||||
string |
| ||||
string |
| ||||
string |
| ||||
string |
| ||||
string |
| ||||
string |
| ||||
string |
| ||||
string |
| ||||
string |
| ||||
string |
| ||||
string |
| ||||
string |
| ||||
string |
| ||||
string |
| ||||
string |
| ||||
string |
| ||||
string |
| ||||
string |
| ||||
string |
| ||||
string |
| ||||
string |
| ||||
string |
| ||||
string |
| ||||
string |
| ||||
string |
| ||||
string |
| ||||
string |
| ||||
string |
| ||||
string |
| ||||
string |
| ||||
string |
| ||||
string |
| ||||
string |
| ||||
string |
| ||||
string |
| ||||
string |
| ||||
string |
| ||||
string |
| ||||
string |
| ||||
string |
| ||||
string |
| ||||
string |
| ||||
string |
| ||||
string |
| ||||
string |
| ||||
string |
| ||||
string |
| ||||
string |
| ||||
string |
| ||||
string |
| ||||
string |
| ||||
string |
| ||||
string |
| ||||
string |
| ||||
string |
| ||||
string |
| ||||
string |
|
Function Summary | |||||
void |
| ||||
void |
|
Constant Detail |
cp_437
const string: cp_437
-
Conversion table from code page 437 to Unicode. Code page 437 is the character set of the original IBM PC.
cp_708
const string: cp_708
-
Conversion table from code page 708 to Unicode. Code page 708 was outlined by ASMO to write Arabic.
cp_720
const string: cp_720
-
Conversion table from code page 720 to Unicode. The MS-DOS code page 720 is used to write Arabic.
cp_737
const string: cp_737
-
Conversion table from code page 737 to Unicode. The MS-DOS code page 737 is used to write Greek language.
cp_775
const string: cp_775
-
Conversion table from code page 775 to Unicode. The MS-DOS code page 775 is used to write the Estonian, Lithuanian and Latvian languages.
cp_850
const string: cp_850
-
Conversion table from code page 850 to Unicode. The MS-DOS code page 850 is used to write Western European languages.
cp_852
const string: cp_852
-
Conversion table from code page 852 to Unicode. The MS-DOS code page 752 is used to write Central European languages that use Latin script, such as Bosnian, Croatian, Czech, Hungarian, Polish, Romanian, Serbian and Slovak.
cp_855
const string: cp_855
-
Conversion table from code page 855 to Unicode. The MS-DOS code page 855 is used to write Cyrillic script.
cp_857
const string: cp_857
-
Conversion table from code page 857 to Unicode. The MS-DOS code page 857 is used to write Turkish.
cp_858
const string: cp_858
-
Conversion table from code page 858 to Unicode. The MS-DOS code page 858 is used to write Western European languages.
cp_860
const string: cp_860
-
Conversion table from code page 860 to Unicode. The MS-DOS code page 860 is used to write Portuguese.
cp_861
const string: cp_861
-
Conversion table from code page 861 to Unicode. The MS-DOS code page 861 is used to write Icelandic language.
cp_862
const string: cp_862
-
Conversion table from code page 862 to Unicode. The MS-DOS code page 862 is used to write Hebrew.
cp_863
const string: cp_863
-
Conversion table from code page 863 to Unicode. The MS-DOS code page 863 is used to write French language.
cp_864
const string: cp_864
-
Conversion table from code page 864 to Unicode. The MS-DOS code page 864 is used to write Arabic.
cp_865
const string: cp_865
-
Conversion table from code page 865 to Unicode. The MS-DOS code page 865 is used to write Nordic languages.
cp_866
const string: cp_866
-
Conversion table from code page 866 to Unicode. The MS-DOS code page 866 is used to write Cyrillic script.
cp_869
const string: cp_869
-
Conversion table from code page 869 to Unicode. The MS-DOS code page 869 is used to write Greek language.
cp_874
const string: cp_874
-
Conversion table from code page 874 to Unicode. The Windows code page 874 is used for the Thai language.
cp_907
const string: cp_907
-
Conversion table from code page 907 to Unicode. Code page 907 is is used for encoding APL symbols.
cp_909
const string: cp_909
-
Conversion table from code page 909 to Unicode. Code page 909 is is used for encoding APL symbols.
cp_1125
const string: cp_1125
-
Conversion table from code page 1125 to Unicode. The code page 1125 is used for the Ukrainian language.
cp_1250
const string: cp_1250
-
Conversion table from code page 1250 to Unicode. The Windows code page 1250 encodes the Latin alphabet for Central and Eastern European languages, that use Latin script. It can be used for encoding German, Polish, Czech, Slovak, Hungarian, Slovene, Bosnian, Croatian, Serbian, Romanian and Albanian.
cp_1251
const string: cp_1251
-
Conversion table from code page 1251 to Unicode. The Windows code page 1251 encodes the Latin/Cyrillic alphabet. It can be used is for encoding Russian, Bulgarian, Serbian and Macedonian.
cp_1252
const string: cp_1252
-
Conversion table from code page 1252 to Unicode. The Windows code page 1250 encodes the Latin alphabet for Western European languages. The Windows code page 1252 is a superset of ISO 8859-1.
cp_1253
const string: cp_1253
-
Conversion table from code page 1253 to Unicode. The Windows code page 1253 encodes the Latin/Greek alphabet.
cp_1254
const string: cp_1254
-
Conversion table from code page 1254 to Unicode. The Windows code page 1254 covers the Turkish language.
cp_1255
const string: cp_1255
-
Conversion table from code page 1255 to Unicode. The Windows code page 1255 encodes the Latin/Hebrew alphabet.
cp_1256
const string: cp_1256
-
Conversion table from code page 1256 to Unicode. The Windows code page 1256 encodes the Latin/Arabic alphabet.
cp_1257
const string: cp_1257
-
Conversion table from code page 1257 to Unicode. The Windows code page 1257 covers the Baltic languages.
cp_1258
const string: cp_1258
-
Conversion table from code page 1258 to Unicode. The Windows code page 1258 covers the Vietnamese language.
iso_8859_1
const string: iso_8859_1
-
Conversion table from ISO-8859-1 (Latin-1) to Unicode. ISO-8859-1 is the character set for Western European languages. ISO-8859-1 defines the first 256 code point assignments in Unicode. It can be used for encoding Afrikaans, Albanian, Basque, Breton, Catalan, Danish, English, Faroese, Galician, German, Icelandic, Malay, Irish, Italian, Latin, Leonese, Luxembourgish, Norwegian, Occitan, Portuguese, Rhaeto-Romanic, Scottish Gaelic, Spanish, Swahili, Swedish and Walloon.
iso_8859_2
const string: iso_8859_2
-
Conversion table from ISO-8859-2 (Latin-2) to Unicode. ISO-8859-2 is the character set for Eastern European languages. It can be used for encoding Bosnian, Croatian, Czech, German, Hungarian, Polish, Serbian, Slovak, Slovene and Sorbian.
iso_8859_3
const string: iso_8859_3
-
Conversion table from ISO-8859-3 (Latin-3) to Unicode. ISO-8859-3 is the character set for South European languages. It can be used for encoding Turkish, Maltese and Esperanto.
iso_8859_4
const string: iso_8859_4
-
Conversion table from ISO-8859-4 (Latin-4) to Unicode. ISO-8859-4 is the character set for North European languages. It can be used for encoding Estonian, Latvian, Lithuanian, Greenlandic and Sami.
iso_8859_5
const string: iso_8859_5
-
Conversion table from ISO-8859-5 to Unicode. ISO-8859-5 is the character set for the Latin/Cyrillic alphabet. It can be used for encoding Bulgarian, Belarusian, Russian, Serbian and Macedonian.
iso_8859_6
const string: iso_8859_6
-
Conversion table from ISO-8859-6 to Unicode. ISO-8859-6 is the character set for the Latin/Arabic alphabet.
iso_8859_7
const string: iso_8859_7
-
Conversion table from ISO-8859-7 to Unicode. ISO-8859-7 is the character set for the Latin/Greek alphabet.
iso_8859_8
const string: iso_8859_8
-
Conversion table from ISO-8859-8 to Unicode. ISO-8859-8 is the character set for the Latin/Hebrew alphabet.
iso_8859_9
const string: iso_8859_9
-
Conversion table from ISO-8859-9 (Latin-5) to Unicode. ISO-8859-9 is the character set to cover the Turkish language.
iso_8859_10
const string: iso_8859_10
-
Conversion table from ISO-8859-10 (Latin-6) to Unicode. ISO-8859-10 is the character set to cover the Nordic languages.
iso_8859_11
const string: iso_8859_11
-
Conversion table from ISO-8859-11 to Unicode. ISO-8859-11 is the character set for the Latin/Thai alphabet.
iso_8859_13
const string: iso_8859_13
-
Conversion table from ISO-8859-13 (Latin-7) to Unicode. ISO-8859-13 is the character set to cover the Baltic languages.
iso_8859_14
const string: iso_8859_14
-
Conversion table from ISO-8859-14 (Latin-8) to Unicode. ISO-8859-14 is the character set to cover the Celtic languages. It can be used for encoding Irish, Manx, Scottish Gaelic, Welsh, Cornish and Breton.
iso_8859_15
const string: iso_8859_15
-
Conversion table from ISO-8859-15 (Latin-9) to Unicode. ISO-8859-15 is the character set for Western European languages. It can be used for encoding Afrikaans, Albanian, Breton, Catalan, Danish, Dutch, English, Estonian, Faroese, Finnish, French, Galician, German, Icelandic, Irish, Italian, Kurdish, Latin, Luxembourgish, Malay, Norwegian, Occitan, Portuguese, Rhaeto-Romanic, Scottish Gaelic, Scots, Spanish, Swahili, Swedish, Tagalog and Walloon.
iso_8859_16
const string: iso_8859_16
-
Conversion table from ISO-8859-16 (Latin-10) to Unicode. ISO-8859-16 is the character set for South-Eastern European languages. It can be used for encoding Albanian, Croatian, Hungarian, Polish, Romanian, Serbian and Slovenian, but also French, German, Italian and Irish Gaelic.
koi8_r
const string: koi8_r
-
Conversion table from KOI8-R encoding to Unicode. KOI8-R is an encoding used for Russian and Bulgarian.
koi8_u
const string: koi8_u
-
Conversion table from KOI8-U encoding to Unicode. KOI8-U is an encoding used for Ukrainian and Belorussian.
mik
const string: mik
-
Conversion table from MIK encoding to Unicode. MIK is an encoding used for the Bulgarian language.
tis_620
const string: tis_620
-
Conversion table from TIS-620 encoding to Unicode. TIS-620 is the Thai Industrial Standard encoding for the Thai language.
armscii_8
const string: armscii_8
-
Conversion table from ArmSCII-8 encoding to Unicode. ArmSCII-8 is an encoding for the Armenian alphabet.
geostd8
const string: geostd8
-
Conversion table from GEOSTD8 encoding to Unicode. GEOSTD is an encoding for the Georgian language.
jis_x_0201
const string: jis_x_0201
-
Conversion table from JIS X 0201 encoding to Unicode. JIS X 0201 is a Japanese Industrial Standard which combines ASCII (except backslash and tilde) with half-width kana (the halfwidth form of katakana).
viscii
const string: viscii
-
Conversion table from VISCII encoding to Unicode. VISCII is the Vietnamese Standard Code for Information Interchange.
ns_4551_1
const string: ns_4551_1
-
Conversion table from NS 4551-1 encoding to Unicode. NS 4551 version 1 is the national variant of ISO 646 for Norway.
cp_037
const string: cp_037
-
Conversion table from code page 37 to Unicode. Code page 37 is an EBCDIC code page with the full ISO-8859-1 (Latin-1) character set. This codepage is used in Australia, Brazil, Canada, New Zealand, Portugal, South Africa and USA.
cp_273
const string: cp_273
-
Conversion table from code page 273 to Unicode. Code page 273 is an EBCDIC code page with the full ISO-8859-1 (Latin-1) character set. This codepage is used in Austria and Germany.
cp_277
const string: cp_277
-
Conversion table from code page 277 to Unicode. Code page 277 is an EBCDIC code page with the full ISO-8859-1 (Latin-1) character set. This codepage is used in Denmark and Norway.
cp_280
const string: cp_280
-
Conversion table from code page 280 to Unicode. Code page 280 is an EBCDIC code page with the full ISO-8859-1 (Latin-1) character set. This codepage is used in Italy.
cp_285
const string: cp_285
-
Conversion table from code page 285 to Unicode. Code page 285 is an EBCDIC code page with the full ISO-8859-1 (Latin-1) character set. This codepage is used in Ireland and the United Kingdom.
cp_297
const string: cp_297
-
Conversion table from code page 297 to Unicode. Code page 297 is an EBCDIC code page with the full ISO-8859-1 (Latin-1) character set. This codepage is used in France.
cp_500
const string: cp_500
-
Conversion table from code page 500 to Unicode. Code page 500 is an EBCDIC code page with the full ISO-8859-1 (Latin-1) character set. This codepage is international.
cp_1047
const string: cp_1047
-
Conversion table from code page 1047 to Unicode. Code page 1047 is an EBCDIC code page with the full ISO-8859-1 (Latin-1) character set. This codepage is used for Open Systems.
Function Detail |
conv2unicode
const proc: conv2unicode (inout string: stri, in string: codePage)
conv2unicodeByName
const proc: conv2unicodeByName (inout string: stri, in var string: charset)
-
Convert a string from a charset encoding to UTF-32. When the function is called stri is assumed to be a string of bytes encoded with the specified charset. When the function is left stri contains an UTF-32 unicode string. The 'charset' encoding is specified with an IANA/MIME charset name. This way the function can be used to convert encoded data for internet protocols such as NNTP.
- Raises:
- RANGE_ERROR - The charset unknown
|
|