Tillbaka till svenska Fidonet
English   Information   Debug  
ENET.SOFT   0/11701
ENET.SYSOP   33904
ENET.TALKS   0/32
ENGLISH_TUTOR   0/2000
EVOLUTION   0/1335
FDECHO   0/217
FDN_ANNOUNCE   0/7068
FIDONEWS   24128
FIDONEWS_OLD1   0/49742
FIDONEWS_OLD2   0/35949
FIDONEWS_OLD3   25100/30874
FIDONEWS_OLD4   0/37224
FIDO_SYSOP   12852
FIDO_UTIL   0/180
FILEFIND   0/209
FILEGATE   0/212
FILM   0/18
FNEWS_PUBLISH   4408
FN_SYSOP   41679
FN_SYSOP_OLD1   71952
FTP_FIDO   0/2
FTSC_PUBLIC   0/13599
FUNNY   0/4886
GENEALOGY.EUR   0/71
GET_INFO   105
GOLDED   0/408
HAM   0/16070
HOLYSMOKE   0/6791
HOT_SITES   0/1
HTMLEDIT   0/71
HUB203   466
HUB_100   264
HUB_400   39
HUMOR   0/29
IC   0/2851
INTERNET   0/424
INTERUSER   0/3
IP_CONNECT   719
JAMNNTPD   0/233
JAMTLAND   0/47
KATTY_KORNER   0/41
LAN   0/16
LINUX-USER   0/19
LINUXHELP   0/1155
LINUX   0/22093
LINUX_BBS   0/957
mail   18.68
mail_fore_ok   249
MENSA   0/341
MODERATOR   0/102
MONTE   0/992
MOSCOW_OKLAHOMA   0/1245
MUFFIN   0/783
MUSIC   0/321
N203_STAT   926
N203_SYSCHAT   313
NET203   321
NET204   69
NET_DEV   0/10
NORD.ADMIN   0/101
NORD.CHAT   0/2572
NORD.FIDONET   189
NORD.HARDWARE   0/28
NORD.KULTUR   0/114
NORD.PROG   0/32
NORD.SOFTWARE   0/88
NORD.TEKNIK   0/58
NORD   0/453
OCCULT_CHAT   0/93
OS2BBS   0/787
OS2DOSBBS   0/580
OS2HW   0/42
OS2INET   0/37
OS2LAN   0/134
OS2PROG   0/36
OS2REXX   0/113
OS2USER-L   207
OS2   0/4786
OSDEBATE   0/18996
PASCAL   0/490
PERL   0/457
PHP   0/45
POINTS   0/405
POLITICS   0/29554
POL_INC   0/14731
PSION   103
R20_ADMIN   1121
R20_AMATORRADIO   0/2
R20_BEST_OF_FIDONET   13
R20_CHAT   0/893
R20_DEPP   0/3
R20_DEV   399
R20_ECHO2   1379
R20_ECHOPRES   0/35
R20_ESTAT   0/719
R20_FIDONETPROG...
...RAM.MYPOINT
  0/2
R20_FIDONETPROGRAM   0/22
R20_FIDONET   0/248
R20_FILEFIND   0/24
R20_FILEFOUND   0/22
R20_HIFI   0/3
R20_INFO2   3222
R20_INTERNET   0/12940
R20_INTRESSE   0/60
R20_INTR_KOM   0/99
R20_KANDIDAT.CHAT   42
R20_KANDIDAT   28
R20_KOM_DEV   112
R20_KONTROLL   0/13273
R20_KORSET   0/18
R20_LOKALTRAFIK   0/24
R20_MODERATOR   0/1852
R20_NC   76
R20_NET200   245
R20_NETWORK.OTH...
...ERNETS
  0/13
R20_OPERATIVSYS...
...TEM.LINUX
  0/44
R20_PROGRAMVAROR   0/1
R20_REC2NEC   534
R20_SFOSM   0/340
R20_SF   0/108
R20_SPRAK.ENGLISH   0/1
R20_SQUISH   107
R20_TEST   2
R20_WORST_OF_FIDONET   12
RAR   0/9
RA_MULTI   106
RA_UTIL   0/162
REGCON.EUR   0/2056
REGCON   0/13
SCIENCE   0/1206
SF   0/239
SHAREWARE_SUPPORT   0/5146
SHAREWRE   0/14
SIMPSONS   0/169
STATS_OLD1   0/2539.065
STATS_OLD2   0/2530
STATS_OLD3   0/2395.095
STATS_OLD4   0/1692.25
SURVIVOR   0/495
SYSOPS_CORNER   0/3
SYSOP   0/84
TAGLINES   0/112
TEAMOS2   0/4530
TECH   0/2617
TEST.444   0/105
TRAPDOOR   0/19
TREK   0/755
TUB   0/290
UFO   0/40
UNIX   0/1316
USA_EURLINK   0/102
USR_MODEMS   0/1
VATICAN   0/2740
VIETNAM_VETS   0/14
VIRUS   0/378
VIRUS_INFO   0/201
VISUAL_BASIC   0/473
WHITEHOUSE   0/5187
WIN2000   0/101
WIN32   0/30
WIN95   0/4288
WIN95_OLD1   0/70272
WINDOWS   0/1517
WWB_SYSOP   0/419
WWB_TECH   0/810
ZCC-PUBLIC   0/1
ZEC   4

 
4DOS   0/134
ABORTION   0/7
ALASKA_CHAT   0/506
ALLFIX_FILE   0/1313
ALLFIX_FILE_OLD1   0/7997
ALT_DOS   0/152
AMATEUR_RADIO   0/1039
AMIGASALE   0/14
AMIGA   0/331
AMIGA_INT   0/1
AMIGA_PROG   0/20
AMIGA_SYSOP   0/26
ANIME   0/15
ARGUS   0/924
ASCII_ART   0/340
ASIAN_LINK   0/651
ASTRONOMY   0/417
AUDIO   0/92
AUTOMOBILE_RACING   0/105
BABYLON5   0/17862
BAG   135
BATPOWER   0/361
BBBS.ENGLISH   0/382
BBSLAW   0/109
BBS_ADS   0/5290
BBS_INTERNET   0/507
BIBLE   0/3563
BINKD   0/1119
BINKLEY   0/215
BLUEWAVE   0/2173
CABLE_MODEMS   0/25
CBM   0/46
CDRECORD   0/66
CDROM   0/20
CLASSIC_COMPUTER   0/378
COMICS   0/15
CONSPRCY   0/899
COOKING   32959
COOKING_OLD1   0/24719
COOKING_OLD2   0/40862
COOKING_OLD3   0/37489
COOKING_OLD4   0/35496
COOKING_OLD5   9370
C_ECHO   0/189
C_PLUSPLUS   0/31
DIRTY_DOZEN   0/201
DOORGAMES   0/2061
DOS_INTERNET   0/196
duplikat   6002
ECHOLIST   0/18295
EC_SUPPORT   0/318
ELECTRONICS   0/359
ELEKTRONIK.GER   1534
ENET.LINGUISTIC   0/13
ENET.POLITICS   0/4
Möte FIDONEWS_OLD3, 30874 texter
 lista första sista föregående nästa
Text 27756, 325 rader
Skriven 2012-04-09 07:23:14 av FidoNews Robot (2:2/2.0)
Ärende: FidoNews 29:15 [02/06]: Ftsc Information
================================================
=================================================================
                        FTSC INFORMATION
=================================================================


**********************************************************************
FTSC                             FIDONET TECHNICAL STANDARDS COMMITTEE
**********************************************************************

Publication:    FTS-5003
Revision:       1
Title:          Character set definition in Fidonet messages
Author(s):      Peter Krefting (born Karlsson)
                Stas Degteff
                FTSC Administrator
Date:           7 April 2012
----------------------------------------------------------------------
Contents:
                1. Introduction
                2. Format of the identifier
                3. Supported levels
                4. Supported character sets
                5. Obsolete identifiers
                6. Notes
----------------------------------------------------------------------

Status of this document
-----------------------

  This document is a Fidonet Standard (FTS).

  This document specifies a Fidonet standard for the Fidonet
  community.

  This document is released to the public domain, and may be used,
  copied or modified for any purpose whatever.

Abstract
--------

  This document defines the identifiers that are used to indicate the
  character sets and character encodings used within messages that are
  distributed in Fidonet.

  There have been many attempts on defining a common standard on what
  character encodings are used in Fidonet distributed messages. The
  only one that has gained widespread use is the "CHRS" specification
  described in FSC-0054. This document tries to describe the current
  use, as well as standardising the parts of it that were ambiguously
  defined.

1. Introduction
---------------

  As Fidonet is an international network, one has to consider that
  not all people use English to write messages. Many languages have
  alphabets that are either bigger than the standard English alphabet,
  or completely different. To keep track of which character set and
  character encoding is used for a particular message, this document
  describes a way to identify this to message reading software.

1.1 Definition of terms.

  A character set is the collection of characters that is used to
  display text. Examples of a character set are the ASCII set, the
  LATIN 1 set and the universal character set or Unicode character
  set.

  An encoding scheme is the algorithm to transform characters from
  a specific character set into a number or set of numbers that can be
  stored and manipulated. An example is UTF-8, an encoding scheme
  for the universal character set.

  The distinction between character set and character encoding scheme
  is only meaningful for character sets that need more than one byte
  for representation. 7 and 8 bit character sets are normally
  represented by a scalar value up to 255. For Unicode that uses the
  universal character set, there is more than one way however to
  represent a particular character. There is UTF-8, UTF-16 and UTF-32.
  These are different  encoding schemes for one and the same character
  set, the universal character set.


2. Character set identification line
------------------------------------

  The character encoding of a message is specified in the "CHRS"
  control line.

  The CHRS control line is formatted as follows:

  ^ACHRS: <identifier> <level>

  Where <identifier> is a character string of no more than eight (8)
  ASCII characters identifying the character set or character encoding
  scheme used, and level is a positive integer value describing what
  level of CHRS the  message is written in.

  For backwards compatibility, "CHARSET" may be treated as a synonym
  for "CHRS".

  Some implementations do not add the <level> field and some
  implementations erroneously present "UTF-8 2" instead of "UTF-8 4".
  Well mannered implementations should gracefully handle this
  situation
  when reading messages. The recommended way of doing this is to
  ignore the level parameter and only use the name of the identifier.
  In future the level parameter may become obsolete.

  Incoming messages without "CHRS" control lines should be considered
  as being written in pure ASCII, but may be treated as being written
  in some default character set or character encoding scheme. Such as
  IBM codepage 437, IBM codepage 866 or UTF-8. It is recommended that
  message readers offer the user the option of manually selecting a
  different character set or encoding scheme for these messages on a
  per-area, per-message or other basis.


3. Supported levels
-------------------

  These levels are the ones that are implemented in current software:

  Level 0
  -------

  This level is for messages containing pure seven bit ASCII only.
  Outgoing messages in pure ASCII need not be identified by a "CHRS"
  control line, but if they are, they should be indicated as
  "ASCII 1" (not "ASCII 0").

  Level 1
  -------

  First level of internationalisation, using seven bit character sets.
  Most of these are based on US ASCII, with minor internationalisation
  variations.

  Level 2
  -------

  Second level of internationalisation, using eight bit character
  sets.

  This level adds support for character sets that use "extended
  ASCII", i.e codes with the most significant bit set. The character
  sets in level two are all based on ASCII (the codes 0-127 coincide
  with ASCII).


  Level 3
  -------

  Level 3 is included just for completeness as it was mentioned in the
  proposals (FSC-0054 and FSP-1013, now FRL-1020) that this standard
  is
  based on.

  It seems level 3 was originally meant for 16 bit character sets but
  there never was an implementation and there may never be. This may
  have to do with the NULL byte being reserved in the Fidonet
  specifications as a termination character.

  Level 3 is "reserved".


  Level 4
  -------
  Level 4 is for multi byte character encodings. The only presently
  known implementation is UTF-8.


  4. Known character set identifiers
  ----------------------------------

  This is a list of character set and character encoding scheme
  identifiers that are known to be in use or to have been in use in
  Fidonet. This list is not exhaustive and is not meant to be a list
  of characters sets or character encoding identifiers that all must
  be supported by an implementation. It is perfectly all right for
  an implementation to only have partial support.

  A common method of implementation is to have a character translation
  table for each character set or character encoding identifier. The
  user can add or delete these tables to or from his configuration in
  order to add or delete support for character set/encoding. The
  details are left to the implementation.


  Identifier  Character set
  ----------  -------------

  Level 1 character sets (seven-bit)

  Level 1 is no longer "current practise". The level 1 character
  sets are included here for backward compatibility.


  ASCII       ISO 646-1 (US ASCII)
  DUTCH       ISO 646 Dutch
  FINNISH     ISO 646-10 (Swedish/Finnish)
  FRENCH      ISO 646 French
  CANADIAN    ISO 646 Canadian
  GERMAN      ISO 646 German
  ITALIAN     ISO 646 Italian
  NORWEIG     ISO 646 Norwegian
  PORTU       ISO 646 Portuguese
  SPANISH     ISO 646 Spanish
  SWEDISH     ISO 646-10 (Swedish/Finnish)
  SWISS       ISO 646 Swiss
  UK          ISO 646 UK
  ISO-10      ISO 646-10 (Deprecated alias)


  Level 2 character sets (eight-bit, ASCII based)

  CP437       IBM codepage 437 (DOS Latin US)
  CP850       IBM codepage 850 (DOS Latin 1)
  CP852       IBM codepage 852 (DOS Latin 2)
  CP866       IBM codepage 866 (Cyrillic Russian)
  CP848       IBM codepage 848 (Cyrillic Ukrainian)
  CP1250      Windows, Eastern Europe
  CP1251      Windows, Cyrillic
  CP1252      Windows, Western Europe
  CP10000     Macintosh Roman character set

  LATIN-1     ISO 8859-1 (Western European)
  LATIN-2     ISO 8859-2 (Eastern European)
  LATIN-5     ISO 8859-9 (Turkish)
  LATIN-9     ISO 8859-15 (Western Europe with EURO sign)

  Level 2 obsolete character set identifiers (see note)

  IBMPC       IBM PC character sets for European
  +7_FIDO     IBM codepage 866, use CP866 instead
  MAC         Macintosh character set, use CPxxxxx instead


  Level 4

  UTF-8      UTF-8 encoding for the Unicode character set


5. Obsolete indentifiers
------------------------

  These indentifiers must not be used when creating new messages.
  The following only applies to processing messages that were
  created using old software.

  Since the "IBMPC" identifier, initially used to indicate IBM
  codepage 437, eventually evolved into identifying "any IBM
  codepage", there exists in some implementations an additional
  control line, "CODEPAGE", identifying the messages codepage:

         "^ACODEPAGE: xxx

  This use is deprecated in favour of the "CPxxx" identifiers
  defined above. If found in incoming messages, however, it should
  be used as an override of the "CHRS: IBMPC" identifier.

  The character set "+7_FIDO" is sometimes used as an identifier for
  CP866. This use is deprecated, and "CP866" is recommended instead.
  Implementations should treat "+7_FIDO" as a synonym for "CP866".

  The character set "MAC" some time ago was used for Macintosh
  character set, but now sysops should use CP10000 or another codepage
  identifier from international standard: CP10029 for Macintosh
  Central Europe languages, CP10007 for Macintosh Cyrillic and
  another.

  FSC-54 also defined control lines for style changes, "CHRC". These
  are not implemented in any software known to the authors, and are
  deprecated.

  Level 3, as defined by FSC-54, is also considered "not used" since
  there currently are no known implementations of them. All levels not
  documented here are considered reserved for future use.

6. Notes
--------

  The character set identifier applies to all parts of the message,
  including the header information and the control lines like origin
  and tear line.

  FSC-54 documents file formats for mapping files that could be used
  for storing character translation data. These are not documented
  here because determined by software implementation.


  A. Author contact data
  ----------------------

  Peter Krefting (born Karlsson)  Fidonet: 2:203/0.222
                                  E-mail: peter@softwolves.pp.se
  Stas Degteff Fidonet: 2:5080/102
  FTSC Administrator: 2:2/20, administrator@ftsc.org

  B. Acknowledgements
  -------------------

  This document is largely based on FSP-1013 by Peter Karlsson, which
  in turn was based on FSC-0054 by Duncan McNutt.

  Peter Karlsson is now known as Peter Krefting and FSP-1013 has been
  filed in the FTSC reference library as FRL-1020.

  Significant modifications were submitted by Stas Degteff. Other
  FTSC members also provided valuable input.


  C. Revision History
  -------------------


  Rev.1, 20120407: First release..




-----------------------------------------------------------------

--- Azure/NewsPrep 3.0
 * Origin: Home of the Fidonews (2:2/2.0)