Fidonet echomail

Tillbaka till svenska Fidonet
English Information Debug

OS2BBS 0/787
OS2DOSBBS 0/580
OS2HW 0/42
OS2INET 0/37
OS2LAN 0/134
OS2PROG 0/36
OS2REXX 0/113
OS2USER-L 207
OS2 0/4806
OSDEBATE 0/18996
PASCAL 0/490
PERL 0/457
PHP 0/45
POINTS 0/405
POLITICS 0/29554
POL_INC 0/14731
PSION 103
R20_ADMIN 1129
R20_AMATORRADIO 0/2
R20_BEST_OF_FIDONET 14
R20_CHAT 0/893
R20_DEPP 0/3
R20_DEV 399
R20_ECHO2 1574
R20_ECHOPRES 0/35
R20_ESTAT 0/719
R20_FIDONETPROG...
...RAM.MYPOINT 0/2
R20_FIDONETPROGRAM 0/22
R20_FIDONET 0/248
R20_FILEFIND 0/24
R20_FILEFOUND 0/22
R20_HIFI 0/3
R20_INFO2 3465
R20_INTERNET 0/12940
R20_INTRESSE 0/60
R20_INTR_KOM 0/99
R20_KANDIDAT.CHAT 42
R20_KANDIDAT 28
R20_KOM_DEV 112
R20_KONTROLL 0/13360
R20_KORSET 0/18
R20_LOKALTRAFIK 0/24
R20_MODERATOR 0/1852
R20_NC 76
R20_NET200 245
R20_NETWORK.OTH...
...ERNETS 0/13
R20_OPERATIVSYS...
...TEM.LINUX 0/44
R20_PROGRAMVAROR 0/1
R20_REC2NEC 534
R20_SFOSM 0/341
R20_SF 0/108
R20_SPRAK.ENGLISH 0/1
R20_SQUISH 107
R20_TEST 2
R20_WORST_OF_FIDONET 20
RAR 0/9
RA_MULTI 106
RA_UTIL 0/162
REGCON.EUR 0/2066
REGCON 0/13
SCIENCE 0/1206
SF 0/239
SHAREWARE_SUPPORT 0/5146
SHAREWRE 0/14
SIMPSONS 0/169
STATS_OLD1 0/2539.065
STATS_OLD2 0/2530
STATS_OLD3 0/2395.095
STATS_OLD4 0/1692.25
SURVIVOR 0/495
SYSOPS_CORNER 0/3
SYSOP 0/84
TAGLINES 0/112
TEAMOS2 0/4530
TECH 0/2617
TEST.444 0/105
TRAPDOOR 0/19
TREK 0/755
TUB 0/290
UFO 0/40
UNIX 0/1316
USA_EURLINK 0/102
USR_MODEMS 0/1
VATICAN 0/2740
VIETNAM_VETS 0/14
VIRUS 0/378
VIRUS_INFO 0/201
VISUAL_BASIC 0/473
WHITEHOUSE 0/5187
WIN2000 0/101
WIN32 0/30
WIN95 0/4291
WIN95_OLD1 0/70272
WINDOWS 0/1517
WWB_SYSOP 0/419
WWB_TECH 0/810
ZCC-PUBLIC 0/1
ZEC 4

4DOS 0/134
ABORTION 0/7
ALASKA_CHAT 0/506
ALLFIX_FILE 0/1313
ALLFIX_FILE_OLD1 0/7997
ALT_DOS 0/152
AMATEUR_RADIO 0/1039
AMIGASALE 0/14
AMIGA 0/331
AMIGA_INT 0/1
AMIGA_PROG 0/20
AMIGA_SYSOP 0/26
ANIME 0/15
ARGUS 0/924
ASCII_ART 0/340
ASIAN_LINK 0/651
ASTRONOMY 0/417
AUDIO 0/92
AUTOMOBILE_RACING 0/105
BABYLON5 0/17862
BAG 135
BATPOWER 0/361
BBBS.ENGLISH 0/382
BBSLAW 0/109
BBS_ADS 0/5290
BBS_INTERNET 0/507
BIBLE 0/3563
BINKD 0/1119
BINKLEY 0/215
BLUEWAVE 0/2173
CABLE_MODEMS 0/25
CBM 0/46
CDRECORD 0/66
CDROM 0/20
CLASSIC_COMPUTER 0/378
COMICS 0/15
CONSPRCY 0/899
COOKING 37964
COOKING_OLD1 0/24719
COOKING_OLD2 0/40862
COOKING_OLD3 0/37489
COOKING_OLD4 0/35496
COOKING_OLD5 9370
C_ECHO 0/189
C_PLUSPLUS 0/31
DIRTY_DOZEN 0/201
DOORGAMES 0/2126
DOS_INTERNET 0/196
duplikat 6057
ECHOLIST 0/18295
EC_SUPPORT 0/318
ELECTRONICS 0/359
ELEKTRONIK.GER 1534
ENET.LINGUISTIC 0/13
ENET.POLITICS 0/4
ENET.SOFT 0/11701
ENET.SYSOP 34135
ENET.TALKS 0/32
ENGLISH_TUTOR 0/2000
EVOLUTION 0/1335
FDECHO 0/217
FDN_ANNOUNCE 0/7068
FIDONEWS 24525
FIDONEWS_OLD1 0/49742
FIDONEWS_OLD2 0/35949
FIDONEWS_OLD3 0/30874
FIDONEWS_OLD4 0/37224
FIDO_SYSOP 12896
FIDO_UTIL 0/180
FILEFIND 0/209
FILEGATE 0/212
FILM 0/18
FNEWS_PUBLISH 4659
FN_SYSOP 41987
FN_SYSOP_OLD1 71952
FTP_FIDO 0/2
FTSC_PUBLIC 0/13892
FUNNY 0/4886
GENEALOGY.EUR 0/71
GET_INFO 105
GOLDED 0/408
HAM 0/16231
HOLYSMOKE 0/6791
HOT_SITES 0/1
HTMLEDIT 0/71
HUB203 466
HUB_100 264
HUB_400 39
HUMOR 0/29
IC 0/2851
INTERNET 0/424
INTERUSER 0/3
IP_CONNECT 719
JAMNNTPD 0/233
JAMTLAND 0/47
KATTY_KORNER 0/41
LAN 0/16
LINUX-USER 0/19
LINUXHELP 0/1155
LINUX 0/22236
LINUX_BBS 0/957
mail 18.68
mail_fore_ok 249
MENSA 0/341
MODERATOR 0/102
MONTE 0/992
MOSCOW_OKLAHOMA 0/1245
MUFFIN 0/783
MUSIC 0/321
N203_STAT 938
N203_SYSCHAT 313
NET203 321
NET204 69
NET_DEV 0/10
NORD.ADMIN 0/101
NORD.CHAT 0/2572
NORD.FIDONET 189
NORD.HARDWARE 0/28
NORD.KULTUR 0/114
NORD.PROG 0/32
NORD.SOFTWARE 0/88
NORD.TEKNIK 0/58
NORD 0/453
OCCULT_CHAT 0/93

Möte OSDEBATE, 18996 texter

Text 2625, 325 rader
Skriven 2005-02-19 23:32:36 av Rich (1:379/45)
   Kommentar till text 2624 av Ellen K. (1:379/45)
Ärende: Re: ESB / XML / Unicode vs 8-bit characters ?
=====================================================
From: "Rich" <@>

This is a multi-part message in MIME format.

------=_NextPart_000_07D6_01C516DB.4BF0FA60
Content-Type: text/plain;
        charset="iso-8859-1"
Content-Transfer-Encoding: quoted-printable

   The UTF in UTF-8/16/32 stands for Unicode Transformation Format.  You =
can find these defined in section 2.5 of =
http://www.unicode.org/versions/Unicode4.0.0/ch02.pdf.

   It's not clear to me how you are creating the XML from the templates. =
 If ANSI data is emitted into an XML document declared as UTF-8 then you =
would have problems only for non-ASCII characters.  UTF-8 and = Windows-1252
are identical for 0x00 to 0x7F which is ASCII in both.

   I do not know how SQL Server maps from char to nchar, specifically =
what conversion is performed.  Also, in some (maybe all released) = versions of
SQL Server nchar and nvarchar are encoded in UCS-2.  UCS-2 = is a 16-bit
encoding like UTF-16.  It dates back to when Unicode was = defined as having
2**16 characters instead of the 2**20+ that it has = now.  You can not express
characters >=3D U+10000 in UCS-2 not that you = care about these.

   I don't know if whether those systems you describe being written in =
java make a difference.  They can do what they want.  The native java = string
is Unicode though I don't remember if it is UCS-2 or UTF-16.  My = guess is
that it was once the former and is now the latter.  One of the = documents on
this on sun's site suggests that java used UCS-2 until the = recently released
1.5 which is the first to use UTF-16.

Rich

  "Ellen K." <72322.1016@compuserve.com> wrote in message =
news:aqag115606i9g8bmh3lst66une1f1sotth@4ax.com...
  UTF-8 is unicode?!?   Sheesh, all this time I thought it meant 8-bit.
  In fact I could swear I read that somewhere.

  My question was coming from the database perspective, where I always =
use
  char and varchar, as opposed to nchar and nvarchar.  I give the
  front-end guys little templates for creating the XML documents for all
  my SQL Server stored procedures that take XML input, and I always
  specify UTF-8 in the header... and my char and varchar columns always
  end up normal, so since you're now telling me UTF-8 is really unicode, =
I
  guess that would answer my question for XML data I would be getting =
from
  the apps...?    Or would the answer be different if the incoming XML =
is
  some other encoding?

  To simulate getting nvarchar data from somewhere, I just tried =
creating
  two dummy tables, one with an nvarchar column and the other with a
  varchar column, typed stuff into the nvarchar one, then inserted to =
the
  varchar one select from the nvarchar one and it looks normal. =20

  If all this means I was worrying about nothing, excellent!   OTOH, is
  there something I should be worrying about that I didn't ask?

  The only pieces whose names I know so far are Sonic and SalesForce, =
both
  of which are written in Java, if that makes any difference.  I know
  there is at least one other external piece but I think that is the =
next
  phase.

  On Sat, 19 Feb 2005 21:37:15 -0800, "Rich" <@> wrote in message
  <421821c1$1@w3.nls.net>:

  >   You need to be more specific than "8-bit characters".  There are =
many 8-bit character encodings.  If you are using Windows to generate = your
data you most likely are using Windows-1252 which is the default = 8-bit
character set for U.S. English in Windows.  Windows supports many = 8-bit
encodings so you could be using something else too.
  >
  >   Unicode is a character set not an encoding.  There are multiple =
encodings the main ones being UTF-8, UTF-16, and UTF-32.  You can use = any of
these for XML as well as non-Unicode encodings.  For = interoperability you
should use Unicode preferably UTF-8.
  >
  >   What comes out when the XML is parsed depends on the XML parser.  =
XML is logically expressed in Unicode.  The Windows XML parsers provide = a
Unicode interface.  Other parsers could do differently.
  >
  >Rich
  >
  >
  >  "Ellen K." <72322.1016@compuserve.com> wrote in message =
news:4o2g11pu048kafbdilg46u77vs5ls0be55@4ax.com...
  >  Our new enterprise system is going to be built around an Enterprise
  >  Service Bus.  I don't have the full specs yet but as I understand =
it the
  >  main apps (starting with SalesForce) are going to be out on the =
internet
  >  and the Sonic ESB will be the messaging piece.  There will  be an
  >  Operational Data Store in house that will get updated every night =
on a
  >  batch basis from the main apps. =20
  >
  >  My data warehouse will continue to be the data warehouse and will =
remain
  >  in house.  The dimensions will stay the same but I might have to =
create
  >  separate measures for the data from the new apps and then create =
views
  >  to keep everything transparent to the users.  =20
  >
  >  I'm thinking if we're going to have an ODS in house already, I may =
as
  >  well do the ETL from there.   But I'm worrying that the new data =
will
  >  probably be unicode (because Java defaults to that and SalesForce =
is
  >  written in Java).  Right now I am storing everything (except our =
blobs
  >  of course) in 8-bit characters.  =20
  >
  >  Anyone here who's up on this stuff, can the XML that goes back and =
forth
  >  convert between unicode and 8-bit characters, or am I gonna have to
  >  redefine all my data?   For example, if unicode data is put into an =
XML
  >  document that specifies UTF-8, what comes out when the document is
  >  parsed?  How about vice versa?  If this is too simplistic to work, =
what
  >  is needed?
  >
  >  (We actually have no substantive need for unicode -- we are =
bilingual
  >  Spanish but all the special Spanish characters exist in the ascii
  >  character set.)

------=_NextPart_000_07D6_01C516DB.4BF0FA60
Content-Type: text/html;
        charset="iso-8859-1"
Content-Transfer-Encoding: quoted-printable

<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN">
<HTML><HEAD>
<META http-equiv=3DContent-Type content=3D"text/html; =
charset=3Diso-8859-1">
<META content=3D"MSHTML 6.00.3790.1289" name=3DGENERATOR>
<STYLE></STYLE>
</HEAD>
<BODY bgColor=3D#ffffff>
<DIV><FONT face=3DArial size=3D2>&nbsp;&nbsp; The UTF in UTF-8/16/32 =
stands for=20
Unicode Transformation Format.&nbsp; You can find these defined in = section
2.5=20
of <A=20
href=3D"http://www.unicode.org/versions/Unicode4.0.0/ch02.pdf">http://www=
.unicode.org/versions/Unicode4.0.0/ch02.pdf</A>.</FONT></DIV>
<DIV><FONT face=3DArial size=3D2></FONT>&nbsp;</DIV>
<DIV><FONT face=3DArial size=3D2>&nbsp;&nbsp; It's not clear to me how =
you are=20
creating the XML from the templates.&nbsp; If ANSI data is emitted into = an
XML=20
document declared as UTF-8 then you would have problems only for = non-ASCII=20
characters.&nbsp; UTF-8 and Windows-1252 are identical for 0x00 to 0x7F = which
is=20
ASCII in both.</FONT></DIV>
<DIV><FONT face=3DArial size=3D2></FONT>&nbsp;</DIV>
<DIV><FONT face=3DArial size=3D2>&nbsp;&nbsp; I do not know how SQL =
Server maps from=20
char to nchar, specifically what conversion is performed.&nbsp; Also, in =
some=20
(maybe all released) versions of SQL Server nchar and nvarchar are = encoded
in=20
UCS-2.&nbsp; UCS-2 is a 16-bit encoding like UTF-16.&nbsp; It dates back = to
when=20
Unicode was defined as having 2**16 characters instead of the 2**20+ = that it
has=20
now.&nbsp; You can not express characters &gt;=3D U+10000 in UCS-2 not = that
you=20
care about these.</FONT></DIV>
<DIV><FONT face=3DArial size=3D2></FONT>&nbsp;</DIV>
<DIV><FONT face=3DArial size=3D2>&nbsp;&nbsp; I don't know if whether =
those systems=20
you describe being written in java make a difference.&nbsp; They can do =
what=20
they want.&nbsp; The native java string is Unicode though I don't = remember if
it=20
is UCS-2 or UTF-16.&nbsp; My guess is that it was once the former and is = now
the=20
latter.&nbsp; One of the documents on this on sun's site suggests that = java
used=20
UCS-2 until the recently released 1.5 which is the first to use=20
UTF-16.</FONT></DIV>
<DIV><FONT face=3DArial size=3D2></FONT>&nbsp;</DIV>
<DIV><FONT face=3DArial size=3D2>Rich</FONT></DIV>
<DIV><FONT face=3DArial size=3D2></FONT>&nbsp;</DIV>
<BLOCKQUOTE=20
style=3D"PADDING-RIGHT: 0px; PADDING-LEFT: 5px; MARGIN-LEFT: 5px; =
BORDER-LEFT: #000000 2px solid; MARGIN-RIGHT: 0px">
  <DIV>"Ellen K." &lt;<A=20
  =
href=3D"mailto:72322.1016@compuserve.com">72322.1016@compuserve.com</A>&g=
t;=20
  wrote in message <A=20
  =
href=3D"news:aqag115606i9g8bmh3lst66une1f1sotth@4ax.com">news:aqag115606i=
9g8bmh3lst66une1f1sotth@4ax.com</A>...</DIV>UTF-8=20
  is unicode?!?&nbsp;&nbsp; Sheesh, all this time I thought it meant=20
  8-bit.<BR>In fact I could swear I read that somewhere.<BR><BR>My =
question was=20
  coming from the database perspective, where I always use<BR>char and =
varchar,=20
  as opposed to nchar and nvarchar.&nbsp; I give the<BR>front-end guys =
little=20
  templates for creating the XML documents for all<BR>my SQL Server =
stored=20
  procedures that take XML input, and I always<BR>specify UTF-8 in the =
header...=20
  and my char and varchar columns always<BR>end up normal, so since =
you're now=20
  telling me UTF-8 is really unicode, I<BR>guess that would answer my =
question=20
  for XML data I would be getting from<BR>the apps...?&nbsp;&nbsp;&nbsp; =
Or=20
  would the answer be different if the incoming XML is<BR>some other=20
  encoding?<BR><BR>To simulate getting nvarchar data from somewhere, I =
just=20
  tried creating<BR>two dummy tables, one with an nvarchar column and =
the other=20
  with a<BR>varchar column, typed stuff into the nvarchar one, then =
inserted to=20
  the<BR>varchar one select from the nvarchar one and it looks =
normal.&nbsp;=20
  <BR><BR>If all this means I was worrying about nothing, =
excellent!&nbsp;&nbsp;=20
  OTOH, is<BR>there something I should be worrying about that I didn't=20
  ask?<BR><BR>The only pieces whose names I know so far are Sonic and=20
  SalesForce, both<BR>of which are written in Java, if that makes any=20
  difference.&nbsp; I know<BR>there is at least one other external piece =
but I=20
  think that is the next<BR>phase.<BR><BR>On Sat, 19 Feb 2005 21:37:15 =
-0800,=20
  "Rich" &lt;@&gt; wrote in message<BR>&lt;<A=20
  =
href=3D"mailto:421821c1$1@w3.nls.net">421821c1$1@w3.nls.net</A>&gt;:<BR><=
BR>&gt;&nbsp;&nbsp;=20
  You need to be more specific than "8-bit characters".&nbsp; There are =
many=20
  8-bit character encodings.&nbsp; If you are using Windows to generate =
your=20
  data you most likely are using Windows-1252 which is the default 8-bit =

  character set for U.S. English in Windows.&nbsp; Windows supports many =
8-bit=20
  encodings so you could be using something else=20
  too.<BR>&gt;<BR>&gt;&nbsp;&nbsp; Unicode is a character set not an=20
  encoding.&nbsp; There are multiple encodings the main ones being =
UTF-8,=20
  UTF-16, and UTF-32.&nbsp; You can use any of these for XML as well as=20
  non-Unicode encodings.&nbsp; For interoperability you should use =
Unicode=20
  preferably UTF-8.<BR>&gt;<BR>&gt;&nbsp;&nbsp; What comes out when the =
XML is=20
  parsed depends on the XML parser.&nbsp; XML is logically expressed in=20
  Unicode.&nbsp; The Windows XML parsers provide a Unicode =
interface.&nbsp;=20
  Other parsers could do=20
  differently.<BR>&gt;<BR>&gt;Rich<BR>&gt;<BR>&gt;<BR>&gt;&nbsp; "Ellen =
K."=20
  &lt;<A=20
  =
href=3D"mailto:72322.1016@compuserve.com">72322.1016@compuserve.com</A>&g=
t;=20
  wrote in message <A=20
  =
href=3D"news:4o2g11pu048kafbdilg46u77vs5ls0be55@4ax.com">news:4o2g11pu048=
kafbdilg46u77vs5ls0be55@4ax.com</A>...<BR>&gt;&nbsp;=20
  Our new enterprise system is going to be built around an=20
  Enterprise<BR>&gt;&nbsp; Service Bus.&nbsp; I don't have the full =
specs yet=20
  but as I understand it the<BR>&gt;&nbsp; main apps (starting with =
SalesForce)=20
  are going to be out on the internet<BR>&gt;&nbsp; and the Sonic ESB =
will be=20
  the messaging piece.&nbsp; There will&nbsp; be an<BR>&gt;&nbsp; =
Operational=20
  Data Store in house that will get updated every night on =
a<BR>&gt;&nbsp; batch=20
  basis from the main apps.&nbsp; <BR>&gt;<BR>&gt;&nbsp; My data =
warehouse will=20
  continue to be the data warehouse and will remain<BR>&gt;&nbsp; in=20
  house.&nbsp; The dimensions will stay the same but I might have to=20
  create<BR>&gt;&nbsp; separate measures for the data from the new apps =
and then=20
  create views<BR>&gt;&nbsp; to keep everything transparent to the=20
  users.&nbsp;&nbsp; <BR>&gt;<BR>&gt;&nbsp; I'm thinking if we're going =
to have=20
  an ODS in house already, I may as<BR>&gt;&nbsp; well do the ETL from=20
  there.&nbsp;&nbsp; But I'm worrying that the new data =
will<BR>&gt;&nbsp;=20
  probably be unicode (because Java defaults to that and SalesForce=20
  is<BR>&gt;&nbsp; written in Java).&nbsp; Right now I am storing =
everything=20
  (except our blobs<BR>&gt;&nbsp; of course) in 8-bit =
characters.&nbsp;&nbsp;=20
  <BR>&gt;<BR>&gt;&nbsp; Anyone here who's up on this stuff, can the XML =
that=20
  goes back and forth<BR>&gt;&nbsp; convert between unicode and 8-bit=20
  characters, or am I gonna have to<BR>&gt;&nbsp; redefine all my=20
  data?&nbsp;&nbsp; For example, if unicode data is put into an=20
  XML<BR>&gt;&nbsp; document that specifies UTF-8, what comes out when =
the=20
  document is<BR>&gt;&nbsp; parsed?&nbsp; How about vice versa?&nbsp; If =
this is=20
  too simplistic to work, what<BR>&gt;&nbsp; is =
needed?<BR>&gt;<BR>&gt;&nbsp;=20
  (We actually have no substantive need for unicode -- we are=20
  bilingual<BR>&gt;&nbsp; Spanish but all the special Spanish characters =
exist=20
  in the ascii<BR>&gt;&nbsp; character =
set.)<BR></BLOCKQUOTE></BODY></HTML>

------=_NextPart_000_07D6_01C516DB.4BF0FA60--

--- BBBS/NT v4.01 Flag-5
 * Origin: Barktopia BBS Site http://HarborWebs.com:8081 (1:379/45)