Text 5621, 420 rader
Skriven 2013-01-14 00:31:24 av FidoNews Robot (2:2/2.0)
Ärende: FidoNews 30:02 [02/08]: General Articles
================================================
=================================================================
GENERAL ARTICLES
=================================================================
Some notes on the best place for the host name.
By Michiel van der Vlist 2:280/5555
The best place for IP connect info is the system name field. Why?
1) It is one of the oldest methods of listing IP info, if not THE
oldest. As a result, it is recognised by most if not all of the
IP mailer software that makes use of the nodelist. The INA flag
OTOH came into being much later in the game and so many of the
early IP mailers do not recognise the INA flag and they never will
because it is abandonware.
2) There is no backward compatibility problem. POTS/ISDN mailers
NEVER attempt to extract connect info from the system name field.
In fact they do nothing with the system name field at all.
Nodelist processing software often passes it on to the user for
his information, but the mailer ignores it. Whatever you put there,
it can do no harm to a POTS/ISDN mailer.
3) It is a logical place for a host name. Nomen est omen. The name
says it. A host name is the name by which the system can be addres-
sed in a newtwork. If we wish to address a system on the InterNet
by NAME, what better place to put it then in the systen NAME field?
Some claim there is a potential problem because a system name that is
just that, could be mistaken for a host name or e-mail address. Some
even claim it happens "often".
I say this is a FUD (Fear Uncertainty and Doubt) argument. There is
not a single reported incident where a system name was mistakingly
interpreted as a host name by an IP mailer. It is easy to see why
this should never happen when the specs are followed.
1) The "problem" only applies when the node carries IP flags. For a
POTS/ISDN node is does not matter what is in the system name field.
If it looks like a host name or e-mail address, there is no problem
because a POTS/ISDN mailer will never look there for connect info.
POTS/ISDN mailers looks in the phone number field and nowhere else.
2) It is not all that hard to distinguish a host name, an email
address or a literal IPv4 or IPv6 address from a random string.
At the end of this article I publish the C routines I use to do
that. I wrote them over 7 years ago. I patched them up and made a
shell around it for testing. It felt good to fire up the compiler
and do a bit of programming again. Compile it for your favorite OS
or download the precompiled 16 bit DOS version from my system.
Play around with it...
3) The proper order to look for IP connect info is:
A) Parameter field of the flag for the protocol in use.
B) Parameter flag of an INA flag if present.
C) The system name field.
Both the system name field and the INA flag only provide the default
connect information. If the node lists several IP protocols for
connectivity and it does not use the same host name or IP address for
all of them, the odd one out, should have the info in the parameter
field of the flag in question. That overrides the default. So the
parameter field of the protocol flag is the first place to look. If
the connect info is all there, look no further. What is in the system
name field is irrelevant in that case.
When the info is not in the protocol flag but it is in the INA flag
look no further. What is in the system name field is irrelevant in
that case too.
So when it is not in the protocol flag and not in the INA flag, the
system name field is where it MUST be. An IP capable system not ha-
ving IP information in the protocol flag or an INA flag but listing
a system name that only looks like IP info, is simply a listing in
error as it lists IP flags, but no valid IP connect info.
So what about the INA flag? Isn't that a better place for the host
name? I already mentioned that it is a relatively new flag that older
abandonware will not recognise. Other than it is my personal not so
humble opinion that the INA flag is an abomination that should never
have been introduced in the nodelist.
1) It was created for the wrong reasons. The trigger for its concep-
tion was some idiot NC crashing the nodelist production by tag-
ging the same long host name to every IP protocol flag that he
could think of. The version of MAKENL in use at the time crashed
when a line was longer than 157 characters. Of course he could
simply have put the host name in the system name field and it
would gave been processed without problems by MAKENKL, but this
idiot want to make a statement. At the time there was a very vocal
group of sysops that refused to "sacrifice" - as they called it -
the place for the bronze plaquette with the name for their BBS.
And instead of telling these selfish sysops that they can't always
get what they want, the powers that be gave in and adopted the INA
flag. The INA flag is a "technical" solution to what is basically a
social problem. Such a "solution" should evoke reverse peristaltic
movement in any technician worth his keyboard..
2) It does not "fit in" and that makes the life of programmers harder.
When we classify the IP flags, we can see two classes:
A) IP protocol flags, IBN, IFC, IFT, ITN, IVM
These may carry a host name and a port number as parameter.
B) E-mail flags. IEM, ITX, IUC, IMI, ISE
These may carry an e-mail address as parameter.
When writing a parser for a syntax checker, we do not want to hard
code all the flags. So that when a new protocol flag is introduced,
we do not have to issue a new version of the programme. We want to
make the list of flags configurable. For that it is convenient if they
can be easily classified. Before the INA flag, we had two classes. A
and B as above. But the INA flag does not fit into one of these
classes. We have to create a whole new class for it:
C) Address carrying flag: INA.
It MUST carry a host name, but not a port number.
So now we have a new class for just ONE flag. And that list of one
flag will never grow, because one "address carrying flag" is all we
ever need. A whole new class for one flag. Argghh!
What makes it even more sad is that the whole circus could have been
avoided by adopting a scheme that has been in use in fidonet for a
very long time: inherit from the predecessor. When listing multiple
node numbers as in via lines, we do not list the full number every
time. When the zone and net number is the same as that of the previous
number, we just list the node number and assume the zone end net
number is the same. Like:
2:280/0 1010 1011 1012 292/1 2 4
We could have done the same for IP protocol flags. When a protocol
flag carries no host name, look at the flag that precedes it and take
it form there:
IBN:example.com,ITN:6023,IFC.
No need for a new class with only one flag .. No need to sacrifice the
BBS name. Well, would have, could have...
The only problem is that it does not work. It does not work because
unlike in the example with the node numbers, the parser may not find
the predecessor. In the example above, it may not know the IBN flag
because it does not support that protocol. Had we agreed on a minimum
protocol for IP nodes that all must support, we would not have this
problem, but we did not and it is unlikely that we ever will. Would
have, could have... Exit the "inherit" method and back to square one.
So the system name field is still the best place for the host name.
And last but not least: what is the worst place for IP connect info?
That is the telephone number field. Why?
1) It creates a serious backward compatibility problem. No matter what
some claim, there are POTS/ISDN mailers out there that can NOT be
configured to never attempt to dial certain numbers. They can
translate numbers to add or change dial prefixes and such and dial
the translated number, but they can not be configured to not dial
at all They are designed on the premise that anything other than
-Unpublished- in field six in the nodelist, is a valid dialable
telephone number. In the general case, there is really no way to
tell what will happen if a classic POTS mailer with a modem a
ttempts to dial a host name. If you are lucky it will do nothing.
If not, it may attempt to dial the letters a-d as the "digits" in
the fourth column of the DTMF tone table. Or.. it might attempt to
dial the letters a-z as translated into digits according to the
labelling on the buttons of a push button telephone. No way to
tell, but it may saddle someone with a hefty phone bill. A big
NONO in Fidonet. The telephone number field should contain the
string "-Unpublished-" or a valid dialable telephone number and
nothing else.
2) It creates a problem for dual capability nodes. You can not put a
host name AND a telephone number there.
Just for completeness let me mention the proposed method of listing
the IP info in the location field after the location separated by a
'#' character. That method was abandoned after a brief experimental
stage. Good riddance.
So here is the code I mentioned:
======= iptest.c =======
/* Test my IP detection routines. ¸ 2008-2013 Michiel van der Vlist.
This code is released to the public under the GNU General
Public License. Feel free to use whatever is of use to you. */
#include <stdio.h>
#include <string.h>
#include <stdlib.h>
#include <ctype.h>
#define boolean char
#define TRUE 1==1
#define FALSE !TRUE
#define NONE 0
#define PORTNR 1
#define IPQUAD 2
#define IPV6 3
#define HOSTNAME 4
#define EMAIL 5
int isip(char *s);
boolean isportnumber(char *s);
boolean isipquad(char *s);
boolean isipv6(char *s);
boolean ishostname(char *s);
boolean isemailaddress(char *s);
char sep=':';
int main(int argc, char *argv[])
{
int n;
char s[128];
if (argc>1) sep=argv[1][0];
printf("Test the IP detecting stuff.\n");
printf("By Michiel van der Vlist 2:280/55555\n");
printf("# or ^C to abort.\n");
while (TRUE)
{
printf("\nEnter a string: ");
scanf("%s",s);
if (strcmp(s,"#")==0) exit(0);
n=isip(s);
switch (n)
{
case NONE: printf("%s is just a string.\n",s); break;
case PORTNR: printf("%s is a port number.\n",s); break;
case IPQUAD: printf("%s is an ip quad.\n",s); break;
case IPV6: printf("%s is an IPV6 address.\n",s); break;
case HOSTNAME: printf("%s is a host name.\n",s); break;
case EMAIL: printf("%s is an e-mail address.\n",s); break;
default: printf("%s does not parse.\n",s);
}
}
}
/* See if a string qualifies as a parameter for an IP flag */
int isip(char *s)
{
if (isportnumber(s)) return(PORTNR);
if (isipquad(s)) return(IPQUAD);
if (isipv6(s)) return(IPV6);
if (ishostname(s)) return(HOSTNAME);
if (isemailaddress(s)) return(EMAIL);
return(NONE);
}
/* See if a string qualifies as a port number
an up to five digit decimal number qualifies */
boolean isportnumber(char *s)
{
int i=0;
char c;
while ((c=s[i])!=0 && c!=':' && c!=',')
{
if (!isdigit(c) || i>4) return(FALSE);
i++;
}
return(TRUE);
}
/* See if a string qualifies as an ip quad */
/* four nummeric values in the range 0-255,0-255,0-255,0-255
separated by dots.
0.0.0.0 is invalid */
boolean isipquad(char *s)
{
int n1,n2,n3,n4,dots;
long unsigned int ipadr;
dots=sscanf(s,"%d.%d.%d.%d",&n1,&n2,&n3,&n4);
if (dots<4) return (FALSE); /* too few fields */
if (n1<0 || n2<0 || n3<0 || n4<0) return (FALSE);
if (n1>255 || n2>255 || n3>255 || n4>255) return (FALSE);
ipadr=(((unsigned long)n1)<<24)+(((unsigned long)n2)<<16)+
(((unsigned long)n3)<<8)+(unsigned long)n4;
if (ipadr==0) return(FALSE);
return(TRUE);
}
/* See if a string qualifies as an ipv6 address */
/* The separator character (colon in the IP world) is defined in the
global variable sep, default also a colon in this test program.
Eight hexadecimal values in the range 0-FFFF separated by a SEP
char. The string may be encased in square brackets and then the
separation character is always a colon. (':')
One or more successive zero values may once be represented by
two successive separators. F.e. 1234::6789:abcd expands to
1234:0:0:0:0:0:6789:abcd.
All zeros is invalid. */
boolean isipv6(char *s)
{
int i=0,j=0,n1,n2,n3,n4,n5,n6,n7,n8,seps=0,nrseps=0,doubleseppos=0;
char c,sp,t[40],tt[40];
char *b;
boolean encased=FALSE;
/* first see if it is encased */
if (s[i]=='[')
{
if (((b=strchr(s,']'))==NULL)) return(FALSE);
else
{
/* there must be one and only one closing bracket */
if(strchr(&b[1],']')!=NULL) return(FALSE);
encased=TRUE; i++; sp=':';
}
}
else
{
if (strchr(s,']')!=NULL) return(FALSE);
sp=sep;
}
/* first substitute the separator character by a comma, count them and
note the position of two successive separators on the fly */
while (s[i] && i<sizeof(t)-2 && (!encased || s[i]!=']'))
{
c=s[i];
if (c==sp)
{
c=',';
nrseps++;
if (s[i+1]==sp)
{
if (!doubleseppos) doubleseppos=j; else return FALSE;
}
}
i++;
t[j++]=c;
}
t[j]=0;
/* printf("%s %d %d\n",t,doubleseppos,nrseps); */
/* Now expand the string if there is a double separator present */
if (doubleseppos)
{
strcpy(tt,&t[doubleseppos+1]);
t[doubleseppos]=0;
for (i=0;i<8-nrseps;i++) strcat(t,",0");
strcat(t,tt);
}
seps=sscanf(t,"%x,%x,%x,%x,%x,%x,%x,%x",&n1,&n2,&n3,&n4,
&n5,&n6,&n7,&n8);
if (seps<8) return (FALSE); /* too few fields */
if (doubleseppos) printf("Expands as %s\n",t);
if ((n1+n2+n3+n4+n5+n6+n7+n8)==0) return (FALSE);
return (TRUE);
}
/* see if a string qualifies as a host name */
/* It must contain one or more dots. The dots may not be the first,
the last or the one but last character (no single letter TLDs).
Two dots may not immidiately follow each other.
It must consist of the characters a..z, A..Z,, 0..9, '.', '-'.
The first character must be a letter or a digit. It must contain
at least one letter */
boolean ishostname(char *s)
{
int i=1;
char c;
boolean dot=FALSE, letter=FALSE;
if (!isalnum(s[0])) return(FALSE);
while ((c=s[i])!=0 && c!=',' && c!=':')
{
if (isalpha(c)) letter=TRUE;
if (c=='.')
{
dot=TRUE;
if (s[i-1]=='.') return(FALSE);
}
if (!(isalnum(c) || c=='-' || c=='.')) return(FALSE);
i++;
}
if (s[0]=='.' || s[i-1]=='.' || s[i-2]=='.') return(FALSE);
return(dot && letter);
}
/* see if a string qualifies as an email address */
/* It must contain exactly one ampersand character that may not be
--- Azure/NewsPrep 3.0
* Origin: Home of the Fidonews (2:2/2.0)
|