[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: doc/Etiquette



On Mon, 14 Dec 1998, Tomas Edwardsson wrote:
> I believe the following is no longer true?
> 4) {}|[]\

It is still true. As usual, what is _not_ said is more important than what
is, though; unlike somebody else said, this section does not speak for the
case-translation between the characters. The IRC RFC, however, even as
incomplete as it is now, covers the character-case translation. IRC does
follow this, and for backwards compatibility, imho should. The original
reason for this particular case was that IRC was originally invented and
used in Finland, where technical limitations demanded for the SF7
characterset - this is no longer true, and hasn't been for years, so
beyond backwards compatibility there's no technical reason to keep it.

However, when this came up with another codebase, it turned out IRC client
authors haven't read the damn RFC, or have considered it too insignificant
to implement. [nick] and {nick} open separate chat-windows etc. according
to these reports. What's more embarassing, some services-bots authors have
"forgotten" this peculiarity of the IRC protocol, no matter it's written
clearly in a central place of the RFC document. When this was discovered
on DALnet (at the time third largest IRC network, and still one of the
servers-of-choice for independent networks) the Services branch summarily
decided it was "an ircd bug", and had it changed in the server. This
naturally resulted in the non-upgraded servers crashing causing total
downtime...

Well, my point though is that in the large scale you can't count on it
anymore, the "backwards compatibility" has already been broken, if anybody
even cared over it anymore. There's no technical reasons to keep it (Ie.
mobs of angry Finns won't come to kill you if you remove it...) but it
_is_ one of the most clearly articulated points of the IRC protocol and
not harmful, unless client/service authors forget it. Now, ofcourse, it's
tricky because well-behaved clients/services need to prepare for both
cases, using whatever trickery is needed to find out which the case is.
And this situation isn't going to disappear soon.

>    There are a lot of people from Japan as well, who use Kanji characters
> which may look quite exotic as well. As I don't know Kanji I don't
> even try to explain any of the characters.

IRC _should_ use (client to client protocol, it doesn't really affect
server protocol) UTF8 encoded Unicode. This should ofcourse go for client
authors, but using anything else is plain foolish. In fact, UTF8 is now
required if IRC protocol ever wishes to enter standards track. Perhaps
because of this it's also in use for Microsoft's IRCX, and I can expect
future clients - hopefully - to start using it.

You should check the standards, but the short story is that for Americans
it doesn't change anything, really, since it's completely ASCII (7-bit)
compatible. This is also why it doesn't affect the server protocol. An
eight bit is used to signal that the following chars/bytes include further
bits of the Unicode character-code; there's checkbits so that you can most
of the time tell Latin-1 from UTF8 without using CTCP signals, but I'm
still hoping clients will voluntarily and quickly switch over to UTF8 so
we can solve the characterset problem into the far future... IRC is an
international avenue if anything, and we can't really afford to go into
full-blown 16- or 32-bit protocol, for various reasons.

 -Donwulff