Sanskrit coding, cont. (64)

Willard McCarty (MCCARTY@VM.EPAS.UTORONTO.CA)
Tue, 25 Apr 89 22:50:23 EDT


Humanist Mailing List, Vol. 2, No. 895. Tuesday, 25 Apr 1989.

Date: Mon, 24 Apr 89 23:15:35 EDT
From: ipl cms <BOISVERT@vm.epas.utoronto.ca>
Subject: Re: Sanskrit coding, cont. (112)

In reply to Jamie's query on how librairies usuelly deal with ASCII
character set when it comes down to Indian languages, I would like to
remind him of a message from Dominik (I think...) where he mentioned
a standard character set built by the American National Standard for
Information Sciences: Extended Latin Alphabet Coded Character Set for
Bibliographic Use. I just got a copy of this yesterday and it seems
quite comprehensive and does not jeopardize the use of French, German,
Chinese,...(almost all language one [or should I say "I"] can think of...

This scheme is the standard coding system used by all the librairies in
North America (I don't know whether European librairies have adopted it
as well?), and it was designed both for a 8-bit coding sheme and for a
7-bit.

No matter what scheme is adopted for this large project, many people will
be dissatisfied for it won't be compatible with their own. What we are
aiming for, however, is not to create "dukkha", but to build up the most
convenient coding system. Your own personal texts (mines included!) will
need to be altered if we want to adopt the eventual scheme, yet this can
be easily be done by running our texts through a transformation filter
(that can easily be written in SNOBOL4+).

Another question coming to mind is whether to use a 8-bit or 7-bit
coding scheme. It seems that the 8-bit would be the most convenient
for the entering of data, yet for sending data through E-mail, the 7-bit
is required. We could use an 8-bit system, and have a special filter
that converts the text into a 7-bit for electronic exchanges. Once
the scheme is elaborated, I will write this filter (i.e. that would
transform the text from 8-bit to 7-bit and vice and versa) and make
it available to whoever might find it useful (SNOBOL4+ also has a
public domain version called SNORUN; we cannot write any program with
SNORUN, but it is possible to run them. The filter I am talking about
could be run with SNORUN).

I think we should seriously consider adapting the ANSI (American National
Standard Information Sciences) 's scheme since it is has been carefully
elaborated. Yet, no Sanskrists (or not many) have heard of it.
What do you think??

I am leaving Toronto early on Thursday morning to go to Massachusetts
where I will be teaching a Pali language course. Unfortunately, it
seems I won't have access to E-Mail there. So if you wish to contact
me during the summer, please do so at the following address:
Mathieu Boisvert
c/o Diana Allen
16 Main Street
Shelburne Falls
MA. 01370 U.S.A.
(413) 625-2546

Thanks, Mathieu