Hi Steve: Thanks for your hacking and inputs on Ch-series. It is an unfortunate fact that none of the native Chinese students who are studying CS have input 5% as much as you have. It is not surprising to encounter bugs in Ch-series along with some raw implementation. Such a package will actually need some 2 years time to become sound. We Ph.D.'s in math and applied math (as I actually am) are really good designers. Programming is only a way to illustrate our ideas, and Ch-series is one of such examples, and the programs I designed to solve mixed-boundary value problems in my dissertation are another. The man pages you wrote are good, although some repeating sentences and site-dependent statements could be removed. Professor X made some good points, but he never joined SCC and he'd never understand the motivations behind ch-series. X/X11 windows could be additional to this package, but the package should never be based on that. ANSI mode is what is one of the new ideas in there. I agree with him on a standard extension of file name for chtex, and I had given some thougts to it. The reason I did not pick .tex as the output extension is that if you happen to have the input file of that extension, things are screwed up. On the other hand, .chtex could not be used as the input extension because DOS does not like it. He said the error message without a trailing newline was a simple coding error. This is baseless. UNIX is simple, and most of the UNIX applications exit without a newline. I actually did sort through these error messages to add a \n, but some were missed out If ch-series were designed for UNIX only, then many things are done in a stupid way. lex is not used at all, not man page and flags are not standard, stdin < is not used (but stdout > is). But most of ch-series was developed on VMS. You can have the flags in any position, such as %chtext -p test.tex test.PIN or %chtext test.tex > test.PIN -p or %chtext test.tex test.PIN -p and you can have multiple flags in many ways, such as, %chtext -p -s test.tex test.pin or %chtext -ps test.tex test.pin or %chtext -p test.tex test.pin -s or ... UNIX is simple-minded itself without having good standards in many a thing. This fact does not seem to be too familiar to Prof. X. One good point of Prof. X is the link of chfont*.tfm to one which may not be easily generalized in other systems. But a chfont0.tfm is only less than 2k. Your colleagues (at least two of them) have said that ChTeX is not a true TeX preprocessor because it needs a special dvi2ps and it does not take Metafont. I pointed out before that TeX without Metafont can still be called TeX. If I tell you that ChTeX will not be dependent on a special dvi2ps or PS, I am prematually releasing the coding secret. But definitely, that goal is to be met. Non-TeXperts, however, do not realize the difficulties hidden. Chinese has at least 7000 commonly used characters PER SET, but English only 127. Plain TeX uses 16 sets of basic fonts for 3 sizes, and LaTeX has over 50 for various sizes. First of all, it will take more than affortable individual effort to generate all these sets of Chinese fonts 7000 in each. Secondly, even if you have them, they can't be used without inteligent font screening. Memory in TeX and PS printers are all limited, and disk spaces in computers can be exhausted. More to come later. JB p.s.: New posted today. __________________________ Newsgroups: soc.culture.china,comp.text Subject: Bug fixes in ChTeX Expires: References: Sender: Reply-To: jbw@unix.cis.pittsburgh.edu (Jingbai Wang) Followup-To: Distribution: Organization: Univ. of Pittsburgh, Comp & Info Sys Keywords: Merry Xmas and happy new year! As the date getting nearer and nearer that jbw accounts on University of Pittsburgh UNIX and VMS machine vanish and that I move into another location in USA, I 'd like to send out some more messages on Ch-series package many of you have already tried out. Since it is only a Beta version, there can be many bugs. Bugs: **** chtext.c (Chinese inputter, recent versions) has a bug that can cause core dump when using CiZu input. This is caused by an extra save_cizu() right below dePinYin(...) in main(...). Just remove that line then everything will be fine. This was found by wwang@... This bug did not exist in earlier versions before I moved some global variables into functions after I received many complaints from Turbo C 2.0. Comments: ******** Among other things are that Prof. Simpson of PSU has written up quite some useful man pages that I will incorporate into the Beta release, and that people are paying too much attention to Chinese TeX which is but a small function of chtex.c. Actually chtex -w (WStroff) is much much more useful for the users who do not need math and fancy tables or formats. It can serve as Chinese->PS or ASCII->PS plain driver or a formatter. Actually all the documents in the release are formatted with it. As far as ChTeX itself is concerned, it is only a minor feature of Ch-series. Once I get more time, I will make it independent of PostScript and special version of dvi2ps as I pointed out to Lee at hawaii some time back. If macros are built, LaTeX can ride on it comfortably just like TeX. Since Ch-series was not designed for UNIX only, little advantage is taken of UNIX. Graphics mode on PC and X/X11 should be supported, but the question is: is VGA going out soon? is X11 popular enough? am I a programmer or a designer? As I pointed out before, Ch-series is just designed to illustrate my new ideas in Chinese input and output, and they believed to be the best ones existing today. The implementations however can be much improved if there is a group of commercial kind or academic nature to cooperate with me. I will keep you informed after I settle down in the new location which is some 400 miles away from Pittsburgh. JB Wang From simpson@boole.math.psu.edu Wed Dec 20 09:57:24 1989 Received: from psuvax1.cs.psu.edu by unix.cis.pitt.edu (5.61/6.41) id AA07213; Wed, 20 Dec 89 09:57:21 -0500 Received: from boole.math.psu.edu by psuvax1.cs.psu.edu with SMTP (5.61/PSUCS-1.0) id AA15765; Wed, 20 Dec 89 09:55:46 -0500 Received: by boole.math.psu.edu (4.1/Psu2.1) id AA02476; Wed, 20 Dec 89 09:58:41 EST Date: Wed, 20 Dec 89 09:58:41 EST From: simpson@boole.math.psu.edu (Stephen G. Simpson) Message-Id: <8912201458.AA02476@boole.math.psu.edu> To: jbw@unix.cis.pitt.edu Subject: cctex man page Status: RO .TH CCTEX 1 12/19/89 .SH NAME cctex \- render a Chinese TeX file into Postscript .SH SYNOPSIS .I cctex foo.tex where .I foo.tex is a Chinese TeX file. .SH DESCRIPTION The purpose of .B cctex is to render a Chinese TeX file into Postscript for printing. The appropriate command is .I cctex foo.tex where .I foo.tex is the name of your Chinese TeX file. This filename must end in .I .tex. The output is another file called .I foo.ps which can be printed on a Postscript printer, using a command such as .I lpr foo.ps. .PP The .B cctex program is nothing more than a shell script which automates steps (2) through (5) of the procedure described in the .B chtex man page. .SH "TEX FILE FORMAT" The input file .I foo.tex is just like an ordinary .I .tex file, except that .I foo.tex may contain Chinese characters. The Chinese characters must be in CCDOS format. This means that each character is represented by a code consisting of two consecutive 8-bit bytes, and each of these two bytes has its high bit set to 1. The two-byte character codes are in accordance with the Guo2-Biao1 standard. Except for the presence of Chinese characters codes, .I foo.tex looks just like an ordinary .I .tex file. In particular, .I foo.tex may invoke any of the usual TeX commands and macros. Simple examples of Chinese TeX files can be found with the .B chtex documentation (see below). .PP Except for the Chinese characters codes, all of the bytes in .I foo.tex are in the standard ASCII range, from 1 to 127. Such a byte has its high bit set to 0. Thus .B cctex will be able to distinguish Chinese character codes from the other bytes in .I foo.tex .PP In order to create .I foo.tex, it may be convenient to use a plain-vanilla CCDOS editor. Your editor should be able to produce a mixture of Chinese character codes and standard ASCII. For example, on a PC you can use Byx or Chinese Wordstar in "non-document" mode. On a Macintosh, analogous editors are available. Here on the Penn State Math Department Suns, you can use J. B. Wang's interesting program, .B chtext (not to be confused with .B chtex). Documentation for .B chtext can be found with the .B chtex documentation (see below). .SH DOCUMENTATION The .B cctex script calls several major programs, including .B chtex, tex, and .B dvi2ps. These programs have their own documentation, beginning with the appropriate man pages. Most directly relevant are the .B chtex man page and the documentation files in .I /home/boole/simpson/doc/chtex. .PP In order to use the Chinese TeX system as described here, you must have an account on the Penn State Mathematics Department Computer Network, math.psu.edu. If you have any questions, try reading the documention. If that doesn't help, try sending mail to simpson@math.psu.edu. .SH AUTHORS The .B cctex program is a simple shell script which automates the use of .B chtex and .B dvi2ps to render Chinese TeX files into Postscript. The .B chtex program and the idea of Chinese TeX are due to J. B. Wang, jwang@pittvms.bitnet. The .B dvi2ps program is a public domain dvi-to-Postscript converter, specially hacked by J. B. Wang to work with .B chtex. The .B cctex script is by S. G. Simpson, simpson@math.psu.edu. .SH BUGS The .B chtex program, which is the heart of .B cctex, is still under development and fairly buggy. If .B chtex runs into trouble, try hitting . From simpson@boole.math.psu.edu Wed Dec 20 09:57:50 1989 Received: from psuvax1.cs.psu.edu by unix.cis.pitt.edu (5.61/6.41) id AA07232; Wed, 20 Dec 89 09:57:48 -0500 Received: from boole.math.psu.edu by psuvax1.cs.psu.edu with SMTP (5.61/PSUCS-1.0) id AA15781; Wed, 20 Dec 89 09:56:11 -0500 Received: by boole.math.psu.edu (4.1/Psu2.1) id AA02480; Wed, 20 Dec 89 09:59:06 EST Date: Wed, 20 Dec 89 09:59:06 EST From: simpson@boole.math.psu.edu (Stephen G. Simpson) Message-Id: <8912201459.AA02480@boole.math.psu.edu> To: jbw@unix.cis.pitt.edu Subject: chtex man page Status: RO .TH CHTEX 1 12/16/89 .SH NAME chtex \- part of the Chinese TeX system .SH SYNOPSIS .B chtex \- a preprocessor for TeX documents containing Chinese .SH DESCRIPTION The .B chtex program is part of Chinese TeX, a system for printing TeX documents that contain Chinese characters. Chinese TeX is an extension of the pre-existing TeX system. In addition to TeX and the .B chtex program, Chinese TeX requires some additional software and a Postscript printer. Chinese TeX is available here on the Computer Network of the Mathematics Department at the Pennsylvania State University. .PP Apart from Chinese TeX, the .B chtex program can also be used as a preprocessor for Scribe and WStroff documents which contain Chinese characters. Those applications are not described in this man page. They are described in the documentation file .I chtex.doc which can be found elsewhere (see below). .PP In this man page, we confine ourselves to a brief outline of how to use .B chtex to print Chinese TeX documents. .SH USAGE Here is a 6-step outline of how to create and print a Chinese TeX document. Steps (2) through (5) have been automated in a shell script called .B cctex. For more information about .B cctex, see the appropriate man page. .PP (1) The first step is to create a Chinese TeX file called, for example, .I foo.tex. The file .I foo.tex is just like an ordinary .I .tex file, except that .I foo.tex may contain Chinese characters. .PP The Chinese characters in .I foo.tex must be in CCDOS format. This means that each character is represented by a code consisting of two consecutive 8-bit bytes, and each of these two bytes has its high bit set to 1. The two-byte character codes are in accordance with the Guo2-Biao1 standard. Except for the presence of Chinese characters codes, .I foo.tex looks just like an ordinary .I .tex file. In particular, .I foo.tex may invoke any of the usual TeX commands and macros. Simple examples of Chinese TeX files can be found with the other .B chtex documentation (see below). .PP Except for the Chinese characters codes, all of the bytes in .I foo.tex are in the standard ASCII range, from 1 to 127. Such a byte has its high bit set to 0. Thus .B chtex will be able to distinguish Chinese character codes from the other bytes in .I foo.tex .PP In order to create .I foo.tex, it is probably convenient to use a plain-vanilla CCDOS editor. Your editor should be able to produce a mixture of Chinese character codes and standard ASCII. For example, on a PC you can use Byx or Chinese Wordstar in "non-document" mode. On a Macintosh, analogous editors are available. Here on the Math Department Suns, you can use an interesting program called .B chtext (not to be confused with .B chtex). Documentation for .B chtext can be found with the .B chtex documentation (see below). .PP (2) Once you have created .I foo.tex, it should take only a few minutes to render it into Postscript for printing. First, you must process .I foo.tex using .B chtex. The appropriate command is .I chtex foo.tex and this will create two output files, .I foo.hdr and .I foo.chtex. Both of these files will be needed later. .PP (3) The next step is to process .I foo.chtex using the pre-existing TeX system. For example, you can issue the command .I tex foo.chtex. If successful, your .B tex command will produce another output file called .I foo.dvi. (For more information, see the .B tex man page.) .PP (4) The next step is to process .I foo.dvi using J. B. Wang's specially hacked version of .B dvi2ps, the public domain dvi-to-Postscript converter. The appropriate command is .I dvi2ps foo.dvi > foo.lps and of course this produces yet another output file, .I foo.lps. .PP (5) The next step is to combine .I foo.hdr and .I foo.lps into one big output file. The appropriate command is .I cat foo.hdr foo.lps > foo.ps and the output of this step is a complete Postscript file, .I foo.ps. .PP (6) The last step is to print the Postscript file on a Postscript printer. The appropriate command might be .I lpr foo.ps after which, finally, you can go to the printer and watch your finished document emerge. Good luck! .SH OPTIONS There are a number of options for .B chtex that are not covered in this man page. For further information, see the documentation files. .SH LIBRARIES Currently the Chinese TeX system is not really installed for general use. Instead, Stephen G. Simpson has installed it in his home directory, /home/boole/simpson. This implies that, before you can use the system, you will have to set certain environment variables appropriately so that your shell will be able to find .B chtex and .B dvi2ps, and so that .B chtex, tex, and .B dvi2ps will be able to find the library files that they need. .PP First, put Simpson's personal directory for executables into your path. The appropriate command might be .I set path=(/home/boole/simpson/bin/sun3 $path) or .I set path=(/home/boole/simpson/bin/sun4 $path) depending on whether you are logged in to a Sun 3 or a Sun4. .PP In addition, you must put Simpson's personal TeX font directory into your TEXFONTS path. (TEXFONTS is the path of directories where .B tex looks for fonts. See the man page for .B tex.) The appropriate command might be .I " setenv TEXFONTS " .I " .:/home/boole/simpson/lib/tex/fonts:/usr/local/lib/tex/fonts" which is supposed to be all on one line. Finally, you should put Simpson's personal TeX macro directory into your TeX macro path. The appropriate command might be .I " setenv TEXINPUTS " .I " .:/home/boole/simpson/lib/tex/macros:/usr/local/lib/tex/macros" and again this is all supposed to go on one line. .SH DOCUMENTATION There is documentation for .B chtex and .B dvi2ps in .I /home/boole/simpson/doc. The source code, along with other useful information, is in .I /home/boole/simpson/src. The latest version of the .B chtex man page will be maintained as .I chtex.1 in .I /home/boole/simpson/man/man1 and .I /home/boole/simpson/man/cat1. There is also a man page for .B dvi2ps in the same directories. .PP In order to use the Chinese TeX system as described here, you must have an account on the Penn State Math Department Computer Network. If you have any questions, try reading the documention. If that doesn't help, try sending mail to simpson@boole.math.psu.edu or to the author of .B chtex, J. B. Wang. .SH AUTHOR J. B. Wang .PP jwang@pittvms.bitnet .PP Dr. Wang would appreciate feedback from users. .SH BUGS This software is still under development and fairly buggy. If .B chtex gets into trouble, try hitting . From wwang@CS.BU.EDU Wed Dec 20 21:57:04 1989 Received: from BU-IT.BU.EDU by unix.cis.pitt.edu (5.61/6.41) id AA24431; Wed, 20 Dec 89 21:56:59 -0500 Received: from BUCSE.BU.EDU by bu-it.BU.EDU (5.58/4.7) id AA29056; Wed, 20 Dec 89 21:56:39 EST Received: by bucse.bu.edu (5.31/4.7) id AA21649; Wed, 20 Dec 89 22:00:07 EST Date: Wed, 20 Dec 89 22:00:07 EST From: wwang@CS.BU.EDU Message-Id: <8912210300.AA21649@bucse.bu.edu> To: jbw@unix.cis.pitt.edu, jwang@pittvms.bitnet Subject: a bug fix for chtext.c Status: RO Hello, I have successfully loaded chtex.tarz into my account on a sun/280. It has been working well except a bug (maybe known to you already) in chtext.c in the part that handles translation of Pinying file into Chinese. The 'save_cizu();' following 'dePinYin( **** );' seems redundant because 'dePinYin(***)' has done the save_cizu already. Also with null argument to save_cizu(), it causes segmentfault in my machine(mabny yours too). After I comment it out, it works fine. The following diff file is obtained by 'diff old.chext.c chtext.c': 1878c1878,1881 < save_cizu(); --- > /* The following line is deleted by Weiguo Wang (wwang@cs.bu.edu) 1989.12 > It causes segmentfault when in translation mode and a new Ci2 Zu3 is > added. */ > /* save_cizu(); */ Also, I unpacked 'dvi2ps.tar'. Although the Chinese characters were printed fine using latex, the English letters were messed up. I guess it was due to mis-scaling or something like that. I am not quite sure. Since I don't have the previlige to move your 'tfm' files to /usr/local/lib/tex/fonts, I put them in my private dir, and put the path into env-variable 'TEXFONTS'. Latex didn't complain while it would if I don't setup TEXFONTS right. I didn't use your Latex Style/ files, but I think your dvi2ps should work with normal styles like [12pt]{article}. Just a little suggestion, in the next revision of the '*.doc', it would be of great help to folks like me if you give the instruction for how to install the necessary files if one doesn't have all the previliges of the system directories. One question, how can one change the size of Chinese characters if using Latex? I tried {\large \Zw da4 han4-zhi4}, it didn't work. So long, have a nice holiday! -Weiguo From westc!jhonig@relay.EU.net Wed Dec 20 22:09:16 1989 Received: from mcsun.eu.net by unix.cis.pitt.edu (5.61/6.41) id AA24663; Wed, 20 Dec 89 22:09:10 -0500 Received: by mcsun.EU.net with SMTP; Thu, 21 Dec 89 04:08:57 +0100 (MET) Received: from westc by hp4nl.nluug.nl with UUCP via EUnet id AA24218 (5.58.1.14/2.14); Thu, 21 Dec 89 04:09:43 +0100 Received: by westc.UUCP; Thu, 21 Dec 89 00:43:43 +0100 (MET) Date: Thu, 21 Dec 89 00:43:43 +0100 From: westc!jhonig@relay.EU.net (Job Honig) Organisation: West Consulting bv Postbox 3318 2601 DH Delft, The Netherlands Phone: +31 15 123190 Fax: +31 15 147889 Message-Id: <8912202343.AA08048@westc.UUCP> To: jbw@unix.cis.pitt.edu Subject: ASCII->PS Status: RO LS, Having read about your ascii2ps filter in the news, i would like to tell you that i am interested. Could you tell me how to obtain in? Job Honig West Consulting BV Phoenixstraat 49 Delft The NETHERLANDS From Z.Wang@CS.UCL.AC.UK Fri Dec 22 06:13:25 1989 Received: from relay.cs.net by unix.cis.pitt.edu (5.61/6.41) id AA29313; Fri, 22 Dec 89 06:13:22 -0500 Message-Id: <8912221113.AA29313@unix.cis.pitt.edu> Received: from cs.ucl.ac.uk by RELAY.CS.NET id aa20392; 22 Dec 89 5:12 EST Received: from pyr1.cs.ucl.ac.uk by vs6.Cs.Ucl.AC.UK via Ethernet with SMTP id aa02804; 22 Dec 89 11:04 WET To: jbw%unix.cis.pitt.edu@RELAY.CS.NET Subject: ChTex Date: Fri, 22 Dec 89 11:01:16 +0000 From: Zheng Wang (Ext: 3701) Status: RO Hi I am very interested in your posting about ChTex. I am afraid that I missed your previous postings. Could you please tell where I should ftp the source and get relevent documents if there are any. Thank you very much Zheng Wang From simpson@boole.math.psu.edu Sun Dec 24 18:12:25 1989 Received: from psuvax1.cs.psu.edu by unix.cis.pitt.edu (5.61/6.41) id AA00305; Sun, 24 Dec 89 18:12:23 -0500 Received: from boole.math.psu.edu by psuvax1.cs.psu.edu with SMTP (5.61/PSUCS-1.0) id AA17524; Sun, 24 Dec 89 18:10:39 -0500 Received: by boole.math.psu.edu (4.1/Psu2.1) id AA05117; Sun, 24 Dec 89 18:13:41 EST Date: Sun, 24 Dec 89 18:13:41 EST From: simpson@boole.math.psu.edu (Stephen G. Simpson) Message-Id: <8912242313.AA05117@boole.math.psu.edu> To: jbw@unix.cis.pitt.edu Subject: chtext coredumps Status: RO Today I was playing around with chtext some more and found that it won't work. Whenever I try to process a pinyin file using a command like chtext test.pin test, chtext goes through the character selection process and then aborts with the message "segmentation fault, core dumped." This happened on a Sun 3 and on a Sun 4. The file "test" is not created or is written with 0 bytes. This happened a few times before, but today it is happening all the time. Also, the dictionary doesn't seem to work today, combinations like wo3-men2 are not recognized. Any idea what might be causing these problems? From simpson@boole.math.psu.edu Sun Dec 24 21:26:59 1989 Received: from psuvax1.cs.psu.edu by unix.cis.pitt.edu (5.61/6.41) id AA01476; Sun, 24 Dec 89 21:26:55 -0500 Received: from boole.math.psu.edu by psuvax1.cs.psu.edu with SMTP (5.61/PSUCS-1.0) id AA22373; Sun, 24 Dec 89 21:25:11 -0500 Received: by boole.math.psu.edu (4.1/Psu2.1) id AA05537; Sun, 24 Dec 89 21:28:13 EST Date: Sun, 24 Dec 89 21:28:13 EST From: simpson@boole.math.psu.edu (Stephen G. Simpson) Message-Id: <8912250228.AA05537@boole.math.psu.edu> To: jbw@unix.cis.pitt.edu Subject: CHTEXTDIC environment variable Status: RO I think I found what was causing the coredumps earlier today. I had the CHTEXTDIC variable set incorrectly to a complete pathname+filename. It is supposed to be only a pathname. I suggest you improve the code so that this error will not cause a coredump. Instead, an appropriate error message should be sent to the console when chtext first loads. Another suggestion: provide an environment variable so that the user can point to his own personal dictionary even if it is named something other than Ci3Dian3.chi. My mistake was because I thought CHTEXTDIC was that variable. From simpson@euler.math.psu.edu Mon Dec 25 00:12:21 1989 Received: from psuvax1.cs.psu.edu by unix.cis.pitt.edu (5.61/6.41) id AA02259; Mon, 25 Dec 89 00:12:18 -0500 Received: from euler.math.psu.edu by psuvax1.cs.psu.edu with SMTP (5.61/PSUCS-1.0) id AA25810; Mon, 25 Dec 89 00:10:32 -0500 Received: by euler.math.psu.edu (4.1/Psu2.1) id AA10394; Mon, 25 Dec 89 00:12:04 EST Date: Mon, 25 Dec 89 00:12:04 EST From: simpson@euler.math.psu.edu (Stephen G. Simpson) Message-Id: <8912250512.AA10394@euler.math.psu.edu> To: jbw@unix.cis.pitt.edu Subject: bug in chtext Status: RO When using chtext in interactive batch mode to translate a pinyin file into a CCDOS file, I notice one minor problem. After the character selection process is finished, the user is asked whether he wants to view his file. A notation [Y] appears, indicating I think that the default answer is "yes". However, when the user does a carriage return, nothing happens. It is then necessary to exit by Ctrl-C. From Z.Wang@CS.UCL.AC.UK Fri Dec 22 13:30:39 1989 Received: from relay.cs.net by unix.cis.pitt.edu (5.61/6.41) id AA07817; Fri, 22 Dec 89 13:30:36 -0500 Message-Id: <8912221830.AA07817@unix.cis.pitt.edu> Received: from cs.ucl.ac.uk by RELAY.CS.NET id aa25795; 22 Dec 89 12:19 EST To: Jingbai Wang Subject: Re: ChTex In-Reply-To: Your message of Fri, 22 Dec 89 11:55:03 -0500. <8912221655.AA05174@unix.cis.pitt.edu> Date: Fri, 22 Dec 89 18:08:11 +0000 From: Zheng Wang (Ext: 3701) Source-Info: toast.cs.ucl.ac.uk Status: RO Thanks for the reply. Have a nice Xmas. Cheers. Zheng From simpson@boole.math.psu.edu Sun Dec 24 12:24:06 1989 Received: from psuvax1.cs.psu.edu by unix.cis.pitt.edu (5.61/6.41) id AA28477; Sun, 24 Dec 89 12:24:02 -0500 Received: from boole.math.psu.edu by psuvax1.cs.psu.edu with SMTP (5.61/PSUCS-1.0) id AA06435; Sun, 24 Dec 89 12:22:01 -0500 Received: by boole.math.psu.edu (4.1/Psu2.1) id AA04904; Sun, 24 Dec 89 12:25:04 EST Date: Sun, 24 Dec 89 12:25:04 EST From: simpson@boole.math.psu.edu (Stephen G. Simpson) Message-Id: <8912241725.AA04904@boole.math.psu.edu> To: jbw@unix.cis.pitt.edu Subject: my man page for chtext Status: RO Here is a man page which I prepared for chtext. As explained in the next two messages, this man page is not really very professional. However, you might find it useful if you go to write a tutorial for chtext. --------------- chtext man page follows ----------------- .TH CHTEXT 1 "22 December 1989" .SH NAME chtext \- edit CCDOS files (mixed Chinese and ASCII) .SH SYNOPSIS .I chtext foo.pin foo (interactively convert a pinyin file to a CCDOS file) .I chtext foo foo.pin -p (convert a CCDOS file to an annotated pinyin file) .I chtext foo -e (edit a CCDOS file in line-editor mode) .SH DESCRIPTION .B Chtext is a fairly sophisticated program whose purpose is to help you create and modify CCDOS files. Rapid input is achieved by interactive batch conversion of pinyin syllables into Chinese characters. Multi-syllable combinations are stored in an external dictionary which is updated after each session. .SH "FILE FORMATS" .PP A CCDOS file is like a standard ASCII file except that, in addition to standard ASCII characters, it may also contain Chinese characters. Each Chinese character is represented by a code consisting of two consecutive 8-bit bytes, each of the two bytes having its high bit set to 1. These bytes are easily distinguished from standard ASCII bytes, since the latter have their high bits set to 0. The Chinese character codes are in accordance with the Guo2-Biao1 standard. .PP There is a particular kind of CCDOS file known as a Chinese TeX file. Such files are named with a .I .tex extension and contain TeX commands. They are used as input for the Chinese TeX system. Even if a CCDOS file .I foo does not contain any TeX commands, it can be renamed to .I foo.tex and processed for printing as if it were a Chinese TeX file. More information about Chinese TeX can be found in the .B chtex man page. (Note that .B chtex is not to be confused with .BR chtext .) .PP The purpose of .B chtext is to help you create CCDOS files. Roughly speaking, this is accomplished by conversion of pinyin files into CCDOS. A pinyin file is a standard ASCII file, i.e. it contains only standard ASCII characters. In particular, a pinyin file does not contain Chinese characters. The pinyin file format is further described in the tutorial below. .SH TUTORIAL .PP One limitation of .B chtext is that it runs only on a VT100 terminal or in a VT100 window. For instance, you can use .B chtext when dialing in from a PC with VT100 emulation software. Here on the Penn State Math Department Suns, you can get a VT100 window by running .B vtem inside a shelltool. For this, the simplest command is .I shelltool -Ww 80 -Wh 24 vtem & but you may consult the .B vtem man page for more details. .PP There are several different ways to use .BR chtext . In this man page, we explain how to use .B chtext in interactive batch mode. This way is the easiest and most efficient. Other ways to use .B chtext are described in the documentation (see below). .PP You begin by using a standard ASCII editor (such as .B emacs or .BR vi ) to create a pinyin file. This is a standard ASCII file which may contain both ASCII text and pinyin syllables. You then use .B chtext in interactive batch mode to go through the pinyin syllables and select a Chinese character to go with each of them. This selection process is required because typically a pinyin syllable matches several different Chinese characters. The result of the selection process is a CCDOS file. .PP For example, suppose that your pinyin file is called .I foo.pin and consists of the single line \\Zw han4 zi4 where \\Zw is an escape sequence meaning "begin Chinese mode," and han4 and zi4 are pinyin syllables. If you get into a VT100 window and issue the command .IR "chtext foo.pin foo" , you are presented with five Chinese characters, each of which has the pronunciation han4. If the character you want is not among those presented, press the space bar to see more characters. Fortunately, the character you want is the third one on the initial screen. You select it by pressing 3. Next, you are presented with five characters having the pronunciation zi4. You select the second one by pressing 2. Finally, the program asks if you would like to see the file that you have created. You type "yes". The program displays % %% % % % %% % %% %%%%%%%%% %%%%%%%%%%%%%% %% %% %%% %% % % %% % % % % % %% %%%%%%%% %% % %% %% %% % % %% %% % %% % %% % %% %%% %%%%%%%%%%%%%% % %% %% %% %%% %%% %% %% %% %% %% %%% %% %%% %% %% %% %%% %%%% % % % % and quits. You now have a small CCDOS file, .IR foo , consisting of the CCDOS codes for these two characters. .PP This method of input, creation of a pinyin file followed by interactive batch conversion of pinyin syllables into Chinese characters, is much faster and easier than the use of ordinary CCDOS editors such as Chinese Wordstar. .PP In addition to the \\Zw escape sequence, the pinyin file .I foo.pin may contain other escape sequences and .B chtext commands. For instance, \\As is an escape sequence meaning "begin ASCII mode," and # is a special character indicating a space in Chinese mode. This permits you to mix Chinese and ASCII on the same line, as in \\As ASCII. \\Zw zhe4 shi4 zhong1 wen2.##\\As ASCII again. Other special characters and escape sequences are discussed below. .PP Note that an escape sequence such as \\Zw or \\As must be followed by a standard ASCII space. This space is in fact part of the escape sequence. .PP If you want to modify your CCDOS file, you can use .B chtext to convert it back to a pinyin file, which can then be edited using a standard ASCII editor. The pinyin syllables in the converted file will be capitalized and annotated to identify them as unique Chinese characters in the internal .B chtext dictionary. With the above example, the command .I chtext foo foo.PIN -p produces a new pinyin file, .IR foo.PIN , which looks something like \\As ASCII. \\Zw Zhe4zz Shi4rt Zhong1 Wen2.##\\As ASCII again. .PP You can use your plain ASCII editor to modify .I foo.PIN by moving text, deleting text, and adding text. The text which you add may include Chinese mode pinyin syllables, annotated or unannotated. When you have modified .I foo.PIN to your satisfaction, go to a VT100 window and issue the command .I chtext foo.PIN foo to convert foo.PIN back into a CCDOS file. This will involve selecting Chinese characters to go with the newly added, unannotated, pinyin syllables. .PP In practice, .I foo.PIN can be the same file as .IR foo.pin . .PP If you only want to modify the ASCII part of your CCDOS file, you needn't go through the above-described CCDOS-to-pinyin-to-CCDOS conversion process. Instead, you can directly edit your CCDOS file using a plain ASCII editor such as .BR emacs . Be warned however that .B vi will not work for this purpose. In fact, .B vi will destroy your CCDOS file by stripping the high bits. .PP (Note: On the Penn State Math Department Suns, .B emacs is called .BR gnuemacs .) .PP It is also possible to use .B chtext in line editor mode to directly edit Chinese characters in a CCDOS file. The appropriate command is .IR "chtext foo -e" . For this and much more information on how to use .BR chtext , see the documentation. .SH "EXTERNAL DICTIONARY" .PP During interactive pinyin processing, .B chtext is able to recognize many combinations consisting of two or more pinyin syllables separated by hyphens. For example, .B chtext can recognize the two-syllable combination wo3-men2 ("we"). The advantage of this is that wo3-men2 corresponds to a unique combination of two Chinese characters, even though men2 by itself is ambiguous and could indicate any of four different Chinese characters in the internal .B chtext dictionary. Thus, when .B chtext encounters wo3-men2, it is intelligent enough to forego the Chinese character selection process. Instead, .B chtext presents you with the correct two-character combination. This feature saves considerable time and effort. .PP In order to use this feature of .BR chtext , you need a personal, external dictionary of multi-syllable pinyin combinations. This dictionary is a standard ASCII file for which you have read-write permission, located somewhere in your home directory. In addition, you must set an environment variable CHTEXTDIC to point to your personal dictionary file. For example, my personal dictionary file is named .I chtext.dic and is located in .IR /home/boole/simpson/lib/ch . To set the CHTEXTDIC environment variable, I use .I setenv CHTEXTDIC /home/boole/simpson/lib/ch/chtext.dic and this line is in my .I .login file. .PP Initially, you can copy .I /usr/local/doc/ch/examples/chtext.dic or someone else's CHTEXTDIC file into your own directory for your own use. Subsequently, every time you use .BR chtext , your CHTEXTDIC file will be updated at the end of each session, automatically taking note of the multi-character combinations which you have used. Thus, as time passes, your CHTEXTDIC file will become more personalized. This will save you much time and effort. .SH "ESCAPE SEQUENCES AND COMMANDS" .PP Your pinyin files may contain various escape sequences and special commands which are recognized by .BR chtext . .PP Each of the following escape sequences must be followed by a standard ASCII space, which is in fact part of the escape sequence. \\As begin ASCII mode \\Zw begin Chinese mode \\CZw begin pure Chinese mode \\aS begin Chinese ASCII mode (ASCII characters in Guo2-Biao1 format) \\jp begin Japanese Hiragana mode \\Jp begin Japanese Katakana mode \\Gk begin Greek mode \\Rs begin Russian mode \\Sb1 begin Symbol mode 1 \\Sb2 begin Symbol mode 2 .PP The following .B chtext commands need not be followed by a space. # print a space in Chinese mode \\# print # in Chinese mode = appended to a pinyin syllable, disables Chinese character selection \\= print = in Chinese mode - used as hyphen in multi-syllable combinations \\- print - in Chinese mode \\@c continue a long line \\Quiet disable screen messages \\Show enable screen messages .PP The following Chinese mode punctuation symbols are recognized by .BR chtext . douh (comma) dunh (dun4-hao4, no English parallel) juh (hollow period) maoh (colon) fenh (semicolon) wenh (question mark) ganth (exclamation point) .SH DOCUMENTATION .PP J. B. Wang's documentation file "A GUIDE TO CHTEXT" is .I chtext.doc in the directory .IR /usr/local/doc/ch . There are some examples of CCDOS and pinyin files in .IR /usr/local/doc/ch/examples . .SH AUTHOR Dr. J. B. Wang jwang@pittvms.bitnet. Dr. Wang would be pleased to hear of any suggestions for improving .BR chtext . This man page is by S. G. Simpson, simpson@math.psu.edu. .SH BUGS The .B chtext program is still under development and fairly buggy. From simpson@boole.math.psu.edu Sun Dec 24 12:25:46 1989 Received: from psuvax1.cs.psu.edu by unix.cis.pitt.edu (5.61/6.41) id AA28504; Sun, 24 Dec 89 12:25:42 -0500 Received: from boole.math.psu.edu by psuvax1.cs.psu.edu with SMTP (5.61/PSUCS-1.0) id AA06461; Sun, 24 Dec 89 12:23:54 -0500 Received: by boole.math.psu.edu (4.1/Psu2.1) id AA04910; Sun, 24 Dec 89 12:26:56 EST Date: Sun, 24 Dec 89 12:26:56 EST From: simpson@boole.math.psu.edu (Stephen G. Simpson) Message-Id: <8912241726.AA04910@boole.math.psu.edu> To: jbw@unix.cis.pitt.edu Subject: more comments on the Ch package Status: RO As you know, I have installed the Ch-package on my own Sun. (This also entailed installing vttool and your version of dvi2ps.) The package works pretty well, and I think it will be useful for a number of people here. There is another professor in our department who is more or less in charge of our department's network of Suns. Let's call him Professor X. I asked Professor X if it would be OK for me to reinstall the Ch-package on the servers, in /usr/local, for everyone's use. He replied, listing a number of objections. I am not sure that he will finally agree to let me install the software in /usr/local. (On the other hand, I am not sure he has the authority to prevent me from doing so.) Therefore, I am a bit angry at Professor X. On the other hand, some of Professor X's objections are thoughtful and perhaps you would be interested in learning of them. So, in the next message, I send you Professor X's remarks and my reply. Some of Professor X's objections are not addressed to the Ch software, but rather to my man pages. In general, I agree with Professor X that it would be be useful to have "professional style" man pages. I don't think that I would be the best person to write them. I also agree with Professor X that it would be useful if the Ch package followed Unix filename and command syntax as much as possible. Although I didn't mention this to Professor X, it is a Unix convention that programs accept input from a file or the "standard input device," output to the "standard output device," and that flags go before filenames. For instance, in calling chtext with the -p flag, the syntax should be chtext -p filename > filename.pin or chtext -p < filename > filename.pin instead of what you have. I also agree with Professor X that the names of the *input* files for chtex should be required to have some standard extension, perhaps .chtex. The extension should not be .tex since this would indicate a TeX file which could be processed by TeX. The chtex *input* files are definitely not true .tex files in this sense. Another valid point, made by Professor X, is that chtex should be changed so that it is truly a TeX preprocessor. In other words, the *output* of chtex should be a true .tex file, which can be processed using tex to produce a true .dvi file. By a true .dvi file, I mean one which can be processed by any dvixxx driver, where xxx is the type of printer. Ideally, the system should be independent of Postscript. Even if the system depends on Postscript, it should not depend having a particular dvi2ps driver. It should work with any dvi-to-Postscript driver. Of course you know best how much effort would be required to make these improvements. I don't know enough about TeX or the internals of chtex to judge the difficulty of these improvements. I hope that you take these remarks as constructive. I really like the Chinese TeX package, and I am going to fight hard to get it installed on our network for everyone to use. From simpson@boole.math.psu.edu Sun Dec 24 12:36:01 1989 Received: from psuvax1.cs.psu.edu by unix.cis.pitt.edu (5.61/6.41) id AA28580; Sun, 24 Dec 89 12:35:58 -0500 Received: from boole.math.psu.edu by psuvax1.cs.psu.edu with SMTP (5.61/PSUCS-1.0) id AA06707; Sun, 24 Dec 89 12:34:00 -0500 Received: by boole.math.psu.edu (4.1/Psu2.1) id AA04916; Sun, 24 Dec 89 12:37:02 EST Date: Sun, 24 Dec 89 12:37:02 EST From: simpson@boole.math.psu.edu (Stephen G. Simpson) Message-Id: <8912241737.AA04916@boole.math.psu.edu> To: jbw@unix.cis.pitt.edu Subject: remarks by Professor X and my reply Status: RO ---------------- remarks by Professor X ---------------------- Steve -- I looked over your Chinese TeX stuff and have some reservations about installing it. I'll describe these here and we can talk about it after Xmas. 1. It is definitely a buggy development release. The documentation says as much (from the man page: "If chtex gets into trouble, try hitting ."). I first tried it as "chtex file" when I should have said "chtex file.tex" (but the synopsis line on the man page doesn't say this, as it should), and it printed out the error message "Can't find the input file" without a trailing newline. This is a rather simple-minded coding error that indicates to me that the product has barely been tested. I really don't like installing software in /usr/local (which is for "officially installed, supported" software) which is in this state. Other things that I didn't like about it in my very cursory appraisal: (a) It takes a file which is not a TeX file but uses the .tex extension, and returns a file which is a TeX file, but with the .chtex extension. This is clearly wrong. If you use this you will no longer be able to see a file name like "letter.tex" and be sure its a TeX file. The opposite approach (chtex converts a .chtex file to a .tex file) is clearly the correct convention. (b) The name shouldn't be chtex since it doesn't operate on a TeX file and it doesn't output a DVI file (compare tex, latex, amstex, slitex, etc.). A name like chtex2tex or ch2tex would be more sensible. (c) It seems that the output file, file.chtex, (which should be called file.tex in my view) is a standard TeX file, ready to feed to TeX to get a standard DVI file. This DVI file should then be printable or viewable or ready to post-process with any DVI driver as long as the appropriate fonts are available. However it seems that Wang is not making the fonts available except through his particular version of dvi2ps, which is a real shame. This implies that we have to maintain another dvi->postscript driver just for this purpose. Of course this driver will have troubles of its own, as almost every version of dvi2ps does. (d) Tying chtext to a vt100 seems poor. (e) The forty identical tfm files chfont*.tfm should obviously be hardlinks of a single file (not only to save space, which is minimal, but to ease maintenance). (f) What about the problem we had where the output screwed up the LaserWriter and it had to be restarted? Has this been fixed? 2. I don't like the style of the man pages in several regards. (a) Unix man pages have a particular telegraphic style which makes them difficult for beginners but quicker and easier for reference by knowledgeable users. Whether you like this or not, it is the convention, and consistency is valuable here. Beginner's documentation with lots of examples, encouragement, etc., belongs elsewhere (in /usr/local/doc). On the other hand the synopsis lines should be complete, in the conventional notation for synopsis lines. (b) The man pages shouldn't be personalized as in "Chinese TeX is available here on the Computer Network of the Mathematics Department at the Pennsylvania State University." or "In order to use the Chinese TeX system as described here, you must have an account on the Penn State Math Department Computer Network." These statements don't really add any information (they could be added to every man page), and they certainly detract from portability. I'm glad I didn't have to go through all the software I installed from Maryland and change the man pages to not refer to Maryland. The man pages certainly should not refer to a directory like /home/boole/simpson, since we don't want to have to remember to change the man page if you buy a new machine, move to a different university, or reorganize your home directory. 3. An entirely different sort of reservation is whether we should install this sort of software. I have mixed feelings on this. I think we have to be careful to remember that all this equipment was bought with University, State, and grant money in order to support our mathematics (research, teaching, etc.). I suspect that this system will be used mostly for purposes unrelated to mathematics (social, political, etc.) Do you agree? If its only going to be used a little, it doesn't really matter. But if we're going to get to a situation where all the machines in 115 are in use and people can't get at them for a mathematical purpose because too many grad students are writing letters home, or whatever, or if newton is slowed to a crawl and mathematica can't be used because everyone is running chtex and tex, I would be very concerned. If our machines get overutilized because too many people are running mathematica or word-processing their theses, we can present that as an argument to get more machines; but not so if too many people are writing letters or posters. You will notice that there are no games in /usr/local. This is for a similar reason. What are your thoughts on this? ----------------- Simpson's reply to Professor X ------------------- For now I'll just comment on your remarks. As you say, let's talk after Christmas. 1. I have to agree with your assessment of the state of the software. It is indeed a development release. The lack of a trailing newline is of course not a serious problem, and the business with the carriage return is really no problem at all (the same thing happens if you try to tex a file that isn't in quite the right format). On the other hand, in testing the software I found a number of other annoying glitches, things that should work but didn't. If all software in /usr/local must be certified bug-free, then this stuff certainly doesn't belong. I think that this is the strongest argument against installing the software. On the file naming conventions, you are correct in asserting that they are not harmonious with latex, etc. However, I don't see this as a reason not to install the software. I have noticed many incongruities and peculiarities throughout Unix. If you like, I could rewrite the cctex script so that it would only accept a file with a .cctex extension, or whatever. I really don't see this as a serious problem. I like this software for several reasons. First, it does what it claims to do, namely permits inclusion of Chinese in TeX documents. Actually, it does this rather well, and I don't think there is any other product that does it at all. Second, Wang's Chinese editor chtext (not to be confused with chtex) is very innovative, employing a number of clever techniques to speed up the task of entering Chinese text. Third, I like the idea of placing this tool into the hands of the many Chinese users of our system. The fact that chtext works on a VT100 actually turns out to be an advantage in that you can run it from any PC. Obviously a graphical interface would be more pleasant, but I found that chtext works extremely well if used as I described in the man page, even on a PC dialed in at 2400 baud. The problem where the output screwed up the Laserwriter occurred with some other Chinese software, not chtex. Chtex seems to be more sophisticated in the way it handles bitmapped fonts. 2. About the man pages, I am sorry that they are not up to your standards. However, my object in writing them was to make it possible for people to use the software, and in that respect I think they will succeed. If necessary, I am willing to go through all of the documentation, extract the complete synopsis information, learn the synopsis line conventions, and draft the synopsis lines. However, if this is not absolutely necessary, it would be a waste of my valuable time, since there would be no reason for anybody to read these synopsis lines. Your man page portability issues are easily addressed. The references to /home/boole/simpson will be unnecessary, indeed inappropriate, if we install the software in /usr/local. The references to Penn State are in no way essential and can be deleted. 3. The issue of whether to restrict the use of the computer network to mathematical research and teaching is not for me to decide. I am not sure that it is for you to decide. In any case, it seems to me that Chinese TeX is mathematically justifiable in terms of abstracts, theses, papers for Chinese math journals, curricula vitae, letters to colleagues -- all the things that TeX is normally used for. The idea that we should refrain from installing serious software because too many people might use it is incomprehensible to me. Games are of course a completely different matter. Steve Simpson From simpson@boole.math.psu.edu Sun Dec 24 12:38:26 1989 Received: from psuvax1.cs.psu.edu by unix.cis.pitt.edu (5.61/6.41) id AA28592; Sun, 24 Dec 89 12:38:24 -0500 Received: from boole.math.psu.edu by psuvax1.cs.psu.edu with SMTP (5.61/PSUCS-1.0) id AA06763; Sun, 24 Dec 89 12:36:46 -0500 Received: by boole.math.psu.edu (4.1/Psu2.1) id AA04923; Sun, 24 Dec 89 12:39:47 EST Date: Sun, 24 Dec 89 12:39:47 EST From: simpson@boole.math.psu.edu (Stephen G. Simpson) Message-Id: <8912241739.AA04923@boole.math.psu.edu> To: jbw@unix.cis.pitt.edu Subject: Merry Christmas! Status: RO I hope your move to Stevens is pleasant and successful. Let's keep in touch. Up with Chinese Democracy! Merry Christmas, Happy New Year! From simpson@boole.math.psu.edu Tue Dec 26 17:25:12 1989 Received: from [128.118.6.2] by unix.cis.pitt.edu (5.61/6.41) id AA04155; Tue, 26 Dec 89 17:25:08 -0500 Received: from boole.math.psu.edu by psuvax1.cs.psu.edu with SMTP (5.61/PSUCS-1.0) id AA26895; Tue, 26 Dec 89 17:23:17 -0500 Received: by boole.math.psu.edu (4.1/Psu2.1) id AA06711; Tue, 26 Dec 89 17:26:23 EST Date: Tue, 26 Dec 89 17:26:23 EST From: simpson@boole.math.psu.edu (Stephen G. Simpson) Message-Id: <8912262226.AA06711@boole.math.psu.edu> To: jbw@unix.cis.pitt.edu Subject: Ci2Dian3.chi problem Status: RO JB, In testing chtext on my Sun, I am finding that the automatic updating of Ci2Dian3.chi feature doesn't work for me. I have tried several different configurations and I don't think it has ever worked correctly, not even once. I compiled chtext using CiDianpath=/home/boole/simpson/lib/ch, and I have a copy of the original Ci2Dian3.chi in that directory, with permissions set to 666 (read-write for everybody). In addition I have a copy of Ci2Dian3.chi in /home/boole/simpson/lib/ch/my, with permissions set to 644 (read-write for me, read only for everybody else). My intention was that the 666 copy would be the default (for those who do not bother to set the CHTEXTDIC environment variable) and the 644 copy would be my personal one. I created a small file with some two-character pinyins not in the original Ci2Dian3.chi. The file contains some names of Chinese friends, for instance wei3-yi2. I called the file friends.pin. I gave the command chtext friends.pin friends. The program correctly went through the file and allowed me to choose Chinese characters. It then wrote the friends file correctly. I expected it to add my friends' names to the dictionary. Instead, it dumped core (error message: segmentation fault, core dumped), copied a 0-byte file over the Ci2Dian3.chi, and stopped. So, the program not only failed to update the dictionary, it even destroyed the dictionary. This is quite annoying. The above happened three times in a row, with CHTEXTDIC setenved to /home/boole/simpson/lib/ch, /home/boole/simpson/lib/ch/my, and unset. Am I doing something wrong? Or is this a serious bug in chtext?