.TH GUAVA 1 "June 2000" "Guava 1.0.3" "Guava - Web Programming Tools"

.SH NAME
hss2html \- Converts an hss source file into HTML.

.SH SYNOPSIS
.B hss2html
[\fBoptions\fP]
.I input-file


.SH DESCRIPTION
.B hss2html
is a Perl program that converts a source file into HTML.  It calls the
C preprocessor
.BR cpp (1)
to do some of the work, so you get to 
.B #define
your own macros,
.B #include
other files, and use 
.B /*comments*/ 
just like you can in C.  It also
allows you to cross reference other files, and pre-defines a few handy
macros of its own.

.B hss2html
is also called from other programs in the Guava tools, for example the
.B htt2html 
template processing program, so that all the features of
.B hss2html
can be used in your 
.B htt2html
source files.

.SH OPTIONS
.IP \fB-v 
Be verbose.

.IP "\fB-version\fP"
Print version information and exit.

.IP "\fB-D\fImacro\fP[=\fIdefn\fP]"
Predefine \fImacro\fP, with definition \fIdefn\fP.  If \fIdefn\fP is
not specified, the value "1" will be used.

.IP "\fB-I\fIdir\fP"
Add \fIdir\fP to the list of directories to be searched when
processing
.BR #include .
The directory containing the source file is always searched.

.IP "\fB-o\fP \fIfile\fP"
The name of the output file.  If the supplied name does not have a
filename extension the output filename extension (as specified by
.BR -outext )
will be appended.

.IP "\fB-localroot\fP \fIdir\fP"
The directory to use as the local-root when processing the FULLREF
builtin.

.IP "\fB-remoteroot\fP \fIdir\fP"
The directory to use as the remote-root when processing the FULLREF
builtin.

.IP "\fB-x\fP \fIname\fP"
The full name of the cross reference file.  The default is
"index.xref".

.IP "\fB-M\fP"
Causes
.B hss2html 
to generate dependency information in a form suitable for use with
.BR make (1).
The dependency information is written to stdout and the program
terminates without creating any normal output.

The behaviour of
.B hss2html
differs slightly from that of the C preprocessor when creating
dependency information in that conditionals such as
.B #if 
are not considered, and any 
.BR #include " or " <IMPORT> 
statement in the input files will cause a dependency to be created,
regardless of whether that file will be used in the final output.

Use the
.B -M
option to the
.B htt2html
program to generate dependency information from multi-page source files.

.IP "\fB-MG\fP"
Treat missing included or imported files as generated files, rather
than stopping with an error message.  It is assumed that these files
will be created in the current directory.  If you specify
.B -MG
you  must also specify
.BR -M .

.IP "\fB-MX\fP"
Consider the cross reference file for inclusion in the dependency list
generated by
.BR -M .
The cross reference file will only be added to the list if the <REF> tag
is found in the input files.  If you specify
.B -MX
you  must also specify
.BR -M .

.IP \fB-d
Compare the generated output file with any existing file of the same name
and replace only if there are changes.  This can help to avoid updating
files unnecessarily, for example, when a large number of files are generated
from a single source file by
.BR htt2html (1).

The 
.BR diff (1)
program is used to compare the files.  This can be modified using
the
.B -diffprog
option.


.SH INFREQUENTLY USED OPTIONS

.IP \fB-k
Keep intermediate files.

.IP \fB-b 
Don't delete blank lines in the output.  The C preprocessor often
generates a lot of unnecessary blank lines, so you probably don't want
to use this option.

.IP "\fB-pwd\fP \fIdir\fP"
The directory to use as the pwd when processing the FULLREF builtin.
If
.B -pwd
is not specified the directory name will be obtained by calling the
.BR pwd (1) 
program.

.IP "\fB-def\fP \fIstring\fP"
The define string (default is "#define %NAME% %DEF%").  You'll only
need to change this if you use something other than the C
preprocessor.

The words 
.BR %NAME% " and " %DEF%
are insertion point markers that will be replaced with the name of the
macro and the definition when 
.B hss2html 
writes out the command line definitions supplied with the
.B -D
option during the preprocessing stage.

The define string
.I must
contain both of the insertion point markers, although the markers
themselves can be redefined using the
.BR -defnmark "and " -defdmark
options.

.IP "\fB-defnmark\fP \fImarker\fP"
Sets the insertion point marker for names in the
.B -def
command string.  The default is "%NAME%".  In the unlikely event that
the default marker conflicts with something in your
.B -def
string, you will have to use this option to specify an alternative.

.IP "\fB-defdmark\fP \fImarker\fP"
Sets the insertion point marker for names in the
.B -def
command string.  The default is "%DEF%".  In the unlikely event that
the default marker conflicts with something in your
.B -def
string, you will have to use this option to specify an alternative.

.IP "\fB-diffprog\fP \fIstring\fP"
The string used to invoke the comparison program when the 
.B -d
option is used.  The default is "diff -b -q %OLD% %NEW% >/dev/null".
If you do not have GNU diff or if you want to change its behaviour,
then you may need to change this setting.
.B hss2html
expects your replacement program to return zero if there are no
differences between the files, and a non-zero value otherwise.

The -b option to GNU diff tells it to ignore differences in the amount
of white space.  The -q option reduces the amount of output that diff
generates.

The words 
.BR %OLD% " and " %NEW
are insertion point markers that will be replaced with the names of the
old and new output files when the command is invoked.

The diffprog string
.I must
contain both of the insertion point markers, although the markers
themselves can be redefined using the
.BR -diffomark "and " -diffnmark
options.

.IP "\fB-diffomark\fP \fImarker\fP"
Sets the insertion point marker for names in the
.B -diffprog
command string.  The default is "%OLD%".  In the unlikely event that
the default marker conflicts with something in your
.B -diffprog
string, you will have to use this option to specify an alternative.

.IP "\fB-diffnmark\fP \fImarker\fP"
Sets the insertion point marker for names in the
.B -diffprog
command string.  The default is "%NEW%".  In the unlikely event that
the default marker conflicts with something in your
.B -diffprog
string, you will have to use this option to specify an alternative.

.IP "\fB-cpp\fP \fIcommand\fP"
Sets the command line used to execute the C preprocessor.  Not all
.BR gcc (1)
installations have a stand-alone
.BR cpp (1) 
program, so the default command string uses
.B gcc
with the
.B -E
command line option to do the same thing.

The default setting is therefore:

.B "gcc -x c -traditional -E -P %OPTS% -o %OUTFILE% %INFILE%"

The three words
.BR %OPTS% ", " %OUTFILE% ", and " %INFILE%
are insertion point markers.  Before the command is executed they will
be replaced with the 
.B -cppopts
string, the name of the output file, and the name of the input file,
respectively.

The command string
.I must
contain all three of the insertion point markers, although the markers
themselves can be redefined using the
.BR -cppoptsmark ", " -cppoutmark ", and " -cppinmark
options.

.IP "\fB-cppopts\fP \fIopts\fP"
Options to pass to the C preprocessor.  Options will be inserted into
the CPP command line at the insertion point defined by the
.B -cppoptsmark
option.

.IP "\fB-cppoptsmark\fP \fImarker\fP"
Sets the insertion point marker for C preprocessor options in the
.B -cpp
command string.  The default is "%OPTS%".  In the unlikely event that
the default marker conflicts with something in your
.B -cpp
command string, you will have to use this option to specify an
alternative.

.IP "\fB-cppinmark\fP \fImarker\fP"
Sets the insertion point marker for the input file name in the
.B -cpp
command string.  The default is "%INFILE%".  In the unlikely event
that the default marker conflicts with something in your
.B -cpp
command string, you will have to use this option to specify an
alternative.

.IP "\fB-cppoutmark\fP \fImarker\fP"
Sets the insertion point marker for the output file name in the
.B -cpp
command string.  The default is "%OUTFILE%".  In the unlikely event
that the default marker conflicts with something in your
.B -cpp
command string, you will have to use this option to specify an
alternative.

.IP \fB-p
Do not use
.BR cpp (1)
line control.

Line control allows error messages to correspond to the line in the
source file where they actually occur.  This is useful, so line
control is turned on by default.

ANSI
.BR cpp (1)
compatible line control codes are used.  If your preprocessor does not
understand these, you may need to use this option to turn off line
control.

.IP "\fB-intext\fP \fIext\fP"
The filename extension for intermediate files.  Default is ".hxx".

.IP "\fB-diffext\fP \fIext\fP"
The filename extension for intermediate output files when the
.B -d
option is used.  Default is ".dout".

.IP "\fB-outext\fP \fIext\fP"
The filename extension for output files.  Default is ".html".

.IP "\fB-htmlhdr\fP \fIstring\fP"
Set a string that will be written at the very beginning of the output
file.  This could be used to add a DOCTYPE line to HTML files, for
example.  The default is not to output anything.

.IP \fB-stamp
Write the date to the output file in an HTML <!--comment-->.  This is
not enabled by default as it will interfere with the operation of the
.B -d
option.


.SH OPERATION

.B hss2html
reads an input file, and processes it in three stages.  There is a
pre-processing stage, after which the file is processed using the C
preprocessor (\fBcpp\fP), and finally a post-processing stage.

.SS Pre-processing
The pre-processing stage reads the input file and expands 
.B #include
files.  Any macro definitions made with the command line \fB-D\fP
option are prepended to the output.  Cross references are expanded
here too.

.SS The C pre-processor
The output from the pre-processing stage is sent to the C pre-processor
.BR cpp (1)
which does all the usual C pre-processor things.  C-style /*comments*/
are removed, macros are expanded, etc.  The only unusual thing is that
.B #include
files will already have been expanded by the
.B hss2html
pre-processing stage, so will not be visible to 
.BR cpp .

.SS Post-processing
The post-processing stage takes the output of the C pre-processor and
processes a few more built-in macros, which are described below.
Imported files are inserted into the output at this stage.

.SH CROSS REFERENCES
Cross references are particularly useful (and pretty much automatic)
when
.B hss2html
is driven by the
.B htt2html
program, when processing a multi-page document.  Nevertheless, they
can also be used when
.B hss2html
is called directly, by specifying a file containing cross reference
information using the \fB-x\fP command line option.  Cross references
are inserted in a document using the REF macro.

.SS The Cross Reference File Format
The cross reference file contains one or more sections, each
corresponding to another page or section of a larger document.  Each
section begins with a <PAGE> marker, followed by a number of key/value
pairs.

For example:

.RS
.DS
<PAGE>
.br
LABEL The first page
.br
NUMBER 1

<PAGE>
.br
LABEL The second page
.br
NUMBER 2
.DE
.RE

.SS The <REF> builtin macro. 

The 
.B <REF>
macro will output the value of the key named 
.IR output ,
for the first page that has a key/value pair matching
.I key
and
.I value
in the cross reference file:

.B "<REF:\fIoutput\fP,\fIkey\fP=\fIvalue\fP>"

Using the example cross reference file given in the section above, the
string:

.RS
.B "<REF:LABEL,NUMBER=1>"
.RE

will be replaced with:

.RS
.B "The first page"
.RE


.SH BUILTIN MACROS
There are a number of built in macros which are expanded in the 
post-processing stage of
.BR hss2html .

.SS <DATE\fR[\fP=\*(lq\fIformat\fP\*(rq\fR]\fP> 

The 
.B <DATE> 
macro expands to the current date and time, as specified by the
.I format
string.  The format string is almost identical to the format string
passed to the POSIX
.BR strftime (3)
function, in C and Perl.  For example, the macro:

.RS
.B "<DATE=\*(lq%A, %B %d<TH>, %Y\*(rq>"
.RE

will be replaced with something like:

.RS
.B Thursday, September 30th, 1999
.RE

The format string can be omitted, in which case the output will be in
the format produced by the
.BR ctime (3)
function call.  For example:

.RS
.B "<DATE>"
.RE

will be replaced with something like:

.RS
.B Thu Sep 30 20:33:01 1999
.RE

A small difference between this format string and the POSIX format
string is the availability of the
.B <TH>
mini-macro, which is demonstrated in the example above.

.B <TH>
can \fIonly\fP be used in the format string of the
.B <DATE>
macro, and expands to one of \*(lqst\*(rq, \*(lqnd\*(rq, \*(lqrd\*(rq,
or \*(lqth\*(rq, according to the current day of the month.

.SS <IMPORT=\*(lq\fIfilespec\fP\*(rq>

The
.B IMPORT
macro imports the contents of a file and inserts it into the output.
The file specified will be searched for using the include path (as set
by the
.B -I
command line option), in exactly the same way that the #include
directive is processed.  The difference between #include and <IMPORT>
is that imported files are copied verbatim into the output, with no
further processing taking place on the contents.

.SS <FULLREF=\*(lq\fIfilespec\fP\*(rq>

The 
.B FULLREF 
macro operates in one of two modes, depending upon whether the macro
.B LOCAL
has been defined, e.g. using 
.B -DLOCAL
on the command line.

In LOCAL mode, the 
.B FULLREF
macro simply expands to its \fIfilespec\fP
argument.

In non-LOCAL mode, the
.B FULLREF
macro first converts any relative components of the \fIfilespec\fP
argument into an absolute path, using the current working directory,
or the value specified by the
.B -pwd
option.

The next stage of the operation uses the LOCALROOT and REMOTEROOT
strings, which must have been supplied on the command line using the
.BR -localroot " and " -remoteroot
options, or alternatively by defining the macros 
.BR LOCALROOT " and " REMOTEROOT .
The command line options will override the macros, if both are
supplied.

Having created an absolute path name, the system checks that the path
exists beneath the LOCALROOT.  This is done by testing that the
absolute path name begins with the LOCALROOT string.  It does not test
for the existence of an actual file or directory.

Finally, the LOCALROOT part of the absolute name is replaced with the
REMOTEROOT string.

For example, if your web site is at

.RS
.B http://www.test.site/
.RE

and your source code is being built in the directory

.RS
.B /home/sdm/Guava/test/
.RE

You should set the command line options,

.RS
.B -remoteroot "http://www.test.site/"
.br
.B -localroot "/home/sdm/Guava/test/"
.RE

Then, when building source code in 
.BR /home/sdm/Guava/test/subdir ,
the macro:

.RS
.B <FULLREF="../index.html">
.RE

will be replaced with the string:

.RS
.B http:/www.test.site/index.html
.RE


.SS <APOS>
.SS <QUOTE>
Sometimes the C preprocessor gets confused with apostrophes (') and
quotes (\*(lq), because it assumes that they mark the beginning of a
character constant.  This can cause the C preprocessor to abort, with
a diagnostic like "unterminated character constant".  By default,
.B hss2html
passes the
.B -traditional
option to GNU
.BR cpp (1) 
which avoids the abort, but still appears not to guarantee the
expected behaviour for text like:

.RS
.B #define MACRO work
.br
.B This test doesn't MACRO as well as it should.
.RE

which will fail to expand the macro due to the presence of the 
apostrophe in "doesn't".

To work around these problems use the <APOS> and <QUOTE> builtin
macros, which are expanded to the appropriate characters in the post
processing stage, after the C preprocessor has been run.

The behaviour of GNU 
.B cpp
and the 
.B -traditional 
option is explained in detail in the 
.BR info (1)
pages for 
.BR cpp .

.SS <NOSP>

The
.B <NOSP> 
builtin macro absorbs spaces, so that the string:

.RS
.B There are no  <NOSP>  spaces here.
.RE

becomes:

.RS
.B There are nospaces here.
.RE

This can be useful if the C preprocessor inserts spaces into its
output where you do not want them, for example, when it expands
macros.

Another use of 
.B <NOSP> 
is to protect strings that would otherwise be expanded as macros.  So:

.RS
.B #define MACRO string
.br
.B MACR<NOSP>O
.RE

will be output as:

.RS
.B MACRO
.RE


.SS <SP>

The 
.B <SP>
builtin macro inserts a single space into the output.  It is
occasionally useful in conjunction with
.BR <NOSP> ,
for example, to ensure that there is only a single space at certain
point:

.RS
.B #define MACRO1 one
.br
.B #define MACRO2 two
.br
.B MACRO1 <NOSP><SP> MACRO2
.RE

will give:

.RS
.B one two
.RE

You can also use
.B <SP>
to protect a blank line inside HTML's
.B <PRE> 
tags, since 
.B <SP> 
is expanded after blank lines are removed from the output.  Note that
.B <NOSP>
will not work for this purpose, since it absorbs all whitespace,
including newlines.

.SS <REF:\fIoutput\fP,\fIkey\fP=\fIvalue\fP>  

The 
.B <REF> 
macro is described in the
.B Cross References 
section.

.SS Adding new macros to hss2html.

Perl programmers could easily add new macros to
.B hss2html
by adding them to 
.B sub Postprocess
in the Perl script.  Patches to the
.B Guava
tools are welcomed.

.SH FILES
The input file for 
.B hss2html
must be specified on the command line.  If the input filename has an
extension (a dot followed by characters at the end of the filename),
the names of intermediate and output files will be created by
replacing the extension.  If the filename has no extension, the
intermediate and output file names will be created by appending the
appropriate extensions to the input filename.

Two intermediate files will always be created.  The first is the
output from the preprocessing stage, and will have an extension
something like
.B .1.hxx
where the 
.B hxx
part is the default extension which can be modified using the
.B -intext
option.  The second file is the output of the C preprocessor, and
will have an extension like
.BR .2.hxx .

The final output file is the output of the postprocessing stage, and
uses the filename extension specified by the
.B -outext
option, which defaults to ".html".

If the
.B -d
option is used, the output will be written to a third intermediate
file before being compared to any existing file with the output file
name.  The existing file is replaced only if the contents of the two
files differ, or if there is no existing file with the output file
name.  The intermediate file name uses the file extension specified by
the
.B -diffext
option, which defaults to ".dout".

All intermediate files will be deleted automatically unless the 
.B -k
option is given.


.SH BUGS
.B hss2html
does not parse HTML, so it might not always do the right thing.  For
example it will expand macros and remove blank lines from verbatim
sections of HTML.  The description of the
.BR <NOSP> " and " <SP>
builtin macros suggests workarounds for some of these problems.

There are too many options that you'll probably never use.

.SH AUTHOR
Steve Morphet <smorphet@iee.org>
.br
http:/www.users.globalnet.co.uk/~morphet

.SH LICENCE

.B hss2html
and the 
.B Guava
tools are Copyright (C) 1999-2000 S Morphet <smorphet@iee.org>

This program is free software; you can redistribute it and/or modify
it under the terms of the GNU General Public License as published by
the Free Software Foundation; either version 2 of the License, or (at
your option) any later version.
 
This program is distributed in the hope that it will be useful, but
WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
General Public License for more details.

.SH "SEE ALSO"
.BR htt2html (1)
.BR websrccopy (1)
.BR webbuilder (1)
.BR cpp (1)
.BR gcc (1)
.BR diff (1)
.BR make (1)
.BR perl (1)

