From unknown Sun Nov  8 04:08:43 1992
SUB: Marpa, prototype of a Hacker's Parser for Perl
SUM: jeffrey@netcom.com (Jeffrey Kegler)->Perl-Users@fuggles.acc.Virginia.EDU

Following my signature is a shar archive of the working prototype of
the Marpa parser.  Marpa is the name of the parse engine itself --
it's a library of perl routines.  Milarepa is a program which uses a
subset of Marpa's capabilities to take a file of BNF and associated
actions (in Perl!), and parse them into a Perl program which parses
its input standard, performing the actions.  Both milarepa.pl and the
Perl program it writes on its standard output use Marpa.  milarepa.pl
both produces examples of Marpa usage, and is itself a moderately
sophisticated one.

For example, test2.mr describes a calculator in the Milarepa language,
a straightforward combination of BNF and Perl.  Running the command
"perl milarepa.pl < test2.mr > testmr2.pl" creates the calculator, and
"perl testmr2.pl" runs it.  The example just adds and multiplies but a
glance at test2.mr should show how easily it could be extended into a
very powerful calculator.

The files in the package are

marpa.pl    -- the heart of the Marpa prototype, its parsing routines
milarepa.pl -- creates simple Marpa compilers or interpreters
                from BNF and Perl
test.mr     -- the Milarepa code for an arithmetic expression compiler
mrtest.pl   -- The above compiler compiled into a Perl script which
               prints the parse for simple arithmetic expressions
test2.mr    -- the Milarepa code for an arithmetic expression interpreter
mrtest2.pl  -- The above interpreter compiled into a perl script which
		evaluates simple arithmetic expressions
tilopa.pl   -- A simple lexer used by Milarepa
naropa.pl   -- Other simple routines used by Milarepa 
test.pl     -- A simple program using Marpa
test2.pl    -- Another simple program using Marpa

This is intended to be a real hacker's parser.  It is not restricted
to LR(k), and the parse logic follows directly from the BNF.  It
handles ambiguous grammars, ambiguous tokens (tokens which were not
positively identified by the lexer) and allows the programmer to
change the start symbol.  There is no fixed distinction between
terminals and non-terminals, that is, a symbol can both match the
input AND be on the left hand side of a production.  Multiple Marpa
grammars are allowed in a single perl program.  The grammar is
extensible.  The BNF may have productions added (or, as an extension,
deleted) after parsing has begun.

Since, unlike LR parsers, Marpa's logic follows directly from the BNF,
hackers should be able to invent tricks.  For example, the order in
which productions are tested is controlled by the programmer.  He can
perform error handling by inserting special error productions into the
grammar, which detect those cases he wishes to report ("Missing comma
in list", etc.).  These productions can come after the others, so that
no input ever fails to parse in the strict sense, some merely return
special "error parses" which indicate the problem.  It does not bother
Marpa if the new productions introduce ambiguities into the grammar.

Of course, I happily offer Marpa under the same terms of free
redistributability that I was offered perl.

There are two restrictions on the grammar, neither of which I believe
will prevent Marpa from handling any grammar of practical use.  First,
the grammar may not be left recursive.  Left recursion makes Marpa
recurse infinitely.  A later version will detect left recursion and
stop the parse with an error.

Second, the input must be divided into sentences of a finite maximum
length.  This restriction is unusual in parsing theory, but is easy to
apply to any grammar of practical interest.  A language where the
parseable entities ran many pages would not be readable or writable by
humans, and even the most obscure computer languages must be divided
into pieces by the human being reading them, otherwise they would not
be comprehensible at all.

With this restriction, Marpa runs very fast.  I have done a C language
prototype of Marpa, and it chomps down large, highly ambiguous
sentences of an English subset rapidly.  The theoretical speed, with
division into sentences of maximum fixed length, is linear, or O(n).
The C code works by pushing pointers onto and off of stacks, and runs
very fast.

This implementation is not fast, since it prototypes Marpa in Perl.  I
am now seeking help in converting Marpa into part of Perl.

Any parser needs help in lexing, and in evaluating the semantics of
the results.  Perl's power in these areas makes it an ideal place to
embed Marpa.

The current documentation stinks.  It forms the rest of this message.

This interface didn't come out very "perl-ish", and that may be due to
my habits of thought.  I would like an interface that seemed more
"perl-ish".

&createGrammar() -- returns a grammar ID for a new grammar, to be
built with registerAlternate() and registerSymbol() calls.  The
grammarID scalar is actually just an integer.  The new grammar becomes
the current grammar.

&setGrammar($gid) -- sets the current grammar to gid.  All Marpa
routines affect only the current grammar.

&registerSymbol($name, $pattern) -- a string giving the name and
another the search pattern.  The name is to be used by another set of
routines (not described) which will use names instead of symbol IDs.
The pattern is only used by the default lexer supplied with Marpa.
Returns the symbol ID of a newly created symbol in the current
grammar.

&registerAlternate($value, $lhs, @rhs) -- returns the production ID of
a new production in the current grammar.  The $lhs is the symbol ID of
its left hand side, and @rhs is a list (possibly empty) of symbol IDs
for the right hand side.  Value is the string to be evaluated when
translating the parse tree for this grammar.

&registerToken($value, @sidList) -- used to build the sentence to be
parsed.  $value is the string that will be passed up (unevaluated) to
the upper levels of the parse tree.  The @sidList is the list of
symbol IDs which are possible choices for this token.

&parse($sid) -- the main routine.  Returns the result of evaluating
the parse tree, or undefined if there were no more parses.  Repeated
calls return evaluations of alternative parse trees, if the grammar
parses the sentence ambiguously.  $sid is the start symbol to use.
The code in this routine is the heart of the parser, and is very hard
to figure out.  The algorithm comes from the two volume _The Theory of
Parsing, Translation and Compiling_ by Aho & Ullman, and I don't
recommend you bother trying to figure out what's going on here unless
you read that section of the book (Vol. I, pp. 289-297).  The code in
the other routines should be accessible to the determined Perl hacker.

&clearParse() -- for parsing ambiguous grammars, &parse save its
intermediate results.  This routine clears them, and deletes all
tokens in the current sentence.

Marpa, Milarepa, Tilopa and Naropa are the names of Tibetan saints.
Marpa the Translator was instrumental in bringing Indian Buddhist
texts to Tibet, which task involved three dangerous journeys across
the Himalayas, extensive fund raising, great scholarship, linguistic
ability and deep spiritual development.

Cheers!

Jeffrey Kegler, Independent UNIX Consultant, Algorists, Inc.
jeffrey@algor2.ALGORISTS.COM or uunet!algor2!jeffrey
137 E Fremont AVE #122, Sunnyvale CA 94087
"No wonder the gods smile so seldom -- we so often fail to notice."
>From _Stardance_ by Spider and Jeanne Robinson

#!/bin/sh
# This is a shell archive (produced by shar 3.49)
# To extract the files from this archive, save it to a file, remove
# everything above the "!/bin/sh" line above, and type "sh file_name".
#
# made 11/06/1992 11:54 UTC by jeffrey@netcom
# Source directory /u25/jeffrey/marpa
#
# existing files will NOT be overwritten unless -c is specified
#
# This shar contains:
# length  mode       name
# ------ ---------- ------------------------------------------
#   7038 -rw-r--r-- marpa.pl
#   4221 -rw-r--r-- milarepa.pl
#    888 -rw-r--r-- mrtest.pl
#   1101 -rw-r--r-- mrtest2.pl
#    966 -rw-r--r-- naropa.pl
#    865 -rw-r--r-- test.pl
#    700 -rw-r--r-- test2.pl
#    813 -rw-r--r-- tilopa.pl
#    160 -rw-r--r-- test.mr
#    244 -rw-r--r-- test2.mr
#
# ============= marpa.pl ==============
if test -f 'marpa.pl' -a X"$1" != X"-c"; then
	echo 'x - skipping marpa.pl (File already exists)'
else
echo 'x - extracting marpa.pl (Text)'
sed 's/^X//' << 'SHAR_EOF' > 'marpa.pl' &&
package Marpa;
X
$debug = 0;
X
$currentGrammar = 1;
$nextGrammar = 2;
X
$nextAlternate = 1;
$nextSymbol = 1;
X
$listTemplate = "l*";
X
sub main'createGrammar
{
X    local($ret) = $nextGrammar++;
X    return $ret;
}
X
sub main'setGrammar
{
X    local($gid) = @_;
X    $currentGrammar = $gid;
}
X
sub main'clearParse
{
X    @TokenValue = ();
X    @Token = ();
X    @L1Position = ();
X    @L1 = ();
X    @L2 = ();
X    $inProgress = 0;
}
X
sub main'registerSymbol
{
X    local($name, $pattern) = @_;
X    local($ret) = $nextSymbol++;
X    $symbolName{$currentGrammar, $ret} = $name;
X    $symbolID{$currentGrammar, $name} = $ret;
X    $symbolPattern{$currentGrammar, $ret} = $pattern if defined $pattern;
X    return $ret;
}
X
sub main'requireSymbol
{
X    local($name) = @_;
X    local($ret) = $symbolID{$currentGrammar, $name};
X    $ret = &main'registerSymbol($name) unless defined $ret;
X    return $ret;
}
X
sub main'requirePattern
{
X    local($name, $pattern) = @_;
X    local($sid) = &main'requireSymbol($name);
X    $symbolPattern{$currentGrammar, $sid} = $pattern;
X    return $sid;
}
X
sub main'symbolName
{
X    local($sid) = @_;
X    return $symbolName{$currentGrammar, $sid};
}
X
sub main'terminatorSymbol
{
X    local($sid) = @_;
X    $terminator{$currentGrammar, $sid} = 1;
}
X
sub main'discardSymbol
{
X    local($sid) = @_;
X    $discard{$currentGrammar, $sid} = 1;
}
X
sub main'registerAlternate
{
X    local($value, $lhs, @rhs) = @_;
X    local($alt) = $nextAlternate++;
X    print "Registering alternate $alt, $lhs = " . join(",", @rhs) .
X	" -> \"$value\"\n" if $debug;
X    $Value{$currentGrammar, $alt} = $value;
X    $LHS{$currentGrammar, $alt} = $lhs;
X    $RHS{$currentGrammar, $alt} = pack($listTemplate, @rhs);
X    local(@altList) = unpack($listTemplate,
X	$Alternate{$currentGrammar, $lhs});
X    push(@altList, $alt);
X    $Alternate{$currentGrammar, $lhs} =
X	pack($listTemplate, sort { $a <=> $b } @altList);
X    return $alt;
}
X
sub main'registerToken
{
X    package Marpa;
X    local($value, @symbolList) = @_;
X
X    push(@TokenValue, $value);
X    push(@Token, pack($listTemplate, @symbolList));
}
X
$inProgress = 0;
X
$normal = 1;
$backtrack = 2;
$fail = 3;
$succeed = 4;
X
sub elementOf
{
X    local($element, @set) = @_;
X
X    foreach $setElement (@set)
X    {
X	return 1 if $element == $setElement;
X    }
X    return 0;
}
X
%evalValues = ();
X
sub main'value
{
X    local($sid, $occurrence) = @_;
X    $occurrence = 1 unless defined $occurrence;
X    local($ret) = $evalValues{$sid, $occurrence};
X    $ret = "[?value($sid, occurrence)?]" unless defined $ret;
X    return $ret;
}
X
sub main'v
{
X    local($name, $occurrence) = @_;
X    $sid = &main'requireSymbol($name);
X    return &main'value($sid, $occurrence);
}
X
sub symbolEval
{
X    local($ret);
X    local($L1pos) = $L1Position[$evalPosition];
X    local($L1) = $L1[$evalPosition];
X    print "Starting \$evalPosition=$evalPosition," .
X	"\$L1pos=$L1pos," .
X	"\$L1=$L1\n" if $debug;
X    $evalPosition++;
X    return $TokenValue[$L1pos] if $L1pos >= $[;
X    local(@RHSCount) = ();
X    local(@RHS) = unpack($listTemplate, $RHS{$currentGrammar, $L1});
X    local(%Values) = ();
X    local($i);
X    for ($i = $[; $i <= $#RHS; $i++)
X    {
X	local($sym) = $RHS[$i];
X	local($val) = &symbolEval();
X	return $ret if $parseReject;
X	$RHSCount[$sym] = 0 unless defined $RHSCount[$sym];
X	$RHSCount[$sym]++;
X	$Values{$sym, $RHSCount[$sym]} = $val;
X    }
X    print "Evaluating \$L1pos=$L1pos," .  "\$L1=$L1\n" if $debug;
X    %evalValues = %Values;
X    $evalString = $Value{$currentGrammar, $L1};
X    print "Evaluating \"$evalString\"\n" if $debug;
X    die "Marpa: no value, production $L1, grammar $currentGrammar\n"
X	unless defined $evalString;
X    package main;
X    die "Marpa: unable to eval \"$Marpa'evalString\", $@\n"
X	unless $Marpa'ret = eval $Marpa'evalString;
X    package Marpa;
}
X
sub main'parse
{
X    local($ret);
X    local($startSymbol) = @_;
X
X    if ($inProgress)
X    {
X	$state = $backtrack;
X    } else {
X	$state = $normal;
X	$position = $[;
X	@L1 = ();
X	@L1Position = ();
X	@L2 = ($startSymbol);
X	$inProgress++;
X    }
X
X    config: for (;;)
X    {
X	if ($debug)
X	{
X	    if ($state == $normal)
X	    {
X		print "q";
X	    } elsif ($state == $backtrack)
X	    {
X		print "b";
X	    } elsif ($state == $succeed)
X	    {
X		print "t";
X	    } else
X	    {
X		print "?";
X	    }
X	    print ",";
X	    print $position+1;
X	    print ",";
X	    if ($#L1 < $[) { print "e"; }
X	    else {
X		local($i);
X		for ($i=$[; $i<=$#L1; $i++)
X		{
X		    if ($L1Position[$i] >= $[)
X		    {
X			print $symbolName{$currentGrammar, $L1[$i]};
X		    } else {
X			local($symbol) = $LHS{$currentGrammar, $L1[$i]};
X			local($altList) = $Alternate{$currentGrammar, $symbol};
X			local(@altList) = unpack($listTemplate, $altList);
X			local($j);
X			altCount: for ($j = $[; $j <= $#altList; $j++)
X			{
X			    last altCount if $altList[$j] == $L1[$i];
X			}
X		    print join("",
X			"<$symbolName{$currentGrammar, $symbol}",
X			($j+1),
X			">"
X			);
X		    }
X		}
X	    }
X	    print ",";
X	    if ($#L2 < $[) { print "e" if $state == $succeed; }
X	    else {
X		local($i);
X		for ($i=$#L2; $i>=$[; $i--)
X		{
X		    print "<$symbolName{$currentGrammar, $L2[$i]}>";
X		}
X	    }
X	    print "$" unless $state == $succeed;
X	    print "\n";
X	}
X
X	if ($state == $fail)
X	{
X	    return $ret;
X	}
X
X	if ($state == $normal)
X	{
X	    if ($#L2 == $[ - 1)
X	    {
X		$state = $position > $#Token ? $succeed : $backtrack;
X		next config;
X	    }
X
X	    if (&elementOf($L2[$#L2],
X		unpack($listTemplate, $Token[$position])))
X	    {
X		push(@L1, pop(@L2));
X		push(@L1Position, $position);
X		$position++;
X		next config;
X	    }
X
X	    local($altList) = $Alternate{$currentGrammar, $L2[$#L2]};
X	    if (defined $altList)
X	    {
X		local(@altList) = unpack($listTemplate, $altList);
X		local($alt) = shift @altList;
X		push(@L1, $alt);
X		push(@L1Position, $[ - 1);
X		pop(@L2);
X		push(@L2, reverse unpack($listTemplate,
X		    $RHS{$currentGrammar, $alt}));
X		next config;
X	    }
X
X	    $state = $backtrack;
X	    next config;
X
X	} # end of if NORMAL
X
X	if ($state == $backtrack)
X	{
X	    if ($#L1 < $[)
X	    {
X		$state = $fail;
X		next config;
X	    }
X
X	    if ($L1Position[$#L1] >= $[)
X	    {
X		$position--;
X		push(@L2, pop(@L1));
X		pop(@L1Position);
X		next config;
X	    }
X
X	    $oldAlt = pop(@L1);
X	    pop(@L1Position);
X	    foreach $element (unpack($listTemplate,
X		$RHS{$currentGrammar, $oldAlt}))
X	    {
X		pop(@L2);
X	    }
X
X	    @altList = unpack($listTemplate,
X		$Alternate{$currentGrammar, $LHS{$currentGrammar, $oldAlt}});
X
X	    altPass: while ($element = shift(@altList))
X	    {
X		last altPass if $element == $oldAlt;
X	    }
X
X	    $newAlt = shift(@altList);
X
X	    if (defined $newAlt)
X	    {
X		$state = $normal;
X		push(@L1, $newAlt);
X		push(@L1Position, $[ - 1);
X		push(@L2, reverse unpack($listTemplate,
X		    $RHS{$currentGrammar, $newAlt}));
X		next config;
X	    }
X
X	    push(@L2, $LHS{$currentGrammar, $oldAlt});
X	    next config;
X
X	} # end of if BACKTRACK
X
X	last config;
X
X    }
X
X    $evalPosition = $[;
X    $parseReject = 0;
X    local($val) = &symbolEval();
X    $ret = $val if (!$parseReject);
X    return $ret;
}
X
1;
X
package main;
SHAR_EOF
chmod 0644 marpa.pl ||
echo 'restore of marpa.pl failed'
Wc_c="`wc -c < 'marpa.pl'`"
test 7038 -eq "$Wc_c" ||
	echo 'marpa.pl: original size 7038, current size' "$Wc_c"
fi
# ============= milarepa.pl ==============
if test -f 'milarepa.pl' -a X"$1" != X"-c"; then
	echo 'x - skipping milarepa.pl (File already exists)'
else
echo 'x - extracting milarepa.pl (Text)'
sed 's/^X//' << 'SHAR_EOF' > 'milarepa.pl' &&
require "naropa.pl";
X
$S_Semicolon = &main'registerSymbol("semicolon", ";");
&main'terminatorSymbol($S_Semicolon);
X
$S_StringPiece =
X    &main'registerSymbol("string piece", "\"((\\\\.)|([^\\\"]))*\"");
X
$S_Whitespace = &main'registerSymbol("whitespace", "\\s+");
&main'discardSymbol($S_Whitespace);
X
$S_Comment = &main'registerSymbol("comment", "#[^\n]*\n");
&main'discardSymbol($S_Comment);
X
$S_ConcatenateSign = &main'registerSymbol("concatenate sign", "\\\.");
$S_NameComponent = &main'registerSymbol("name component", "\\w+");
$S_Tilde = &main'registerSymbol("tilde", "~");
$S_ProduceSign = &main'registerSymbol("produce sign", "::=");
X
# action ::= lex pattern | production;
$S_Action = &main'registerSymbol("action");
$S_LexPattern = &main'registerSymbol("lex pattern");
$S_Production = &main'registerSymbol("production");
&main'registerAlternate('&value($S_LexPattern)', $S_Action, $S_LexPattern);
&main'registerAlternate('&value($S_Production)', $S_Action, $S_Production);
X
# lex pattern ::= symbol name . tilde . string
$S_SymbolName = &main'registerSymbol("symbol name");
&main'registerAlternate(
X    qq/join("",
X	"&requirePattern('",
X	    &value($S_SymbolName),
X	    "', ",
X	    &value($S_StringPiece),
X	    ");\n")/,
X    $S_LexPattern, $S_SymbolName, $S_Tilde, $S_StringPiece);
X
# symbol name ::= name component list
$S_NameComponentList = &main'registerSymbol("name component list");
&main'registerAlternate(
X    '&value($S_NameComponentList)',
X    $S_SymbolName, $S_NameComponentList);
X
# name component list ::=
#     name component |
#     name component . name component list;
&main'registerAlternate('&value($S_NameComponent)',
X    $S_NameComponentList, $S_NameComponent);
&main'registerAlternate(
X    'join(" ",&value($S_NameComponent),&value($S_NameComponentList))',
X    $S_NameComponentList, $S_NameComponent, $S_NameComponentList);
X
# string ::= string piece list
$S_String = &main'registerSymbol("string");
$S_StringPieceList = &main'registerSymbol("string piece list");
&main'registerAlternate(
X    '&value($S_StringPieceList)',
X    $S_String, $S_StringPieceList);
X
# string piece list ::=
#    string piece |
#    string piece . string piece list;
&main'registerAlternate(
X    '&value($S_StringPiece)',
X    $S_StringPieceList, $S_StringPiece);
&main'registerAlternate(
X    'join("", &value($S_StringPiece), &value($S_StringPieceList))',
X    $S_StringPieceList, $S_StringPiece, $S_StringPieceList);
X
# production ::=
#    production proper |
#    production proper . production action;
$S_ProductionProper = &main'registerSymbol("production proper");
$S_ProductionAction = &main'registerSymbol("production action");
&registerAlternate(
X    qq/join("",
X	'&defaultAction(\n  ',
X	&value($S_ProductionProper),
X	');\n')/,
X    $S_Production, $S_ProductionProper);
&registerAlternate(
X    qq/join("",
X	'&registerAlternate(\n  ',
X	&value($S_ProductionAction),
X	'\n   ,',
X	&value($S_ProductionProper),
X	');\n')/,
X    $S_Production, $S_ProductionProper, $S_ProductionAction);
X
# production action ::= string piece;
&main'registerAlternate(
X    '&value($S_StringPiece)',
X    $S_ProductionAction, $S_StringPiece);
X
# production proper ::= lhs . produce sign . rhs;
$S_LHS = &main'registerSymbol("lhs");
$S_RHS = &main'registerSymbol("rhs");
&registerAlternate(
X    qq/join('',
X	'&requireSymbol("',
X	&value($S_LHS),
X	'")\n',
X	&value($S_RHS))/,
X    $S_ProductionProper, $S_LHS, $S_ProduceSign, $S_RHS
);
X
# lhs ::= symbol name;
&registerAlternate('&value($S_SymbolName)', $S_LHS, $S_SymbolName);
X
# rhs ::= symbol list;
$S_SymbolList = &main'registerSymbol("symbol list");
&registerAlternate(
X    qq/&value($S_SymbolList)/,
X    $S_RHS, $S_SymbolList
);
X
# symbol list ::=
#    symbol name |
#    symbol name . concatenate symbol . symbol list;
&main'registerAlternate(
X    qq/join(''
X	,'  ,&requireSymbol("'
X	,&value($S_SymbolName)
X	,'")\n'
X    )/,
X    $S_SymbolList, $S_SymbolName);
&registerAlternate(
X    qq/join('',
X	'  ,&requireSymbol("',
X	&value($S_SymbolName),
X	'")\n',
X	&value($S_SymbolList)
X	)/,
X    $S_SymbolList, $S_SymbolName,
X    $S_ConcatenateSign, $S_SymbolList);
X
print "\nrequire 'naropa.pl';\n";
X
# $Marpa'debug = 1;
&naropa($S_Action);
X
print "\n&naropa(&requireSymbol('start symbol'));\n";
SHAR_EOF
chmod 0644 milarepa.pl ||
echo 'restore of milarepa.pl failed'
Wc_c="`wc -c < 'milarepa.pl'`"
test 4221 -eq "$Wc_c" ||
	echo 'milarepa.pl: original size 4221, current size' "$Wc_c"
fi
# ============= mrtest.pl ==============
if test -f 'mrtest.pl' -a X"$1" != X"-c"; then
	echo 'x - skipping mrtest.pl (File already exists)'
else
echo 'x - extracting mrtest.pl (Text)'
sed 's/^X//' << 'SHAR_EOF' > 'mrtest.pl' &&
X
require 'naropa.pl';
# "start symbol ::= E;"
&defaultAction(
X  &requireSymbol("start symbol")
X  ,&requireSymbol("E")
);
X
# "E ::= T;"
&defaultAction(
X  &requireSymbol("E")
X  ,&requireSymbol("T")
);
X
# "E ::= T . plus sign . E;"
&defaultAction(
X  &requireSymbol("E")
X  ,&requireSymbol("T")
X  ,&requireSymbol("plus sign")
X  ,&requireSymbol("E")
);
X
# "T ::= F ;"
&defaultAction(
X  &requireSymbol("T")
X  ,&requireSymbol("F")
);
X
# "T ::= F . times sign . T;"
&defaultAction(
X  &requireSymbol("T")
X  ,&requireSymbol("F")
X  ,&requireSymbol("times sign")
X  ,&requireSymbol("T")
);
X
# "F ::= number;"
&defaultAction(
X  &requireSymbol("F")
X  ,&requireSymbol("number")
);
X
# "number ~ "\\d+";"
&requirePattern('number', "\\d+");
X
# "plus sign ~ "\\+";"
&requirePattern('plus sign', "\\+");
X
# "times sign ~ "\\*";"
&requirePattern('times sign', "\\*");
X
X
&naropa(&requireSymbol('start symbol'));
SHAR_EOF
chmod 0644 mrtest.pl ||
echo 'restore of mrtest.pl failed'
Wc_c="`wc -c < 'mrtest.pl'`"
test 888 -eq "$Wc_c" ||
	echo 'mrtest.pl: original size 888, current size' "$Wc_c"
fi
# ============= mrtest2.pl ==============
if test -f 'mrtest2.pl' -a X"$1" != X"-c"; then
	echo 'x - skipping mrtest2.pl (File already exists)'
else
echo 'x - extracting mrtest2.pl (Text)'
sed 's/^X//' << 'SHAR_EOF' > 'mrtest2.pl' &&
X
require 'naropa.pl';
# "start symbol ::= E  "&v('E')";"
&registerAlternate(
X  "&v('E')"
X   ,&requireSymbol("start symbol")
X  ,&requireSymbol("E")
);
X
# "E ::= T  "&v('T')";"
&registerAlternate(
X  "&v('T')"
X   ,&requireSymbol("E")
X  ,&requireSymbol("T")
);
X
# "E ::= T . plus sign . E  "&v('T')+&v('E')";"
&registerAlternate(
X  "&v('T')+&v('E')"
X   ,&requireSymbol("E")
X  ,&requireSymbol("T")
X  ,&requireSymbol("plus sign")
X  ,&requireSymbol("E")
);
X
# "T ::= F  "&v('F')";"
&registerAlternate(
X  "&v('F')"
X   ,&requireSymbol("T")
X  ,&requireSymbol("F")
);
X
# "T ::= F . times sign . T "&v('F')*&v('T')";"
&registerAlternate(
X  "&v('F')*&v('T')"
X   ,&requireSymbol("T")
X  ,&requireSymbol("F")
X  ,&requireSymbol("times sign")
X  ,&requireSymbol("T")
);
X
# "F ::= number "&v('number')";"
&registerAlternate(
X  "&v('number')"
X   ,&requireSymbol("F")
X  ,&requireSymbol("number")
);
X
# "number ~ "\\d+";"
&requirePattern('number', "\\d+");
X
# "plus sign ~ "\\+";"
&requirePattern('plus sign', "\\+");
X
# "times sign ~ "\\*";"
&requirePattern('times sign', "\\*");
X
X
&naropa(&requireSymbol('start symbol'));
SHAR_EOF
chmod 0644 mrtest2.pl ||
echo 'restore of mrtest2.pl failed'
Wc_c="`wc -c < 'mrtest2.pl'`"
test 1101 -eq "$Wc_c" ||
	echo 'mrtest2.pl: original size 1101, current size' "$Wc_c"
fi
# ============= naropa.pl ==============
if test -f 'naropa.pl' -a X"$1" != X"-c"; then
	echo 'x - skipping naropa.pl (File already exists)'
else
echo 'x - extracting naropa.pl (Text)'
sed 's/^X//' << 'SHAR_EOF' > 'naropa.pl' &&
require "tilopa.pl";
X
sub defaultAction
{
X    local($lhs, @rhs) = @_;
X    local(@action);
X    push(@action, "join('','", &main'symbolName($lhs), "=',");
X    local(@rhsList) = ();
X    foreach $rhs (@rhs)
X    {
X	push(@rhsList, "'<'", "&value($rhs)", "'>'", "','");
X    }
X    pop(@rhsList); # pop extra comma
X    push(@action, join(",", @rhsList));
X    push(@action, ")");
X    &main'registerAlternate(join('', @action), $lhs, @rhs);
}
X
sub naropa
{
X    local($top) = @_;
X    local($line);
X    line: while ($line = <main'STDIN>)
X    {
X	chop $line;
X	while (length($line))
X	{
X	    print "# \"$line\"\n";
X	    local($before) = length($line);
X	    $line = &main'lex($line);
X	    local($after) = length($line);
X	    local($ret);
X	    if ($after == $before)
X	    {
X		$ret = "Lexer failed!!!, line=\"$line\"";
X		$line = "";
X	    } else
X	    {
X		$ret = "Parse failed!!!"
X		    unless $ret = &main'parse($top);
X	    }
X	    print "$ret\n";
X	    &main'clearParse();
X	}
X    }
}
X
1;
SHAR_EOF
chmod 0644 naropa.pl ||
echo 'restore of naropa.pl failed'
Wc_c="`wc -c < 'naropa.pl'`"
test 966 -eq "$Wc_c" ||
	echo 'naropa.pl: original size 966, current size' "$Wc_c"
fi
# ============= test.pl ==============
if test -f 'test.pl' -a X"$1" != X"-c"; then
	echo 'x - skipping test.pl (File already exists)'
else
echo 'x - extracting test.pl (Text)'
sed 's/^X//' << 'SHAR_EOF' > 'test.pl' &&
require "marpa.pl";
X
$SID_E = &registerSymbol("E");
$SID_Plus = &registerSymbol("+", "\+");
$SID_T = &registerSymbol("T");
$SID_Times = &registerSymbol("*", "\*");
$SID_F = &registerSymbol("F");
$SID_a = &registerSymbol("a", "\d*");
X
&registerAlternate("'E(' . &value($SID_T) . ',' . " .
X	" &value($SID_Plus) . ',' . &value($SID_E) . ')'",
X    $SID_E, $SID_T, $SID_Plus, $SID_E);
&registerAlternate("'E(' . &value($SID_T) . ')'",
X    $SID_E, $SID_T);
&registerAlternate("'T(' . &value($SID_F) . ',' . " .
X	" &value($SID_Times) . ',' &value($SID_T) . ')'",
X    $SID_T, $SID_F, $SID_Times, $SID_T);
&registerAlternate("'T(' . &value($SID_F) . ')'",
X    $SID_T, $SID_F);
&registerAlternate("'F(' . &value($SID_a) . ')'",
X    $SID_F, $SID_a);
X
&registerToken("a", $SID_a);
&registerToken("+", $SID_Plus);
&registerToken("a", $SID_a);
X
print &parse($SID_E);
print "\n";
SHAR_EOF
chmod 0644 test.pl ||
echo 'restore of test.pl failed'
Wc_c="`wc -c < 'test.pl'`"
test 865 -eq "$Wc_c" ||
	echo 'test.pl: original size 865, current size' "$Wc_c"
fi
# ============= test2.pl ==============
if test -f 'test2.pl' -a X"$1" != X"-c"; then
	echo 'x - skipping test2.pl (File already exists)'
else
echo 'x - extracting test2.pl (Text)'
sed 's/^X//' << 'SHAR_EOF' > 'test2.pl' &&
require "marpa.pl";
X
$SID_E = &registerSymbol("E");
$SID_Plus = &registerSymbol("+");
$SID_T = &registerSymbol("T");
$SID_Times = &registerSymbol("*");
$SID_F = &registerSymbol("F");
$SID_a = &registerSymbol("a");
X
&registerAlternate("&value($SID_T)+&value($SID_E)",
X    $SID_E, $SID_T, $SID_Plus, $SID_E);
&registerAlternate("&value($SID_T)",
X    $SID_E, $SID_T);
&registerAlternate("&value($SID_F)*&value($SID_T)",
X    $SID_T, $SID_F, $SID_Times, $SID_T);
&registerAlternate("&value($SID_F)",
X    $SID_T, $SID_F);
&registerAlternate("&value($SID_a)",
X    $SID_F, $SID_a);
X
&registerToken("5", $SID_a);
&registerToken("+", $SID_Plus);
&registerToken("6", $SID_a);
X
print &parse($SID_E);
print "\n";
SHAR_EOF
chmod 0644 test2.pl ||
echo 'restore of test2.pl failed'
Wc_c="`wc -c < 'test2.pl'`"
test 700 -eq "$Wc_c" ||
	echo 'test2.pl: original size 700, current size' "$Wc_c"
fi
# ============= tilopa.pl ==============
if test -f 'tilopa.pl' -a X"$1" != X"-c"; then
	echo 'x - skipping tilopa.pl (File already exists)'
else
echo 'x - extracting tilopa.pl (Text)'
sed 's/^X//' << 'SHAR_EOF' > 'tilopa.pl' &&
require "marpa.pl";
X
package Tilopa;
X
$debug = 0;
X
sub main'lex
{
X    local($input) = @_;
X
X    local($key);
X    inputpass: while (length($input))
X    {
X	study $input;
X	pattern: foreach $key (keys %Marpa'symbolPattern)
X	{
X	    local($grammar, $sid) = split($;, $key);
X	    next pattern unless $grammar == $Marpa'currentGrammar;
X	    local($pattern) = $Marpa'symbolPattern{$key};
X	    print "Trying pattern \"$pattern\"\n" if $debug;
X	    next pattern unless $input =~ /^$pattern/;
X	    $input = $';
X	    next inputpass if $Marpa'discard{$grammar, $sid};
X	    last inputpass if $Marpa'terminator{$grammar, $sid};
X	    &main'registerToken($&, $sid);
X	    print "Registering $& as $Marpa'symbolName{$grammar, $sid}\n"
X		if $debug;
X	    next inputpass;
X	}
X	last inputpass;
X    }
X    return $input;
}
X
1;
package main;
SHAR_EOF
chmod 0644 tilopa.pl ||
echo 'restore of tilopa.pl failed'
Wc_c="`wc -c < 'tilopa.pl'`"
test 813 -eq "$Wc_c" ||
	echo 'tilopa.pl: original size 813, current size' "$Wc_c"
fi
# ============= test.mr ==============
if test -f 'test.mr' -a X"$1" != X"-c"; then
	echo 'x - skipping test.mr (File already exists)'
else
echo 'x - extracting test.mr (Text)'
sed 's/^X//' << 'SHAR_EOF' > 'test.mr' &&
start symbol ::= E;
E ::= T;
E ::= T . plus sign . E;
T ::= F ;
T ::= F . times sign . T;
F ::= number;
number ~ "\\d+";
plus sign ~ "\\+";
times sign ~ "\\*";
SHAR_EOF
chmod 0644 test.mr ||
echo 'restore of test.mr failed'
Wc_c="`wc -c < 'test.mr'`"
test 160 -eq "$Wc_c" ||
	echo 'test.mr: original size 160, current size' "$Wc_c"
fi
# ============= test2.mr ==============
if test -f 'test2.mr' -a X"$1" != X"-c"; then
	echo 'x - skipping test2.mr (File already exists)'
else
echo 'x - extracting test2.mr (Text)'
sed 's/^X//' << 'SHAR_EOF' > 'test2.mr' &&
start symbol ::= E  "&v('E')";
E ::= T  "&v('T')";
E ::= T . plus sign . E  "&v('T')+&v('E')";
T ::= F  "&v('F')";
T ::= F . times sign . T "&v('F')*&v('T')";
F ::= number "&v('number')";
number ~ "\\d+";
plus sign ~ "\\+";
times sign ~ "\\*";
SHAR_EOF
chmod 0644 test2.mr ||
echo 'restore of test2.mr failed'
Wc_c="`wc -c < 'test2.mr'`"
test 244 -eq "$Wc_c" ||
	echo 'test2.mr: original size 244, current size' "$Wc_c"
fi
exit 0
-- 
Jeffrey Kegler, Independent UNIX Consultant, Algorists, Inc.
jeffrey@algor2.ALGORISTS.COM or uunet!algor2!jeffrey
137 E Fremont AVE #122, Sunnyvale CA 94087
"Nitwit ideas are for emergencies.  You use them when you've got
nothing else to try.  If they work, they go in the Book.  Otherwise
you follow the Book, which is largely a collection of nitwit ideas
that worked." from the _Mote in God's Eye_ by Larry Niven and Jerry
Pournelle


