1. Textual positions of tokens:

yylex sets text position in
  yylloc.first_line
	.first_column
	.last_line
	.last_column

type is YYLTYPE

Then in actions you can use @n, which maps to the struct above.

slows the parser down notably



2. Stack overflow

yyerror is called to report the overflow
and yyparse returns non-zero

Define YYMAXDEPTH as an integer to expand the stack.  You can go ahead
and give a massive value; it is only allocated as necessary.  Default
is 10000.  YYINITDEPTH is the amount initially allocated.



3. Perfect hash functions

Motivation

Read the manual

Basically provides a function to look up your keywords

If you are doing any real case (programming language, assembler, etc)
then this is a really important tool.

Gillions of options.



4. Debugging
Use bison -v to tell you the meaning of each state and so forth

Define YYDEBUG or use -t to enable tracing
Turn on tracing be setting yydebug (using a debugger, say)
Trace information then gets written for:
   Each token returned by yylex
   Each shift; giving the complete contents of the stack
   Each reduce, giving the reduction and the complete stack
If you want debugging to print semantic values, then set 
   YYPRINT (stream, token code, token value)


5. The Central Dogma of Molecular Biology

A.  DNA   ->   mRNA   ->  protein
B.  Text  ->   sounds   ->   conceptualization
C.  Character stream   ->   Token stream   ->   parsed result (syntax tree?)



a. proteins affect DNA transcription
a. Semantic info in token types: the lexer looks at the post-parsing state
   to decide what a thing is.
   A wrinkle in C: a declaration can redeclare a typedef if an explicit
     type is named earlier:
       typedef int foo, bar, lose;
       static foo (bar);        /* redeclare bar as static variable */
       static int foo (lose);   /* redeclare foo as function */
     this forces duplication of the rules to allow special ones

initdcl:
          declarator maybeasm '='
          init
        | declarator maybeasm
        ;

notype_initdcl:
          notype_declarator maybeasm '='
          init
        | notype_declarator maybeasm
        ;

b. Lexical tie-ins: bison actions that set flags which alter the
scanning of tokens.  For example, a hex ( .... ) construct that
changes the interpretation of literals.  If you do error recovery,
then you must be very careful to have the close context happen
correctly in the error handling.

c. the global context of the cytoplasm, cf bootstrapping parsers--but
   only needed once per planet.

d. some of the genetic code is not arbitrary, but parts of it are, and
   those parts are coded for by the DNA.  

e. Mitochondrial and Chloroplastic? DNA
   other parsers, with other lexers -- use the features that enable
     you to have multiple parsers and lexers by renaming the external
     variables

f. line numbers [if through global variables] are a direct
   manipulation of the world by the scanner (cf tRNA or mRNA)

Moral: be flexible.  keep in mind that there are backdoors for almost
       everything.  don't get boxed in by the Central Dogma, but at
       the same time, try to conform to it before reaching for other
       solutions.   (ex, the better handling of lexical position in
       bison. )
