In file ../include/EST_SCFG.h:

class EST_SCFG_traintest

A class used to train (and test) SCFGs is an extention of EST_SCFG.

Inheritance:


Public Methods

[more]void test_corpus ()
Test the current grammar against the current corpus print summary.
[more]void test_crossbrackets ()
Test the current grammar against the current corpus.
[more]void load_corpus (const EST_String &filename)
Load a corpus from the given file.
[more]void train_inout (int passes, int startpass, int checkpoint, int spread, const EST_String &outfile)
Train a grammar using the loaded corpus.


Inherited from EST_SCFG:

Public Methods

Constructor and initialisation functions

[more] EST_SCFG(LISP rules)
Initialize from a set of rules

utility functions

[more]void set_rules(LISP rules)
Set (or reset) rules from external source after construction
[more]LISP get_rules()
Return rules as LISP list
[more]SCFGRuleList rules
The rules themselves
[more]void find_terms_nonterms(EST_StrList &nt, EST_StrList &t, LISP rules)
Find the terminals and nonterminals in the given grammar, adding them to the appropriate given string lists
[more]EST_String nonterminal(int p) const
Convert nonterminal index to string form
[more]EST_String terminal(int m) const
Convert terminal index to string form
[more]int nonterminal(const EST_String &p) const
Convert nonterminal string to index
[more]int terminal(const EST_String &m) const
Convert terminal string to index
[more]int num_nonterminals() const
Number of nonterminals
[more]int num_terminals() const
Number of terminals
[more]double prob_B(int p, int q, int r) const
The rule probability of given binary rule
[more]double prob_U(int p, int m) const
The rule probability of given unary rule
[more]void set_rule_prob_cache()
(re-)set rule probability caches

file i/o functions

[more]EST_read_status load(const EST_String &filename)
Load grammar from named file
[more]EST_write_status save(const EST_String &filename)
Save current grammar to named file


Documentation

A class used to train (and test) SCFGs is an extention of EST_SCFG.

This offers an implementation of Pereira and Schabes ``Inside-Outside reestimation from partially bracket corpora.'' ACL 1992.

A SCFG maybe trained from a corpus (optionally) containing brackets over a series of passes reestimating the grammar probabilities after each pass. This basically extends the EST_SCFG class adding support for a bracket corpus and various indexes for efficient use of the grammar.

ovoid test_corpus()
Test the current grammar against the current corpus print summary.

Cross entropy measure only is given.

ovoid test_crossbrackets()
Test the current grammar against the current corpus.

Sumamry includes percentage of cross bracketing accuracy and percentage of fully correct parses.

ovoid load_corpus(const EST_String &filename)
Load a corpus from the given file.

Each setence in the corpus should be contained in parentheses. Additional paranethesis may be used to denote phrasing within a sentence. The corpus is read using the LISP reader so LISP conventions shold apply, notable single quotes should appear within double quotes.

ovoid train_inout(int passes, int startpass, int checkpoint, int spread, const EST_String &outfile)
Train a grammar using the loaded corpus.

Parameters:
passes - the number of training passes desired.
startpass - from which pass to start from
checkpoint - save the grammar every n passes
spread - Percentage of corpus to use on each pass, this cycles through the corpus on each pass.


This class has no child classes.

Alphabetic index HTML hierarchy of classes or Java


This page is part of the Edinburgh Speech Tools Library documentation
Copyright University of Edinburgh 1997
Contact: speech_tools@cstr.ed.ac.uk