This is Info file pm.info, produced by Makeinfo version 1.68 from the
input file bigpm.texi.


File: pm.info,  Node: DBIx/DBSchema/DBD/Pg,  Next: DBIx/DBSchema/DBD/mysql,  Prev: DBIx/DBSchema/DBD,  Up: Module List

PostgreSQL native driver for DBIx::DBSchema
*******************************************

NAME
====

   DBIx::DBSchema::DBD::Pg - PostgreSQL native driver for DBIx::DBSchema

SYNOPSIS
========

   use DBI; use DBIx::DBSchema;

   $dbh = DBI->connect('dbi:Pg:dbname=database', 'user', 'pass'); $schema
= new_native DBIx::DBSchema $dbh;

DESCRIPTION
===========

   This module implements a PostgreSQL-native driver for DBIx::DBSchema.

AUTHOR
======

   Ivan Kohler <ivan-dbix-dbschema@420.am>

COPYRIGHT
=========

   Copyright (c) 2000 Ivan Kohler Copyright (c) 2000 Mail Abuse Prevention
System LLC All rights reserved.  This program is free software; you can
redistribute it and/or modify it under the same terms as Perl itself.

BUGS
====

   Yes.

   columns doesn't return column default information.

SEE ALSO
========

   *Note DBIx/DBSchema: DBIx/DBSchema,, *Note DBIx/DBSchema/DBD:
DBIx/DBSchema/DBD,, `DBI' in this node, `DBI::DBD' in this node


File: pm.info,  Node: DBIx/DBSchema/DBD/mysql,  Next: DBIx/DBSchema/Table,  Prev: DBIx/DBSchema/DBD/Pg,  Up: Module List

MySQL native driver for DBIx::DBSchema
**************************************

NAME
====

   DBIx::DBSchema::DBD::mysql - MySQL native driver for DBIx::DBSchema

SYNOPSIS
========

   use DBI; use DBIx::DBSchema;

   $dbh = DBI->connect('dbi:mysql:database', 'user', 'pass'); $schema =
new_native DBIx::DBSchema $dbh;

DESCRIPTION
===========

   This module implements a MySQL-native driver for DBIx::DBSchema.

AUTHOR
======

   Ivan Kohler <ivan-dbix-dbschema@420.am>

COPYRIGHT
=========

   Copyright (c) 2000 Ivan Kohler Copyright (c) 2000 Mail Abuse Prevention
System LLC All rights reserved.  This program is free software; you can
redistribute it and/or modify it under the same terms as Perl itself.

BUGS
====

SEE ALSO
========

   *Note DBIx/DBSchema: DBIx/DBSchema,, *Note DBIx/DBSchema/DBD:
DBIx/DBSchema/DBD,, `DBI' in this node, `DBI::DBD' in this node


File: pm.info,  Node: DBIx/DBSchema/Table,  Next: DBIx/DataSource,  Prev: DBIx/DBSchema/DBD/mysql,  Up: Module List

Table objects
*************

NAME
====

   DBIx::DBSchema::Table - Table objects

SYNOPSIS
========

     use DBIx::DBSchema::Table;

     $table = new DBIx::DBSchema::Table (
       "table_name",
       "primary_key",
       $dbix_dbschema_colgroup_unique_object,
       $dbix_dbschema_colgroup_index_object,
       @dbix_dbschema_column_objects,
     );

     $table->addcolumn ( $dbix_dbschema_column_object );

     $table_name = $table->name;
     $table->name("table_name");

     $primary_key = $table->primary_key;
     $table->primary_key("primary_key");

     $dbix_dbschema_colgroup_unique_object = $table->unique;
     $table->unique( $dbix_dbschema__colgroup_unique_object );

     $dbix_dbschema_colgroup_index_object = $table->index;
     $table->index( $dbix_dbschema_colgroup_index_object );

     @column_names = $table->columns;

     $dbix_dbschema_column_object = $table->column("column");

     #preferred
     @sql_statements = $table->sql_create_table $dbh;
     @sql_statements = $table->sql_create_table $datasrc, $username, $password;

     #possible problems
     @sql_statements = $table->sql_create_table $datasrc;
     @sql_statements = $table->sql_create_table;

DESCRIPTION
===========

   DBIx::DBSchema::Table objects represent a single database table.

METHODS
=======

new [ TABLE_NAME [ , PRIMARY_KEY [ , UNIQUE [ , INDEX [ , COLUMN... ] ] ] ] ]
     Creates a new DBIx::DBSchema::Table object.  TABLE_NAME is the name
     of the table.  PRIMARY_KEY is the primary key (may be empty).  UNIQUE
     is a DBIx::DBSchema::ColGroup::Unique object (see *Note
     DBIx/DBSchema/ColGroup/Unique: DBIx/DBSchema/ColGroup/Unique,).
     INDEX is a DBIx::DBSchema::ColGroup::Index object (see *Note
     DBIx/DBSchema/ColGroup/Index: DBIx/DBSchema/ColGroup/Index,).  The
     rest of the arguments should be DBIx::DBSchema::Column objects (see
     *Note DBIx/DBSchema/Column: DBIx/DBSchema/Column,).

new_odbc DATABASE_HANDLE TABLE_NAME
     Creates a new DBIx::DBSchema::Table object from the supplied DBI
     database handle for the specified table.  This uses the experimental
     DBI type_info method to create a table with standard (ODBC) SQL
     column types that most closely correspond to any non-portable column
     types.   Use this to import a schema that you wish to use with many
     different database engines.  Although primary key and (unique) index
     information will only be imported from databases with
     DBIx::DBSchema::DBD drivers (currently MySQL and PostgreSQL), import
     of column names and attributes *should* work for any database.

new_native DATABASE_HANDLE TABLE_NAME
     Creates a new DBIx::DBSchema::Table object from the supplied DBI
     database handle for the specified table.  This uses database-native
     methods to read the schema, and will preserve any non-portable column
     types.  The method is only available if there is a
     DBIx::DBSchema::DBD for the corresponding database engine (currently,
     MySQL and PostgreSQL).

addcolumn COLUMN
     Adds this DBIx::DBSchema::Column object.

name [ TABLE_NAME ]
     Returns or sets the table name.

primary_key [ PRIMARY_KEY ]
     Returns or sets the primary key.

unique [ UNIQUE ]
     Returns or sets the DBIx::DBSchema::ColGroup::Unique object.

index [ INDEX ]
     Returns or sets the DBIx::DBSchema::ColGroup::Index object.

columns
     Returns a list consisting of the names of all columns.

column COLUMN_NAME
     Returns the column object (see *Note DBIx/DBSchema/Column:
     DBIx/DBSchema/Column,) for the specified COLUMN_NAME.

sql_create_table [ DATABASE_HANDLE | DATA_SOURCE [ USERNAME PASSWORD [ ATTR ] ] ]
     Returns a list of SQL statments to create this table.

     The data source can be specified by passing an open DBI database
     handle, or by passing the DBI data source name, username and password.

     Although the username and password are optional, it is best to call
     this method with a database handle or data source including a valid
     username and password - a DBI connection will be opened and the
     quoting and type mapping will be more reliable.

     If passed a DBI data source (or handle) such as `DBI:mysql:database',
     will use MySQL- or PostgreSQL-specific syntax.  Non-standard syntax
     for other engines (if applicable) may also be supported in the future.

AUTHOR
======

   Ivan Kohler <ivan-dbix-dbschema@420.am>

COPYRIGHT
=========

   Copyright (c) 2000 Ivan Kohler Copyright (c) 2000 Mail Abuse Prevention
System LLC All rights reserved.  This program is free software; you can
redistribute it and/or modify it under the same terms as Perl itself.

BUGS
====

   sql_create_table() has database-specific foo that probably ought to be
abstracted into the DBIx::DBSchema::DBD:: modules.

   sql_create_table may change or destroy the object's data.  If you need
to use the object after sql_create_table, make a copy beforehand.

   Some of the logic in new_odbc might be better abstracted into Column.pm
etc.

SEE ALSO
========

   *Note DBIx/DBSchema: DBIx/DBSchema,, *Note
DBIx/DBSchema/ColGroup/Unique: DBIx/DBSchema/ColGroup/Unique,, *Note
DBIx/DBSchema/ColGroup/Index: DBIx/DBSchema/ColGroup/Index,, *Note
DBIx/DBSchema/Column: DBIx/DBSchema/Column,, `DBI' in this node


File: pm.info,  Node: DBIx/DataSource,  Next: DBIx/DataSource/Driver,  Prev: DBIx/DBSchema/Table,  Up: Module List

Database-independant create and drop functions
**********************************************

NAME
====

   DBIx::DataSource - Database-independant create and drop functions

SYNOPSIS
========

     use DBIx::DataSource qw( create_database drop_database );

     create_database( $data_source, $username, $password )
       or warn $DBIx::DataSource::errstr;

     drop_database( $data_source, $username, $password )
       or warn $DBIx::DataSource::errstr;

DESCRIPTION
===========

   This module implements create_database and drop_database functions for
databases.  It aims to provide a common interface to database creation and
deletion regardless of the actual database being used.

   Currently supported databases are MySQL and PostgreSQL.  Assistance
adding support for other databases is welcomed and relatively simple - see
*Note DBIx/DataSource/Driver: DBIx/DataSource/Driver,.

FUNCTIONS
=========

create_database DATA_SOURCE USERNAME PASSWORD
     Create the database specified by the given DBI data source.

drop_database DATA_SOURCE
     Drop the database specified by the given DBI data source.

AUTHOR
======

   Ivan Kohler <ivan-dbix-datasource@420.am>

COPYRIGHT
=========

   Copyright (c) 2000 Ivan Kohler Copyright (c) 2000 Mail Abuse Prevention
System LLC All rights reserved.  This program is free software; you can
redistribute it and/or modify it under the same terms as Perl itself.

BUGS
====

   If DBI data sources were objects, these functions would be methods.

SEE ALSO
========

   *Note DBIx/DataSource/Driver: DBIx/DataSource/Driver,, *Note
DBIx/DataSource/mysql: DBIx/DataSource/mysql,, *Note DBIx/DataSource/Pg:
DBIx/DataSource/Pg,, `DBI' in this node


File: pm.info,  Node: DBIx/DataSource/Driver,  Next: DBIx/DataSource/Pg,  Prev: DBIx/DataSource,  Up: Module List

Driver Writer's Guide and base class
************************************

NAME
====

   DBIx::DataSource::Driver - Driver Writer's Guide and base class

SYNOPSIS
========

     perldoc DBIx::DataSource::Driver;

     or

     package DBIx::DataSource::FooBase;
     use DBIx::DataSource::Driver;
     @ISA = qw( DBIx::DataSource::Driver );

DESCRIPTION
===========

   To implement a driver for your database:

   1) If you can create a database with an SQL command through DBI/DBD,
simply    provide a parse_dsn class method which returns a list consisting
of the    *actual* data source to use in DBI->connect and the SQL.

     package DBIx::DataSource::NewDatabase;
     use DBIx::DataSource::Driver;
     @ISA = qw( DBIx::DataSource::Driver );

     sub parse_dsn {
       my( $class, $action, $dsn ) = @_;

     # $action is `create' or `drop'
     # for example, if you parse parse $dsn for $database,
     # $sql = "$action $database";

     # you can die on errors - it'll be caught

     ( $new_dsn, $sql );
            }

   2) Otherwise, you'll need to write create_database and drop_database
class methods.

     package DBIx::DataSource::NewDatabase;

     sub create_database {
       my( $class, $dsn, $user, $pass ) = @_;

     # for success, return true
     # for failure, die (it'll be caught)
            }

     sub drop_database {
       my( $class, $dsn, $user, $pass ) = @_;

     # for success, return true
     # for failure, die (it'll be caught)
            }

AUTHOR
======

   Ivan Kohler <ivan-dbix-datasource@420.am>

COPYRIGHT
=========

   Copyright (c) 2000 Ivan Kohler Copyright (c) 2000 Mail Abuse Prevention
System LLC All rights reserved.  This program is free software; you can
redistribute it and/or modify it under the same terms as Perl itself.

BUGS
====

SEE ALSO
========

   *Note DBIx/DataSource: DBIx/DataSource,, *Note DBIx/DataSource/mysql:
DBIx/DataSource/mysql,, *Note DBIx/DataSource/Pg: DBIx/DataSource/Pg,,
`DBI' in this node


File: pm.info,  Node: DBIx/DataSource/Pg,  Next: DBIx/DataSource/mysql,  Prev: DBIx/DataSource/Driver,  Up: Module List

PostgreSQL driver for DBIx::DataSource
**************************************

NAME
====

   DBIx::DataSource::Pg - PostgreSQL driver for DBIx::DataSource

SYNOPSIS
========

     use DBIx::DataSource;

     use DBIx::DataSource qw( create_database drop_database );

     create_database( "dbi:Pg:dbname=$dbname", $username, $password )
       or warn $DBIx::DataSource::errstr;

     create_database( "dbi:Pg:dbname=$dbname;host=$host;port=$port",
                      $username, $password )
       or warn $DBIx::DataSource::errstr;

     drop_database( "dbi:Pg:dbname=$dbname", $username, $password )
       or warn $DBIx::DataSource::errstr;

     drop_database( "dbi:Pg:dbname=$dbname;host=$host;port=$port",
                     $username, $password )
       or warn $DBIx::DataSource::errstr;

DESCRIPTION
===========

   This is the PostgresSQL driver for DBIx::DataSource.

AUTHOR
======

   Ivan Kohler <ivan-dbix-datasource@420.am>

COPYRIGHT
=========

   Copyright (c) 2000 Ivan Kohler Copyright (c) 2000 Mail Abuse Prevention
System LLC All rights reserved.  This program is free software; you can
redistribute it and/or modify it under the same terms as Perl itself.

BUGS
====

SEE ALSO
========

   *Note DBIx/DataSource/Driver: DBIx/DataSource/Driver,, *Note
DBIx/DataSource: DBIx/DataSource,, `DBD::Pg' in this node, `DBI' in this
node


File: pm.info,  Node: DBIx/DataSource/mysql,  Next: DBIx/Formatter,  Prev: DBIx/DataSource/Pg,  Up: Module List

MySQL driver for DBIx::DataSource
*********************************

NAME
====

   DBIx::DataSource::mysql - MySQL driver for DBIx::DataSource

SYNOPSIS
========

     use DBIx::DataSource;

     use DBIx::DataSource qw( create_database drop_database );

     create_database( "dbi:mysql:$database", $username, $password )
       or warn $DBIx::DataSource::errstr;

     create_database( "dbi:mysql:database=$database;host=$hostname;port=$port",
                      $username, $password )
       or warn $DBIx::DataSource::errstr;

     drop_database( "dbi:mysql:$database", $username, $password )
       or warn $DBIx::DataSource::errstr;

     drop_database( "dbi:mysql:database=$database;host=$hostname;port=$port",
                     $username, $password )
       or warn $DBIx::DataSource::errstr;

DESCRIPTION
===========

   This is the MySQL driver for DBIx::DataSource.

AUTHOR
======

   Ivan Kohler <ivan-dbix-datasource@420.am>

COPYRIGHT
=========

   Copyright (c) 2000 Ivan Kohler Copyright (c) 2000 Mail Abuse Prevention
System LLC All rights reserved.  This program is free software; you can
redistribute it and/or modify it under the same terms as Perl itself.

BUGS
====

SEE ALSO
========

   *Note DBIx/DataSource/Driver: DBIx/DataSource/Driver,, *Note
DBIx/DataSource: DBIx/DataSource,, *Note DBD/mysql: DBD/mysql,, `DBI' in
this node


File: pm.info,  Node: DBIx/Formatter,  Next: DBIx/FullTextSearch,  Prev: DBIx/DataSource/mysql,  Up: Module List

Module to perform report generation via query DBI
*************************************************

NAME
====

   Formatter - Module to perform report generation via query DBI

SYNOPSIS
========

     use Formatter;

     format FMT_HEADER=
     **************************************************
     *             FIRST PAGE OF REPORT               *
     **************************************************
     .

     format FMT_TTITLE=
     **************************************************
     * SOC : @<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<< *
             $DEAZIEND
     **************************************************
     PROG    CDAZIEND CDDIPEND
     ----    -------- --------
     .

     format FMT_BODY=
     @<<<    @<<<     @<<<<<
     $FMT->line,     $CDAZIEND,$CDDIPEND
     .

     format FMT_CD1LVSTR=
     TOT]--> @<<<<<<< @<<<<<<< @<<<<<<<
        111,     111,     111
     .
     format FMT_BTITLE=
     --------------------------------------------------
     @</@</@<                                    P.@<<<
     $DAY,$MONTH,$YEAR,                      $FMT->page
     .

     $BREAKS[0]="CDAZIEND";
     $BREAKS[1]="CDDIPEND";

     $FMT=new Formatter(
     'DBI_DRIVER'           => 'Oracle',
     'DBI_DATABASE'         => 'database',
     'DBI_USERNAME'         => 'dbusername',
     'DBI_PASSWORD'         => 'dbpassword',
     'DBD_QUERY'            => 'SELECT * FROM ANAGRA WHERE CLPERRIF=199901 ORDER BY CDAZIEND',
     'BREAKS'               => \@BREAKS,
     'BREAKS_SKIP_PAGE'     =>  {
         CD1LVSTR => 1,
         CDCCOSTO => 0
     },
     'FORMAT_PAGESIZE'      => 40,
     'FORMAT_LINESIZE'      => 50,
     'FORMAT_FORMFEED'      => "\f",
     'FORMAT_HEADER'        => *FMT_HEADER,
     'FORMAT_TTITLE'        => *FMT_TTITLE,
     'FORMAT_BTITLE'        => *FMT_BTITLE,
     'FORMAT_BTITLE_HEIGHT' => 2,
     'FORMAT_BODY'          => *FMT_BODY,
     'FORMAT_BREAKS'        =>  {
         CD1LVSTR => *FMT_CD1LVSTR,
     },
     'EVENT_PREHEADER'      => \&PREHEADER,
     'EVENT_POSTHEADER'     => \&POSTHEADER,
     'EVENT_PRETTITLE'      => \&PRETTITLE,
     'EVENT_POSTTTITLE'     => \&POSTTTITLE,
     'EVENT_PREBODY'        => \&PREBODY,
     'EVENT_POSTBODY'       => \&POSTBODY,
     'EVENT_PREBTITLE'      => \&PREBTITLE,
     'EVENT_POSTBTITLE'     => \&POSTBTITLE,
     'EVENT_ALLBREAKS'      => \&BREAKALL,
     'EVENT_BREAKS'         =>  {
         CDAZIEND => \&CDAZIEND,
         CDDIPEND => \&CDDIPEND
     }
         );

     $FMT->generate();

     sub PREHEADER    {do something before header print out}
     sub POSTHEADER   {do something after header print out}
     sub PRETTITLE    {do something before top title print out}
     sub POSTTTITLE   {do something after top title print out}
     sub PREBTITLE    {do something before bottom title print out}
     sub POSTBTITLE   {do something after bottom title print out}
     sub PREBODY      {do something before body print out}
     sub POSTBODY     {do something after body print out}

     $FMT->ofmt("Print out a line during report generation",">");

DESCRIPTION
===========

     Formatter module perform report generation based on DBI query.

DEFINITION
==========

Function new
------------

   The <new> function create the Formatter object and configure it for all
parameters required for report generation.  One by one parameter
definition:

     * DBI_DRIVER        => Specify wath driver you would use
                            for connect to your database.
                            (See DBD drivers for specific)
     * DBI_DATABASE      => Specify database name (or instance).
                            In union of last parameter it create
                            connection string : dbi:DBI_DRIVER:DBI_DATABASE
     * DBI_USERNAME      => Specify database username
     * DBI_PASSWORD      => Specify database DBI_USERNAME password
     * DBD_QUERY         => Point to a string where Sql Query are located
     * BREAKS            => Point to an array containing fields that cause a
                            break in the report
     * BREAKS_SKIP_PAGE  => Point to an hash that specify if a new page are
                            performed when the break FORMAT ar printed
     * FORMAT_PAGESIZE   => Specify page height in character
     * FORMAT_LINESIZE   => Specify line width in character (only for outf use)
     * FORMAT_FORMFEED   => Specify formfeed sequence when a formfeed or newpage
                            are requested
     * FORMAT_HEADER     => Point to filehandle of the header definition for the
                            report
     * FORMAT_TTITLE     => Point to filehandle of the top title definition for
                            the report
     * FORMAT_BODY       => Point to filehandle of the body definition for the
                            report
     * FORMAT_BTITLE     => Point to filehandle of the bottom title definition for
                            the report
     * FORMAT_BTITLE_HEIGHT => Height in lines of BTITLE
     * FORMAT_BREAKS     => Point to an hash containig break fields related to its
                            format filehandle definition
     * EVENT_PREHEDER    => Point to the subroutine that is called before header
                            generation
     * EVENT_POSTHEDER   => Point to the subroutine that is called after header
                            generation
     * EVENT_PRETTITLE   => Point to the subroutine that is called before top title
                            generation
     * EVENT_POSTTTITLE  => Point to the subroutine that is called after top title
                            generation
     * EVENT_PREBODY     => Point to the subroutine that is called before body
                            generation
     * EVENT_POSTBODY    => Point to the subroutine that is called after body
                            generation
     * EVENT_PREBTITLE   => Point to the subroutine that is called before bottom title
                            generation
     * EVENT_POSTBTITLE  => Point to the subroutine that is called after bottom title
                            generation
     * EVENT_BREAKS      => Point to structure that contains break fields related to
                            subroutine to execute when the break is berformed

DBI_DRIVER
----------

   DBI_DRIVER specify driver to use for connection within Database via
DBI::DBD module.

   Ex.

     * Oracle  (Oracle database)
     * CSV     (Comma separated database)
     * Pg      (PostgreSQL database)
     ... ecc.

   For specific look at DBD::<Driver>

DBI_DATABASE
------------

   The clause DBI_DATABASE can change from db to db , look at DBD::Driver
for your specific.  Example parameter are:

     * ORACLE_SID            for Oracle database
     * f_dir=/csv/data       for CSV file specify directory location of text-file-table
     * dbname=your db name   for PostgreSQL database
     * DSN                   for ADO db connection

DBI_USERNAME DBI_PASSWORD
-------------------------

   Specify in order username and password of the user granted to use
database

DBD_QUERY
---------

   Here you can pass your query (SQL) , that is fetched for report
generation.  Example to pass query are:

     * Example 1 (Direct via parameters)
             $FMT=new Formatter(
             'DBI_DRIVER'           => 'Oracle',
             'DBI_DATABASE'         => 'database',
             'DBI_USERNAME'         => 'dbusername',
             'DBI_PASSWORD'         => 'dbpassword',
             'DBD_QUERY'            => 'SELECT * FROM ANAGRA WHERE CLPERRIF=199901 ORDER BY CDAZIEND',
              ...

     * Example 2 (Via variable)
             $query = qq {
                 SELECT
                     *
                 FROM
                     ANAGRA
                 WHERE
                         CLPERRIF=199901
                     AND
                         CDAZIEND=345
                 ORDER BY
                     CDAZIEND
             };

     $FMT=new Formatter(
     'DBI_DRIVER'           => 'Oracle',
     'DBI_DATABASE'         => 'database',
     'DBI_USERNAME'         => 'dbusername',
     'DBI_PASSWORD'         => 'dbpassword',
     'DBD_QUERY'            => $query,
     ...

BREAKS
------

   Specify an array containing fields that cause a break in the report.
When a breaks is performed 2 step are executed, first an EVENT_BREAKS is
called if defined , second a FORMAT_BREAKS is printed out if defined.  For
convenience if you would like to generate a new format when a field change
you can use first statement because is called after FORMAT_BREAKS (and if
they are subtotal are printed before), using the putformat function in the
event routine, for example:

     format FMT_DEPARTEMENT=
     TOTALS FOR DEPARTEMENT ARE ]--> @<<<<<<< @<<<<<<< @<<<<<<<
                                     111,     111,     111
     .

     format FMT_SUBDEPARTEMENT
     *********************************
     * @|||||||||||||||||||||||||||| *
       $SUBDEPARTEMENT
     *********************************
     .

     $BREAKS[0]="DEPARTEMENT";
     $BREAKS[1]="SUBDEPARTEMENT";

     $FMT=Formatter->new (
             ...
             'BREAKS'               => \@BREAKS,
             'BREAKS_SKIP_PAGE'     =>  {
             DEPARTEMENT => 1,
             SUBDEPARTEMENT => 0
             },
             'FORMAT_BREAKS'        =>  {
             DEPARTEMENT => *FMT_DEPARTEMENT
             },
             'EVENT_BREAKS'         =>  {
             SUBDEPARTEMENT => \&MySubDepartement
             },
             ...
     );

   sub MySubDepartement {$FMT->putformat(*FMT_SUBDEPARTEMENT)};

   In this way a possible output is :
******************************************************* *
REPORT BY SUBDEPARTMENT              *
*******************************************************
*********************************    *           CHEMICALS           *
*********************************    10    20    40    10    20    40
10    20    40    *********************************    *
PHARMACIA           *    *********************************    10    20
40    10    20    40    10    20    40

   TOTALS FOR DEPARTEMENT ARE ]-> 111 111 111

BREAKS_SKIP_PAGE
----------------

   This parameter specify only if a new page is called after a break
FORMAT is writed.

FORMAT_PAGESIZE
---------------

   (Optional-Deafult=60)

   Specify how many lines are printed for each piece of paper.

FORMAT_LINESIZE
---------------

   (Optional-Default=130)

   Specify how many character are counted by an outf function to perform
alignement of text.

FORMAT_FORMFEED
---------------

   (Optional-Deafult=\f)

   Specify the form feed sequence that are called whem a newpage or a
formfeed function are called ( and alse a new page of report is required)
Default value are CTRL-L (\f)

FORMAT_HEADER
-------------

   (Optional)

   This parameter point to the format fileheader for HEADER page.
Definition of format specific can be found in Perl documentation.  Header
page is printed only one time at beginning of the report and normally
include general specification or purpose of report.

   Example:

     format MY_HEADER=
     ******************************************
     * DATE    : @</@</@<<<                   *
                 $dd,$mm,$yyyy
     * PURPOSE : Statistique about user login *
     ******************************************
     .

FORMAT_TTITLE
-------------

   (Optional)

   Identical to HEADER definition.  TTITLE is printed on every change of
page in the top of the report page

FORMAT_BODY
-----------

   (Optional)

   Identical to HEADER definition.  BODY is printed on every change of
value in fetch of query statament, values of query are passed and
traslated to real variable in the main caller program.

   Example :     if query is SELECT NAME,SURNAME,ADDRESS FROM ADDRESSBOOK
  in yours format values $NAME,$SURNAME,$ADDRESS are created and updated
  on every fetch.

     format MYBODY=
     @<<<<<<<<<<<<<<<<< @<<<<<<<<<<<<<<<<<< @<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<
     $NAME,             $SURNAME            $ADDRESS
     .

FORMAT_BTITLE
-------------

   (Optional)

   Identical to HEADER definition.  BTITLE is printed on every change of
page in the bottom of the report page

FORMAT_BTITLE_HEIGHT
--------------------

   (Option if FORMAT_BTITLE not specified)

   This parameter specify how many lines are printed by the FORMAT_BTITLE
definition.  I have put this parameter beacuse i don't know how to inform
automatically my package in what lines is used by a format definition.  If
anyone know how to make it possible please contact me.

FORMAT_BREAKS
-------------

   This parameter point to a structure where are located by fields format
filehandle definitions.  For use see the BREAKS specifics.

EVENT_PRE...
------------

   All EVENT_PRE... (EVENT_PREHEADER,EVENT_PRETTITLE ...) parameters
points to a routine defined by the user that is called before the event
handle is performed.  For example if you would like to change a value in
the top title before is print out you need to create a sub for example
named MyBeforeTopTitle in which you change a value , and then you pass
this reference to the EVENT_PRETTITLE parameter :

     'EVENT_PRETTITLE' => \&MyBeforeTopTitle

   Remember that all values created by the fetch statement are already
updated when all events are generated

EVENT_POST...
-------------

   Is identical to the last but is applied after the event handle occurs.

EVENT_BREAKS
------------

   This parameter point to a structure that specify by fields events that
can be generated when a break is performed after a FORMAT_BREAKS is
printed out.

FUNCTIONS
=========

Function generate
-----------------

   This function call all parameters and build the report.

Function ofmt
-------------

   This function place a text in the report over formatting definition ,
is only for special case in which is impossible to place text on format
definition (if you found when please contact me).
Formatter->ofmt ("Text to print out",position flag,position character); *
Position flag The position flag specify where text are printed , possible
values are:     <   Text are aligned to left of the line     >   Text are
aligned to right of the line     |   Text are aligned in the middle of the
line     C   Enable Position

   * Position character The position character specify the position in the
line where the text are printed

Function page
-------------

   The page function return number of page in this moment

Function line
-------------

   The line function return the line position in the moment

Function newpage
----------------

   The new page function send a complete new page to the report this is
the next sequence :

     * Print the bottom title in the current page
     * Send the form feed sequence
     * Print the top title in the new page

Function formfeed
-----------------

   The formfeed function send only the form feed sequence to the report
and not perform the title generation.

AUTHOR
======

     Vecchio Fabrizio <jacote@tiscalinet.it>

SEE ALSO
========

   `DBI' in this node


File: pm.info,  Node: DBIx/FullTextSearch,  Next: DBIx/FullTextSearch/StopList,  Prev: DBIx/Formatter,  Up: Module List

Indexing documents with MySQL as storage
****************************************

NAME
====

   DBIx::FullTextSearch - Indexing documents with MySQL as storage

SYNOPSIS
========

     use DBIx::FullTextSearch;
     use DBI;
     # connect to database (regular DBI)
     my $dbh = DBI->connect('dbi:mysql:database', 'user', 'passwd');

     # create a new stoplist
     my $sl = DBIx::FullTextSearch::StopList->create_default($dbh, 'sl_en', 'English');

     # create a new index with default english stoplist and english stemmer
     my $fts = DBIx::FullTextSearch->create($dbh, 'fts_web_1',
     		frontend => 'string', backend => 'blob',
     		stoplist => 'sl_en', stemmer => 'en-us');
     # or open existing one
     # my $fts = DBIx::FullTextSearch->open($dbh, 'fts_web_1');

     # index documents
     $fts->index_document('krtek', 'krtek leze pod zemi');
     $fts->index_document('jezek', 'Jezek ma ostre bodliny.');

     # search for matches
     my @docs = $fts->contains('foo');
     my @docs = $fts->econtains('+foo', '-Bar');
     my @docs = $fts->search('+foo -Bar');

DESCRIPTION
===========

   DBIx::FullTextSearch is a pure man's solution for indexing contents of
documents.  It uses the MySQL database to store the information about
words and documents and provides Perl interface for indexing new documents,
making changes and searching for matches.  For DBIx::FullTextSearch, a
document is nearly anything - Perl scalar, file, Web document, database
field.

   The basic style of interface is shown above. What you need is a MySQL
database and a DBI with DBD::mysql. Then you create a DBIx::FullTextSearch
index - a set of tables that maintain all necessary information. Once
created it can be accessed many times, either for updating the index
(adding documents) or searching.

   DBIx::FullTextSearch uses one basic table to store parameters of the
index. Second table is used to store the actual information about
documents and words, and depending on the type of the index (specified
during index creation) there may be more tables to store additional
information (like conversion from external string names (eg. URL's) to
internal numeric form). For a user, these internal thingies and internal
behaviour of the index are not important. The important part is the API,
the methods to index document and ask questions about words in documents.
However, certain understanding of how it all works may be usefull when you
are deciding if this module is for you and what type of index will best
suit your needs.

Frontends
---------

   From the user, application point of view, the DBIx::FullTextSearch
index stores documents that are named in a certain way, allows adding new
documents, and provides methods to ask: "give me list of names of
documents that contain this list of words". The DBIx::FullTextSearch index
doesn't store the documents itself. Instead, it stores information about
words in the documents in such a structured way that it makes easy and
fast to look up what documents contain certain words and return names of
the documents.

   DBIx::FullTextSearch provides a couple of predefined frontend classes
that specify various types of documents (and the way they relate to their
names).

default
     By default, user specifies the integer number of the document and the
     content (body) of the document. The code would for example read

          $fts->index_document(53, 'zastavujeme vyplaty vkladu');

     and DBIx::FullTextSearch will remember that the document 53 contains
     three words.  When looking for all documents containing word (string)
     vklad, a call

          my @docs = $fts->contains('vklad%');

     would return numbers of all documents containing words starting with
     'vklad', 53 among them.

     So here it's user's responsibility to maintain a relation between the
     document numbers and their content, to know that a document 53 is
     about vklady. Perhaps the documents are already stored somewhere and
     have inique numeric id.

string
     Frontend string allows the user to specify the names of the documents
     as strings, instead of numbers. Still the user has to specify both the
     name of the document and the content:

          $fts->index_document('foobar',
          		'the quick brown fox jumped over lazy dog!');

     After that,

          $fts->contains('dog')

     will return 'foobar' as one of the names of documents with word 'dog'
     in it.

file
     To index files, use the frontend file. Here the content of the
     document is clearly the content of the file specified by the
     filename, so in a call to index_document, only the name is needed -
     the content of the file is read by the DBIx::FullTextSearch
     transparently:

          $fts->index_document('/usr/doc/FAQ/Linux-FAQ');
          my @files = $fts->contains('penguin');

url
     Web document can be indexed by the frontend url. DBIx::FullTextSearch
     uses *Note LWP: LWP, to get the document and then parses it normally:

          $fts->index_document('http://www.perl.com/');

table
     You can have a DBIx::FullTextSearch index that indexes char or blob
     fields in MySQL table. Since MySQL doesn't support triggers, you have
     to call the index_document method of DBIx::FullTextSearch any time
     something changes in the table. So the sequence probably will be

          $dbh->do('insert into the_table (id, data, other_fields)
          	values (?, ?, ?)', {}, $name, $data, $date_or_something);
          $fts->index_document($name);

     When calling contains, the id (name) of the record will be returned.
     If the id in the_table is numeric, it's directly used as the internal
     numeric id, otherwise a string's way of converting the id to numeric
     form is used.

     When creating this index, you'll have to pass it three additionial
     options, table_name, column_name, and column_id_name.  You may use
     the optional column_process option to pre-process data in the
     specified columns.

   The structure of DBIx::FullTextSearch is very flexible and adding new
frontend (what will be indexed) is very easy.

Backends
--------

   While frontend specifies what is indexed and how the user sees the
collection of documents, backend is about low level database way of
actually storing the information in the tables. Three types are available:

blob
     For each word, a blob holding list of all documents containing that
     word is stored in the table, with the count (number of occurencies)
     associated with each document number. That makes it for very compact
     storage. Since the document names (for example URL) are internally
     converted to numbers, storing and fetching the data is fast. However,
     updating the information is very slow, since information concerning
     one document is spread across all table, without any direct database
     access.  Updating a document (or merely reindexing it) requires
     update of all blobs, which is slow.

     The list of documents is stored sorted by document name so that
     fetching an information about a document for one word is relatively
     easy, still a need to update (or at least scan) all records in the
     table makes this storage unsuitable for collections of documents that
     often change.

column
     The column backend stores a word/document pair in database fields,
     indexing both, thus allowing both fast retrieval and updates - it's
     easy to delete all records describing one document and insert new
     ones.  However, the database indexes that have to be maintained are
     large.

     Both blob and column backends only store a count - number of
     occurencies of the word in the document (and even this can be switched
     off, yielding just a yes/no information about the word's presence).
     This allows questions like

          all documents containing words 'voda' or 'Mattoni'
          	but not a word 'kyselka'

     but you cannot ask whether a document contains a phrase 'kyselka
     Mattoni' because such information is not maintained by these types of
     backends.

phrase
     To allow phrase matching, a phrase backend is available. For each word
     and document number it stores a blob of lists of positions of the word
     in the document. A query

          $fts->contains('kyselk% Mattoni');

     then only returns those documents (document names/numbers) where word
     kyselka (or kyselky, or so) is just before word Mattoni.

Mixing frontends and backends
-----------------------------

   Any frontend can be used with any backend in one DBIx::FullTextSearch
index. You can index Web documents with url frontend and phrase backend to
be able to find phrases in the documents. And you can use the default,
number based document scheme with blob backend to use the disk space as
efficiently as possible - this is usefull for example for mailing-list
archives, where we need to index huge number of documents that do not
change at all.

   Finding optimal combination is very important and may require some
analysis of the document collection and manipulation, as well as the speed
and storage requirements. Benchmarking on actual target platform is very
useful during the design phase.

METHODS
=======

   The following methods are available on the user side as
DBIx::FullTextSearch API.

create
          my $fts = DBIx::FullTextSearch->create($dbh, $index_name, %opts);

     The class method create creates index of given name (the name of the
     index is the name of its basic parameter table) and all necessary
     tables, returns an object - newly created index. The options that may
     be specified after the index name define the frontend and backend
     types, storage parameters (how many bits for what values), etc. See
     below for list of create options and discussion of their use.

open
          my $fts = DBIx::FullTextSearch->open($dbh, $index_name);

     Opens and returns object, accessing specifies DBIx::FullTextSearch
     index. Since all the index parameters and information are stored in
     the `$index_name' table (including names of all other needed tables),
     the database handler and the name of the parameter table are the only
     needed arguments.

index_document
          $fts->index_document(45, 'Sleva pri nakupu stribra.');
          $fts->index_document('http://www.mozilla.org/');
          $fts->index_document('http://www.mozilla.org/','This is the mozilla web site');

     For the default and string frontends, two arguments are expected - the
     name (number or string) of the document and its content. For file,
     url, and table frontends the content is optional.  Any content that
     you pass will be appended to the content from the file, URL, or
     database table.

delete_document
          $fts->delete_document('http://www.mozilla.org/');

     Removes information about document from the index. Note that for blob
     backend this is very time consuming process.

contains
          my @docs = $fts->contains('sleva', 'strib%');

     Returns list of names (numbers or strings, depending on the frontend)
     of documents that contain some of specified words.

econtains
          my @docs = $fts->contains('foo', '+bar%', '-koo');

     Econtains stands for extended contains and allows words to be prefixed
     by plus or minus signs to specify that the word must or mustn't be
     present in the document for it to match.

contains_hashref, econtains_hashref
     Similar to contains and econtains, only instead of list of document
     names, there methods return a hash reference to a hash where keys are
     the document names and values are the number of occurencies of the
     words.

search
          my @docs = $fts->search(qq{+"this is a phrase" -koo +bar foo});

     This is a wrapper to econtains which takes a user input string and
     parses it into can-include, must-include, and must-not-include words
     and phrases.

drop
     Removes all tables associated with the index, including the base
     parameter table. Effectivelly destroying the index form the database.

          $fts->drop;

empty
     Emptys the index so you can reindex the data.

          $fts->empty;

INDEX OPTIONS
=============

   Here we list the options that may be passed to
DBIx::FullTextSearch->create call.  These allow to specify the style and
storage parameters in great detail.

backend
     The backend type, default blob, possible values blob, column and
     phrase (see above for explanation).

frontend
     The frontend type. The default frontend requires the user to specify
     numeric id of the document together with the content of the document,
     other possible values are string, file and url (see above for more
     info).

word_length
     Maximum length of words that may be indexed, default 30.

data_table
     Name of the table where the actual data about word/document relation
     is stored. By default, the name of the index (of the base table) with
     _data suffix is used.

name_length
     Any frontend that uses strings as names of documents needs to maintain
     a conversion table from these names to internal integer ids. This
     value specifies maximum length of these string names (URLs, file
     names, ...).

blob_direct_fetch
     Only for blob backend. When looking for information about specific
     document in the list stored in the blob, the blob backend uses
     division of interval to find the correct place in the blob. When the
     interval gets equal or shorter that this value, all values are
     fetched from the database and the final search is done in Perl code
     sequentially.

word_id_bits
     With column or `phase' backends, DBIx::FullTextSearch maintains a
     numeric id for each word to optimize the space requirements. The
     word_id_bits parameter specifies the number of bits to reserve for
     this conversion and thus effectively limits number of distinct words
     that may be indexed. The default is 16 bits and possible values are
     8, 16, 24 or 32 bits.

word_id_table
     Name of the table that holds conversion from words to their numeric id
     (for column and phrase backends). By default is the name of the index
     with _words suffix.

doc_id_bits
     A number of bits to hold a numeric id of the document (that is either
     provided by the user (with default frontend) or generated by the
     module to accomplish the conversion from the string name of the
     document). This value limits the maximum number of documents to hold.
     The default is 16 bits and possible values are 8, 16 and 32 bits for
     blob backend and 8, 16, 24 and 32 bits for column and phrase backends.

doc_id_table
     Name of the table that holds conversion from string names of documents
     to their numeric id, by default the name of the index with _docid
     suffix.

count_bits
     Number of bits reserved for storing number of occurencies of each word
     in the document. The default is 8 and possible values are the same as
     with doc_id_bits.

position_bits
     With `phrase backend', DBIx::FullTextSearch stores positions of each
     word of the documents. This value specifies how much space should be
     reserved for this purpose. The default is 32 bits and possible values
     are 8, 16 or 32 bits. This value limits the maximum number of words
     of each document that can be stored.

index_splitter
     DBIx::FullTextSearch allows the user to provide any Perl code that
     will be used to split the content of the document to words when
     indexing documents.  The code will be evalled inside of the
     DBIx::FullTextSearch code. The default is

          /(\w{2,$word_length})/g

     and shows that the input is stored in the variable $data and the code
     may access any other variable available in the perl_and_index_data_*
     methods (see source), especially `$word_length' to get the maximum
     length of words and `$backend' to get the backend object.

     The default value also shows that by default, the minimum length of
     words indexed is 2.

search_splitter
     This is similar to the index_splitter method, except that it is used
     in the `contains_hashref' method when searching for documents instead
     of when indexing documents.  The default is

          /([a-zA-Z_0-9]{2,$word_length}\*?)/g

     Which, unlike the default index_splitter, allows for the wild card
     character (*).

filter
     The output words of splitter (and also any parameter of (e)contains*
     methods) are send to filter that may do further processing. Filter is
     again a Perl code, the default is

          map { lc $_ }

     showing that the filter operates on input list and by default does
     conversion to lowercase (yielding case insensitive index).

init_env
     Because user defined splitter or filter may depend on other things
     that it is reasonable to set before the actual procession of words,
     you can use yet another Perl hook to set things up. The default is

          use locale

stoplist
     This is the name of a *Note DBIx/FullTextSearch/StopList:
     DBIx/FullTextSearch/StopList, object that is used for stop words.

stemmer
     If this option is set, then word stemming will be enabled in the
     indexing and searching.

     The value is the name of a *Note Lingua/Stem: Lingua/Stem, recognized
     locale.  Currently, 'en', 'en-us' and 'en-uk' are the only recognized
     locales.  All locale identifiers are converted to lowercase.

table_name
     For table frontend; this is the name of the table that will be
     indexed.

column_name
     For table frontend; this is a reference to an array of columns in the
     table_name that contains the documents - data to be indexed. It can
     also have a form table.column that will be used if the table_name
     option is not specified.

column_id_name
     For table frontend; this is the name of the field in table_name that
     holds names (ids) of the records. If not specified, a field that has
     primary key on it is used. If this field is numeric, it's values are
     directly used as identifiers, otherwise a conversion to numeric values
     is made.

ERROR HANDLING
==============

   The create and open methods return the DBIx::FullTextSearch object on
success, upon failure they return undef and set error message in
`$DBIx::FullTextSearch::errstr' variable.

   All other methods return reasonable (documented above) value on success,
failure is signalized by unreasonable (typically undef or null) return
value; the error message may then be retrieved by `$fts->errstr' method
call.

VERSION
=======

   This documentation describes DBIx::FullTextSearch module version 0.55.

BUGS
====

   Error handling needs more polishing.

   We do not check if the stored values are larger that specified by the
*_bits parameters.

   No CGI administration tool at the moment.

   No scoring algorithm implemented.

DEVELOPMENT
===========

   These modules are under active development.  Currently a PostgreSQL
version is being developed.  If you would like to contribute, please
e-mail tj@anidea.com

   There are two mailing lists for this module, one for users, and another
for developers.  To subscribe, visit
http://sourceforge.net/mail/?group_id=8645

AUTHOR
======

   (Original) Jan Pazdziora, adelton@fi.muni.cz,
http://www.fi.muni.cz/~adelton/ at Faculty of Informatics, Masaryk
University in Brno, Czech Republic

   (Current Maintainer) Thomas J. Mather, tj@anidea.com,
http://www.thoughtstore.com/tjmather/ New York, NY, USA

CREDITS
=======

   Fixes, Bug Reports have been generously provided by:

     Ade Olonoh
     Andrew Turner
     Tarik Alkasab

   Thanks!

COPYRIGHT
=========

   All rights reserved. This package is free software; you can
redistribute it and/or modify it under the same terms as Perl itself.

SEE ALSO
========

   `DBI' in this node, ftsadmin, `DBIx::FullTextSearch::StopWord' in this
node

OTHER PRODUCTS and why I've written this module
===============================================

   I'm aware of `DBIx::TextIndex' module and about UdmSearch utility, and
about htdig and glimpse on the non-database side of the world.

   To me, using a database gives reasonable maintenance benefits. With
products that use their own files to store the information (even if the
storage algorithms are efficient and well thought of), you always struggle
with permissions on files and directories for various users, with files
that somebody accidently deleted or mungled, and making the index
available remotely is not trivial.

   That's why I've wanted a module that will use a database as a storage
backend. With MySQL, you get remote access and access control for free,
and on many web servers MySQL is part of the standard equipment. So using
it for text indexes seemed natural.

   However, existing *Note DBIx/TextIndex: DBIx/TextIndex, and UdmSearch
are too narrow-aimed to me. The first only supports indexing of data that
is stored in the database, but you may not always want or need to store
the documents in the database as well. The UdmSearch on the other hand is
only for web documents, making it unsuitable for indexing mailing-list
archives or local data.

   I believe that DBIx::FullTextSearch is reasonably flexible and still
very efficient. It doesn't enforce its own idea of what is good for you -
the number of options is big and you can always extend the module with
your own backend of frontend if you feel that those provided are not
sufficient. Or you can extend existing by adding one or two parameters
that will add new features. Of course, patches are always welcome.
DBIx::FullTextSearch is a tool that can be deployed in many projects. It's
not a complete environment since different people have different needs. On
the other hand, the methods that it provides make it easy to build a
complete solution on top of this in very short course of time.

   I was primarily inspired by the ConText cartrige of Oracle server. Since
MySQL doesn't support triggers, it showed up that Perl interface will be
needed. Of course, porting this module to (for example) PostgreSQL should
be easy, so different name is probably needed. On the other hand, the code
is sometimes very MySQL specific to make the module work efficiently, so I
didn't want a name that would suggest that it's a generic tool that will
work with any SQL database.


