This is Info file pm.info, produced by Makeinfo version 1.68 from the
input file bigpm.texi.


File: pm.info,  Node: PDL/Internals,  Next: PDL/Intro,  Prev: PDL/Indexing,  Up: Module List

description of the current internals
************************************

NAME
====

   PDL::Internals - description of the current internals

DESCRIPTION
===========

Intro
-----

   This document explains various aspects of the current implementation of
PDL. If you just want to use PDL for something, you definitely do not need
to read this. Even if you want to interface your C routines to PDL or
create new `PDL::PP|PDL::PP' in this node functions, you do not need to
read this (though it may be informative). This document is primarily
intended for people interested in debugging or changing the internals of
PDL. To read this, a good understanding of the C language and programming
and data structures in general is required, as well as some perl
understanding. If you read through this document and understand all of it
and are able to point what any part of this document refers to in the PDL
core sources and additionally struggle to understand `PDL::PP|PDL::PP' in
this node, you will be awarded the title "PDL Guru" (of course, the
current version of this document is so incomplete that this is not yet the
case).

   Warning: If it seems that this document has gotten out of date, please
inform the PerlDL developers email list (address in the README file) about
it. This may well happen.

Piddles
-------

   Currently, a pdl data object is a hash ref which contains the element
PDL, which is a pointer to a pdl structure, as well as some other fields.
The file `Core.pm' uses some of these fields and the file `pdlhash.c'
converts these to C when necessary.

   The pdl struct is defined in `pdl.h' and the meanings of the fields are

magicno
     A magic number, used to check whether something really is a piddle
     when debugging.

state
     Various flags about the state of the pdl, such as whether the parents
     of this pdl have been altered at some point.

trans
     Where this pdl was obtained from. This pointer may be null, in which
     case this pdl is not getting any dataflow from anywhere.  Note,
     however that being non-null does not mean that data is flowing:

          $a = pdl 2,3,4;
          $b = pdl 4,5,6;
          $c = $a + $b;     # Note: no dataflow (not asked for)

     here, the trans field in $c contains a pointer to a transformation.
     Only when `$a' or $b is changed, is the transformation destroyed and
     the field cleared. To see whether data is flowing, check the flags
     field of the trans struct.

vafftrans
     This is intended for speeding up e.g. the chaining of affine
     transformations. See `pdlapi.c' for the code handling this.  Also,
     `slices.pd' defines some things with / for this.

sv
     Pointer to the hash object. May be null if this pdl does not have a
     perl counterpart.

datasv, data
     The field datasv is a pointer to the perl SV containing the data
     string.  These may be null before the pdl is finally physicalized.

nvals
     How many values there are in data

datatype
     The type of the data stored in the data vector.

dims, ndims
     The dimensions of this pdl. Remember to physicalize the pdl before
     using.

dimincs
     As an optimization, an increment for each dimension is stored here.
     These are required to correspond exactly to dims.  If you want to
     optimize for affine transformations, use the trans or vtrans.

threadids, nthreadids
     This is where the threading tags are stored. The way this works is
     that ndims and dims hold all dimensions of the pdl, including
     threaded dimensions. The real dimensions of the pdl extend from 0 to
     `threadids[0]-1', the thread dimensions with id 0 extend from
     `threadids[0]' to `threadids[1]-1' and the thread dimensions with the
     last id extend from `threadids[nthreadids-1]' to
     `threadids[nthreadids]-1'.  For example, if a pdl has dimensions
     `(2,3,4,5)' (= 120 elements) and `nthreadids==2' and
     `threadids={1,3,4}', there is one "real" dimensions with size 2, two
     dimensions with threadid 0 (3 and 4) and the dimensions with size 5
     has threadid 1.

progenitor, future_me
     See the section on families below

children
     The children of this pdl i.e. where data is flowing to from this pdl.

living_for
     XXX Not quite clear right now. Has to do with families


     To avoid mallocs, there is a suitable amount of space already
     allocated for each pointer in this pdl, with the ideology that if you
     have more than six-dimensional data you must be willing to settle for
     a little more overhead.

magic
     If this pdl is magical (e.g. if it is bound to something), this
     pointer is non-null and you must call the appropriate magic-handling
     routines when using the pdl.

hdrsv
     A "header" SV * that can be set and accessed from outside.  Can be
     used to include any perl object in a piddle.

Transformations
---------------

   Each transformation has a virtual table which contains various
information about that transformation. Usually transformations are
generated with `PDL::PP|PDL::PP' in this node so it's better to see that
documentation.

Freeing
-------

   Currently, not much is freed, especially when dataflow is done.  This
is bound to change pretty soon.

Threading
---------

   The file `pdlthread.c' handles most of the threading matters.  The
threading is encapsulated in the structure `pdlthread.h'.

Accessing children and parents of a piddle
------------------------------------------

   The file `Basic/Core/pdlapi.h.PL' contains useful routines for
manipulating the pdl structure (it's probably easier to read
`Basic/Core/pdlapi.h' once you've performed a build of PDL).

   An example of processing the children of a piddle is provided by the
`baddata' method of PDL::Bad (only available if you have comiled PDL with
the WITH_BADVAL option set to 1, but still useful as an example!).

   Consider the following situation:

     perldl> $a = rvals(7,7,Centre=>[3,4]);
     perldl> $b = $a->slice('2:4,3:5');
     perldl> ? vars
     PDL variables in package main::

     Name         Type   Dimension       Flow  State          Mem
     ----------------------------------------------------------------
     $a           Double D [7,7]                P            0.38Kb
     $b           Double D [3,3]                VC           0.00Kb

   Now, if I suddenly decide that `$a' should be flagged as possibly
containing bad values, using

     perldl> $a->baddata(1)

   then I want the state of $b - it's child - to be changed as well, so
that I get a 'B' in the State field:

     perldl> ? vars
     PDL variables in package main::

     Name         Type   Dimension       Flow  State          Mem
     ----------------------------------------------------------------
     $a           Double D [7,7]                PB           0.38Kb
     $b           Double D [3,3]                VCB          0.00Kb

   This bit of magic is performed by the `propogate_badflag' function,
which is listed below:

     /* newval = 1 means set flag, 0 means clear it */
     /* thanks to Christian Soeller for this */

     void propogate_badflag( pdl *it, int newval ) {
        PDL_DECL_CHILDLOOP(it)
        PDL_START_CHILDLOOP(it)
        {
     	pdl_trans *trans = PDL_CHILDLOOP_THISCHILD(it);
     	int i;
     	for( i = trans->vtable->nparents;
     	     i < trans->vtable->npdls;
     	     i++ ) {
     	    pdl *child = trans->pdls[i];

     if ( newval ) child->state |=  PDL_BADVAL;
                 else          child->state &= ~PDL_BADVAL;

     /* make sure we propogate to grandchildren, etc */
     propogate_badflag( child, newval );

     } /* for: i */
         }
         PDL_END_CHILDLOOP(it)
      } /* propogate_badflag */

   Given a piddle (`pdl *it'), the routine loops through each `pdl_trans'
structure, where access to this structure is provided by the
`PDL_CHILDLOOP_THISCHILD' macro.  The children of the piddle are stored in
the `pdls' array, after the *parents*, hence the loop from `i =
...nparents' to `i = ...nparents - 1'.  Once we have the pointer to the
child piddle, we can do what we want to it; here we change the value of
the state variable, but the details are unimportant).  What *is* important
is that we call `propogate_badflag' on this piddle, to ensure we loop
through its children. This recursion ensures we get to all the *offspring*
of a particular piddle.

   *THE FOLLOWING NEEDS TO BE CHECKED*.

   Access to *parents* is similar, with the for loop replaced by:

     for( i = 0;
          i < trans->vtable->nparents;
          i++ ) {

AUTHOR
======

   Copyright(C) 1997 Tuomas J. Lukka (lukka@fas.harvard.edu), 2000 Doug
Burke (burke@ifa.hawaii.edu).

   Redistribution in the same form is allowed but reprinting requires a
permission from the author.


File: pm.info,  Node: PDL/Intro,  Next: PDL/Lite,  Prev: PDL/Internals,  Up: Module List

Introduction to the Perl Data Language
**************************************

NAME
====

   PDL::Intro - Introduction to the Perl Data Language

   Version 2.0

   "Why is it that we entertain the belief that for every purpose odd
numbers are the most effectual?" - Pliny the Elder.

   *Karl Glazebrook [karlglazebrook@yahoo.com]*

DESCRIPTION
===========

   Perl is an extremely good and versatile scripting language, well suited
to beginners and allows rapid prototyping. However until recently it did
not support data structures which allowed it to do fast number crunching.

   However with the development of Perl v5, Perl acquired 'Objects'. To put
it simply users can define their own special data types, and write custom
routines to manipulate them either in low level languages (C and Fortran)
or in Perl itself.

   This has been fully exploited by the PerlDL developers. The 'PDL'
module is a complete Object-Oriented extension to Perl (although you don't
have to know what an object is to use it) which allows large N-dimensional
data sets, such as large images, spectra, time series, etc to be stored
*efficiently* and manipulated *en masse*.  For example  with the PDL
module we can write the perl code `$a=$b+$c', where $b and $c are large
datasets (e.g. 2048x2048 images), and get the result in only a fraction of
a second.

   PDL variables (or 'piddles' as they have come to be known) support a
wide range of fundamental data types - arrays can be bytes, short integers
(signed or unsigned), long integers, floats or double precision floats.
And because of the Object-Oriented nature of PDL new customised datatypes
can be derived from them.

   As well as the PDL modules, that can be used by normal perl programs,
PerlDL comes with a command line perl shell, called *perldl*, which
supports command line editing. In combination with the various PDL
graphics modules this allows data to be easily played with and visualised.

SYNOPSIS
========

   This manual page provides a general introduction to the underlying
philosophy of PDL and practical examples on how to use it. For details,
see:

*Note PDL/Intro: PDL/Intro,
     This document

*Note PDL/Impatient: PDL/Impatient,
     Quick summary - PDL for the impatient

*Note PDL/Philosophy: PDL/Philosophy,
     Why another matrix language?

*Note PDL/Indexing: PDL/Indexing,
     An introduction to using smart indices in PDL.

`PDL::Slice' in this node
     A reference guide to the same.

*Note PDL/PP: PDL/PP,
     A utility for generating extension in C language for use with PDL
     easily.

*Note PDL/FAQ: PDL/FAQ,
     The Frequently Asked Questions list for PDL.

*Note PDL/Tips: PDL/Tips,
     Small tips and tricks for writing idiomatic PDL code.

*Note PDL/Internals: PDL/Internals,
     How does it all work?

*Note PDL/Dataflow: PDL/Dataflow,
     Tuomas has been too lazy to document this yet.

AUTHOR
======

   Copyright (C) Karl Glazebrook (karlglazebrook@yahoo.com), Tuomas J.
Lukka, (lukka@husc.harvard.edu) and Christian Soeller
(c.soeller@auckland.ac.nz) 1997 to 2000.  Commercial reproduction of this
documentation in a different format is forbidden.


File: pm.info,  Node: PDL/Lite,  Next: PDL/LiteF,  Prev: PDL/Intro,  Up: Module List

minimum PDL module OO loader
****************************

NAME
====

   PDL::Lite - minimum PDL module OO loader

DESCRIPTION
===========

   Loads the smallest possible set of modules for PDL to work, without
importing an functions in to the current namespace. This is the absolute
minimum set for PDL.

   Though know functions are defined (apart from a few always exported by
Core) you can still use method syntax, viz:

     $x->wibble(42);

SYNOPSIS
========

     use PDL::Lite; # Is equivalent to the following:

     use PDL::Core '';
     use PDL::Ops '';
     use PDL::Primitive '';
     use PDL::Ufunc '';
     use PDL::Basic '';
     use PDL::Slices '';
     use PDL::Bad '';
     use PDL::Version;


File: pm.info,  Node: PDL/LiteF,  Next: PDL/Math,  Prev: PDL/Lite,  Up: Module List

minimum PDL module function loader
**********************************

NAME
====

   PDL::LiteF - minimum PDL module function loader

DESCRIPTION
===========

   Loads the smallest possible set of modules for PDL to work, making the
functions available in the current namespace. If you want something even
smaller see the `PDL::Lite|PDL::Lite' in this node module.

SYNOPSIS
========

     use PDL::LiteF; # Is equivalent to the following:

     use PDL::Core;
     use PDL::Ops;
     use PDL::Primitive;
     use PDL::Ufunc;
     use PDL::Basic;
     use PDL::Slices;
     use PDL::Bad;
     use PDL::Version;


File: pm.info,  Node: PDL/Math,  Next: PDL/Objects,  Prev: PDL/LiteF,  Up: Module List

extended mathematical operations and special functions
******************************************************

NAME
====

   PDL::Math - extended mathematical operations and special functions

SYNOPSIS
========

     use PDL::Math;

     use PDL::Graphics::TriD;
     imag3d [SURF2D,bessj0(rvals(zeroes(50,50))/2)];

DESCRIPTION
===========

   This module extends PDL with more advanced mathematical functions than
provided by standard Perl.

   All the functions have one input pdl, and one output, unless otherwise
stated.

   The functions are usually available from the system maths library,
however if they are not (determined when PDL is compiled) a version from
the Cephes math library is used.

FUNCTIONS
=========

acos
----

asin
----

atan
----

cosh
----

sinh
----

tan
---

tanh
----

ceil
----

floor
-----

rint
----

pow
---

acosh
-----

asinh
-----

atanh
-----

erf
---

erfc
----

bessj0
------

bessj1
------

bessy0
------

bessy1
------

bessjn
------

bessyn
------

lgamma
------

   This returns 2 piddles - the first set gives the log(gamma) values,
while the second set, of integer values, gives the sign of the gamma
function.  This is useful for determining factorials, amongst other things.

badmask
-------

   badmask can be run with `$a' inplace:

     badmask($a->inplace,0);
     $a->inplace->badmask(0);

isfinite
--------

erfi
----

ndtri
-----

svd
---

polyroots
---------

eigens
------

simq
----

   `$a' is an `n x n' matrix (i.e., a vector of length `n*n'), stored
row-wise: that is, `a(i,j) = a[ij]', where `ij = i*n + j'.  While this is
the transpose of the normal column-wise storage, this corresponds to
normal PDL usage.  The contents of matrix a may be altered (but may be
required for subsequent calls with flag = -1).

   $b, $x, `$ips' are vectors of length n.

   Set `flag=0' to solve.  Set `flag=-1' to do a new back substitution for
different $b vector using the same a matrix previously reduced when
`flag=0' (the `$ips' vector generated in the previous solution is also
required).

squaretotri
-----------

BUGS
====

   Hasn't been tested on all platforms to ensure Cephes versions are
picked up automatically and used correctly.

AUTHOR
======

   Copyright (C) R.J.R. Williams 1997 (rjrw@ast.leeds.ac.uk), Karl
Glazebrook (kgb@aaoepp.aao.gov.au) and Tuomas J. Lukka
(Tuomas.Lukka@helsinki.fi).

   All rights reserved. There is no warranty. You are allowed to
redistribute this software / documentation under certain conditions. For
details, see the file COPYING in the PDL distribution. If this file is
separated from the PDL distribution, the copyright notice should be
included in the file.


File: pm.info,  Node: PDL/Objects,  Next: PDL/Ops,  Prev: PDL/Math,  Up: Module List

Object-Orientation, what is it and how to exploit it
****************************************************

NAME
====

   PDL::Objects - Object-Orientation, what is it and how to exploit it

DESCRIPTION
===========

   This still needs to be written properly.

Inheritance
-----------

   There are basically two reasons for subclassing piddles.  The first is
simply that you want to be able to use your own routines like

     $piddle->something()

   but don't want to mess up the PDL namespace (a worthy goal, indeed!).
The other is that you wish to provide special handling of some functions
or more information about the data the piddle contains.  In the first
case, you can do with

     package BAR;
     @ISA=qw/PDL/;
     sub foo {my($this) = @_; fiddle;}

     package main;
     $a = PDL::pdl(BAR,5);
     $a->foo();

   However, because a PDL object is an opaque reference to a C struct, it
is not possible to extend the PDL class by e.g. extra data via subclassing.
To circumvent this problem PerlDL has built-in support to extent the PDL
class via the *has-a* relation for blessed hashes.  You can get the
*HAS-A* behave like *IS-A* simply in that you assign the `PDL' object to
the attribute named PDL and redefine the method initialize().

     package FOO;

     @FOO::ISA = qw(PDL);
     sub initialize {
         my $class = shift;
         my $self = {
                 creation_time => time(),  # necessary extension :-)
                 PDL => null,             # used to store PDL object
                 };
         bless $self, $class;
     }

   All PDL constructors will call initialize() to make sure that your
extentions are added by all PDL constructors automaticly.   The `PDL'
attribute is used by perlDL to store the PDL object and all PDL methods
use this attribute automaticly if they are called with a blessed hash
reference instead of a PDL object (a blessed scalar).

   Do remember that if you subclass a class that is subclassed from a
piddle, you need to call SUPER::initialize.

   NEED STUFF ABOUT CODE REFs!!

Examples
--------

   You can find some simple examples of PDL subclassing in the PDL
distribution test-case files. Look in `t/subclass2.t', `t/subclass3.t',
etc.

Output Auto-Creation and Subclassed Objects
-------------------------------------------

   For PDL Functions where the output is created and returned, PDL will
either call the subclassed object's initialize or copy method to create the
output object. (See `PDL::Indexing|PDL::Indexing' in this node for a
discussion on Output Auto-Creation.) This behavior is summarized as
follows:

   * For Simple functions, defined as having a signature of

          func( a(), [o]b() )

     PDL will call $a->copy to create the output object.

     In the spirit of the perl philosophy of making *Easy Things Easy*,
     This behavior enables PDL-subclassed objects to be written without
     having to overload the many simple PDL functions in this category.

     The file t/subclass4.t in the PDL Distribution tests for this
     behavior.  See that file for an example.

   * For other functions, PDL will call $class->initialize to create the
     output object.  Where $class is the class name of the first argument
     supplied to the function.

     For these more complex cases, it is difficult to second-guess the
     subclassed-object's designer to know if a copy or a initialize is
     appropriate. So for these cases, $class->initialize is called by
     default. If this is not appropriate for you, overload the function in
     your subclass and do whatever is appropriate is the overloaded
     function's code.

AUTHOR
======

   Copyright (C) Karl Glazebrook (kgb@aaoepp.aao.gov.au), Tuomas J. Lukka,
(lukka@husc.harvard.edu) and Christian Soeller (c.soeller@auckland.ac.nz)
2000.  Commercial reproduction of this documentation in a different format
is forbidden.


File: pm.info,  Node: PDL/Ops,  Next: PDL/Opt/Simplex,  Prev: PDL/Objects,  Up: Module List

Fundamental mathematical operators
**********************************

NAME
====

   PDL::Ops - Fundamental mathematical operators

DESCRIPTION
===========

   This module provides the functions used by PDL to overload the basic
mathematical operators (`+ - / *' etc.) and functions (`sin sqrt' etc.)

   It also includes the function log10, which should be a perl function so
that we can overload it!

SYNOPSIS
========

   none

FUNCTIONS
=========

plus
----

   It can be made to work inplace with the `$a->inplace' syntax.  This
function is used to overload the binary + operator.  Note that when
calling this function explicitly you need to supply a third argument that
should generally be zero (see first example).  This restriction is
expected to go away in future releases.

mult
----

   It can be made to work inplace with the `$a->inplace' syntax.  This
function is used to overload the binary * operator.  Note that when
calling this function explicitly you need to supply a third argument that
should generally be zero (see first example).  This restriction is
expected to go away in future releases.

minus
-----

   It can be made to work inplace with the `$a->inplace' syntax.  This
function is used to overload the binary - operator.  Note that when
calling this function explicitly you need to supply a third argument that
should generally be zero (see first example).  This restriction is
expected to go away in future releases.

divide
------

   It can be made to work inplace with the `$a->inplace' syntax.  This
function is used to overload the binary / operator.  Note that when
calling this function explicitly you need to supply a third argument that
should generally be zero (see first example).  This restriction is
expected to go away in future releases.

gt
--

   It can be made to work inplace with the `$a->inplace' syntax.  This
function is used to overload the binary `>' operator.  Note that when
calling this function explicitly you need to supply a third argument that
should generally be zero (see first example).  This restriction is
expected to go away in future releases.

lt
--

   It can be made to work inplace with the `$a->inplace' syntax.  This
function is used to overload the binary `<' operator.  Note that when
calling this function explicitly you need to supply a third argument that
should generally be zero (see first example).  This restriction is
expected to go away in future releases.

le
--

   It can be made to work inplace with the `$a->inplace' syntax.  This
function is used to overload the binary `<=' operator.  Note that when
calling this function explicitly you need to supply a third argument that
should generally be zero (see first example).  This restriction is
expected to go away in future releases.

ge
--

   It can be made to work inplace with the `$a->inplace' syntax.  This
function is used to overload the binary `>=' operator.  Note that when
calling this function explicitly you need to supply a third argument that
should generally be zero (see first example).  This restriction is
expected to go away in future releases.

eq
--

   It can be made to work inplace with the `$a->inplace' syntax.  This
function is used to overload the binary `==' operator.  Note that when
calling this function explicitly you need to supply a third argument that
should generally be zero (see first example).  This restriction is
expected to go away in future releases.

ne
--

   It can be made to work inplace with the `$a->inplace' syntax.  This
function is used to overload the binary `!=' operator.  Note that when
calling this function explicitly you need to supply a third argument that
should generally be zero (see first example).  This restriction is
expected to go away in future releases.

shiftleft
---------

   It can be made to work inplace with the `$a->inplace' syntax.  This
function is used to overload the binary `<<' operator.  Note that when
calling this function explicitly you need to supply a third argument that
should generally be zero (see first example).  This restriction is
expected to go away in future releases.

shiftright
----------

   It can be made to work inplace with the `$a->inplace' syntax.  This
function is used to overload the binary `>>' operator.  Note that when
calling this function explicitly you need to supply a third argument that
should generally be zero (see first example).  This restriction is
expected to go away in future releases.

or2
---

   It can be made to work inplace with the `$a->inplace' syntax.  This
function is used to overload the binary | operator.  Note that when
calling this function explicitly you need to supply a third argument that
should generally be zero (see first example).  This restriction is
expected to go away in future releases.

and2
----

   It can be made to work inplace with the `$a->inplace' syntax.  This
function is used to overload the binary & operator.  Note that when
calling this function explicitly you need to supply a third argument that
should generally be zero (see first example).  This restriction is
expected to go away in future releases.

xor
---

   It can be made to work inplace with the `$a->inplace' syntax.  This
function is used to overload the binary ^ operator.  Note that when
calling this function explicitly you need to supply a third argument that
should generally be zero (see first example).  This restriction is
expected to go away in future releases.

bitnot
------

   It can be made to work inplace with the `$a->inplace' syntax.  This
function is used to overload the unary `~' operator/function.

power
-----

   It can be made to work inplace with the `$a->inplace' syntax.  This
function is used to overload the binary `**' function.  Note that when
calling this function explicitly you need to supply a third argument that
should generally be zero (see first example).  This restriction is
expected to go away in future releases.

atan2
-----

   It can be made to work inplace with the `$a->inplace' syntax.  This
function is used to overload the binary atan2 function.  Note that when
calling this function explicitly you need to supply a third argument that
should generally be zero (see first example).  This restriction is
expected to go away in future releases.

modulo
------

   It can be made to work inplace with the `$a->inplace' syntax.  This
function is used to overload the binary % function.  Note that when
calling this function explicitly you need to supply a third argument that
should generally be zero (see first example).  This restriction is
expected to go away in future releases.

spaceship
---------

   It can be made to work inplace with the `$a->inplace' syntax.  This
function is used to overload the binary `<=>' function.  Note that when
calling this function explicitly you need to supply a third argument that
should generally be zero (see first example).  This restriction is
expected to go away in future releases.

sqrt
----

   It can be made to work inplace with the `$a->inplace' syntax.  This
function is used to overload the unary sqrt operator/function.

abs
---

   It can be made to work inplace with the `$a->inplace' syntax.  This
function is used to overload the unary abs operator/function.

sin
---

   It can be made to work inplace with the `$a->inplace' syntax.  This
function is used to overload the unary sin operator/function.

cos
---

   It can be made to work inplace with the `$a->inplace' syntax.  This
function is used to overload the unary cos operator/function.

not
---

   It can be made to work inplace with the `$a->inplace' syntax.  This
function is used to overload the unary ! operator/function.

exp
---

   It can be made to work inplace with the `$a->inplace' syntax.  This
function is used to overload the unary exp operator/function.

log
---

   It can be made to work inplace with the `$a->inplace' syntax.  This
function is used to overload the unary log operator/function.

log10
-----

   It can be made to work inplace with the `$a->inplace' syntax.  This
function is used to overload the unary log10 operator/function.

assgn
-----

AUTHOR
======

   Tuomas J. Lukka (lukka@fas.harvard.edu), Karl Glazebrook
(kgb@aaoepp.aao.gov.au), Doug Hunt (dhunt@ucar.edu), Christian Soeller
(c.soeller@auckland.ac.nz), and Doug Burke (burke@ifa.hawaii.edu).


File: pm.info,  Node: PDL/Opt/Simplex,  Next: PDL/Options,  Prev: PDL/Ops,  Up: Module List

Simplex optimization routines
*****************************

NAME
====

   PDL::Opt::Simplex - Simplex optimization routines

SYNOPSIS
========

     use PDL::Opt::Simplex;

     ($optimum,$ssize) = simplex($init,$initsize,$minsize,
     		 $maxiter,
     		 sub {evaluate_func_at($_[0])},
     		 sub {display_simplex($_[0])}
     		 );

DESCRIPTION
===========

   This package implements the commonly used simplex optimization
algorithm. The basic idea of the algorithm is to move a "simplex" of N+1
points in the N-dimensional search space according to certain rules. The
main benefit of the algorithm is that you do not need to calculate the
derivatives of your function.

   $init is a 1D vector holding the initial values of the N fitted
parameters, $optimum is a vector holding the final solution.

   $initsize is the size of $init (more...)

   $minsize is some sort of convergence criterion (more...)  - e.g.
$minsize = 1e-6

   The sub is assumed to understand more than 1 dimensions and threading.
Its signature is 'inp(nparams); [ret]out()'. An example would be

     sub evaluate_func_at {
     	my($xv) = @_;
     	my $x1 = $xv->slice("(0)");
     	my $x2 = $xv->slice("(1)");
     	return $x1**4 + ($x2-5)**4 + $x1*$x2;
     }

   Here $xv is a vector holding the current values of the parameters being
fitted which are then sliced out explicitly as $x1 and $x2.

   $ssize gives a very very approximate estimate of how close we might be
- it might be miles wrong. It is the euclidean distance between the best
and the worst vertices. If it is not very small, the algorithm has not
converged.

FUNCTIONS
=========

simplex
-------

   See module `PDL::Opt::Simplex' for more information.

CAVEATS
=======

   Do not use the simplex method if your function has local minima.  It
will not work. Use genetic algorithms or simulated annealing or conjugate
gradient or momentum gradient descent.

   They will not really work either but they are not guaranteed not to
work ;) (if you have infinite time, simulated annealing is guaranteed to
work but only after it has visited every point in your space).

SEE ALSO
========

   Ron Shaffer's chemometrics web page and references therein:
`http://chem1.nrl.navy.mil/~shaffer/chemoweb.html'.

   Numerical Recipes (bla bla bla XXX ref).

   The demonstration (Examples/Simplex/tsimp.pl and tsimp2.pl).

AUTHOR
======

   Copyright(C) 1997 Tuomas J. Lukka.  All rights reserved. There is no
warranty. You are allowed to redistribute this software / documentation
under certain conditions. For details, see the file COPYING in the PDL
distribution. If this file is separated from the PDL distribution, the
copyright notice should be included in the file.


File: pm.info,  Node: PDL/Options,  Next: PDL/PP,  Prev: PDL/Opt/Simplex,  Up: Module List

simplifies option passing by hash in PerlDL
*******************************************

NAME
====

   PDL::Options - simplifies option passing by hash in PerlDL

SYNOPSIS
========

     use PDL::Options;

     %hash = parse( \%defaults, \%user_options);

     use PDL::Options ();

     $opt = new PDL::Options;
     $opt = new PDL::Options ( \%defaults );

     $opt->defaults ( \%defaults );
     $opt->synonyms ( { 'COLOR' => 'COLOUR' } );

     $hashref = $opt->defaults;

     $opt->options ( \%user_options );

     $hashref = $opt->options;

     $opt->incremental(1);
     $opt->full_options(0);

DESCRIPTION
===========

   Object to simplify option passing for PerlDL subroutines.  Allows you
to merge a user defined options with defaults.  A simplified (non-OO)
interface is provided.

Utility functions
=================

ifhref
------

     parse({Ext => 'TIF', ifhref($opt)});

   just return the argument if it is a hashref otherwise return an empty
hashref. Useful in conjunction with parse to return just the default
values if argument is not a hash ref

NON-OO INTERFACE
================

   A simplified non-object oriented interface is provided.  These routines
are exported into the callers namespace by default.

parse( \%defaults, \%user_options)
     This will parse user options by using the defaults.  The following
     settings are used for parsing: The options can be case-insensitive, a
     default synonym table is consulted (currently just contains a synonym
     for COLOUR), minimum-matching is turned on, translation of values is
     not performed.

     A hash (not hash reference) containing the processed options is
     returned.

          %options = parse( { LINE => 1, COLOUR => 'red'}, { COLOR => 'blue'});

iparse( \%defaults, \%user_options)
     Same as parse but matching is case insensitive

METHODS
=======

   The following methods are available to PDL::Options objects.

new()
     Constructor. Creates the object. With an optional argument can also
     set the default options.

extend (\%options)
     This will copy the existing options object and extend it with the
     requested extra options.

defaults( \%defaults )
     Method to set or return the current defaults. The argument should be
     a reference to a hash. The hash reference is returned if no arguments
     are supplied.

     The current values are reset whenever the defaults are changed.

add_synonym (\%synonyms)
     Method to add another synonym to an option set The argument should be
     a reference to a hash.

add_translation (\%translation)
     Method to add another translation rule to an option set.  The
     argument should be a reference to a hash.

synonyms( \%synonyms )
     Method to set or return the current synonyms. The argument should be
     a reference to a hash. The hash reference is returned if no arguments
     are supplied.

     This allows you to provide alternate keywords (such as allowing
     'COLOR' as an option when your defaults uses 'COLOUR').

current
     Returns the current state of the options. This is returned as a hash
     reference (although it is not a reference to the actual hash stored
     in the object). If full_options() is true the full options hash is
     returned, if full_options() is false only the modified options are
     returned (as set by the last call to options()).

translation
     Provide translation of options to more specific values that are
     recognised by the program. This allows, for example, the automatic
     translation of the string 'red' to '#ff0000'.

     This method can be used to setup the dictionary and is hash reference
     with the following structure:

          OPTIONA => {
          	        'string1' => decode1,
                      'string2' => decode2
          		},
          OPTIONB => {
                      's4' => decodeb1,
          	       }
          etc....

     Where OPTION? corresponds to the top level option name as stored in
     the defaults array (eg LINECOLOR) and the anonymous hashes provide
     the translation from string1 ('red') to decode1 ('#ff0000').

     An options string will be translated automatically during the main
     options() processing if autotrans() is set to true. Else translation
     can be initiated by the user using the translate() method.

incremental
     Specifies whether the user defined options will be treated as
     additions to the current state of the object (1) or modifications to
     the default values only (0).

     Can be used to set or return this value.  Default is false.

full_options
     Governs whether a complete set of options is returned (ie defaults +
     expanded user options), true, or if just the expanded user options
     are returned, false (ie the values specified by the user).

     This can be useful when you are only interested in the changes to the
     options rather than knowing the full state. (For example, if defaults
     contains keys for COLOUR and LINESTYLE and the user supplied a key of
     COL, you may simply be interested in the modification to COLOUR
     rather than the state of LINESTYLE and COLOUR.)

     Default is true.

casesens
     Specifies whether the user defined options will be processed
     independent of case (0) or not (1). Default is to be case insensitive.

     Can be used to set or return this value.

minmatch
     Specifies whether the user defined options will be minimum matched
     with the defaults (1) or whether the user defined options should match
     the default keys exactly. Defaults is true (1).

     If a particular key matches exactly (within the constraints imposed
     bby case sensitivity) this key will always be taken as correct even
     if others are similar. For example COL would match COL and COLOUR but
     this implementation will always return COL in this case (note that
     for CO it will return both COL and COLOUR and pick one at random.

     Can be used to set or return this value.

autotrans
     Specifies whether the user defined options will be processed via the
     translate() method immediately following the main options parsing.
     Default is to autotranslate (1).

     Can be used to set or return this value.

casesenstrans
     Specifies whether the keys in the options hash will be matched
     insensitive of case (0) during translation() or not (1). Default is
     to be case insensitive.

     Can be used to set or return this value.

minmatchtrans
     Specifies whether the keys in the options hash  will be minimum
     matched during translation(). Default is false (0).

     If a particular key matches exactly (within the constraints imposed
     bby case sensitivity) this key will always be taken as correct even
     if others are similar. For example COL would match COL and COLOUR but
     this implementation will always return COL in this case (note that
     for CO it will return both COL and COLOUR and pick one at random.

     Can be used to set or return this value.

warnonmissing
     Turn on or off the warning message printed when an options is not in
     the options hash. This can be convenient when a user passes a set of
     options that has to be parsed by several different option objects down
     the line.

debug
     Turn on or off debug messages. Default is off (0).  Can be used to
     set or return this value.

options
     Takes a set of user-defined options (as a reference to a hash) and
     merges them with the current state (or the defaults; depends on the
     state of incremental()).

     The user-supplied keys will be compared with the defaults.  Case
     sensitivity and minimum matching can be configured using the
     mimatch() and casesens() methods.

     A warning is raised if keys present in the user options are not
     present in the defaults unless warnonmissing is set.

     A reference to a hash containing the merged options is returned.

          $merged = $opt->options( { COL => 'red', Width => 1});

     The state of the object can be retrieved after this by using the
     current() method or by using the options() method with no arguments.
     If full_options() is true, all options are returned (options plus
     overrides), if full_options() is false then only the modified options
     are returned.

     Synonyms are supported if they have been configured via the synonyms()
     method.

translate
     Translate the current option values (eg those set via the options()
     method) using the provided translation().

     This method updates the current state of the object and returns the
     updated options hash as a reference.

          $ref = $opt->translate;

EXAMPLE
=======

   Two examples are shown. The first uses the simplified interface and the
second uses the object-oriented interface.

Non-OO
======

     use PDL::Options (':Func');

     %options = parse( {
     		   LINE => 1,
     		   COLOUR => 'red',
     		  },
     		  {
     		   COLOR => 'blue'
     		  }
     		);

   This will return a hash containg

     %options = (
                  LINE => 1,
                  COLOUR => 'blue'
                )

Object oriented
===============

   The following example will try to show the main points:

     use PDL::Options ();

     # Create new object and supply defaults
     $opt = new PDL::Options(   { Colour => 'red',
     	   		        LineStyle => 'dashed',
     			        LineWidth => 1
     			      }
     			   );

     # Create synonyms
     $opt->synonyms( { Color => 'Colour' } );

     # Create translation dictionary
     $opt->translation( { Colour => {
                           'blue' => '#0000ff',
     			 'red'  => '#ff0000',
     			 'green'=> '#00ff00'
     				},
     	  	        LineStyle => {
     			 'solid' => 1,
     			 'dashed' => 2,
     			 'dotted' => 3
     			 }
     		      }
     		    );

     # Generate and parse test hash
     $options = $opt->options( { Color => 'green',
     			       lines => 'solid',
     			      }
     			   );

   When this code is run, $options will be the reference to a hash
containing the following:

     Colour => '#00ff00',
     LineStyle => 1,
     LineWidth => 1

   If full_options() was set to false (0), $options would be a reference
to a hash containing:

     Colour => '#00ff00',
     LineStyle => 1

   Minimum matching and case insensitivity can be configured for both the
initial parsing and for the subsequent translating. The translation can be
turned off if not desired.

   Currently synonyms are not available for the translation although this
could be added quite simply.

AUTHOR
======

   Copyright (C) Tim Jenness 1998 (t.jenness@jach.hawaii.edu).  All rights
reserved. There is no warranty. You are allowed to redistribute this
software / documentation under certain conditions. For details, see the
file COPYING in the PDL distribution. If this file is separated from the
PDL distribution, the copyright notice should be included in the file.


File: pm.info,  Node: PDL/PP,  Next: PDL/PP/Dump,  Prev: PDL/Options,  Up: Module List

Generate PDL routines from concise descriptions
***********************************************

NAME
====

   PDL::PP - Generate PDL routines from concise descriptions

SYNOPSIS
========

   e.g.

     pp_def(
     	'sumover',
     	Pars => 'a(n); [o]b();',
     	Code => 'double tmp=0;
     		loop(n) %{ tmp += $a(); %}
     		$b() = tmp;
     		'
     );

     pp_done();

DESCRIPTION
===========

   In much of what follows we will assume familiarity of the reader with
the concepts of implicit and explicit threading and index manipulations
within PDL. If you have not yet heard of these concepts or are not very
comfortable with them it is time to check *Note PDL/Indexing:
PDL/Indexing,.

   As you may appreciate from its name PDL::PP is a Pre-Processor, i.e.
it expands code via substitutions to make real C-code (well, actually it
outputs XS code (See *perlxs*) but that is very close to C).

Overview
========

   Why do we need PP? Several reasons: firstly, we want to be able to
generate subroutine code for each of the PDL datatypes (PDL_Byte,
PDL_Short,. etc).  AUTOMATICALLY.  Secondly, when referring to slices of
PDL arrays in Perl (e.g. `$a->slice('0:10:2,:')' or other things such as
transposes) it is nice to be able to do this transparently and to be able
to do this 'in-place' - i.e, not to have to make a memory copy of the
section. PP handles all the necessary element and offset arithmetic for
you. There are also the notions of threading (repeated calling of the same
routine for multiple slices, see *Note PDL/Indexing: PDL/Indexing,) and
dataflow (see *Note PDL/Dataflow: PDL/Dataflow,) which use of PP allows.

   So how do you use PP? Well for the most part you just write ordinary C
code except for special PP constructs which take the form:

     $something(something else)

   or:

     PPfunction %{
       <stuff>
     %}

   The most important PP construct is the form `$array()'. Consider the
very simple PP function to sum the elements of a 1D vector (in fact this is
very similar to the actual code used by 'sumover'):

     pp_def('sumit',
             Pars => 'a(n);  [o]b();',
             Code => '
             	double tmp;
             	tmp = 0;
             	loop(n) %{
              	  tmp += $a();
              	%}
              	$b() = tmp;
     ');

   What's going on? The `Pars =>' line is very important for PP - it
specifies all the arguments and their dimensionality. We call this the
signature of the PP function (compare also the explanations in *Note
PDL/Indexing: PDL/Indexing,).  In this case the routine takes a 1-D
function as input and returns a 0-D scalar as output.  The `$a()' PP
construct is used to access elements of the array a(n) for you - PP fills
in all the required C code.

   [Aside: since PP used `$var()' for its parsing you must single-quote
all Code=> arguments since you don't want perl to interpolate `$var()' into
another string - i.e. don't use "" unless you know what you are doing!
Tjl: it's usually easiest to use single quotes and
'something'.$interpolatable.'somethingelse']

   In the simple case here where all elements are accessed the PP construct
`loop(n) %{ ... %}' is used to loop over all elements in dimension n.
Note this feature of PP: ALL DIMENSIONS ARE SPECIFIED BY NAME.

   This is made clearer if we avoid the PP loop() construct and write the
loop explicitly using conventional C:

     pp_def('sumit',
             Pars => 'a(n);  [o]b();',
             Code => '
             	int i,n_size;
             	double tmp;
             	n_size = $SIZE(n);
             	tmp = 0;
             	for(i=0; i<n_size; i++) {
              	  tmp += $a(n=>i);
              	}
              	$b() = tmp;
     ');

   which does the same as before, except more long-windedly.  You can see
to get element i of a() we say `$a(n=>i)' - we are specifying the
dimension by name n. In 2D we might say:

     Pars=>'a(m,n);',
        ...
        tmp += $a(m=>i,n=>j);
        ...

   The syntax 'm=>i' borrows from Perl hashes (which are in fact used in
the implementation of PP). One could also say `$a(n=>j,m=>i)' as order is
not important.

   You can also see in the above example the use of another PP construct -
$SIZE(n) to get the length of the dimension n.

   It should, however, be noted that you shouldn't write an explicit C-loop
when you could have used the PP loop construct since PDL::PP checks
automatically the loop limits for you, usage of loop makes the code more
concise, etc. But there are certainly situations where you need explicit
control of the loop and now you know how to do it ;).

   To revisit 'Why PP?' - the above code for sumit() will be generated for
each data-type. It will operate on slices of arrays 'in-place'. It will
thread automatically - e.g. if a 2D array is given it will be called
repeatedly for each 1D row (again check *Note PDL/Indexing: PDL/Indexing,
for the details of threading).  And then b() will be a 1D array of sums of
each row.  We could call it with $a->xchg(0,1) to sum the colums instead.
And Dataflow tracing etc. will be available.

   You can see PP saves the programmer from writing a lot of needlessly
repetitive C-code - in our opinion this is one of the best features of PDL
making writing new C subroutines for PDL an amazingly concise exercise. A
second reason is the ability to make PP expand your concise code
definitions into different C code based on the needs of the computer
architecture in question. Imagine for example you are lucky to have a
supercomputer at your hands; in that case you want PDL::PP certainly to
generate code that takes advantage of the vectorising/parallel computing
features of your machine (this a project for the future). In any case, the
bottom line is that your unchanged code should still expand to working XS
code even if the internals of PDL changed.

   Also, because you are generating the code in an actual Perl script,
there are many fun things that you can do. Let's say that you need to
write both sumit (as above) and multit. With a little bit of inventivity,
we can do

     for({Name => 'sumit', Init => '0', Op => '+='},
         {Name => 'multit', Init => '1', Op => '*='}) {
     	   pp_def($_->{Name},
     		   Pars => 'a(n);  [o]b();',
     		   Code => '
     			double tmp;
     			tmp = '.$_->{Init}.';
     			loop(n) %{
     			  tmp '.$_->{Op}.' $a();
     			%}
     			$b() = tmp;
     	   ');
     }

   which defines both the functions easily. Now, if you later need to
change the signature or dimensionality or whatever, you only need to
change one place in your code.  Yeah, sure, your editor does have 'cut and
paste' and 'search and replace' but it's still less bothersome and
definitely more difficult to forget just one place and have strange bugs
creep in.  Also, adding 'orit' (bitwise or) later is a one-liner.

   And remember, you really have perl's full abilities with you - you can
very easily read any input file and make routines from the information in
that file. For simple cases like the above, the author (Tjl) currently
favors the hash syntax like the above - it's not too much more characters
than the corresponding array syntax but much easier to understand and
change.

   We should mention here also the ability to get the pointer to the
beginning of the data in memory - a prerequisite for interfacing PDL to
some libraries. This is handled with the `$P(var)' directive, see below.

   So, after this quick overview of the general flavour of programming PDL
routines using PDL::PP let's summarise in which circumstances you should
actually use this preprocessor/precompiler. You should use PDL::PP if you
want to

   * interface PDL to some external library

   * write some algorithm that would be slow if coded in perl (this is not
     as often as you think; take a look at threading and dataflow first).

   * be a PDL developer (and even then it's not obligatory)

WARNING
=======

   Because of its architecture, PDL::PP can be both flexible and easy to
use (yet exuberantly complicated) at the same time. Currently, part of the
problem is that error messages are not very informative and if something
goes wrong, you'd better know what you are doing and be able to hack your
way through the internals (or be able to figure out by trial and error
what is wrong with your args to `pp_def').

   An alternative, of course, is to ask someone about it (e.g., through the
mailing lists).

ABANDON ALL HOPE, YE WHO ENTER HERE (DESCRIPTION)
=================================================

   Now that you have some idea how to use `pp_def' to define new PDL
functions it is time to explain the general syntax of `pp_def'.  `pp_def'
takes as arguments first the name of the function you are defining and
then a hash list that can contain various keys.

   Based on these keys PP generates XS code and a .pm file. The function
`pp_done' (see example in the SYNOPSIS) is used to tell PDL::PP that there
are no more definitions in this file and it is time to generate the .xs and
.pm file.

   As a consequence, there may be several pp_def() calls inside a file (by
convention files with PP code have the extension .pd or .pp) but generally
only one pp_done().

   There are two main different types of usage of pp_def(), the 'data
operation' and 'slice operation' prototypes.

   The 'data operation' is used to take some data, mangle it and output
some other data; this includes for example the '+' operation, matrix
inverse, sumover etc and all the examples we have talked about in this
document so far. Implicit and explicit threading and the creation of the
result are taken care of automatically in those opeartions. You can even
do dataflow with `sumit', sumover, etc (don't be dismayed if you don't
understand the concept of dataflow in PDL very well yet; it is still very
much experimental).

   The 'slice operation' is a different kind of operation: in a slice
operation, you are not changing any data, you are defining correspondences
between different elements of two piddles (examples include the index
manipulation/slicing function definitions in the file `slices.pd' that is
part of the PDL distribution; but beware, this is not introductory level
stuff).

   If PDL was compiled with support for bad values (ie `WITH_BADVAL => 1'),
then additional keys are required for `pp_def', as explained below.

   If you are just interested in communicating with some external library
(for example some linear algebra/matrix library), you'll usually want the
'data operation' so we are going to discuss that first.

Data operation
==============

A simple example
----------------

   In the data operation, you must know what dimensions of data you need.
First, an example with scalars:

     pp_def('add',
     	Pars => 'a(); b(); [o]c();',
     	Code => '$c() = $a() + $b();'
     );

   That looks a little strange but let's dissect it. The first line is
easy: we're defining a routine with the name 'add'.  The second line
simply declares our parameters and the parentheses mean that they are
scalars. We call the string that defines our parameters and their
dimensionality the signature of that function. For its relevance with
regard to threading and index manipulations check the *Note PDL/Indexing:
PDL/Indexing, manpage.

   The third line is the actual operation. You need to use the dollar
signs and parentheses to refer to your parameters (this will probably
change at some point in the future, once a good syntax is found).

   These lines are all that is necessary to actually define the function
for PDL (well, actually it isn't; you aditionally need to write a
Makefile.PL (see below) and build the module (something like 'perl
Makefile.PL; make'); but let's ignore that for the moment). So now you can
do

     use MyModule;
     $a = pdl 2,3,4;
     $b = pdl 5;

     $c = add($a,$b);
     # or
     add($a,$b,($c=null)); # Alternative form, useful if $c has been
                           # preset to something big, not useful here.

   and have threading work correctly (the result is $c == [7 8 9]).

The Pars section: the signature of a PP function
------------------------------------------------

   Seeing the above example code you will most probably ask: what is this
strange `$c=null' syntax in the second call to our new add function? If
you take another look at the definition of add you will notice that the
third argument c is flagged with the qualifier `[o]' which tells PDL::PP
that this is an output argument. So the above call to add means 'create a
new $c from scratch with correct dimensions' - null is a special token for
'empty piddle' (you might ask why we haven't used the value undef to flag
this instead of the PDL specific null; we are currently thinking about it
;).

   [This should be explained in some other section of the manual as well!!]
The reason for having this syntax as an alternative is that if you have
really huge piddles, you can do

     $c = PDL->null;
     for(some long loop) {
     	# munge a,b
     	add($a,$b,$c);
     	# munge c, put something back to a,b
     }

   and avoid allocating and deallocating $c each time. It is allocated
once at the first add() and thereafter the memory stays until $c is
destroyed.

   If you just say

     $c =  add($a,$b);

   the code generated by PP will automatically fill in `$c=null' and return
the result. If you want to learn more about the reasons why PDL::PP
supports this style where output arguments are given as last arguments
check the *Note PDL/Indexing: PDL/Indexing, manpage.

   `[o]' is not the only qualifier a pdl argument can have in the
signature.  Another important qualifier is the `[t]' option which flags a
pdl as temporary.  What does that mean? You tell PDL::PP that this pdl is
only used for temporary results in the course of the calculation and you
are not interested in its value after the computation has been completed.
But why should PDL::PP want to know about this in the first place?  The
reason is closely related to the concepts of pdl auto creation (you heard
about that above) and implicit threading. If you use implicit threading
the dimensionality of automatically created pdls is actually larger than
that specified in the signature. With `[o]' flagged pdls will be created
so that they have the additional dimensions as required by the number of
implicit thread dimensions. When creating a temporary pdl, however, it
will always only be made big enough so that it can hold the result for one
iteration in a threadloop, i.e. as large as required by the signature.  So
less memory is wasted when you flag a pdl as temporary. Secondly, you can
use output auto creation with temporary pdls even when you are using
explicit threading which is forbidden for normal output pdls flagged with
`[o]' (see *Note PDL/Indexing: PDL/Indexing,).

   Here is an example where we use the [t] qualifier. We define the
function `callf' that calls a C routine f which needs a temporary array of
the same size and type as the array a (sorry about the forward reference
for `$P'; it's a pointer access, see below) :

     pp_def('callf',
     	Pars => 'a(n); [t] tmp(n); [o] b()',
     	Code => 'int ns = $SIZE(n);
     		 f($P(a),$P(b),$P(tmp),ns);
     		'
     );

Argument dimensions and the signature
-------------------------------------

   Now we have just talked about dimensions of pdls and the signature. How
are they related? Let's say that we want to add a scalar + the index
number to a vector:

     pp_def('add2',
     	Pars => 'a(n); b(); [o]c(n);',
     	Code => 'loop(n) %{
     			$c() = $a() + $b() + n;
     		 %}'
     );

   There are several points to notice here: first, the Pars argument now
contains the n arguments to show that we have a single dimensions in a and
c. It is important to note that dimensions are actual entities that are
accessed by name so this declares a and c to have the *same* first
dimensions. In most PP definitions the size of named dimensions will be
set from the respective dimensions of non-output pdls (those with no `[o]'
flag) but sometimes you might want to set the size of a named dimension
explicitly through an integer parameter. See below in the description of
the OtherPars section how that works.

Type conversions and the signature
----------------------------------

   The signature also determines the type conversions that will be
performed when a PP function is invoked. So what happens when we invoke
one of our previously defined functions with pdls of different type, e.g.

     add2($a,$b,($ret=null));

   where $a is of type `PDL_Float' and $b of type `PDL_Short'? With the
signature as shown in the definition of `add2' above the datatype of the
operation (as determined at runtime) is that of the pdl with the 'highest'
type (sequence is byte < short < ushort < long < float < double). In the
add2 example the datatype of the operation is float ($a has that
datatype). All pdl arguments are then type converted to that datatype
(they are not converted inplace but a copy with the right type is created
if a pdl argument doesn't have the type of the operation).  Null pdls
don't contribute a type in the determination of the type of the operation.
However, they will be created with the datatype of the operation; here,
for example, $ret will be of type float. You should be aware of these
rules when calling PP functions with pdls of different types to take the
additional storage and runtime requirements into account.

   These type conversions are correct for most functions you normally
define with `pp_def'. However, there are certain cases where slightly
modified type conversion behaviour is desired. For these cases additional
qualifiers in the signature can be used to specify the desired properties
with regard to type conversion. These qualifiers can be combined with
those we have encountered already (the *creation qualifiers* `[o]' and
`[t]'). Let's go through the list of qualifiers that change type
conversion behaviour.

   The most important is the int qualifier which comes in handy when a pdl
argument represents indices into another pdl. Let's take a look at an
example from `PDL::Ufunc':

     pp_def('maximum_ind',
     	  Pars => 'a(n); int [o] b()',
     	  Code => '$GENERIC() cur;
     		   int curind;
     		   loop(n) %{
     		    if (!n || $a() > cur) {cur = $a(); curind = n;}
     	 	   %}
     	 	   $b() = curind;',
     );

   The function maximum_ind finds the index of the largest element of a
vector. If you look at the signature you notice that the output argument b
has been declared with the additional int qualifier.  This has the
following consequences for type conversions: regardless of the type of the
input pdl a the output pdl b will be of type `PDL_Long' which makes sense
since b will represent an index into a. Furthermore, if you call the
function with an existing output pdl b its type will not influence the
datatype of the operation (see above). Hence, even if a is of a smaller
type than b it will not be converted to match the type of b but stays
untouched, which saves memory and CPU cycles and is the right thing to do
when b represents indices. Also note that you can use the 'int' qualifier
together with other qualifiers (the `[o]' and `[t]' qualifiers). Order is
significant - type qualifiers precede creation qualifiers (`[o]' and
`[t]').

   The above example also demonstrates typical usage of the $GENERIC()
macro.  It expands to the current type in a so called generic loop. What
is a generic loop? As you already heard a PP function has a runtime
datatype as determined by the type of the pdl arguments it has been
invoked with.  The PP generated XS code for this function therefore
contains a switch like `switch (type) {case PDL_Byte: ... case PDL_Double:
...}' that selects a case based on the runtime datatype of the function
(it's called a type "loop" because there is a loop in PP code that
generates the cases).  In any case your code is inserted once for each PDL
type into this switch statement. The $GENERIC() macro just expands to the
respective type in each copy of your parsed code in this switch statement,
e.g., in the `case PDL_Byte' section `cur' will expand to `PDL_Byte' and
so on for the other case statements. I guess you realise that this is a
useful macro to hold values of pdls in some code.

   There are a couple of other qualifiers with similar effects as int.
For your convenience there are the float and double qualifiers with
analogous consequences on type conversions as int. Let's assume you have a
*very* large array for which you want to compute row and column sums with
an equivalent of the sumover function.  However, with the normal
definition of sumover you might run into problems when your data is, e.g.
of type short. A call like

     sumover($large_pdl,($sums = null));

   will result in `$sums' be of type short and is therefore prone to
overflow errors if `$large_pdl' is a very large array. On the other hand
calling

     @dims = $large_pdl->dims; shift @dims;
     sumover($large_pdl,($sums = zeroes(double,@dims)));

   is not a good alternative either. Now we don't have overflow problems
with `$sums' but at the expense of a type conversion of `$large_pdl' to
double, something bad if this is really a large pdl. That's where double
comes in handy:

     pp_def('sumoverd',
     	 Pars => 'a(n); double [o] b()',
     	 Code => 'double tmp=0;
     		  loop(n) %{ tmp += a(); %}
     		  $b() = tmp;',
     );

   This gets us around the type conversion and overflow problems. Again,
analogous to the int qualifier double results in b always being of type
double regardless of the type of a without leading to a typeconversion of
a as a side effect.

   Finally, there are the `type+' qualifiers where type is one of int or
float. What shall that mean. Let's illustrate the `int+' qualifier with
the actual definition of sumover:

     pp_def('sumover',
     	 Pars => 'a(n); int+ [o] b()',
     	 Code => '$GENERIC(b) tmp=0;
     		  loop(n) %{ tmp += a(); %}
     		  $b() = tmp;',
     );

   As we had already seen for the int, float and double qualifiers, a pdl
marked with a `type+' qualifier does not influence the datatype of the pdl
operation. Its meaning is "make this pdl at least of type type or higher,
as required by the type of the operation". In the sumover example this
means that when you call the function with an a of type PDL_Short the
output pdl will be of type PDL_Long (just as would have been the case with
the int qualifier). This again tries to avoid overflow problems when using
small datatypes (e.g. byte images).  However, when the datatype of the
operation is higher than the type specified in the `type+' qualifier b
will be created with the datatype of the operation, e.g. when a is of type
double then b will be double as well. We hope you agree that this is
sensible behaviour for sumover. It should be obvious how the `float+'
qualifier works by analogy.  It may become necessary to be able to specify
a set of alternative types for the parameters. However, this will probably
not be implemented until someone comes up with a reasonable use for it.

   Note that we now had to specify the `$GENERIC' macro with the name of
the pdl to derive the type from that argument. Why is that? If you
carefully followed our explanations you will have realised that in some
cases b will have a different type than the type of the operation.
Calling the '$GENERIC' macro with b as argument makes sure that the type
will always the same as that of b in that part of the generic loop.

   This is about all there is to say about the Pars section in a `pp_def'
call. You should remember that this section defines the signature of a PP
defined function, you can use several options to qualify certain arguments
as output and temporary args and all dimensions that you can later refer
to in the Code section are defined by name.

   It is important that you understand the meaning of the signature since
in the latest PDL versions you can use it to define threaded functions
from within perl, i.e. what we call *perl level threading*. Please check
*Note PDL/Indexing: PDL/Indexing, for details.

The Code section
----------------

   The Code section contains the actual XS code that will be in the
innermost part of a threadloop (if you don't know what a thread loop is
then you still haven't read *Note PDL/Indexing: PDL/Indexing,; do it now
;) after any PP macros (like `$GENERIC') and PP functions have been
expanded (like the loop function we are going to explain next).

   Let's quickly reiterate the sumover example:

     pp_def('sumover',
     	 Pars => 'a(n); int+ [o] b()',
     	 Code => '$GENERIC(b) tmp=0;
     		  loop(n) %{ tmp += a(); %}
     		  $b() = tmp;',
     );

   The loop construct in the Code section also refers to the dimension
name so you don't need to specify any limits: the loop is correctly sized
and everything is done for you, again.

   Next, there is the surprising fact that `$a()' and `$b()' do not
contain the index. This is not necessary because we're looping over n and
both variables know which dimensions they have so they automatically know
they're being looped over.

   This feature comes in very handy in many places and makes for much
shorter code. Of course, there are times when you want to circumvent this;
here is a function which symmetrizes a matrix and serves as an example of
how to code explicit looping:

     pp_def('symm',
     	Pars => 'a(n,n); [o]c(n,n);',
     	Code => 'loop(n) %{
     			int n2;
     			for(n2=n; n2<$SIZE(n); n2++) {
     				$c(n0 => n, n1 => n2) =
     				$c(n0 => n2, n1 => n) =
     				 $a(n0 => n, n1 => n2);
     			}
     		%}
     	'
     );

   Let's dissect what is happening. Firstly, what is this function
supposed to do? From its signature you see that it takes a 2D matrix with
equal numbers of columns and rows and outputs a matrix of the same size.
From a given input matrix $a it computes a symmetric output matrix $c
(symmetric in the matrix sense that A^T = A where ^T means matrix
transpose, or in PDL parlance $c == $c->xchg(0,1)). It does this by using
only the values on and below the diagonal of $a. In the output matrix $c
all values on and below the diagonal are the same as those in $a while
those above the diagonal are a mirror image of those below the diagonal
(above and below are here interpreted in the way that PDL prints 2D pdls).
If this explanation still sounds a bit strange just go ahead, make a
little file into which you write this definition, build the new PDL
extension (see section on Makefiles for PP code) and try it out with a
couple of examples.

   Having explained what the function is supposed to do there are a couple
of points worth noting from the syntactical point of view. First, we get
the size of the dimension named n again by using the `$SIZE' macro.
Second, there are suddenly these funny `n0' and `n1' index names in the
code though the signature defines only the dimension n. Why this? The
reason becomes clear when you note that both the first and second
dimension of $a and $b are named n in the signature of `symm'. This tells
PDL::PP that the first and second dimension of these arguments should have
the same size. Otherwise the generated function will raise a runtime error.
However, now in an access to `$a' and $c PDL::PP cannot figure out which
index n refers to any more just from the name of the index.  Therefore,
the indices with equal dimension names get numbered from left to right
starting at 0, e.g. in the above example `n0' refers to the first
dimension of `$a' and $c, `n1' to the second and so on.

   In all examples so far, we have only used the Pars and Code members of
the hash that was passed to `pp_def'. There are certainly other keys that
are recognised by PDL::PP and we will hear about some of them in the
course of this document. Find a (non-exhaustive) list of keys in Appendix
A.  A list of macros and PPfunctions (we have only encountered some of
those in the examples above yet) that are expanded in values of the hash
argument to `pp_def' is summarised in Appendix B.

   At this point, it might be appropriate to mention that PDL::PP is not a
completely static, well designed set of routines (as Tuomas puts it: "stop
thinking of PP as a set of routines carved in stone") but rather a
collection of things that the PDL::PP author (Tuomas J. Lukka) considered
he would have to write often into his PDL extension routines. PP tries to
be expandable so that in the future, as new needs arise, new common code
can be abstracted back into it. If you want to learn more on why you might
want to change PDL::PP and how to do it check the section on PDL::PP
internals.

Handling bad values
-------------------

   If you do not have bad-value support compiled into PDL you can ignore
this section and the related keys: BadCode, HandleBad, ...  (try printing
out the value of $PDL::Bad::Status - if it equals 0 then move straight on).

   There are several keys and macros used when writing code to handle bad
values. The first one is the HandleBad key:

HandleBad => 0
     This flags a pp-routine as *NOT* handling bad values. If this routine
     is sent piddles with their `badflag' set, then a warning message is
     printed to STDOUT and the piddles are processed as if the value used
     to represent bad values is a valid number. The `badflag' value is not
     propogated to the output piddles.

     An example of when this is used is for FFT routines, which generally
     do not have a way of ignoring part of the data.

HandleBad => 1
     This causes PDL::PP to write extra code that ensures the BadCode
     section is used, and that the `$ISBAD()' macro (and its brethren)
     work.

HandleBad is not given
     If any of the input piddles have their `badflag' set, then the output
     piddles will have their `badflag' set, but any supplied BadCode is
     ignored.

   The value of HandleBad is used to define the contents of the BadDoc
key, if it is not given.

   To handle bad values, code must be written somewhat differently; for
instance,

     $c() = $a() + $b();

   becomes something like

     if ( $a() != BADVAL && $b() != BADVAL ) {
        $c() = $a() + $b();
     } else {
        $c() = BADVAL;
     }

   However, we only want the second version if bad values are present in
the input piddles (and that bad-value support is wanted!) - otherwise we
actually want the original code. This is where the BadCode key comes in;
you use it to specify the code to execute if bad values may be present,
and PP uses both it and the Code section to create something like:

     if ( bad_values_are_present ) {
        fancy_threadloop_stuff {
           BadCode
        }
     } else {
        fancy_threadloop_stuff {
           Code
        }
     }

   This approach means that there is virtually no overhead when bad values
are not present (ie the `badflag|PDL::Bad' in this node routine returns 0).

   The BadCode section can use the same macros and looping constructs as
the Code section.  However, it wouldn't be much use without the following
additional macros:

$ISBAD(var)
     To check whether a piddle's value is bad, use the `$ISBAD' macro:

          if ( $ISBAD(a()) ) { printf("a() is bad\n"); }

     You can also access given elements of a piddle:

          if ( $ISBAD(a(n=>l)) ) { printf("element %d of a() is bad\n", l); }

$ISGOOD(var)
     This is the opposite of the `$ISBAD' macro.

$SETBAD(var)
     For when you want to set an element of a piddle bad.

$ISBADVAR(c_var,pdl)
     If you have cached the value of a piddle `$a()' into a c-variable
     (foo say), then to check whether it is bad, use `$ISBADVAR(foo,a)'.

$ISGOODVAR(c_var,pdl)
     As above, but this time checking that the cached value isn't bad.

$SETBADVAR(c_var,pdl)
     To copy the bad value for a piddle into a c variable, use
     `$SETBADVAR(foo,a)'.

   *TODO:* mention `$PPISBAD()' etc macros.

   Using these macros, the above code could be specified as:

     Code => '$c() = $a() + $b();',
     BadCode => '
        if ( $ISBAD(a()) || $ISBAD(b()) ) {
           $SETBAD(c());
        } else {
           $c() = $a() + $b();
        }',

   Since this is perl, TMTOWTDI, so you could also write:

     BadCode => '
        if ( $ISGOOD(a()) && $ISGOOD(b()) ) {
           $c() = $a() + $b();
        } else {
           $SETBAD(c());
        }',

   If you want access to the value of the badflag for a given piddle, you
can use the `$PDLSTATExxxx()' macros:

$PDLSTATEISBAD(pdl)
$PDLSTATEISGOOD(pdl)
$PDLSTATESETBAD(pdl)
$PDLSTATESETGOOD(pdl)
   *TODO:* mention the FindBadStatusCode and CopyBadStatusCode options to
`pp_def', as well as the BadDoc key.

Interfacing your own/library functions using PP
-----------------------------------------------

   Now, consider the following: you have your own C function (that may in
fact be part of some library you want to interface to PDL) which takes as
arguments two pointers to vectors of double:

     void myfunc(int n,double *v1,double *v2);

   The correct way of defining the PDL function is

     pp_def('myfunc',
     	Pars => 'a(n); [o]b(n);',
     	GenericTypes => [D],
     	Code => 'myfunc($SIZE(n),$P(a),$P(b));'
     );

   The `$P('par) syntax returns a pointer to the first element and the
other elements are guaranteed to lie after that.

   Notice that here it is possible to make many mistakes. First, $SIZE(n)
must be used instead of n. Second, you shouldn't put any loops in this
code. Third, here we encounter a new hash key recognised by PDL::PP : the
GenericTypes declaration tells PDL::PP to ONLY GENERATE THE TYPELOOP FOP
THE LIST OF TYPES SPECIFIED. In this case double. This has two advantages.
Firstly the size of the compiled code is reduced vastly, secondly if
non-double arguments are passed to `myfunc()' PDL will automatically
convert them to double before passing to the external C routine and
convert them back afterwards.

   One can also use Pars to qualify the types of individual arguments.
Thus one could also write this as:

     pp_def('myfunc',
     	Pars => 'double a(n); double [o]b(n);',
     	Code => 'myfunc($SIZE(n),$P(a),$P(b));'
     );

   The type specification in Pars exempts the argument from variation in
the typeloop - rather it is automatically converted too and from the type
specified. This is obviously useful in a more general example, e.g.:

     void myfunc(int n,float *v1,long *v2);

     pp_def('myfunc',
     	Pars => 'float a(n); long [o]b(n);',
     	GenericTypes => [F],
     	Code => 'myfunc($SIZE(n),$P(a),$P(b));'
     );

   Note we still use GenericTypes to reduce the size of the type loop,
obviously PP could in principle spot this and do it automatically though
the code has yet to attain that level of sophistication!

   Finally note when types are converted automatically one MUST use the
`[o]' qualifier for output variables or you hard one changes will get
optimised away by PP!

   If you interface a large library you can automate the interfacing even
further. Perl can help you again(!) in doing this. In many libraries you
have certain calling conventions. This can be exploited. In short, you can
write a little parser (which is really not difficult in perl) that then
generates the calls to `pp_def' from parsed descriptions of the functions
in that library. For an example, please check the *Slatec* interface in
the `Lib' tree of the PDL distribution. If you want to check (during
debugging) which calls to PP functions your perl code generated a little
helper package comes in handy which replaces the PP functions by
identically named ones that dump their arguments to stdout.

   Just say

     perl -MPDL::PP::Dump myfile.pd

   to see the calls to `pp_def' and friends. Try it with `ops.pd' and
`slatec.pd'. If you're interested (or want to enhance it), the source is
in Basic/Gen/PP/Dump.pm

Other macros and functions in the Code section
----------------------------------------------

   Macros: So far we have encountered the `$SIZE', `$GENERIC' and `$P'
macros.  Now we are going to quickly explain the other macros that are
expanded in the Code section of PDL::PP along with examples of their usage.

`$T'
     The `$T' macro is used for type switches. This is very useful when
     you have to use different external (e.g. library) functions depending
     on the input type of arguments. The general syntax is

          $Ttypeletters(type_alternatives)

     where `typeletters' is a permutation of a subset of the letters
     `BSULFD' which stand for Byte, Short, Ushort, etc. and
     `type_alternatives' are the expansions when the type of the PP
     operation is equal to that indicated by the respective letter. Let's
     illustrate this incomprehensible description by an example. Assuming
     you have two C functions with prototypes

          void float_func(float *in, float *out);
          void double_func(double *in, double *out);

     which do basically the same thing but one accepts float and the other
     double pointers. You could interface them to PDL by defining a generic
     function `foofunc' (which will call the correct function depending on
     the type of the transformation):

          pp_def('foofunc',
          	Pars => ' a(n); [o] b();',
          	Code => ' $TFD(float_func,double_func) ($P(a),$P(b));'
          	GenericTypes => [F,D],
          );

     Please note that you can't say

          Code => ' $TFD(float,double)_func ($P(a),$P(b));'

     since the `$T' macro expands with trailing spaces, analogously to C
     preprocessor macros.  The slightly longer form illustrated above is
     correct.  If you really want brevity, you can of course do

          '$TBSULFD('.(join ',',map {"long_identifier_name_$_"}
          	qw/byt short unseigned lounge flotte dubble/).');'

`$PP'
     The `$PP' macro is used for a so called *physical pointer access*. The
     *physical* refers to some internal optimisations of PDL (for those who
     are familiar with the PDL core we are talking about the vaffine
     optimisations). This macro is mainly for internal use and you
     shouldn't need to use it in any of your normal code.

`$COMP' (and the OtherPars section)
     The `$COMP' macro is used to access non-pdl values in the code
     section. Its name is derived from the implementation of
     transformations in PDL. The variables you can refer to using `$COMP'
     are members of the "compiled" structure that represents the PDL
     transformation in question but does not yet contain any information
     about dimensions (for further details check *Note PDL/Internals:
     PDL/Internals,). However, you can treat `$COMP' just as a black box
     without knowing anything about the implementation of transformations
     in PDL. So when would you use this macro? Its main usage is to access
     values of arguments that are declared in the OtherPars section of a
     `pp_def' definition. But then you haven't heard about the OtherPars
     key yet?!  Let's have another example that illustrates typical usage
     of both new features:

          pp_def('pnmout',
          	Pars => 'a(m)',
          	OtherPars => "char* fd",
          	GenericTypes => [B,U,S,L],
          	Code => 'PerlIO *fp;
          		 IO *io;

          io = GvIO(gv_fetchpv($COMP(fd),FALSE,SVt_PVIO));
          		 if (!io || !(fp = IoIFP(io)))
          			croak("Can\'t figure out FP");

          if (PerlIO_write(fp,$P(a),len) != len)
          				croak("Error writing pnm file");
            ');

     This function is used to write data from a pdl to a file. The file
     descriptor is passed as a string into this function. This parameter
     does not go into the Pars section since it cannot be usefully treated
     like a pdl but rather into the aptly named OtherPars section.
     Parameters in the OtherPars section follow those in the Pars section
     when invoking the function, i.e.

          open FILE,">out.dat" or die "couldn't open out.dat";
          pnmout($pdl,'FILE');

     When you want to access this parameter inside the code section you
     have to tell PP by using the `$COMP' macro, i.e. you write
     `$COMP(fd)' as in the example. Otherwise PP wouldn't know that the fd
     you are referring to is the same as that specified in the OtherPars
     section.

     Another use for the OtherPars section is to set a named dimension in
     the signature. Let's have an example how that is done:

          pp_def('setdim',
          	Pars => '[o] a(n)',
          	OtherPars => 'int ns => n',
          	Code => 'loop(n) %{ $a() = n; %}',
          );

     This says that the named dimension n will be initialised from the
     value of the *other parameter* `ns' which is of integer type (I guess
     you have realised that we use the `CType From => named_dim' syntax).
     Now you can call this function in the usual way:

          setdim(($a=null),5);
          print $a;
            [ 0 1 2 3 4 ]

     Admittedly this function is not very useful but it demonstrates how it
     works. If you call the function with an existing pdl and you don't
     need to explicitly specify the size of n since PDL::PP can figure it
     out from the dimensions of the non-null pdl. In that case you just
     give the dimension parameter as `-1':

          $a = hist($b);
          setdim($a,-1);

     That should do it.

   The only PP function that we have used in the examples so far is loop.
Additionally, there are currently two other functions which are recognised
in the Code section:

threadloop
     As we heard above the signature of a PP defined function defines the
     dimensions of all the pdl arguments involved in a *primitive*
     operation.  However, you often call the functions that you defined
     with PP with pdls that have more dimensions than those specified in
     the signature. In this case the primitive operation is performed on
     all subslices of appropriate dimensionality in what is called a
     threadloop (see also overview above and *Note PDL/Indexing:
     PDL/Indexing,). Assuming you have some notion of this concept you
     will probably appreciate that the operation specified in the code
     section should be optimised since this is the tightest loop inside a
     threadloop.  However, if you revisit the example where we define the
     pnmout function, you will quickly realise that looking up the IO file
     descriptor in the inner threadloop is not very efficient when writing
     a pdl with many rows. A better approach would be to look up the IO
     descriptor once outside the threadloop and use its value then inside
     the tightest threadloop. This is exactly where the threadloop
     function comes in handy. Here is an improved definition of pnmout
     which uses this function:

          pp_def('pnmout',
          	Pars => 'a(m)',
          	OtherPars => "char* fd",
          	GenericTypes => [B,U,S,L],
          	Code => 'PerlIO *fp;
          		 IO *io;
          		 int len;

          io = GvIO(gv_fetchpv($COMP(fd),FALSE,SVt_PVIO));
          		 if (!io || !(fp = IoIFP(io)))
          			croak("Can\'t figure out FP");

          len = $SIZE(m) * sizeof($GENERIC());

          threadloop %{
             if (PerlIO_write(fp,$P(a),len) != len)
          				croak("Error writing pnm file");
          %}
            ');

     This works as follows. Normally the C code you write inside the Code
     section is placed inside a threadloop (i.e., PP generates the
     appropriate wrapping XS code around it). However, when you explicitly
     use the threadloop function, PDL::PP recognises this and doesn't wrap
     your code with an additional threadloop. This has the effect that
     code you write outside the threadloop is only executed once per
     transformation and just the code with in the surrounding `%{ ... %}'
     pair is placed within the tightest threadloop. This also comes in
     handy when you want to perform a decision (or any other code,
     especially CPU intensive code) only once per thread, i.e.

          pp_addhdr('
            #define RAW 0
            #define ASCII 1
          ');
          pp_def('do_raworascii',
          	 Pars => 'a(); b(); [o]c()',
          	 OtherPars => 'int mode',
               Code => ' switch ($COMP(mode)) {
          		    case RAW:
          			threadloop %{
                                    /* do raw stuff */
                                %}
          		        break;
          		    case ASCII:
          			threadloop %{
                                    /* do ASCII stuff */
                                %}
          		        break;
          		    default:
          			croak("unknown mode");
          		   }'
           );

types
     The types function works similar to the `$T' macro. However, with the
     types function the code in the following block (delimited by `%{' and
     `%}' as usual) is executed for all those cases in which the datatype
     of the operation is *any of* the types represented by the letters in
     the argument to type, e.g.

          Code => '...

          types(BSUL) %{
          		 /* do integer type operation */
                       %}
          types(FD) %{
          		 /* do floating point operation */
          %}
                       ...'

Other useful PP keys in data operation definitions
--------------------------------------------------

   You have already heard about the OtherPars key. Currently, there are not
many other keys for a data operation that will be useful in normal
(whatever that is) PP programming. In fact, it would be interesting to
hear about a case where you think you need more than what is provided at
the moment.  Please speak up on one of the PDL mailing lists. Most other
keys recognised by `pp_def' are only really useful for what we call *slice
operations* (see also above).

   One thing that is strongly being planned is variable number of
arguments, which will be a little tricky.

   An incomplete list of the available keys:

Inplace
     Setting this key marks the routine as working inplace - ie the input
     and output piddles are the same. An example is `$a->inplace->sqrt()'
     (or `sqrt(inplace($a))').

    Inplace => 1
          Use when the routine is a unary function, such as sqrt.

    Inplace => ['a']
          If there are more than one input piddles, specify the name of
          the one that can be changed inplace using an array reference.

    Inplace => ['a','b']
          If there are more than one output piddle, specify the name of
          the input piddle and output piddle in a 2-element array
          reference. This probably isn't needed, but left in for
          completeness.

     If bad values are being used, care must be taken to ensure the
     propogation of the badflag when inplace is being used; consider this
     excerpt from `Basic/Bad/bad.pd':

          pp_def('replacebad',HandleBad => 1,
            Pars => 'a(); [o]b();',
            OtherPars => 'double newval',
            Inplace => 1,
            CopyBadStatusCode =>
            '/* propogate badflag if inplace AND it has changed */
             if ( a == b && $ISPDLSTATEBAD(a) )
               PDL->propogate_badflag( b, 0 );

          /* always make sure the output is "good" */
          $SETPDLSTATEGOOD(b);
              ',
              ...

     Since this routine removes all bad values, then the output piddle had
     its bad flag cleared. If run inplace (so `a == b'), then we have to
     tell all the children of a that the bad flag has been cleared (to
     save time we make sure that we call `PDL->propogate_badgflag' only if
     the input piddle had its bad flag set).

     NOTE: one idea is that the documentation for the routine could be
     automatically flagged to indicate that it can be executed inplace, ie
     something similar to how HandleBad sets BadDoc if it's not supplied
     (it's not an ideal solution).

Other PDL::PP functions to support concise package definition
-------------------------------------------------------------

   So far, we have described the `pp_def' and `pp_done' functions. PDL::PP
exports a few other functions to aid you in writing concise PDL extension
package definitions.

pp_addhdr
.........

   Often when you interface library functions as in the above example you
have to include additional C include files. Since the XS file is generated
by PP we need some means to make PP insert the appropriate include
directives in the right place into the generated XS file.  To this end
there is the pp_addhdr function. This is also the function to use when you
want to define some C functions for internal use by some of the XS
functions (which are mostly functions defined by `pp_def').  By including
these functions here you make sure that PDL::PP inserts your code before
the point where the actual XS module section begins and will therefore be
left untouched by xsubpp (cf. *perlxs* and *perlxstut* manpages).

   A typical call would be

     pp_addhdr('
     #include <unistd.h>       /* we need defs of XXXX */
     #include "libprotos.h"    /* prototypes of library functions */
     #include "mylocaldecs.h"  /* Local decs */

     static void do_the real_work(PDL_Byte * in, PDL_Byte * out, int n)
     {
     	/* do some calculations with the data */
     }
     ');

   This ensures that all the constants and prototypes you need will be
properly included and that you can use the internal functions defined here
in the `pp_def's, e.g.:

     pp_def('barfoo',
     	 Pars => ' a(n); [o] b(n)',
     	 GenericTypes => '[B]',
     	 Code => ' int ns = $SIZE(n);
     		   do_the_real_work($P(a),$P(b),ns);
                    ',
     );

pp_addpm
........

   In many cases the actual PP code (meaning the arguments to `pp_def'
calls) is only part of the package you are currently implementing. Often
there is additional perl code and XS code you would normally have written
into the pm and XS files which are now automatically generated by PP. So
how to get this stuff into those dynamically generated files? Fortunately,
there are a couple of functions, generally called `pp_addXXX' that assist
you in doing this.

   Let's assume you have additional perl code that should go into the
generated pm-file. This is easily achieved with the pp_addpm command:

     pp_addpm(<<'EOD');

     =head1 NAME

     PDL::Lib::Mylib -- a PDL interface to the Mylib library

     =head1 DESCRIPTION

     This package implements an interface to the Mylib package with full
     threading and indexing support (see L<PDL::Indexing>).

     =cut

     use PGPLOT;

     =head2 use_myfunc
     	this function applies the myfunc operation to all the
     	elements of the input pdl regardless of dimensions
     	and returns the sum of the result
     =cut

     sub use_myfunc {
     	my $pdl = shift;

     myfunc($pdl->clump(-1),($res=null));

     return $res->sum;
        }

     EOD

pp_add_exported
...............

   You have probably got the idea. In some cases you also want to export
your additional functions. To avoid getting into trouble with PP which
also messes around with the `@EXPORT' array you just tell PP to add your
functions to the list of exported functions:

     pp_add_exported('', 'use_myfunc gethynx');

   Note the initial empty string argument (reason for it?).

pp_add_isa
..........

   The pp_add_isa command works like the the pp_add_exported function.
The arguments to pp_add_isa are added the @ISA list, e.g.

     pp_add_isa(' Some::Other::Class ');

pp_addxs
........

   Sometimes you want to add extra XS code of your own (that is generally
not involved with any threading/indexing issues but supplies some other
functionality you want to access from the perl side) to the generated XS
file, for example

     pp_addxs('','

     # Determine endianness of machine

     int
     isbigendian()
        CODE:
          unsigned short i;
          PDL_Byte *b;

     i = 42; b = (PDL_Byte*) (void*) &i;

     if (*b == 42)
        RETVAL = 0;
     else if (*(b+1) == 42)
        RETVAL = 1;
     else
        croak("Impossible - machine is neither big nor little endian!!\n");
     OUTPUT:
       RETVAL
       ');

   Especially pp_add_exported and pp_addxs should be used with care. PP
uses PDL::Exporter, hence letting PP export your function means that they
get added to the standard list of function exported by default (the list
defined by the export tag ":Func"). If you use pp_addxs you shouldn't try
to do anything that involves threading or indexing directly. PP is much
better at generating the appropriate code from your definitions.

pp_add_boot
...........

   Finally, you may want to add some code to the BOOT section of the XS
file (if you don't know what that is check *perlxs*). This is easily done
with the pp_add_boot command:

     pp_add_boot(<<EOB);
     	descrip = mylib_initialize(KEEP_OPEN);

     if (descrip == NULL)
        croak("Can't initialize library");

     GlobalStruc->descrip = descrip;
     GlobalStruc->maxfiles = 200;
       EOB

pp_export_nothing
.................

   By default, PP.pm puts all subs defined using the pp_def function into
the output .pm file's EXPORT list. This can create problems if you are
creating a subclassed object where you don't want any methods exported.
(i.e. the methods will only be called using the $object->method syntax).

   For these cases you can call pp_export_nothing() to clear out the
export list. Example (At the end of the .pd file):

     pp_export_nothing();
     pp_done();

pp_core_importList
..................

   By default, PP.pm puts the 'use Core;' line into the output .pm file.
This imports Core's exported names into the current namespace, which can
create problems if you are over-riding one of Core's methods in the
current file.  You end up getting messages like "Warning: sub sumover
redefined in file subclass.pm" when running the program.

   For these cases the pp_core_importList can be used to change what is
imported from Core.pm.  For example:

     pp_core_importList('()')
     
     This would result in

     use Core();

   being generated in the output .pm file. This would result in no names
being imported from Core.pm. Similarly, calling

     pp_core_importList(' qw/ barf /')
     
     would result in
     
     use Core qw/ barf/;
     
     being generated in the output .pm file. This would result in just 'barf'
     being imported from Core.pm.

Slice operation
===============

   The slice operation section of this manual is provided using dataflow
and lazy evaluation: when you need it, ask Tjl to write it.  a delivery in
a week from when I receive the email is 95% probable and two week delivery
is 99% probable.

   And anyway, the slice operations require a much more intimate knowledge
of PDL internals than the data operations. Furthermore, the complexity of
the issues involved is considerably higher than that in the average data
operation. If you would like to convince yourself of this fact take a look
at the `Basic/Slices/slices.pd' file in the PDL distribution :-).
Nevertheless, functions generated using the slice operations are at the
heart of the index manipulation and dataflow capabilities of PDL.

   Also, there are a lot of dirty issues with virtual piddles and vaffines
which we shall entirely skip here.

Slices and bad values
---------------------

   Slice operations need to be able to handle bad values (if support is
compiled into PDL). The easiest thing to do is look at
`Basic/Slices/slices.pd' to see how this works.

   Along with BadCode, there are also the `BadBackCode' and
`BadRedoDimsCode' keys for `pp_def'. However, any `EquivCPOffsCode' should
not need changing, since any changes are absorbed into the definition of
the `$EQUIVCPOFFS()' macro (ie it is handled automatically by PDL::PP>.

USEFUL ROUTINES
===============

   The PDL Core structure, defined in `Basic/Core/pdlcore.h.PL', contains
pointers to a number of routines that may be useful to you.  The majority
of these routines deal with manipulating piddles, but some are more
general:

PDL->qsort_B( PDL_Byte *xx, int a, int b )
     Sort the array `xx' between the indices a and b.  There are also
     versions for the other PDL datatypes, with postfix `_S', `_U', `_L',
     `_F', and `_D'.  Any module using this must ensure that `PDL::Ufunc'
     is loaded.

PDL->qsort_ind_B( PDL_Byte *xx, int *ix, int a, int b )
     As for `PDL->qsort_B', but this time sorting the indices rather than
     the data.

   The routine med2d in `Lib/Image2D/image2d.pd' shows how such routines
are used.

MAKEFILES FOR PP FILES
======================

   If you are going to generate a package from your PP file (typical file
extensions are `.pd' or `.pp' for the files containing PP code) it is
easiest and safest to leave generation of the appropriate commands to the
Makefile. In the following we will outline the typical format of a perl
Makefile to automatically build and install your package from a
description in a PP file. Most of the rules to build the xs, pm and other
required files from the PP file are already predefined in the
PDL::Core::Dev package. We just have to tell MakeMaker to use it.

   In most cases you can define your Makefile like

     # Makefile.PL for a package defined by PP code.

     use PDL::Core::Dev;            # Pick up development utilities
     use ExtUtils::MakeMaker;

     $package = ["mylib.pd",Mylib,PDL::Lib::Mylib];
     %hash = pdlpp_stdargs($package);
     $hash{OBJECT} .= ' additional_Ccode$(OBJ_EXT) ';
     $hash{clean}->{FILES} .= ' todelete_Ccode$(OBJ_EXT) ';
     $hash{'VERSION_FROM'} = 'mylib.pd';
     WriteMakefile(%hash);

     sub MY::postamble { pdlpp_postamble($package); }

   Here, the list in $package is: first: PP source file name, then the
prefix for the produced files and finally the whole package name.  You can
modify the hash in whatever way you like but it would be reasonable to
stay within some limits so that your package will continue to work with
later versions of PDL.

   If you don't want to use prepackaged arguments, here is a generic
Makefile.PL that you can adapt for your own needs:

     # Makefile.PL for a package defined by PP code.

     use PDL::Core::Dev;            # Pick up development utilities
     use ExtUtils::MakeMaker;

     WriteMakefile(
      'NAME'  	=> 'PDL::Lib::Mylib',
      'VERSION_FROM'	=> 'mylib.pd',
      'TYPEMAPS'     => [&PDL_TYPEMAP()],
      'OBJECT'       => 'mylib$(OBJ_EXT) additional_Ccode$(OBJ_EXT)',
      'PM'		=> { 'Mylib.pm'            => '$(INST_LIBDIR)/Mylib.pm'},
      'INC'          => &PDL_INCLUDE(), # add include dirs as required by your lib
      'LIBS'         => [''],   # add link directives as necessary
      'clean'        => {'FILES'  =>
     			  'Mylib.pm Mylib.xs Mylib$(OBJ_EXT)
     			  additional_Ccode$(OBJ_EXT)'},
     );

     # Add genpp rule; this will invoke PDL::PP on our PP file
     # the argument is an array reference where the array has three string elements:
     #   arg1: name of the source file that contains the PP code
     #   arg2: basename of the xs and pm files to be generated
     #   arg3: name of the package that is to be generated
     sub MY::postamble { pdlpp_postamble(["mylib.pd",Mylib,PDL::Lib::Mylib]); }

   To make life even easier PDL::Core::Dev defines the function
`pdlpp_stdargs' that returns a hash with default values that can be passed
(either directly or after appropriate modification) to a call to
WriteMakefile.  Currently, `pdlpp_stdargs' returns a hash where the keys
are filled in as follows:

     (
      'NAME'  	=> $mod,
      'TYPEMAPS'     => [&PDL_TYPEMAP()],
      'OBJECT'       => "$pref\$(OBJ_EXT)",
      PM 	=> {"$pref.pm" => "\$(INST_LIBDIR)/$pref.pm"},
      MAN3PODS => {"$src" => "\$(INST_MAN3DIR)/$mod.\$(MAN3EXT)"},
      'INC'          => &PDL_INCLUDE(),
      'LIBS'         => [''],
      'clean'        => {'FILES'  => "$pref.xs $pref.pm $pref\$(OBJ_EXT)"},
     )

   Here, $src is the name of the source file with PP code, `$pref' the
prefix for the generated .pm and .xs files and `$mod' the name of the
exntension module to generate.

INTERNALS
=========

   The internals of the current version consist of a large table which
gives the rules according to which things are translated and the subs
which implement these rules.

   Later on, it would be good to make the table modifiable by the user so
that different things may be tried.

   [Meta comment: here will hopefully be more in the future; currently,
your best bet will be to read the source code :-( or ask on the list (try
the latter first) ]

Appendix A: Some keys recognised by PDL::PP
===========================================

   Unless otherwise specified, the arguments are strings. Keys marked with
(bad) are only used if bad-value support is compiled into PDL.

Pars
     define the signature of your function

OtherPars
     arguments which are not pdls. Default: nothing.

Code
     the actual code that implements the functionality; several PP macros
     and PP functions are recognised in the string value

HandleBad (bad)
     If set to 1, the routine is assumed to support bad values and the
     code in the BadCode key is used if bad values are present; it also
     sets things up so that the `$ISBAD()' etc macros can be used.  If set
     to 0, cause the routine to print a warning if any of the input piddles
     have their bad flag set.

BadCode (bad)
     Give the code to be used if bad values may be present in the input
     piddles.  Only used if `HandleBad => 1'.

GenericTypes
     An array reference. The array may contain any subset of the strings
     `B', `S', `U', `L', `F' and `D', which specify which types your
     operation will accept.  This is very useful (and important!) when
     interfacing an external library.  Default: [qw/B S U L F D/]

Inplace
     Mark a function as being able to work inplace.

          Inplace => 1          if  Pars => 'a(); [o]b();'
          Inplace => ['a']      if  Pars => 'a(); b(); [o]c();'
          Inplace => ['a','b']  if  Pars => 'a(); b(); [o]c(); [o]d();'

     If bad values are being used, care must be taken to ensure the
     propogation of the badflag when inplace is being used; for instance
     see the code for `replacebad' in `Basic/Bad/bad.pd'.

Doc
     Used to specify a documentation string in Pod format. See PDL::Doc
     for information on PDL documentation conventions. Note: in the
     special case where the PP 'Doc' string is one line this is implicitly
     used for the quick reference AND the documentation!

     If the Doc field is omitted PP will generate default documentation
     (after all it knows about the Signature).

     If you really want the function NOT to be documented in any way at
     this point (e.g. for an internal routine, or because youu are doing
     it elsewhere in the code) explictly specify `Doc=>undef'.

BadDoc (bad)
     Contains the text returned by the badinfo command (in `perldl') or
     the -b switch to the `pdldoc' shell script. In many cases, you will
     not need to specify this, since the information can be automatically
     created by PDL::PP. However, as befits computer-generated text, it's
     rather stilted; it may be much better to do it yourself!

Appendix B: PP macros and functions
===================================

Macros
------

   Macros labelled by (bad) are only used if bad-value support is compiled
into PDL.

$*variablename_from_sig*()
     access a pdl (by its name) that was specified in the signature

$COMP(x)
     access a value in the private data structure of this transformation
     (mainly used to use an argument that is specified in the `OtherPar'
     section)

$SIZE(n)
     replaced at runtime by the actual size of a *named* dimension (as
     specified in the signature)

$GENERIC()
     replaced by the C type that is equal to the runtime type of the
     operation

$P(a)
     a pointer access to the PDL named a in the signature. Useful for
     interfacing to C functions

$PP(a)
     a physical pointer access to pdl a; mainly for internal use

$TXXX(Alternative,Alternative)
     expansion alternatives according to runtime type of operation, where
     XXX is some string that is matched by `/[BSULFD+]/'.

$PDL(a)
     return a pointer to the pdl data structure (pdl *) of piddle a

$ISBAD(a()) (bad)
     returns true if the value stored in `a()' equals the bad value for
     this piddle.  Requires HandleBad being set to 1.

$ISGOOD(a()) (bad)
     returns true if the value stored in `a()' does not equal the bad value
     for this piddle.  Requires HandleBad being set to 1.

$SETBAD(a()) (bad)
     Sets `a()' to equal the bad value for this piddle.  Requires
     HandleBad being set to 1.

functions
---------

`loop(DIMS) %{ ... %}'
     loop over named dimensions; limits are generated automatically by PP

`threadloop %{ ... %}'
     enclose following code in a threadloop

`types(TYPES) %{ ... %}'
     execute following code if type of operation is any of `TYPES'

SEE ALSO
========

   *PDL*

   For the concepts of threading and slicing check *Note PDL/Indexing:
PDL/Indexing,.

   *Note PDL/Internals: PDL/Internals,

   *Note PDL/BadValues: PDL/BadValues, for information on bad values

   *perlxs*, *perlxstut*

CURRENTLY UNDOCUMENTED
======================

   RedoDimsCode, $RESIZE()

BUGS
====

   PDL::PP is still, even in its rewritten form, too complicated.  It
needs to be rethought a little as well as deconvoluted and modularized
some more (e.g. all the NS things).

   After the rewrite, this can happen a little by little, though.

Undocumented functions
----------------------

   The following functions have been added since this manual was written
and are as yet undocumented

pp_export_nothing
pp_core_importList
pp_beginwrap
pp_setversion
pp_addbegin
AUTHOR
======

   Copyright(C) 1997 Tuomas J. Lukka (lukka@fas.harvard.edu), Karl
Glaazebrook (kgb@aaocbn1.aao.GOV.AU) and Christian Soeller
(c.soeller@auckland.ac.nz). All rights reserved. Although destined for
release as a man page with the standard PDL distribution, it is not public
domain. Permission is granted to freely distribute verbatim copies of this
document provided that no modifications outside of formatting be made, and
that this notice remain intact.  You are permitted and encouraged to use
its code and derivatives thereof in your own source code for fun or for
profit as you see fit.