This is /home/pdm/install/Python-2.1/Doc/lib/python-lib.info, produced
by makeinfo version 4.0 from lib.texi.

   April 15, 2001		2.1


File: python-lib.info,  Node: StringIO,  Next: cStringIO,  Prev: fpformat,  Up: String Services

Read and write strings as files
===============================

   Read and write strings as if they were files.

   This module implements a file-like class, `StringIO', that reads and
writes a string buffer (also known as _memory files_).  See the
description of file objects for operations (section *Note File
Objectsfile::).

`StringIO([buffer])'
     When a `StringIO' object is created, it can be initialized to an
     existing string by passing the string to the constructor.  If no
     string is given, the `StringIO' will start empty.

     The `StringIO' object can accept either Unicode or 8-bit strings,
     but mixing the two may take some care.  If both are used, 8-bit
     strings that cannot be interpreted as 7-bit ASCII (i.e., that use
     the 8th bit) will cause a `UnicodeError' to be raised when
     `getvalue()' is called.

   The following methods of `StringIO' objects require special mention:

`getvalue()'
     Retrieve the entire contents of the "file" at any time before the
     `StringIO' object's `close()' method is called.  See the note
     above for information about mixing Unicode and 8-bit strings; such
     mixing can cause this method to raise `UnicodeError'.

`close()'
     Free the memory buffer.


File: python-lib.info,  Node: cStringIO,  Next: codecs,  Prev: StringIO,  Up: String Services

Faster version of `StringIO'
============================

   Faster version of `StringIO', but not subclassable.  This module was
documented by Jim Fulton <jfulton@digicool.com>.
This section was written by Fred L. Drake, Jr. <fdrake@acm.org>.
The module `cStringIO' provides an interface similar to that of the
`StringIO' module.  Heavy use of `StringIO.StringIO' objects can be
made more efficient by using the function `StringIO()' from this module
instead.

   Since this module provides a factory function which returns objects
of built-in types, there's no way to build your own version using
subclassing.  Use the original `StringIO' module in that case.

   Unlike the memory files implemented by the `StringIO' module, those
provided by this module are not able to accept Unicode strings that
cannot be encoded as plain ASCII strings.

   The following data objects are provided as well:

`InputType'
     The type object of the objects created by calling `StringIO' with
     a string parameter.

`OutputType'
     The type object of the objects returned by calling `StringIO' with
     no parameters.

   There is a C API to the module as well; refer to the module source
for more information.


File: python-lib.info,  Node: codecs,  Next: unicodedata,  Prev: cStringIO,  Up: String Services

Codec registry and base classes
===============================

   Encode and decode data and streams.  This module was documented by
Marc-Andre Lemburg <mal@lemburg.com>.
This section was written by Marc-Andre Lemburg <mal@lemburg.com>.
This module defines base classes for standard Python codecs (encoders
and decoders) and provides access to the internal Python codec registry
which manages the codec lookup process.

   It defines the following functions:

`register(search_function)'
     Register a codec search function. Search functions are expected to
     take one argument, the encoding name in all lower case letters, and
     return a tuple of functions `(ENCODER, DECODER, STREAM_READER,
     STREAM_WRITER)' taking the following arguments:

     ENCODER and DECODER: These must be functions or methods which have
     the same interface as the `encode()'/`decode()' methods of Codec
     instances (see Codec Interface). The functions/methods are
     expected to work in a stateless mode.

     STREAM_READER and STREAM_WRITER: These have to be factory
     functions providing the following interface:

     `factory(STREAM, ERRORS='strict')'

     The factory functions must return objects providing the interfaces
     defined by the base classes `StreamWriter' and `StreamReader',
     respectively. Stream codecs can maintain state.

     Possible values for errors are `'strict'' (raise an exception in
     case of an encoding error), `'replace'' (replace malformed data
     with a suitable replacement marker, such as `?') and `'ignore''
     (ignore malformed data and continue without further notice).

     In case a search function cannot find a given encoding, it should
     return `None'.

`lookup(encoding)'
     Looks up a codec tuple in the Python codec registry and returns the
     function tuple as defined above.

     Encodings are first looked up in the registry's cache. If not
     found, the list of registered search functions is scanned. If no
     codecs tuple is found, a `LookupError' is raised. Otherwise, the
     codecs tuple is stored in the cache and returned to the caller.

   To simplify working with encoded files or stream, the module also
defines these utility functions:

`open(filename, mode[, encoding[, errors[, buffering]]])'
     Open an encoded file using the given MODE and return a wrapped
     version providing transparent encoding/decoding.

     *Note:* The wrapped version will only accept the object format
     defined by the codecs, i.e. Unicode objects for most built-in
     codecs.  Output is also codec-dependent and will usually be
     Unicode as well.

     ENCODING specifies the encoding which is to be used for the the
     file.

     ERRORS may be given to define the error handling. It defaults to
     `'strict'' which causes a `ValueError' to be raised in case an
     encoding error occurs.

     BUFFERING has the same meaning as for the built-in `open()'
     function.  It defaults to line buffered.

`EncodedFile(file, input[, output[, errors]])'
     Return a wrapped version of file which provides transparent
     encoding translation.

     Strings written to the wrapped file are interpreted according to
     the given INPUT encoding and then written to the original file as
     strings using the OUTPUT encoding. The intermediate encoding will
     usually be Unicode but depends on the specified codecs.

     If OUTPUT is not given, it defaults to INPUT.

     ERRORS may be given to define the error handling. It defaults to
     `'strict'', which causes `ValueError' to be raised in case an
     encoding error occurs.

   The module also provides the following constants which are useful
for reading and writing to platform dependent files:

`BOM'

`BOM_BE'

`BOM_LE'

`BOM32_BE'

`BOM32_LE'

`BOM64_BE'

`BOM64_LE'
     These constants define the byte order marks (BOM) used in data
     streams to indicate the byte order used in the stream or file.
     `BOM' is either `BOM_BE' or `BOM_LE' depending on the platform's
     native byte order, while the others represent big endian (`_BE'
     suffix) and little endian (`_LE' suffix) byte order using 32-bit
     and 64-bit encodings.

   See also:

<http://sourceforge.net/projects/python-codecs/>
     A SourceForge project working on additional support for Asian
     codecs for use with Python.  They are in the early stages of
     development at the time of this writing -- look in their FTP area
     for downloadable files.

* Menu:

* Codec Base Classes::


File: python-lib.info,  Node: Codec Base Classes,  Prev: codecs,  Up: codecs

Codec Base Classes
------------------

   The `codecs' defines a set of base classes which define the
interface and can also be used to easily write you own codecs for use
in Python.

   Each codec has to define four interfaces to make it usable as codec
in Python: stateless encoder, stateless decoder, stream reader and
stream writer. The stream reader and writers typically reuse the
stateless encoder/decoder to implement the file protocols.

   The `Codec' class defines the interface for stateless
encoders/decoders.

   To simplify and standardize error handling, the `encode()' and
`decode()' methods may implement different error handling schemes by
providing the ERRORS string argument.  The following string values are
defined and implemented by all standard Python codecs:

Value                                Meaning
------                               -----
'strict'                             Raise `ValueError' (or a subclass);
                                     this is the default.
'ignore'                             Ignore the character and continue
                                     with the next.
'replace'                            Replace with a suitable replacement
                                     character; Python will use the
                                     official U+FFFD REPLACEMENT
                                     CHARACTER for the built-in Unicode
                                     codecs.

* Menu:

* Codec Objects::
* StreamWriter Objects::
* StreamReader Objects::
* StreamReaderWriter Objects::
* StreamRecoder Objects::


File: python-lib.info,  Node: Codec Objects,  Next: StreamWriter Objects,  Prev: Codec Base Classes,  Up: Codec Base Classes

Codec Objects
.............

   The `Codec' class defines these methods which also define the
function interfaces of the stateless encoder and decoder:

`encode(input[, errors])'
     Encodes the object INPUT and returns a tuple (output object,
     length consumed).

     ERRORS defines the error handling to apply. It defaults to
     `'strict'' handling.

     The method may not store state in the `Codec' instance. Use
     `StreamCodec' for codecs which have to keep state in order to make
     encoding/decoding efficient.

     The encoder must be able to handle zero length input and return an
     empty object of the output object type in this situation.

`decode(input[, errors])'
     Decodes the object INPUT and returns a tuple (output object,
     length consumed).

     INPUT must be an object which provides the `bf_getreadbuf' buffer
     slot.  Python strings, buffer objects and memory mapped files are
     examples of objects providing this slot.

     ERRORS defines the error handling to apply. It defaults to
     `'strict'' handling.

     The method may not store state in the `Codec' instance. Use
     `StreamCodec' for codecs which have to keep state in order to make
     encoding/decoding efficient.

     The decoder must be able to handle zero length input and return an
     empty object of the output object type in this situation.

   The `StreamWriter' and `StreamReader' classes provide generic
working interfaces which can be used to implement new encodings
submodules very easily. See `encodings.utf_8' for an example on how
this is done.


File: python-lib.info,  Node: StreamWriter Objects,  Next: StreamReader Objects,  Prev: Codec Objects,  Up: Codec Base Classes

StreamWriter Objects
....................

   The `StreamWriter' class is a subclass of `Codec' and defines the
following methods which every stream writer must define in order to be
compatible to the Python codec registry.

`StreamWriter(stream[, errors])'
     Constructor for a `StreamWriter' instance.

     All stream writers must provide this constructor interface. They
     are free to add additional keyword arguments, but only the ones
     defined here are used by the Python codec registry.

     STREAM must be a file-like object open for writing (binary) data.

     The `StreamWriter' may implement different error handling schemes
     by providing the ERRORS keyword argument. These parameters are
     defined:

        * `'strict'' Raise `ValueError' (or a subclass); this is the
          default.

        * `'ignore'' Ignore the character and continue with the next.

        * `'replace'' Replace with a suitable replacement character

`write(object)'
     Writes the object's contents encoded to the stream.

`writelines(list)'
     Writes the concatenated list of strings to the stream (possibly by
     reusing the `write()' method).

`reset()'
     Flushes and resets the codec buffers used for keeping state.

     Calling this method should ensure that the data on the output is
     put into a clean state, that allows appending of new fresh data
     without having to rescan the whole stream to recover state.

   In addition to the above methods, the `StreamWriter' must also
inherit all other methods and attribute from the underlying stream.


File: python-lib.info,  Node: StreamReader Objects,  Next: StreamReaderWriter Objects,  Prev: StreamWriter Objects,  Up: Codec Base Classes

StreamReader Objects
....................

   The `StreamReader' class is a subclass of `Codec' and defines the
following methods which every stream reader must define in order to be
compatible to the Python codec registry.

`StreamReader(stream[, errors])'
     Constructor for a `StreamReader' instance.

     All stream readers must provide this constructor interface. They
     are free to add additional keyword arguments, but only the ones
     defined here are used by the Python codec registry.

     STREAM must be a file-like object open for reading (binary) data.

     The `StreamReader' may implement different error handling schemes
     by providing the ERRORS keyword argument. These parameters are
     defined:

        * `'strict'' Raise `ValueError' (or a subclass); this is the
          default.

        * `'ignore'' Ignore the character and continue with the next.

        * `'replace'' Replace with a suitable replacement character.

`read([size])'
     Decodes data from the stream and returns the resulting object.

     SIZE indicates the approximate maximum number of bytes to read
     from the stream for decoding purposes. The decoder can modify this
     setting as appropriate. The default value -1 indicates to read and
     decode as much as possible.  SIZE is intended to prevent having to
     decode huge files in one step.

     The method should use a greedy read strategy meaning that it should
     read as much data as is allowed within the definition of the
     encoding and the given size, e.g.  if optional encoding endings or
     state markers are available on the stream, these should be read
     too.

`readline([size])'
     Read one line from the input stream and return the decoded data.

     Note: Unlike the `readlines()' method, this method inherits the
     line breaking knowledge from the underlying stream's `readline()'
     method - there is currently no support for line breaking using the
     codec decoder due to lack of line buffering.  Sublcasses should
     however, if possible, try to implement this method using their own
     knowledge of line breaking.

     SIZE, if given, is passed as size argument to the stream's
     `readline()' method.

`readlines([sizehint])'
     Read all lines available on the input stream and return them as
     list of lines.

     Line breaks are implemented using the codec's decoder method and
     are included in the list entries.

     SIZEHINT, if given, is passed as SIZE argument to the stream's
     `read()' method.

`reset()'
     Resets the codec buffers used for keeping state.

     Note that no stream repositioning should take place.  This method
     is primarily intended to be able to recover from decoding errors.

   In addition to the above methods, the `StreamReader' must also
inherit all other methods and attribute from the underlying stream.

   The next two base classes are included for convenience. They are not
needed by the codec registry, but may provide useful in practice.


File: python-lib.info,  Node: StreamReaderWriter Objects,  Next: StreamRecoder Objects,  Prev: StreamReader Objects,  Up: Codec Base Classes

StreamReaderWriter Objects
..........................

   The `StreamReaderWriter' allows wrapping streams which work in both
read and write modes.

   The design is such that one can use the factory functions returned by
the `lookup()' function to construct the instance.

`StreamReaderWriter(stream, Reader, Writer, errors)'
     Creates a `StreamReaderWriter' instance.  STREAM must be a
     file-like object.  READER and WRITER must be factory functions or
     classes providing the `StreamReader' and `StreamWriter' interface
     resp.  Error handling is done in the same way as defined for the
     stream readers and writers.

   `StreamReaderWriter' instances define the combined interfaces of
`StreamReader' and `StreamWriter' classes. They inherit all other
methods and attribute from the underlying stream.


File: python-lib.info,  Node: StreamRecoder Objects,  Prev: StreamReaderWriter Objects,  Up: Codec Base Classes

StreamRecoder Objects
.....................

   The `StreamRecoder' provide a frontend - backend view of encoding
data which is sometimes useful when dealing with different encoding
environments.

   The design is such that one can use the factory functions returned by
the `lookup()' function to construct the instance.

`StreamRecoder(stream, encode, decode, Reader, Writer, errors)'
     Creates a `StreamRecoder' instance which implements a two-way
     conversion: ENCODE and DECODE work on the frontend (the input to
     `read()' and output of `write()') while READER and WRITER work on
     the backend (reading and writing to the stream).

     You can use these objects to do transparent direct recodings from
     e.g. Latin-1 to UTF-8 and back.

     STREAM must be a file-like object.

     ENCODE, DECODE must adhere to the `Codec' interface, READER,
     WRITER must be factory functions or classes providing objects of
     the the `StreamReader' and `StreamWriter' interface respectively.

     ENCODE and DECODE are needed for the frontend translation, READER
     and WRITER for the backend translation.  The intermediate format
     used is determined by the two sets of codecs, e.g. the Unicode
     codecs will use Unicode as intermediate encoding.

     Error handling is done in the same way as defined for the stream
     readers and writers.

   `StreamRecoder' instances define the combined interfaces of
`StreamReader' and `StreamWriter' classes. They inherit all other
methods and attribute from the underlying stream.


File: python-lib.info,  Node: unicodedata,  Prev: codecs,  Up: String Services

Unicode Database
================

   Access the Unicode Database.  This module was documented by
Marc-Andre Lemburg <mal@lemburg.com>.
This section was written by Marc-Andre Lemburg <mal@lemburg.com>.
This module provides access to the Unicode Character Database which
defines character properties for all Unicode characters. The data in
this database is based on the `UnicodeData.txt' file version 3.0.0
which is publically available from <ftp://ftp.unicode.org/>.

   The module uses the same names and symbols as defined by the
UnicodeData File Format 3.0.0 (see
<http://www.unicode.org/Public/UNIDATA/UnicodeData.html>).  It defines
the following functions:

`lookup(name)'
     Look up character by name.  If a character with the given name is
     found, return the corresponding Unicode character.  If not found,
     `KeyError' is raised.

`name(unichr[, default])'
     Returns the name assigned to the Unicode character UNICHR as a
     string. If no name is defined, DEFAULT is returned, or, if not
     given, `ValueError' is raised.

`decimal(unichr[, default])'
     Returns the decimal value assigned to the Unicode character UNICHR
     as integer. If no such value is defined, DEFAULT is returned, or,
     if not given, `ValueError' is raised.

`digit(unichr[, default])'
     Returns the digit value assigned to the Unicode character UNICHR
     as integer. If no such value is defined, DEFAULT is returned, or,
     if not given, `ValueError' is raised.

`numeric(unichr[, default])'
     Returns the numeric value assigned to the Unicode character UNICHR
     as float. If no such value is defined, DEFAULT is returned, or, if
     not given, `ValueError' is raised.

`category(unichr)'
     Returns the general category assigned to the Unicode character
     UNICHR as string.

`bidirectional(unichr)'
     Returns the bidirectional category assigned to the Unicode
     character UNICHR as string. If no such value is defined, an empty
     string is returned.

`combining(unichr)'
     Returns the canonical combining class assigned to the Unicode
     character UNICHR as integer. Returns `0' if no combining class is
     defined.

`mirrored(unichr)'
     Returns the mirrored property of assigned to the Unicode character
     UNICHR as integer. Returns `1' if the character has been
     identified as a "mirrored" character in bidirectional text, `0'
     otherwise.

`decomposition(unichr)'
     Returns the character decomposition mapping assigned to the Unicode
     character UNICHR as string. An empty string is returned in case no
     such mapping is defined.


File: python-lib.info,  Node: Miscellaneous Services,  Next: Generic Operating System Services,  Prev: String Services,  Up: Top

Miscellaneous Services
**********************

   The modules described in this chapter provide miscellaneous services
that are available in all Python versions.  Here's an overview:

* Menu:

* doctest::
* unittest::
* math::
* cmath::
* random::
* whrandom::
* bisect::
* array::
* ConfigParser::
* fileinput::
* xreadlines::
* calendar::
* cmd::
* shlex::


File: python-lib.info,  Node: doctest,  Next: unittest,  Prev: Miscellaneous Services,  Up: Miscellaneous Services

Test docstrings represent reality
=================================

   This module was documented by Tim Peters
<tim_one@users.sourceforge.net>.
This section was written by Tim Peters <tim_one@users.sourceforge.net>.
This section was written by Moshe Zadka <moshez@debian.org>.
A framework for verifying examples in docstrings.

   The `doctest' module searches a module's docstrings for text that
looks like an interactive Python session, then executes all such
sessions to verify they still work exactly as shown.  Here's a complete
but small example:

     """
     This is module example.
     
     Example supplies one function, factorial.  For example,
     
     >>> factorial(5)
     120
     """
     
     def factorial(n):
         """Return the factorial of n, an exact integer >= 0.
     
         If the result is small enough to fit in an int, return an int.
         Else return a long.
     
         >>> [factorial(n) for n in range(6)]
         [1, 1, 2, 6, 24, 120]
         >>> [factorial(long(n)) for n in range(6)]
         [1, 1, 2, 6, 24, 120]
         >>> factorial(30)
         265252859812191058636308480000000L
         >>> factorial(30L)
         265252859812191058636308480000000L
         >>> factorial(-1)
         Traceback (most recent call last):
             ...
         ValueError: n must be >= 0
     
         Factorials of floats are OK, but the float must be an exact integer:
         >>> factorial(30.1)
         Traceback (most recent call last):
             ...
         ValueError: n must be exact integer
         >>> factorial(30.0)
         265252859812191058636308480000000L
     
         It must also not be ridiculously large:
         >>> factorial(1e100)
         Traceback (most recent call last):
             ...
         OverflowError: n too large
         """


         import math
         if not n >= 0:
             raise ValueError("n must be >= 0")
         if math.floor(n) != n:
             raise ValueError("n must be exact integer")
         if n+1 == n:  # e.g., 1e300
             raise OverflowError("n too large")
         result = 1
         factor = 2
         while factor <= n:
             try:
                 result *= factor
             except OverflowError:
                 result *= long(factor)
             factor += 1
         return result
     
     def _test():
         import doctest, example
         return doctest.testmod(example)
     
     if __name__ == "__main__":
         _test()

   If you run `example.py' directly from the command line, doctest works
its magic:

     $ python example.py
     $

   There's no output!  That's normal, and it means all the examples
worked.  Pass `-v' to the script, and doctest prints a detailed log of
what it's trying, and prints a summary at the end:

     $ python example.py -v
     Running example.__doc__
     Trying: factorial(5)
     Expecting: 120
     ok
     0 of 1 examples failed in example.__doc__
     Running example.factorial.__doc__
     Trying: [factorial(n) for n in range(6)]
     Expecting: [1, 1, 2, 6, 24, 120]
     ok
     Trying: [factorial(long(n)) for n in range(6)]
     Expecting: [1, 1, 2, 6, 24, 120]
     ok
     Trying: factorial(30)
     Expecting: 265252859812191058636308480000000L
     ok

   And so on, eventually ending with:

     Trying: factorial(1e100)
     Expecting:
     Traceback (most recent call last):
         ...
     OverflowError: n too large
     ok
     0 of 8 examples failed in example.factorial.__doc__
     2 items passed all tests:
        1 tests in example
        8 tests in example.factorial
     9 tests in 2 items.
     9 passed and 0 failed.
     Test passed.
     $

   That's all you need to know to start making productive use of
doctest!  Jump in.  The docstrings in doctest.py contain detailed
information about all aspects of doctest, and we'll just cover the more
important points here.

* Menu:

* Normal Usage::
* Which Docstrings Are Examined?::
* What's the Execution Context?::
* What About Exceptions?::
* Advanced Usage::
* How are Docstring Examples Recognized?::
* Warnings::
* Soapbox::


File: python-lib.info,  Node: Normal Usage,  Next: Which Docstrings Are Examined?,  Prev: doctest,  Up: doctest

Normal Usage
------------

   In normal use, end each module `M' with:

     def _test():
         import doctest, M           # replace M with your module's name
         return doctest.testmod(M)   # ditto
     
     if __name__ == "__main__":
         _test()

   Then running the module as a script causes the examples in the
docstrings to get executed and verified:

     python M.py

   This won't display anything unless an example fails, in which case
the failing example(s) and the cause(s) of the failure(s) are printed
to stdout, and the final line of output is `'Test failed.''.

   Run it with the `-v' switch instead:

     python M.py -v

   and a detailed report of all examples tried is printed to `stdout',
along with assorted summaries at the end.

   You can force verbose mode by passing `verbose=1' to testmod, or
prohibit it by passing `verbose=0'.  In either of those cases,
`sys.argv' is not examined by testmod.

   In any case, testmod returns a 2-tuple of ints `(F, T)', where F is
the number of docstring examples that failed and T is the total number
of docstring examples attempted.


File: python-lib.info,  Node: Which Docstrings Are Examined?,  Next: What's the Execution Context?,  Prev: Normal Usage,  Up: doctest

Which Docstrings Are Examined?
------------------------------

   See `docstring.py' for all the details.  They're unsurprising:  the
module docstring, and all function, class and method docstrings are
searched, with the exception of docstrings attached to objects with
private names.

   In addition, if `M.__test__' exists and "is true", it must be a
dict, and each entry maps a (string) name to a function object, class
object, or string.  Function and class object docstrings found from
`M.__test__' are searched even if the name is private, and strings are
searched directly as if they were docstrings.  In output, a key `K' in
`M.__test__' appears with name

           <name of M>.__test__.K

   Any classes found are recursively searched similarly, to test
docstrings in their contained methods and nested classes.  While
private names reached from `M''s globals are skipped, all names reached
from `M.__test__' are searched.


File: python-lib.info,  Node: What's the Execution Context?,  Next: What About Exceptions?,  Prev: Which Docstrings Are Examined?,  Up: doctest

What's the Execution Context?
-----------------------------

   By default, each time testmod finds a docstring to test, it uses a
_copy_ of `M''s globals, so that running tests on a module doesn't
change the module's real globals, and so that one test in `M' can't
leave behind crumbs that accidentally allow another test to work.  This
means examples can freely use any names defined at top-level in `M',
and names defined earlier in the docstring being run.  It also means
that sloppy imports (see below) can cause examples in external
docstrings to use globals inappropriate for them.

   You can force use of your own dict as the execution context by
passing `globs=your_dict' to `testmod()' instead.  Presumably this
would be a copy of `M.__dict__' merged with the globals from other
imported modules.


File: python-lib.info,  Node: What About Exceptions?,  Next: Advanced Usage,  Prev: What's the Execution Context?,  Up: doctest

What About Exceptions?
----------------------

   No problem, as long as the only output generated by the example is
the traceback itself.  For example:

     >>> [1, 2, 3].remove(42)
     Traceback (most recent call last):
       File "<stdin>", line 1, in ?
     ValueError: list.remove(x): x not in list
     >>>

   Note that only the exception type and value are compared
(specifically, only the last line in the traceback).  The various
"File" lines in between can be left out (unless they add significantly
to the documentation value of the example).


File: python-lib.info,  Node: Advanced Usage,  Next: How are Docstring Examples Recognized?,  Prev: What About Exceptions?,  Up: doctest

Advanced Usage
--------------

   `testmod()' actually creates a local instance of class `Tester',
runs appropriate methods of that class, and merges the results into
global `Tester' instance `master'.

   You can create your own instances of `Tester', and so build your own
policies, or even run methods of `master' directly.  See
`Tester.__doc__' for details.


File: python-lib.info,  Node: How are Docstring Examples Recognized?,  Next: Warnings,  Prev: Advanced Usage,  Up: doctest

How are Docstring Examples Recognized?
--------------------------------------

   In most cases a copy-and-paste of an interactive console session
works fine -- just make sure the leading whitespace is rigidly
consistent (you can mix tabs and spaces if you're too lazy to do it
right, but doctest is not in the business of guessing what you think a
tab means).

     >>> # comments are ignored
     >>> x = 12
     >>> x
     12
     >>> if x == 13:
     ...     print "yes"
     ... else:
     ...     print "no"
     ...     print "NO"
     ...     print "NO!!!"
     ...
     no
     NO
     NO!!!
     >>>

   Any expected output must immediately follow the final `'>`>'>~'' or
`'...~'' line containing the code, and the expected output (if any)
extends to the next `'>`>'>~'' or all-whitespace line.

   The fine print:

   * Expected output cannot contain an all-whitespace line, since such a
     line is taken to signal the end of expected output.

   * Output to stdout is captured, but not output to stderr (exception
     tracebacks are captured via a different means).

   * If you continue a line via backslashing in an interactive session,
     or for any other reason use a backslash, you need to double the
     backslash in the docstring version.  This is simply because you're
     in a string, and so the backslash must be escaped for it to
     survive intact.  Like:

          >>> if "yes" == \\
          ...     "y" +   \\
          ...     "es":
          ...     print 'yes'
          yes

   * The starting column doesn't matter:

            >>> assert "Easy!"
                  >>> import math
                      >>> math.floor(1.9)
                      1.0

     and as many leading whitespace characters are stripped from the
     expected output as appeared in the initial `'>`>'>~'' line that
     triggered it.


File: python-lib.info,  Node: Warnings,  Next: Soapbox,  Prev: How are Docstring Examples Recognized?,  Up: doctest

Warnings
--------

  1. Sloppy imports can cause trouble; e.g., if you do

          from XYZ import XYZclass

     then `XYZclass' is a name in `M.__dict__' too, and doctest has no
     way to know that `XYZclass' wasn't _defined_ in `M'.  So it may
     try to execute the examples in `XYZclass''s docstring, and those
     in turn may require a different set of globals to work correctly.
     I prefer to do "`import *'"-friendly imports, a la

          from XYZ import XYZclass as _XYZclass

     and then the leading underscore makes `_XYZclass' a private name so
     testmod skips it by default.  Other approaches are described in
     `doctest.py'.

  2. `doctest' is serious about requiring exact matches in expected
     output.  If even a single character doesn't match, the test fails.
     This will probably surprise you a few times, as you learn exactly
     what Python does and doesn't guarantee about output.  For example,
     when printing a dict, Python doesn't guarantee that the key-value
     pairs will be printed in any particular order, so a test like

          >>> foo()
          {"Hermione": "hippogryph", "Harry": "broomstick"}
          >>>

     is vulnerable!  One workaround is to do

          >>> foo() == {"Hermione": "hippogryph", "Harry": "broomstick"}
          1
          >>>

     instead.  Another is to do

          >>> d = foo().items()
          >>> d.sort()
          >>> d
          [('Harry', 'broomstick'), ('Hermione', 'hippogryph')]

     There are others, but you get the idea.

     Another bad idea is to print things that embed an object address,
     like

          >>> id(1.0) # certain to fail some of the time
          7948648
          >>>

     Floating-point numbers are also subject to small output variations
     across platforms, because Python defers to the platform C library
     for float formatting, and C libraries vary widely in quality here.

          >>> 1./7  # risky
          0.14285714285714285
          >>> print 1./7 # safer
          0.142857142857
          >>> print round(1./7, 6) # much safer
          0.142857

     Numbers of the form `I/2.**J' are safe across all platforms, and I
     often contrive doctest examples to produce numbers of that form:

          >>> 3./4  # utterly safe
          0.75

     Simple fractions are also easier for people to understand, and
     that makes for better documentation.


File: python-lib.info,  Node: Soapbox,  Prev: Warnings,  Up: doctest

Soapbox
-------

   The first word in doctest is "doc", and that's why the author wrote
doctest:  to keep documentation up to date.  It so happens that doctest
makes a pleasant unit testing environment, but that's not its primary
purpose.

   Choose docstring examples with care.  There's an art to this that
needs to be learned -- it may not be natural at first.  Examples should
add genuine value to the documentation.  A good example can often be
worth many words.  If possible, show just a few normal cases, show
endcases, show interesting subtle cases, and show an example of each
kind of exception that can be raised.  You're probably testing for
endcases and subtle cases anyway in an interactive shell:  doctest
wants to make it as easy as possible to capture those sessions, and
will verify they continue to work as designed forever after.

   If done with care, the examples will be invaluable for your users,
and will pay back the time it takes to collect them many times over as
the years go by and "things change".  I'm still amazed at how often one
of my doctest examples stops working after a "harmless" change.

   For exhaustive testing, or testing boring cases that add no value to
the docs, define a `__test__' dict instead.  That's what it's for.


File: python-lib.info,  Node: unittest,  Next: math,  Prev: doctest,  Up: Miscellaneous Services

Unit testing framework
======================

   Unit testing framework for Python.  This module was documented by
Steve Purcell <stephen_purcell@yahoo.com>.
This section was written by Steve Purcell <stephen_purcell@yahoo.com>.
This section was written by Fred L. Drake, Jr. <fdrake@acm.org>.
The Python unit testing framework, often referred to as "PyUnit," is a
Python language version of JUnit, by Kent Beck and Erich Gamma.  JUnit
is, in turn, a Java version of Kent's Smalltalk testing framework.
Each is the de facto standard unit testing framework for its respective
language.

   PyUnit supports test automation, sharing of setup and shutdown code
for tests, aggregation of tests into collections, and independence of
the tests from the reporting framework.  The `unittest' module provides
classes that make it easy to support these qualities for a set of tests.

   To achieve this, PyUnit supports some important concepts:

"test fixture"
     A "test fixture" represents the preparation needed to perform one
     or more tests, and any associate cleanup actions.  This may
     involve, for example, creating temporary or proxy databases,
     directories, or starting a server process.

"test case"
     A "test case" is the smallest unit of testing.  It checks for a
     specific response to a particular set of inputs.  PyUnit provides a
     base class, `TestCase', which may be used to create new test cases.

"test suite"
     A "test suite" is a collection of test cases, test suites, or
     both.  It is used to aggregate tests that should be executed
     together.

"test runner"
     A "test runner" is a component which orchestrates the execution of
     tests and provides the outcome to the user.  The runner may use a
     graphical interface, a textual interface, or return a special
     value to indicate the results of executing the tests.

   The test case and test fixture concepts are supported through the
`TestCase' and `FunctionTestCase' classes; the former should be used
when creating new tests, and the later can be used when integrating
existing test code with a PyUnit-driven framework.  When building test
fixtures using `TestCase', the `setUp()' and `tearDown()' methods can
be overridden to provide initialization and cleanup for the fixture.
With `FunctionTestCase', existing functions can be passed to the
constructor for these purposes.  When the test is run, the fixture
initialization is run first; if it succeeds, the cleanup method is run
after the test has been executed, regardless of the outcome of the
test.  Each instance of the `TestCase' will only be used to run a
single test method, so a new fixture is created for each test.

   Test suites are implemented by the `TestSuite' class.  This class
allows individual tests and test suites to be aggregated; when the
suite is executed, all tests added directly to the suite and in "child"
test suites are run.

   A test runner is an object that provides a single method, `run()',
which accepts a `TestCase' or `TestSuite' object as a parameter, and
returns a result object.  The class `TestResult' is provided for use as
the result object.  PyUnit provide the `TextTestRunner' as an example
test runner which reports test results on the standard error stream by
default.  Alternate runners can be implemented for other environments
(such as graphical environments) without any need to derive from a
specific class.

   See also:

   `PyUnit Web Site'{The source for further information on PyUnit.}
`Simple Smalltalk Testing: With Patterns'{Kent Beck's original paper on
testing frameworks using the pattern shared by `unittest'.}

* Menu:

* Organizing test code::
* Re-using old test code::
* Classes and functions 2::
* TestCase Objects::
* TestSuite Objects::
* TestResult Objects::
* TestLoader Objects::


File: python-lib.info,  Node: Organizing test code,  Next: Re-using old test code,  Prev: unittest,  Up: unittest

Organizing test code
--------------------

   The basic building blocks of unit testing are "test cases" -- single
scenarios that must be set up and checked for correctness.  In PyUnit,
test cases are represented by instances of the `TestCase' class in the
`unittest' module. To make your own test cases you must write
subclasses of `TestCase', or use `FunctionTestCase'.

   An instance of a `TestCase'-derived class is an object that can
completely run a single test method, together with optional set-up and
tidy-up code.

   The testing code of a `TestCase' instance should be entirely self
contained, such that it can be run either in isolation or in arbitrary
combination with any number of other test cases.

   The simplest test case subclass will simply override the `runTest()'
method in order to perform specific testing code:

     import unittest
     
     class DefaultWidgetSizeTestCase(unittest.TestCase):
         def runTest(self):
             widget = Widget("The widget")
             self.failUnless(widget.size() == (50,50), 'incorrect default size')

   Note that in order to test something, we use the one of the
`assert*()' or `fail*()' methods provided by the `TestCase' base class.
If the test fails when the test case runs, an exception will be
raised, and the testing framework will identify the test case as a
"failure".  Other exceptions that do not arise from checks made through
the `assert*()' and `fail*()' methods are identified by the testing
framework as dfn{errors}.

   The way to run a test case will be described later.  For now, note
that to construct an instance of such a test case, we call its
constructor without arguments:

     testCase = DefaultWidgetSizeTestCase()

   Now, such test cases can be numerous, and their set-up can be
repetitive.  In the above case, constructing a "Widget" in each of 100
Widget test case subclasses would mean unsightly duplication.

   Luckily, we can factor out such set-up code by implementing a method
called `setUp()', which the testing framework will automatically call
for us when we run the test:

     import unittest
     
     class SimpleWidgetTestCase(unittest.TestCase):
         def setUp(self):
             self.widget = Widget("The widget")
     
     class DefaultWidgetSizeTestCase(SimpleWidgetTestCase):
         def runTest(self):
             self.failUnless(self.widget.size() == (50,50),
                             'incorrect default size')
     
     class WidgetResizeTestCase(SimpleWidgetTestCase):
         def runTest(self):
             self.widget.resize(100,150)
             self.failUnless(self.widget.size() == (100,150),
                             'wrong size after resize')

   If the `setUp()' method raises an exception while the test is
running, the framework will consider the test to have suffered an
error, and the `runTest()' method will not be executed.

   Similarly, we can provide a `tearDown()' method that tidies up after
the `runTest()' method has been run:

     import unittest
     
     class SimpleWidgetTestCase(unittest.TestCase):
         def setUp(self):
             self.widget = Widget("The widget")
     
         def tearDown(self):
             self.widget.dispose()
             self.widget = None

   If `setUp()' succeeded, the `tearDown()' method will be run
regardless of whether or not `runTest()' succeeded.

   Such a working environment for the testing code is called a
"fixture".

   Often, many small test cases will use the same fixture.  In this
case, we would end up subclassing `SimpleWidgetTestCase' into many
small one-method classes such as `DefaultWidgetSizeTestCase'.  This is
time-consuming and discouraging, so in the same vein as JUnit, PyUnit
provides a simpler mechanism:

     import unittest
     
     class WidgetTestCase(unittest.TestCase):
         def setUp(self):
             self.widget = Widget("The widget")
     
         def tearDown(self):
             self.widget.dispose()
             self.widget = None
     
         def testDefaultSize(self):
             self.failUnless(self.widget.size() == (50,50),
                             'incorrect default size')
     
         def testResize(self):
             self.widget.resize(100,150)
             self.failUnless(self.widget.size() == (100,150),
                             'wrong size after resize')

   Here we have not provided a `runTest()' method, but have instead
provided two different test methods.  Class instances will now each run
one of the `test*()'  methods, with `self.widget' created and destroyed
separately for each instance.  When creating an instance we must
specify the test method it is to run.  We do this by passing the method
name in the constructor:

     defaultSizeTestCase = WidgetTestCase("testDefaultSize")
     resizeTestCase = WidgetTestCase("testResize")

   Test case instances are grouped together according to the features
they test.  PyUnit provides a mechanism for this: the `test suite',
represented by the class `TestSuite' in the `unittest' module:

     widgetTestSuite = unittest.TestSuite()
     widgetTestSuite.addTest(WidgetTestCase("testDefaultSize"))
     widgetTestSuite.addTest(WidgetTestCase("testResize"))

   For the ease of running tests, as we will see later, it is a good
idea to provide in each test module a callable object that returns a
pre-built test suite:

     def suite():
         suite = unittest.TestSuite()
         suite.addTest(WidgetTestCase("testDefaultSize"))
         suite.addTest(WidgetTestCase("testResize"))
         return suite

   or even:

     class WidgetTestSuite(unittest.TestSuite):
         def __init__(self):
             unittest.TestSuite.__init__(self,map(WidgetTestCase,
                                                   ("testDefaultSize",
                                                    "testResize")))

   (The latter is admittedly not for the faint-hearted!)

   Since it is a common pattern to create a `TestCase' subclass with
many similarly named test functions, there is a convenience function
called `makeSuite()' provided in the `unittest' module that constructs
a test suite that comprises all of the test cases in a test case class:

     suite = unittest.makeSuite(WidgetTestCase,'test')

   Note that when using the `makeSuite()' function, the order in which
the various test cases will be run by the test suite is the order
determined by sorting the test function names using the `cmp()'
built-in function.

   Often it is desirable to group suites of test cases together, so as
to run tests for the whole system at once.  This is easy, since
`TestSuite' instances can be added to a `TestSuite' just as `TestCase'
instances can be added to a `TestSuite':

     suite1 = module1.TheTestSuite()
     suite2 = module2.TheTestSuite()
     alltests = unittest.TestSuite((suite1, suite2))

   You can place the definitions of test cases and test suites in the
same modules as the code they are to test (e.g. `widget.py'), but there
are several advantages to placing the test code in a separate module,
such as `widgettests.py':

   * The test module can be run standalone from the command line.

   * The test code can more easily be separated from shipped code.

   * There is less temptation to change test code to fit the code.  it
     tests without a good reason.

   * Test code should be modified much less frequently than the code it
     tests.

   * Tested code can be refactored more easily.

   * Tests for modules written in C must be in separate modules anyway,
     so why not be consistent?

   * If the testing strategy changes, there is no need to change the
     source code.


File: python-lib.info,  Node: Re-using old test code,  Next: Classes and functions 2,  Prev: Organizing test code,  Up: unittest

Re-using old test code
----------------------

   Some users will find that they have existing test code that they
would like to run from PyUnit, without converting every old test
function to a `TestCase' subclass.

   For this reason, PyUnit provides a `FunctionTestCase' class.  This
subclass of `TestCase' can be used to wrap an existing test function.
Set-up and tear-down functions can also optionally be wrapped.

   Given the following test function:

     def testSomething():
         something = makeSomething()
         assert something.name is not None
         # ...

   one can create an equivalent test case instance as follows:

     testcase = unittest.FunctionTestCase(testSomething)

   If there are additional set-up and tear-down methods that should be
called as part of the test case's operation, they can also be provided:

     testcase = unittest.FunctionTestCase(testSomething,
                                          setUp=makeSomethingDB,
                                          tearDown=deleteSomethingDB)

   *Note:*  PyUnit supports the use of `AssertionError' as an indicator
of test failure, but does not recommend it.  Future versions may treat
`AssertionError' differently.

