This is /home/pdm/install/Python-2.1/Doc/lib/python-lib.info, produced
by makeinfo version 4.0 from lib.texi.

   April 15, 2001		2.1


File: python-lib.info,  Node: MultiFile Objects,  Next: MultiFile Example,  Prev: multifile,  Up: multifile

MultiFile Objects
-----------------

   A `MultiFile' instance has the following methods:

`readline(str)'
     Read a line.  If the line is data (not a section-divider or
     end-marker or real EOF) return it.  If the line matches the
     most-recently-stacked boundary, return `''' and set `self.last' to
     1 or 0 according as the match is or is not an end-marker.  If the
     line matches any other stacked boundary, raise an error.  On
     encountering end-of-file on the underlying stream object, the
     method raises `Error' unless all boundaries have been popped.

`readlines(str)'
     Return all lines remaining in this part as a list of strings.

`read()'
     Read all lines, up to the next section.  Return them as a single
     (multiline) string.  Note that this doesn't take a size argument!

`seek(pos[, whence])'
     Seek.  Seek indices are relative to the start of the current
     section.  The POS and WHENCE arguments are interpreted as for a
     file seek.

`tell()'
     Return the file position relative to the start of the current
     section.

`next()'
     Skip lines to the next section (that is, read lines until a
     section-divider or end-marker has been consumed).  Return true if
     there is such a section, false if an end-marker is seen.  Re-enable
     the most-recently-pushed boundary.

`is_data(str)'
     Return true if STR is data and false if it might be a section
     boundary.  As written, it tests for a prefix other than `'-'`-'' at
     start of line (which all MIME boundaries have) but it is declared
     so it can be overridden in derived classes.

     Note that this test is used intended as a fast guard for the real
     boundary tests; if it always returns false it will merely slow
     processing, not cause it to fail.

`push(str)'
     Push a boundary string.  When an appropriately decorated version of
     this boundary is found as an input line, it will be interpreted as
     a section-divider or end-marker.  All subsequent reads will return
     the empty string to indicate end-of-file, until a call to `pop()'
     removes the boundary a or `next()' call reenables it.

     It is possible to push more than one boundary.  Encountering the
     most-recently-pushed boundary will return EOF; encountering any
     other boundary will raise an error.

`pop()'
     Pop a section boundary.  This boundary will no longer be
     interpreted as EOF.

`section_divider(str)'
     Turn a boundary into a section-divider line.  By default, this
     method prepends `'-'`-'' (which MIME section boundaries have) but
     it is declared so it can be overridden in derived classes.  This
     method need not append LF or CR-LF, as comparison with the result
     ignores trailing whitespace.

`end_marker(str)'
     Turn a boundary string into an end-marker line.  By default, this
     method prepends `'-'`-'' and appends `'-'`-'' (like a
     MIME-multipart end-of-message marker) but it is declared so it can
     be be overridden in derived classes.  This method need not append
     LF or CR-LF, as comparison with the result ignores trailing
     whitespace.

   Finally, `MultiFile' instances have two public instance variables:

`level'
     Nesting depth of the current part.

`last'
     True if the last end-of-file was for an end-of-message marker.


File: python-lib.info,  Node: MultiFile Example,  Prev: MultiFile Objects,  Up: multifile

`MultiFile' Example
-------------------

   This section was written by Skip Montanaro <skip@mojam.com>.
     import mimetools
     import multifile
     import StringIO
     
     def extract_mime_part_matching(stream, mimetype):
         """Return the first element in a multipart MIME message on stream
         matching mimetype."""
     
         msg = mimetools.Message(stream)
         msgtype = msg.gettype()
         params = msg.getplist()
     
         data = StringIO.StringIO()
         if msgtype[:10] == "multipart/":
     
             file = multifile.MultiFile(stream)
             file.push(msg.getparam("boundary"))
             while file.next():
                 submsg = mimetools.Message(file)
                 try:
                     data = StringIO.StringIO()
                     mimetools.decode(file, data, submsg.getencoding())
                 except ValueError:
                     continue
                 if submsg.gettype() == mimetype:
                     break
             file.pop()
         return data.getvalue()


File: python-lib.info,  Node: binhex,  Next: uu,  Prev: multifile,  Up: Internet Data Handling

Encode and decode binhex4 files
===============================

   Encode and decode files in binhex4 format.

   This module encodes and decodes files in binhex4 format, a format
allowing representation of Macintosh files in ASCII. On the Macintosh,
both forks of a file and the finder information are encoded (or
decoded), on other platforms only the data fork is handled.

   The `binhex' module defines the following functions:

`binhex(input, output)'
     Convert a binary file with filename INPUT to binhex file OUTPUT.
     The OUTPUT parameter can either be a filename or a file-like
     object (any object supporting a `write()' and `close()' method).

`hexbin(input[, output])'
     Decode a binhex file INPUT. INPUT may be a filename or a file-like
     object supporting `read()' and `close()' methods.  The resulting
     file is written to a file named OUTPUT, unless the argument is
     omitted in which case the output filename is read from the binhex
     file.

   See also:

   *Note binascii:: Support module containing ASCII-to-binary and
binary-to-ASCII conversions.

* Menu:

* Notes::


File: python-lib.info,  Node: Notes,  Prev: binhex,  Up: binhex

Notes
-----

   There is an alternative, more powerful interface to the coder and
decoder, see the source for details.

   If you code or decode textfiles on non-Macintosh platforms they will
still use the Macintosh newline convention (carriage-return as end of
line).

   As of this writing, `hexbin()' appears to not work in all cases.


File: python-lib.info,  Node: uu,  Next: binascii,  Prev: binhex,  Up: Internet Data Handling

Encode and decode uuencode files
================================

   Encode and decode files in uuencode format.  This module was
documented by Lance Ellinghouse <>.
This module encodes and decodes files in uuencode format, allowing
arbitrary binary data to be transferred over ascii-only connections.
Wherever a file argument is expected, the methods accept a file-like
object.  For backwards compatibility, a string containing a pathname is
also accepted, and the corresponding file will be opened for reading
and writing; the pathname `'-'' is understood to mean the standard
input or output.  However, this interface is deprecated; it's better
for the caller to open the file itself, and be sure that, when
required, the mode is `'rb'' or `'wb'' on Windows or DOS.

   This code was contributed by Lance Ellinghouse, and modified by Jack
Jansen.

   The `uu' module defines the following functions:

`encode(in_file, out_file[, name[, mode]])'
     Uuencode file IN_FILE into file OUT_FILE.  The uuencoded file will
     have the header specifying NAME and MODE as the defaults for the
     results of decoding the file. The default defaults are taken from
     IN_FILE, or `'-'' and `0666' respectively.

`decode(in_file[, out_file[, mode]])'
     This call decodes uuencoded file IN_FILE placing the result on
     file OUT_FILE. If OUT_FILE is a pathname, MODE is used to set the
     permission bits if the file must be created. Defaults for OUT_FILE
     and MODE are taken from the uuencode header.

   See also:

   *Note binascii:: Support module containing ASCII-to-binary and
binary-to-ASCII conversions.


File: python-lib.info,  Node: binascii,  Next: xdrlib,  Prev: uu,  Up: Internet Data Handling

Convert between binary and ASCII
================================

   Tools for converting between binary and various ASCII-encoded binary
representations.

   The `binascii' module contains a number of methods to convert
between binary and various ASCII-encoded binary representations.
Normally, you will not use these functions directly but use wrapper
modules like `uu' or `binhex' instead, this module solely exists
because bit-manipulation of large amounts of data is slow in Python.

   The `binascii' module defines the following functions:

`a2b_uu(string)'
     Convert a single line of uuencoded data back to binary and return
     the binary data. Lines normally contain 45 (binary) bytes, except
     for the last line. Line data may be followed by whitespace.

`b2a_uu(data)'
     Convert binary data to a line of ASCII characters, the return value
     is the converted line, including a newline char. The length of
     DATA should be at most 45.

`a2b_base64(string)'
     Convert a block of base64 data back to binary and return the
     binary data. More than one line may be passed at a time.

`b2a_base64(data)'
     Convert binary data to a line of ASCII characters in base64 coding.
     The return value is the converted line, including a newline char.
     The length of DATA should be at most 57 to adhere to the base64
     standard.

`a2b_hqx(string)'
     Convert binhex4 formatted ASCII data to binary, without doing
     RLE-decompression. The string should contain a complete number of
     binary bytes, or (in case of the last portion of the binhex4 data)
     have the remaining bits zero.

`rledecode_hqx(data)'
     Perform RLE-decompression on the data, as per the binhex4
     standard. The algorithm uses `0x90' after a byte as a repeat
     indicator, followed by a count. A count of `0' specifies a byte
     value of `0x90'. The routine returns the decompressed data, unless
     data input data ends in an orphaned repeat indicator, in which
     case the `Incomplete' exception is raised.

`rlecode_hqx(data)'
     Perform binhex4 style RLE-compression on DATA and return the
     result.

`b2a_hqx(data)'
     Perform hexbin4 binary-to-ASCII translation and return the
     resulting string. The argument should already be RLE-coded, and
     have a length divisible by 3 (except possibly the last fragment).

`crc_hqx(data, crc)'
     Compute the binhex4 crc value of DATA, starting with an initial
     CRC and returning the result.

`crc32(data[, crc])'
     Compute CRC-32, the 32-bit checksum of data, starting with an
     initial crc.  This is consistent with the ZIP file checksum.  Use
     as follows:
              print binascii.crc32("hello world")
              # Or, in two pieces:
              crc = binascii.crc32("hello")
              crc = binascii.crc32(" world", crc)
              print crc

`b2a_hex(data)'

`hexlify data'
     Return the hexadecimal representation of the binary DATA.  Every
     byte of DATA is converted into the corresponding 2-digit hex
     representation.  The resulting string is therefore twice as long as
     the length of DATA.

`a2b_hex(hexstr)'

`unhexlify hexstr'
     Return the binary data represented by the hexadecimal string
     HEXSTR.  This function is the inverse of `b2a_hex()'.  HEXSTR must
     contain an even number of hexadecimal digits (which can be upper
     or lower case), otherwise a `TypeError' is raised.

`Error'
     Exception raised on errors. These are usually programming errors.

`Incomplete'
     Exception raised on incomplete data. These are usually not
     programming errors, but may be handled by reading a little more
     data and trying again.

   See also:

   *Note base64:: Support for base64 encoding used in MIME email
messages.

   *Note binhex:: Support for the binhex format used on the Macintosh.

   *Note uu:: Support for UU encoding used on UNIX.


File: python-lib.info,  Node: xdrlib,  Next: mailcap,  Prev: binascii,  Up: Internet Data Handling

Encode and decode XDR data
==========================

   Encoders and decoders for the External Data Representation (XDR).

   The `xdrlib' module supports the External Data Representation
Standard as described in RFC 1014, written by Sun Microsystems, Inc.
June 1987.  It supports most of the data types described in the RFC.

   The `xdrlib' module defines two classes, one for packing variables
into XDR representation, and another for unpacking from XDR
representation.  There are also two exception classes.

`Packer()'
     `Packer' is the class for packing data into XDR representation.
     The `Packer' class is instantiated with no arguments.

`Unpacker(data)'
     `Unpacker' is the complementary class which unpacks XDR data
     values from a string buffer.  The input buffer is given as DATA.

   See also:

*RFC1014 XDR: External Data Representation Standard*
     This RFC defined the encoding of data which was XDR at the time
     this module was originally written.  It has appearantly been
     obsoleted by RFC 1832.

*RFC1832 XDR: External Data Representation Standard*
     Newer RFC that provides a revised definition of XDR.

* Menu:

* Packer Objects::
* Unpacker Objects::
* Exceptions::


File: python-lib.info,  Node: Packer Objects,  Next: Unpacker Objects,  Prev: xdrlib,  Up: xdrlib

Packer Objects
--------------

   `Packer' instances have the following methods:

`get_buffer()'
     Returns the current pack buffer as a string.

`reset()'
     Resets the pack buffer to the empty string.

   In general, you can pack any of the most common XDR data types by
calling the appropriate `pack_TYPE()' method.  Each method takes a
single argument, the value to pack.  The following simple data type
packing methods are supported: `pack_uint()', `pack_int()',
`pack_enum()', `pack_bool()', `pack_uhyper()', and `pack_hyper()'.

`pack_float(value)'
     Packs the single-precision floating point number VALUE.

`pack_double(value)'
     Packs the double-precision floating point number VALUE.

   The following methods support packing strings, bytes, and opaque
data:

`pack_fstring(n, s)'
     Packs a fixed length string, S.  N is the length of the string but
     it is _not_ packed into the data buffer.  The string is padded
     with null bytes if necessary to guaranteed 4 byte alignment.

`pack_fopaque(n, data)'
     Packs a fixed length opaque data stream, similarly to
     `pack_fstring()'.

`pack_string(s)'
     Packs a variable length string, S.  The length of the string is
     first packed as an unsigned integer, then the string data is packed
     with `pack_fstring()'.

`pack_opaque(data)'
     Packs a variable length opaque data string, similarly to
     `pack_string()'.

`pack_bytes(bytes)'
     Packs a variable length byte stream, similarly to `pack_string()'.

   The following methods support packing arrays and lists:

`pack_list(list, pack_item)'
     Packs a LIST of homogeneous items.  This method is useful for
     lists with an indeterminate size; i.e. the size is not available
     until the entire list has been walked.  For each item in the list,
     an unsigned integer `1' is packed first, followed by the data value
     from the list.  PACK_ITEM is the function that is called to pack
     the individual item.  At the end of the list, an unsigned integer
     `0' is packed.

     For example, to pack a list of integers, the code might appear like
     this:

          import xdrlib
          p = xdrlib.Packer()
          p.pack_list([1, 2, 3], p.pack_int)

`pack_farray(n, array, pack_item)'
     Packs a fixed length list (ARRAY) of homogeneous items.  N is the
     length of the list; it is _not_ packed into the buffer, but a
     `ValueError' exception is raised if `len(ARRAY)' is not equal to
     N.  As above, PACK_ITEM is the function used to pack each element.

`pack_array(list, pack_item)'
     Packs a variable length LIST of homogeneous items.  First, the
     length of the list is packed as an unsigned integer, then each
     element is packed as in `pack_farray()' above.


File: python-lib.info,  Node: Unpacker Objects,  Next: Exceptions,  Prev: Packer Objects,  Up: xdrlib

Unpacker Objects
----------------

   The `Unpacker' class offers the following methods:

`reset(data)'
     Resets the string buffer with the given DATA.

`get_position()'
     Returns the current unpack position in the data buffer.

`set_position(position)'
     Sets the data buffer unpack position to POSITION.  You should be
     careful about using `get_position()' and `set_position()'.

`get_buffer()'
     Returns the current unpack data buffer as a string.

`done()'
     Indicates unpack completion.  Raises an `Error' exception if all
     of the data has not been unpacked.

   In addition, every data type that can be packed with a `Packer', can
be unpacked with an `Unpacker'.  Unpacking methods are of the form
`unpack_TYPE()', and take no arguments.  They return the unpacked
object.

`unpack_float()'
     Unpacks a single-precision floating point number.

`unpack_double()'
     Unpacks a double-precision floating point number, similarly to
     `unpack_float()'.

   In addition, the following methods unpack strings, bytes, and opaque
data:

`unpack_fstring(n)'
     Unpacks and returns a fixed length string.  N is the number of
     characters expected.  Padding with null bytes to guaranteed 4 byte
     alignment is assumed.

`unpack_fopaque(n)'
     Unpacks and returns a fixed length opaque data stream, similarly to
     `unpack_fstring()'.

`unpack_string()'
     Unpacks and returns a variable length string.  The length of the
     string is first unpacked as an unsigned integer, then the string
     data is unpacked with `unpack_fstring()'.

`unpack_opaque()'
     Unpacks and returns a variable length opaque data string,
     similarly to `unpack_string()'.

`unpack_bytes()'
     Unpacks and returns a variable length byte stream, similarly to
     `unpack_string()'.

   The following methods support unpacking arrays and lists:

`unpack_list(unpack_item)'
     Unpacks and returns a list of homogeneous items.  The list is
     unpacked one element at a time by first unpacking an unsigned
     integer flag.  If the flag is `1', then the item is unpacked and
     appended to the list.  A flag of `0' indicates the end of the
     list.  UNPACK_ITEM is the function that is called to unpack the
     items.

`unpack_farray(n, unpack_item)'
     Unpacks and returns (as a list) a fixed length array of homogeneous
     items.  N is number of list elements to expect in the buffer.  As
     above, UNPACK_ITEM is the function used to unpack each element.

`unpack_array(unpack_item)'
     Unpacks and returns a variable length LIST of homogeneous items.
     First, the length of the list is unpacked as an unsigned integer,
     then each element is unpacked as in `unpack_farray()' above.


File: python-lib.info,  Node: Exceptions,  Prev: Unpacker Objects,  Up: xdrlib

Exceptions
----------

   Exceptions in this module are coded as class instances:

`Error'
     The base exception class.  `Error' has a single public data member
     `msg' containing the description of the error.

`ConversionError'
     Class derived from `Error'.  Contains no additional instance
     variables.

   Here is an example of how you would catch one of these exceptions:

     import xdrlib
     p = xdrlib.Packer()
     try:
         p.pack_double(8.01)
     except xdrlib.ConversionError, instance:
         print 'packing the double failed:', instance.msg


File: python-lib.info,  Node: mailcap,  Next: mimetypes,  Prev: xdrlib,  Up: Internet Data Handling

Mailcap file handling.
======================

   Mailcap file handling.

   Mailcap files are used to configure how MIME-aware applications such
as mail readers and Web browsers react to files with different MIME
types. (The name "mailcap" is derived from the phrase "mail
capability".)  For example, a mailcap file might contain a line like
`video/mpeg; xmpeg %s'.  Then, if the user encounters an email message
or Web document with the MIME type `video/mpeg', `%s' will be replaced
by a filename (usually one belonging to a temporary file) and the
`xmpeg' program can be automatically started to view the file.

   The mailcap format is documented in RFC 1524, "A User Agent
Configuration Mechanism For Multimedia Mail Format Information," but is
not an Internet standard.  However, mailcap files are supported on most
UNIX systems.

`findmatch(caps, MIMEtype                            [, key[, filename[, plist]]])'
     Return a 2-tuple; the first element is a string containing the
     command line to be executed (which can be passed to
     `os.system()'), and the second element is the mailcap entry for a
     given MIME type.  If no matching MIME type can be found, `(None,
     None)' is returned.

     KEY is the name of the field desired, which represents the type of
     activity to be performed; the default value is 'view', since in the
     most common case you simply want to view the body of the MIME-typed
     data.  Other possible values might be 'compose' and 'edit', if you
     wanted to create a new body of the given MIME type or alter the
     existing body data.  See RFC 1524 for a complete list of these
     fields.

     FILENAME is the filename to be substituted for `%s' in the command
     line; the default value is `'/dev/null'' which is almost certainly
     not what you want, so usually you'll override it by specifying a
     filename.

     PLIST can be a list containing named parameters; the default value
     is simply an empty list.  Each entry in the list must be a string
     containing the parameter name, an equals sign (`='), and the
     parameter's value.  Mailcap entries can contain named parameters
     like `%{foo}', which will be replaced by the value of the
     parameter named 'foo'.  For example, if the command line
     `showpartial %{id} %{number} %{total}' was in a mailcap file, and
     PLIST was set to `['id=1', 'number=2', 'total=3']', the resulting
     command line would be `"showpartial 1 2 3"'.

     In a mailcap file, the "test" field can optionally be specified to
     test some external condition (e.g., the machine architecture, or
     the window system in use) to determine whether or not the mailcap
     line applies.  `findmatch()' will automatically check such
     conditions and skip the entry if the check fails.

`getcaps()'
     Returns a dictionary mapping MIME types to a list of mailcap file
     entries. This dictionary must be passed to the `findmatch()'
     function.  An entry is stored as a list of dictionaries, but it
     shouldn't be necessary to know the details of this representation.

     The information is derived from all of the mailcap files found on
     the system. Settings in the user's mailcap file `$HOME/.mailcap'
     will override settings in the system mailcap files `/etc/mailcap',
     `/usr/etc/mailcap', and `/usr/local/etc/mailcap'.

   An example usage:
     >>> import mailcap
     >>> d=mailcap.getcaps()
     >>> mailcap.findmatch(d, 'video/mpeg', filename='/tmp/tmp1223')
     ('xmpeg /tmp/tmp1223', {'view': 'xmpeg %s'})


File: python-lib.info,  Node: mimetypes,  Next: base64,  Prev: mailcap,  Up: Internet Data Handling

Map filenames to MIME types
===========================

   Mapping of filename extensions to MIME types.

   This section was written by Fred L. Drake, Jr. <fdrake@acm.org>.
The `mimetypes' converts between a filename or URL and the MIME type
associated with the filename extension.  Conversions are provided from
filename to MIME type and from MIME type to filename extension;
encodings are not supported for the later conversion.

   The functions described below provide the primary interface for this
module.  If the module has not been initialized, they will call
`init()'.

`guess_type(filename)'
     Guess the type of a file based on its filename or URL, given by
     FILENAME.  The return value is a tuple `(TYPE, ENCODING)' where
     TYPE is `None' if the type can't be guessed (no or unknown suffix)
     or a string of the form `'TYPE/SUBTYPE'', usable for a MIME
     `content-type' header; and encoding is `None' for no encoding or
     the name of the program used to encode (e.g. `compress' or
     `gzip').  The encoding is suitable for use as a `content-encoding'
     header, _not_ as a `content-transfer-encoding' header.  The
     mappings are table driven.  Encoding suffixes are case sensitive;
     type suffixes are first tried case sensitive, then case
     insensitive.

`guess_extension(type)'
     Guess the extension for a file based on its MIME type, given by
     TYPE.  The return value is a string giving a filename extension,
     including the leading dot (`.').  The extension is not guaranteed
     to have been associated with any particular data stream, but would
     be mapped to the MIME type TYPE by `guess_type()'.  If no
     extension can be guessed for TYPE, `None' is returned.

   Some additional functions and data items are available for
controlling the behavior of the module.

`init([files])'
     Initialize the internal data structures.  If given, FILES must be
     a sequence of file names which should be used to augment the
     default type map.  If omitted, the file names to use are taken from
     `knownfiles'.  Each file named in FILES or `knownfiles' takes
     precedence over those named before it.  Calling `init()'
     repeatedly is allowed.

`read_mime_types(filename)'
     Load the type map given in the file FILENAME, if it exists.  The
     type map is returned as a dictionary mapping filename extensions,
     including the leading dot (`.'), to strings of the form
     `'TYPE/SUBTYPE''.  If the file FILENAME does not exist or cannot
     be read, `None' is returned.

`inited'
     Flag indicating whether or not the global data structures have been
     initialized.  This is set to true by `init()'.

`knownfiles'
     List of type map file names commonly installed.  These files are
     typically named `mime.types' and are installed in different
     locations by different packages.

`suffix_map'
     Dictionary mapping suffixes to suffixes.  This is used to allow
     recognition of encoded files for which the encoding and the type
     are indicated by the same extension.  For example, the `.tgz'
     extension is mapped to `.tar.gz' to allow the encoding and type to
     be recognized separately.

`encodings_map'
     Dictionary mapping filename extensions to encoding types.

`types_map'
     Dictionary mapping filename extensions to MIME types.


File: python-lib.info,  Node: base64,  Next: quopri,  Prev: mimetypes,  Up: Internet Data Handling

Encode and decode MIME base64 data
==================================

   Encode and decode files using the MIME base64 data.

   This module performs base64 encoding and decoding of arbitrary binary
strings into text strings that can be safely emailed or posted.  The
encoding scheme is defined in RFC 1521 (_MIME (Multipurpose Internet
Mail Extensions) Part One: Mechanisms for Specifying and Describing the
Format of Internet Message Bodies_, section 5.2, "Base64
Content-Transfer-Encoding") and is used for MIME email and various
other Internet-related applications; it is not the same as the output
produced by the `uuencode' program.  For example, the string
`'www.python.org'' is encoded as the string `'d3d3LnB5dGhvbi5vcmc=\n''.

`decode(input, output)'
     Decode the contents of the INPUT file and write the resulting
     binary data to the OUTPUT file.  INPUT and OUTPUT must either be
     file objects or objects that mimic the file object interface.
     INPUT will be read until `INPUT.read()' returns an empty string.

`decodestring(s)'
     Decode the string S, which must contain one or more lines of
     base64 encoded data, and return a string containing the resulting
     binary data.

`encode(input, output)'
     Encode the contents of the INPUT file and write the resulting
     base64 encoded data to the OUTPUT file.  INPUT and OUTPUT must
     either be file objects or objects that mimic the file object
     interface. INPUT will be read until `INPUT.read()' returns an
     empty string.

`encodestring(s)'
     Encode the string S, which can contain arbitrary binary data, and
     return a string containing one or more lines of base64 encoded
     data.

   See also:

   *Note binascii:: Support module containing ASCII-to-binary and
binary-to-ASCII conversions.

*RFC1521 MIME (Multipurpose Internet Mail Extensions) Part One: Mechanisms for Specifying and Describing the Format of Internet Message Bodies*
     Section 5.2, "Base64 Content-Transfer-Encoding," provides the
     definition of the base64 encoding.


File: python-lib.info,  Node: quopri,  Next: mailbox,  Prev: base64,  Up: Internet Data Handling

Encode and decode MIME quoted-printable data
============================================

   Encode and decode files using the MIME quoted-printable encoding.

   This module performs quoted-printable transport encoding and
decoding, as defined in RFC 1521: "MIME (Multipurpose Internet Mail
Extensions) Part One".  The quoted-printable encoding is designed for
data where there are relatively few nonprintable characters; the base64
encoding scheme available via the `base64' module is more compact if
there are many such characters, as when sending a graphics file.

`decode(input, output)'
     Decode the contents of the INPUT file and write the resulting
     decoded binary data to the OUTPUT file.  INPUT and OUTPUT must
     either be file objects or objects that mimic the file object
     interface. INPUT will be read until `INPUT.read()' returns an
     empty string.

`encode(input, output, quotetabs)'
     Encode the contents of the INPUT file and write the resulting
     quoted-printable data to the OUTPUT file.  INPUT and OUTPUT must
     either be file objects or objects that mimic the file object
     interface. INPUT will be read until `INPUT.read()' returns an
     empty string.

   See also:

   *Note mimify:: General utilities for processing of MIME messages.


File: python-lib.info,  Node: mailbox,  Next: mhlib,  Prev: quopri,  Up: Internet Data Handling

Read various mailbox formats
============================

   Read various mailbox formats.

   This module defines a number of classes that allow easy and uniform
access to mail messages in a (UNIX) mailbox.

`UnixMailbox(fp[, factory])'
     Access to a classic UNIX-style mailbox, where all messages are
     contained in a single file and separated by `From ' (a.k.a.
     `From_') lines.  The file object FP points to the mailbox file.
     The optional FACTORY parameter is a callable that should create
     new message objects.  FACTORY is called with one argument, FP by
     the `next()' method of the mailbox object.  The default is the
     `rfc822.Message' class (see the `rfc822' module).

     For maximum portability, messages in a UNIX-style mailbox are
     separated by any line that begins exactly with the string `'From
     '' (note the trailing space) if preceded by exactly two newlines.
     Because of the wide-range of variations in practice, nothing else
     on the From_ line should be considered.  However, the current
     implementation doesn't check for the leading two newlines.  This is
     usually fine for most applications.

     The `UnixMailbox' class implements a more strict version of From_
     line checking, using a regular expression that usually correctly
     matched From_ delimiters.  It considers delimiter line to be
     separated by `From NAME TIME' lines.  For maximum portability, use
     the `PortableUnixMailbox' class instead.  This class is identical
     to `UnixMailbox' except that individual messages are separated by
     only `From ' lines.

     For more information, see .

`PortableUnixMailbox(fp[, factory])'
     A less-strict version of `UnixMailbox', which considers only the
     `From ' at the beginning of the line separating messages.  The
     "NAME TIME" portion of the From line is ignored, to protect
     against some variations that are observed in practice.  This works
     since lines in the message which begin with `'From '' are quoted
     by mail handling software well before delivery.

`MmdfMailbox(fp[, factory])'
     Access an MMDF-style mailbox, where all messages are contained in
     a single file and separated by lines consisting of 4 control-A
     characters.  The file object FP points to the mailbox file.
     Optional FACTORY is as with the `UnixMailbox' class.

`MHMailbox(dirname[, factory])'
     Access an MH mailbox, a directory with each message in a separate
     file with a numeric name.  The name of the mailbox directory is
     passed in DIRNAME.  FACTORY is as with the `UnixMailbox' class.

`Maildir(dirname[, factory])'
     Access a Qmail mail directory.  All new and current mail for the
     mailbox specified by DIRNAME is made available.  FACTORY is as
     with the `UnixMailbox' class.

`BabylMailbox(fp[, factory])'
     Access a Babyl mailbox, which is similar to an MMDF mailbox.  In
     Babyl format, each message has two sets of headers, the _original_
     headers and the _visible_ headers.  The original headers appear
     before a a line containing only `'*** EOOH ***''
     (End-Of-Original-Headers) and the visible headers appear after the
     `EOOH' line.  Babyl-compliant mail readers will show you only the
     visible headers, and `BabylMailbox' objects will return messages
     containing only the visible headers.  You'll have to do your own
     parsing of the mailbox file to get at the original headers.  Mail
     messages start with the EOOH line and end with a line containing
     only `'\037\014''.  FACTORY is as with the `UnixMailbox' class.

* Menu:

* Mailbox Objects::


File: python-lib.info,  Node: Mailbox Objects,  Prev: mailbox,  Up: mailbox

Mailbox Objects
---------------

   All implementations of Mailbox objects have one externally visible
method:

`next()'
     Return the next message in the mailbox, created with the optional
     FACTORY argument passed into the mailbox object's constructor.  By
     defaul this is an `rfc822.Message' object (see the `rfc822'
     module).  Depending on the mailbox implementation the FP attribute
     of this object may be a true file object or a class instance
     simulating a file object, taking care of things like message
     boundaries if multiple mail messages are contained in a single
     file, etc.  If no more messages are available, this method returns
     `None'.


File: python-lib.info,  Node: mhlib,  Next: mimify,  Prev: mailbox,  Up: Internet Data Handling

Access to MH mailboxes
======================

   Manipulate MH mailboxes from Python.

   The `mhlib' module provides a Python interface to MH folders and
their contents.

   The module contains three basic classes, `MH', which represents a
particular collection of folders, `Folder', which represents a single
folder, and `Message', which represents a single message.

`MH([path[, profile]])'
     `MH' represents a collection of MH folders.

`Folder(mh, name)'
     The `Folder' class represents a single folder and its messages.

`Message(folder, number[, name])'
     `Message' objects represent individual messages in a folder.  The
     Message class is derived from `mimetools.Message'.

* Menu:

* MH Objects::
* Folder Objects::
* Message Objects 2::


File: python-lib.info,  Node: MH Objects,  Next: Folder Objects,  Prev: mhlib,  Up: mhlib

MH Objects
----------

   `MH' instances have the following methods:

`error(format[, ...])'
     Print an error message - can be overridden.

`getprofile(key)'
     Return a profile entry (`None' if not set).

`getpath()'
     Return the mailbox pathname.

`getcontext()'
     Return the current folder name.

`setcontext(name)'
     Set the current folder name.

`listfolders()'
     Return a list of top-level folders.

`listallfolders()'
     Return a list of all folders.

`listsubfolders(name)'
     Return a list of direct subfolders of the given folder.

`listallsubfolders(name)'
     Return a list of all subfolders of the given folder.

`makefolder(name)'
     Create a new folder.

`deletefolder(name)'
     Delete a folder - must have no subfolders.

`openfolder(name)'
     Return a new open folder object.


File: python-lib.info,  Node: Folder Objects,  Next: Message Objects 2,  Prev: MH Objects,  Up: mhlib

Folder Objects
--------------

   `Folder' instances represent open folders and have the following
methods:

`error(format[, ...])'
     Print an error message - can be overridden.

`getfullname()'
     Return the folder's full pathname.

`getsequencesfilename()'
     Return the full pathname of the folder's sequences file.

`getmessagefilename(n)'
     Return the full pathname of message N of the folder.

`listmessages()'
     Return a list of messages in the folder (as numbers).

`getcurrent()'
     Return the current message number.

`setcurrent(n)'
     Set the current message number to N.

`parsesequence(seq)'
     Parse msgs syntax into list of messages.

`getlast()'
     Get last message, or `0' if no messages are in the folder.

`setlast(n)'
     Set last message (internal use only).

`getsequences()'
     Return dictionary of sequences in folder.  The sequence names are
     used as keys, and the values are the lists of message numbers in
     the sequences.

`putsequences(dict)'
     Return dictionary of sequences in folder {name: list}.

`removemessages(list)'
     Remove messages in list from folder.

`refilemessages(list, tofolder)'
     Move messages in list to other folder.

`movemessage(n, tofolder, ton)'
     Move one message to a given destination in another folder.

`copymessage(n, tofolder, ton)'
     Copy one message to a given destination in another folder.


File: python-lib.info,  Node: Message Objects 2,  Prev: Folder Objects,  Up: mhlib

Message Objects
---------------

   The `Message' class adds one method to those of `mimetools.Message':

`openmessage(n)'
     Return a new open message object (costs a file descriptor).


File: python-lib.info,  Node: mimify,  Next: netrc,  Prev: mhlib,  Up: Internet Data Handling

MIME processing of mail messages
================================

   Mimification and unmimification of mail messages.

   The mimify module defines two functions to convert mail messages to
and from MIME format.  The mail message can be either a simple message
or a so-called multipart message.  Each part is treated separately.
Mimifying (a part of) a message entails encoding the message as
quoted-printable if it contains any characters that cannot be
represented using 7-bit ASCII.  Unmimifying (a part of) a message
entails undoing the quoted-printable encoding.  Mimify and unmimify are
especially useful when a message has to be edited before being sent.
Typical use would be:

     unmimify message
     edit message
     mimify message
     send message

   The modules defines the following user-callable functions and
user-settable variables:

`mimify(infile, outfile)'
     Copy the message in INFILE to OUTFILE, converting parts to
     quoted-printable and adding MIME mail headers when necessary.
     INFILE and OUTFILE can be file objects (actually, any object that
     has a `readline()' method (for INFILE) or a `write()' method (for
     OUTFILE)) or strings naming the files.  If INFILE and OUTFILE are
     both strings, they may have the same value.

`unmimify(infile, outfile[, decode_base64])'
     Copy the message in INFILE to OUTFILE, decoding all
     quoted-printable parts.  INFILE and OUTFILE can be file objects
     (actually, any object that has a `readline()' method (for INFILE)
     or a `write()' method (for OUTFILE)) or strings naming the files.
     If INFILE and OUTFILE are both strings, they may have the same
     value.  If the DECODE_BASE64 argument is provided and tests true,
     any parts that are coded in the base64 encoding are decoded as
     well.

`mime_decode_header(line)'
     Return a decoded version of the encoded header line in LINE.

`mime_encode_header(line)'
     Return a MIME-encoded version of the header line in LINE.

`MAXLEN'
     By default, a part will be encoded as quoted-printable when it
     contains any non-ASCII characters (i.e., characters with the 8th
     bit set), or if there are any lines longer than `MAXLEN' characters
     (default value 200).

`CHARSET'
     When not specified in the mail headers, a character set must be
     filled in.  The string used is stored in `CHARSET', and the default
     value is ISO-8859-1 (also known as Latin1 (latin-one)).

   This module can also be used from the command line.  Usage is as
follows:
     mimify.py -e [-l length] [infile [outfile]]
     mimify.py -d [-b] [infile [outfile]]

   to encode (mimify) and decode (unmimify) respectively.  INFILE
defaults to standard input, OUTFILE defaults to standard output.  The
same file can be specified for input and output.

   If the *-l* option is given when encoding, if there are any lines
longer than the specified LENGTH, the containing part will be encoded.

   If the *-b* option is given when decoding, any base64 parts will be
decoded as well.

   See also:

   *Note quopri:: Encode and decode MIME quoted-printable files.


File: python-lib.info,  Node: netrc,  Next: robotparser,  Prev: mimify,  Up: Internet Data Handling

netrc file processing
=====================

   Loading of `.netrc' files.  This module was documented by Eric S.
Raymond <esr@snark.thyrsus.com>.
This section was written by Eric S. Raymond <esr@snark.thyrsus.com>.
_Added in Python version 1.5.2_

   The `netrc' class parses and encapsulates the netrc file format used
by the UNIX `ftp' program and other FTP clients.

`netrc([file])'
     A `netrc' instance or subclass instance encapsulates data from a
     netrc file.  The initialization argument, if present, specifies the
     file to parse.  If no argument is given, the file `.netrc' in the
     user's home directory will be read.  Parse errors will raise
     `NetrcParseError' with diagnostic information including the file
     name, line number, and terminating token.

`NetrcParseError'
     Exception raised by the `netrc' class when syntactical errors are
     encountered in source text.  Instances of this exception provide
     three interesting attributes:  `msg' is a textual explanation of
     the error, `filename' is the name of the source file, and `lineno'
     gives the line number on which the error was found.

* Menu:

* netrc Objects::


File: python-lib.info,  Node: netrc Objects,  Prev: netrc,  Up: netrc

netrc Objects
-------------

   A `netrc' instance has the following methods:

`authenticators(host)'
     Return a 3-tuple `(LOGIN, ACCOUNT, PASSWORD)' of authenticators
     for HOST.  If the netrc file did not contain an entry for the
     given host, return the tuple associated with the `default' entry.
     If neither matching host nor default entry is available, return
     `None'.

`__repr__()'
     Dump the class data as a string in the format of a netrc file.
     (This discards comments and may reorder the entries.)

   Instances of `netrc' have public instance variables:

`hosts'
     Dictionary mapping host names to `(LOGIN, ACCOUNT, PASSWORD)'
     tuples.  The `default' entry, if any, is represented as a
     pseudo-host by that name.

`macros'
     Dictionary mapping macro names to string lists.


File: python-lib.info,  Node: robotparser,  Prev: netrc,  Up: Internet Data Handling

Parser for robots.txt
=====================

   Accepts as input a list of lines or URL that refers to a robots.txt
file, parses the file, then builds a set of rules from that list and
answers questions about fetchability of other URLs.

   This section was written by Skip Montanaro <skip@mojam.com>.
This module provides a single class, `RobotFileParser', which answers
questions about whether or not a particular user agent can fetch a URL
on the web site that published the `robots.txt' file.  For more details
on the structure of `robots.txt' files, see
<http://info.webcrawler.com/mak/projects/robots/norobots.html>.

`RobotFileParser()'
     This class provides a set of methods to read, parse and answer
     questions about a single `robots.txt' file.

    `set_url(url)'
          Sets the URL referring to a `robots.txt' file.

    `read()'
          Reads the `robots.txt' URL and feeds it to the parser.

    `parse(lines)'
          Parses the lines argument.

    `can_fetch(useragent, url)'
          Returns true if the USERAGENT is allowed to fetch the URL
          according to the rules contained in the parsed `robots.txt'
          file.

    `mtime()'
          Returns the time the `robots.txt' file was last fetched.
          This is useful for long-running web spiders that need to
          check for new `robots.txt' files periodically.

    `modified()'
          Sets the time the `robots.txt' file was last fetched to the
          current time.

   The following example demonstrates basic use of the RobotFileParser
class.

     >>> import robotparser
     >>> rp = robotparser.RobotFileParser()
     >>> rp.set_url("http://www.musi-cal.com/robots.txt")
     >>> rp.read()
     >>> rp.can_fetch("*", "http://www.musi-cal.com/cgi-bin/search?city=San+Francisco")
     0
     >>> rp.can_fetch("*", "http://www.musi-cal.com/")
     1


File: python-lib.info,  Node: Structured Markup Processing Tools,  Next: Multimedia Services,  Prev: Internet Data Handling,  Up: Top

Structured Markup Processing Tools
**********************************

   Python supports a variety of modules to work with various forms of
structured data markup.  This includes modules to work with the
Standard Generalized Markup Language (SGML) and the Hypertext Markup
Language (HTML), and several interfaces for working with the Extensible
Markup Language (XML).

* Menu:

* sgmllib::
* htmllib::
* htmlentitydefs::
* xml.parsers.expat::
* xml.dom::
* xml.dom.minidom::
* xml.dom.pulldom::
* xml.sax::
* xml.sax.handler::
* xml.sax.saxutils::
* xml.sax.xmlreader::
* xmllib::

