This is Info file /home/pdm/tmp/Python-1.5.2p1/Doc/tut/python-tut.info,
produced by Makeinfo version 1.68 from the input file tut.texi.

   July 6, 1999			1.5.2


File: python-tut.info,  Node: Documentation Strings,  Prev: Lambda Forms,  Up: More on Defining Functions

Documentation Strings
---------------------

   There are emerging conventions about the content and formatting of
documentation strings.

   The first line should always be a short, concise summary of the
object's purpose.  For brevity, it should not explicitly state the
object's name or type, since these are available by other means (except
if the name happens to be a verb describing a function's operation).
This line should begin with a capital letter and end with a period.

   If there are more lines in the documentation string, the second line
should be blank, visually separating the summary from the rest of the
description.  The following lines should be one or more paragraphs
describing the object's calling conventions, its side effects, etc.

   The Python parser does not strip indentation from multi-line string
literals in Python, so tools that process documentation have to strip
indentation.  This is done using the following convention.  The first
non-blank line *after* the first line of the string determines the
amount of indentation for the entire documentation string.  (We can't
use the first line since it is generally adjacent to the string's
opening quotes so its indentation is not apparent in the string
literal.)  Whitespace "equivalent" to this indentation is then stripped
from the start of all lines of the string.  Lines that are indented
less should not occur, but if they occur all their leading whitespace
should be stripped.  Equivalence of whitespace should be tested after
expansion of tabs (to 8 spaces, normally).


File: python-tut.info,  Node: Data Structures,  Next: Modules,  Prev: More Control Flow Tools,  Up: Top

Data Structures
***************

   This chapter describes some things you've learned about already in
more detail, and adds some new things as well.

* Menu:

* More on Lists::
* del statement::
* Tuples and Sequences::
* Dictionaries::
* More on Conditions::
* Comparing Sequences and Other Types::


File: python-tut.info,  Node: More on Lists,  Next: del statement,  Prev: Data Structures,  Up: Data Structures

More on Lists
=============

   The list data type has some more methods.  Here are all of the
methods of list objects:

``insert(i, x)''
     Insert an item at a given position.  The first argument is the
     index of the element before which to insert, so `a.insert(0, x)'
     inserts at the front of the list, and `a.insert(len(a), x)' is
     equivalent to `a.append(x)'.

``append(x)''
     Append an item to the list; equivalent to `a.insert(len(a), x)'.

``index(x)''
     Return the index in the list of the first item whose value is `x'.
     It is an error if there is no such item.

``remove(x)''
     Remove the first item from the list whose value is `x'.  It is an
     error if there is no such item.

``sort()''
     Sort the items of the list, in place.

``reverse()''
     Reverse the elements of the list, in place.

``count(x)''
     Return the number of times `x' appears in the list.

   An example that uses all list methods:

     >>> a = [66.6, 333, 333, 1, 1234.5]
     >>> print a.count(333), a.count(66.6), a.count('x')
     2 1 0
     >>> a.insert(2, -1)
     >>> a.append(333)
     >>> a
     [66.6, 333, -1, 333, 1, 1234.5, 333]
     >>> a.index(333)
     1
     >>> a.remove(333)
     >>> a
     [66.6, -1, 333, 1, 1234.5, 333]
     >>> a.reverse()
     >>> a
     [333, 1234.5, 1, 333, -1, 66.6]
     >>> a.sort()
     >>> a
     [-1, 1, 66.6, 333, 333, 1234.5]

* Menu:

* Functional Programming Tools::


File: python-tut.info,  Node: Functional Programming Tools,  Prev: More on Lists,  Up: More on Lists

Functional Programming Tools
----------------------------

   There are three built-in functions that are very useful when used
with lists: `filter()', `map()', and `reduce()'.

   `filter(FUNCTION, SEQUENCE)' returns a sequence (of the same type,
if possible) consisting of those items from the sequence for which
`FUNCTION(ITEM)' is true.  For example, to compute some primes:

     >>> def f(x): return x % 2 != 0 and x % 3 != 0
     ...
     >>> filter(f, range(2, 25))
     [5, 7, 11, 13, 17, 19, 23]

   `map(FUNCTION, SEQUENCE)' calls `FUNCTION(ITEM)' for each of the
sequence's items and returns a list of the return values.  For example,
to compute some cubes:

     >>> def cube(x): return x*x*x
     ...
     >>> map(cube, range(1, 11))
     [1, 8, 27, 64, 125, 216, 343, 512, 729, 1000]

   More than one sequence may be passed; the function must then have as
many arguments as there are sequences and is called with the
corresponding item from each sequence (or `None' if some sequence is
shorter than another).  If `None' is passed for the function, a
function returning its argument(s) is substituted.

   Combining these two special cases, we see that `map(None, LIST1,
LIST2)' is a convenient way of turning a pair of lists into a list of
pairs.  For example:

     >>> seq = range(8)
     >>> def square(x): return x*x
     ...
     >>> map(None, seq, map(square, seq))
     [(0, 0), (1, 1), (2, 4), (3, 9), (4, 16), (5, 25), (6, 36), (7, 49)]

   `reduce(FUNC, SEQUENCE)' returns a single value constructed by
calling the binary function FUNC on the first two items of the
sequence, then on the result and the next item, and so on.  For
example, to compute the sum of the numbers 1 through 10:

     >>> def add(x,y): return x+y
     ...
     >>> reduce(add, range(1, 11))
     55

   If there's only one item in the sequence, its value is returned; if
the sequence is empty, an exception is raised.

   A third argument can be passed to indicate the starting value.  In
this case the starting value is returned for an empty sequence, and the
function is first applied to the starting value and the first sequence
item, then to the result and the next item, and so on.  For example,

     >>> def sum(seq):
     ...     def add(x,y): return x+y
     ...     return reduce(add, seq, 0)
     ...
     >>> sum(range(1, 11))
     55
     >>> sum([])
     0


File: python-tut.info,  Node: del statement,  Next: Tuples and Sequences,  Prev: More on Lists,  Up: Data Structures

The `del' statement
===================

   There is a way to remove an item from a list given its index instead
of its value: the `del' statement.  This can also be used to remove
slices from a list (which we did earlier by assignment of an empty list
to the slice).  For example:

     >>> a
     [-1, 1, 66.6, 333, 333, 1234.5]
     >>> del a[0]
     >>> a
     [1, 66.6, 333, 333, 1234.5]
     >>> del a[2:4]
     >>> a
     [1, 66.6, 1234.5]

   `del' can also be used to delete entire variables:

     >>> del a

   Referencing the name `a' hereafter is an error (at least until
another value is assigned to it).  We'll find other uses for `del'
later.


File: python-tut.info,  Node: Tuples and Sequences,  Next: Dictionaries,  Prev: del statement,  Up: Data Structures

Tuples and Sequences
====================

   We saw that lists and strings have many common properties, e.g.,
indexing and slicing operations.  They are two examples of *sequence*
data types.  Since Python is an evolving language, other sequence data
types may be added.  There is also another standard sequence data type:
the *tuple*.

   A tuple consists of a number of values separated by commas, for
instance:

     >>> t = 12345, 54321, 'hello!'
     >>> t[0]
     12345
     >>> t
     (12345, 54321, 'hello!')
     >>> # Tuples may be nested:
     ... u = t, (1, 2, 3, 4, 5)
     >>> u
     ((12345, 54321, 'hello!'), (1, 2, 3, 4, 5))

   As you see, on output tuples are alway enclosed in parentheses, so
that nested tuples are interpreted correctly; they may be input with or
without surrounding parentheses, although often parentheses are
necessary anyway (if the tuple is part of a larger expression).

   Tuples have many uses, e.g., (x, y) coordinate pairs, employee
records from a database, etc.  Tuples, like strings, are immutable: it
is not possible to assign to the individual items of a tuple (you can
simulate much of the same effect with slicing and concatenation,
though).

   A special problem is the construction of tuples containing 0 or 1
items: the syntax has some extra quirks to accommodate these.  Empty
tuples are constructed by an empty pair of parentheses; a tuple with
one item is constructed by following a value with a comma (it is not
sufficient to enclose a single value in parentheses).  Ugly, but
effective.  For example:

     >>> empty = ()
     >>> singleton = 'hello',    # <-- note trailing comma
     >>> len(empty)
     0
     >>> len(singleton)
     1
     >>> singleton
     ('hello',)

   The statement `t = 12345, 54321, 'hello!'' is an example of *tuple
packing*: the values `12345', `54321' and `'hello!'' are packed
together in a tuple.  The reverse operation is also possible, e.g.:

     >>> x, y, z = t

   This is called, appropriately enough, *tuple unpacking*.  Tuple
unpacking requires that the list of variables on the left have the same
number of elements as the length of the tuple.  Note that multiple
assignment is really just a combination of tuple packing and tuple
unpacking!

   Occasionally, the corresponding operation on lists is useful: *list
unpacking*.  This is supported by enclosing the list of variables in
square brackets:

     >>> a = ['spam', 'eggs', 100, 1234]
     >>> [a1, a2, a3, a4] = a


File: python-tut.info,  Node: Dictionaries,  Next: More on Conditions,  Prev: Tuples and Sequences,  Up: Data Structures

Dictionaries
============

   Another useful data type built into Python is the *dictionary*.
Dictionaries are sometimes found in other languages as "associative
memories" or "associative arrays".  Unlike sequences, which are indexed
by a range of numbers, dictionaries are indexed by *keys*, which can be
any immutable type; strings and numbers can always be keys.  Tuples can
be used as keys if they contain only strings, numbers, or tuples.  You
can't use lists as keys, since lists can be modified in place using
their `append()' method.

   It is best to think of a dictionary as an unordered set of
*key:value* pairs, with the requirement that the keys are unique
(within one dictionary).  A pair of braces creates an empty dictionary:
`{}'.  Placing a comma-separated list of key:value pairs within the
braces adds initial key:value pairs to the dictionary; this is also the
way dictionaries are written on output.

   The main operations on a dictionary are storing a value with some key
and extracting the value given the key.  It is also possible to delete
a key:value pair with `del'.  If you store using a key that is already
in use, the old value associated with that key is forgotten.  It is an
error to extract a value using a non-existent key.

   The `keys()' method of a dictionary object returns a list of all the
keys used in the dictionary, in random order (if you want it sorted,
just apply the `sort()' method to the list of keys).  To check whether
a single key is in the dictionary, use the `has_key()' method of the
dictionary.

   Here is a small example using a dictionary:

     >>> tel = {'jack': 4098, 'sape': 4139}
     >>> tel['guido'] = 4127
     >>> tel
     {'sape': 4139, 'guido': 4127, 'jack': 4098}
     >>> tel['jack']
     4098
     >>> del tel['sape']
     >>> tel['irv'] = 4127
     >>> tel
     {'guido': 4127, 'irv': 4127, 'jack': 4098}
     >>> tel.keys()
     ['guido', 'irv', 'jack']
     >>> tel.has_key('guido')
     1


File: python-tut.info,  Node: More on Conditions,  Next: Comparing Sequences and Other Types,  Prev: Dictionaries,  Up: Data Structures

More on Conditions
==================

   The conditions used in `while' and `if' statements above can contain
other operators besides comparisons.

   The comparison operators `in' and `not in' check whether a value
occurs (does not occur) in a sequence.  The operators `is' and `is not'
compare whether two objects are really the same object; this only
matters for mutable objects like lists.  All comparison operators have
the same priority, which is lower than that of all numerical operators.

   Comparisons can be chained: e.g., `a < b == c' tests whether `a' is
less than `b' and moreover `b' equals `c'.

   Comparisons may be combined by the Boolean operators `and' and `or',
and the outcome of a comparison (or of any other Boolean expression)
may be negated with `not'.  These all have lower priorities than
comparison operators again; between them, `not' has the highest
priority, and `or' the lowest, so that `A and not B or C' is equivalent
to `(A and (not B)) or C'.  Of course, parentheses can be used to
express the desired composition.

   The Boolean operators `and' and `or' are so-called *shortcut*
operators: their arguments are evaluated from left to right, and
evaluation stops as soon as the outcome is determined.  E.g., if `A'
and `C' are true but `B' is false, `A and B and C' does not evaluate
the expression C.  In general, the return value of a shortcut operator,
when used as a general value and not as a Boolean, is the last
evaluated argument.

   It is possible to assign the result of a comparison or other Boolean
expression to a variable.  For example,

     >>> string1, string2, string3 = '', 'Trondheim', 'Hammer Dance'
     >>> non_null = string1 or string2 or string3
     >>> non_null
     'Trondheim'

   Note that in Python, unlike C, assignment cannot occur inside
expressions.


File: python-tut.info,  Node: Comparing Sequences and Other Types,  Prev: More on Conditions,  Up: Data Structures

Comparing Sequences and Other Types
===================================

   Sequence objects may be compared to other objects with the same
sequence type.  The comparison uses *lexicographical* ordering: first
the first two items are compared, and if they differ this determines
the outcome of the comparison; if they are equal, the next two items
are compared, and so on, until either sequence is exhausted.  If two
items to be compared are themselves sequences of the same type, the
lexicographical comparison is carried out recursively.  If all items of
two sequences compare equal, the sequences are considered equal.  If
one sequence is an initial subsequence of the other, the shorted
sequence is the smaller one.  Lexicographical ordering for strings uses
the ASCII ordering for individual characters.  Some examples of
comparisons between sequences with the same types:

     (1, 2, 3)              < (1, 2, 4)
     [1, 2, 3]              < [1, 2, 4]
     'ABC' < 'C' < 'Pascal' < 'Python'
     (1, 2, 3, 4)           < (1, 2, 4)
     (1, 2)                 < (1, 2, -1)
     (1, 2, 3)             == (1.0, 2.0, 3.0)
     (1, 2, ('aa', 'ab'))   < (1, 2, ('abc', 'a'), 4)

   Note that comparing objects of different types is legal.  The outcome
is deterministic but arbitrary: the types are ordered by their name.
Thus, a list is always smaller than a string, a string is always
smaller than a tuple, etc.  Mixed numeric types are compared according
to their numeric value, so 0 equals 0.0, etc.(1)

   ---------- Footnotes ----------

   (1)  The rules for comparing objects of different types should not
be relied upon; they may change in a future version of the language.


File: python-tut.info,  Node: Modules,  Next: Input and Output,  Prev: Data Structures,  Up: Top

Modules
*******

   If you quit from the Python interpreter and enter it again, the
definitions you have made (functions and variables) are lost.
Therefore, if you want to write a somewhat longer program, you are
better off using a text editor to prepare the input for the interpreter
and running it with that file as input instead.  This is known as
creating a *script*.  As your program gets longer, you may want to
split it into several files for easier maintenance.  You may also want
to use a handy function that you've written in several programs without
copying its definition into each program.

   To support this, Python has a way to put definitions in a file and
use them in a script or in an interactive instance of the interpreter.
Such a file is called a *module*; definitions from a module can be
*imported* into other modules or into the *main* module (the collection
of variables that you have access to in a script executed at the top
level and in calculator mode).

   A module is a file containing Python definitions and statements.  The
file name is the module name with the suffix `.py' appended.  Within a
module, the module's name (as a string) is available as the value of
the global variable `__name__'.  For instance, use your favorite text
editor to create a file called `fibo.py' in the current directory with
the following contents:

     # Fibonacci numbers module
     
     def fib(n):    # write Fibonacci series up to n
         a, b = 0, 1
         while b < n:
             print b,
             a, b = b, a+b
     
     def fib2(n): # return Fibonacci series up to n
         result = []
         a, b = 0, 1
         while b < n:
             result.append(b)
             a, b = b, a+b
         return result

   Now enter the Python interpreter and import this module with the
following command:

     >>> import fibo

   This does not enter the names of the functions defined in `fibo'
directly in the current symbol table; it only enters the module name
`fibo' there.  Using the module name you can access the functions:

     >>> fibo.fib(1000)
     1 1 2 3 5 8 13 21 34 55 89 144 233 377 610 987
     >>> fibo.fib2(100)
     [1, 1, 2, 3, 5, 8, 13, 21, 34, 55, 89]
     >>> fibo.__name__
     'fibo'

   If you intend to use a function often you can assign it to a local
name:

     >>> fib = fibo.fib
     >>> fib(500)
     1 1 2 3 5 8 13 21 34 55 89 144 233 377

* Menu:

* More on Modules::
* Standard Modules::
* dir Function::
* Packages::


File: python-tut.info,  Node: More on Modules,  Next: Standard Modules,  Prev: Modules,  Up: Modules

More on Modules
===============

   A module can contain executable statements as well as function
definitions.  These statements are intended to initialize the module.
They are executed only the *first* time the module is imported
somewhere.(1)

   Each module has its own private symbol table, which is used as the
global symbol table by all functions defined in the module.  Thus, the
author of a module can use global variables in the module without
worrying about accidental clashes with a user's global variables.  On
the other hand, if you know what you are doing you can touch a module's
global variables with the same notation used to refer to its functions,
`modname.itemname'.

   Modules can import other modules.  It is customary but not required
to place all `import' statements at the beginning of a module (or
script, for that matter).  The imported module names are placed in the
importing module's global symbol table.

   There is a variant of the `import' statement that imports names from
a module directly into the importing module's symbol table.  For
example:

     >>> from fibo import fib, fib2
     >>> fib(500)
     1 1 2 3 5 8 13 21 34 55 89 144 233 377

   This does not introduce the module name from which the imports are
taken in the local symbol table (so in the example, `fibo' is not
defined).

   There is even a variant to import all names that a module defines:

     >>> from fibo import *
     >>> fib(500)
     1 1 2 3 5 8 13 21 34 55 89 144 233 377

   This imports all names except those beginning with an underscore
(`_').

* Menu:

* Module Search Path::
* Compiled Python files::

   ---------- Footnotes ----------

   (1)  In fact function definitions are also `statements' that are
`executed'; the execution enters the function name in the module's
global symbol table.


File: python-tut.info,  Node: Module Search Path,  Next: Compiled Python files,  Prev: More on Modules,  Up: More on Modules

The Module Search Path
----------------------

   When a module named `spam' is imported, the interpreter searches for
a file named `spam.py' in the current directory, and then in the list
of directories specified by the environment variable `PYTHONPATH'.
This has the same syntax as the shell variable `PATH', i.e., a list of
directory names.  When `PYTHONPATH' is not set, or when the file is not
found there, the search continues in an installation-dependent default
path; on UNIX, this is usually `.:/usr/local/lib/python'.

   Actually, modules are searched in the list of directories given by
the variable `sys.path' which is initialized from the directory
containing the input script (or the current directory), `PYTHONPATH'
and the installation-dependent default.  This allows Python programs
that know what they're doing to modify or replace the module search
path.  See the section on Standard Modules later.


File: python-tut.info,  Node: Compiled Python files,  Prev: Module Search Path,  Up: More on Modules

"Compiled" Python files
-----------------------

   As an important speed-up of the start-up time for short programs that
use a lot of standard modules, if a file called `spam.pyc' exists in
the directory where `spam.py' is found, this is assumed to contain an
already-"byte-compiled" version of the module `spam'.  The modification
time of the version of `spam.py' used to create `spam.pyc' is recorded
in `spam.pyc', and the file is ignored if these don't match.

   Normally, you don't need to do anything to create the `spam.pyc'
file.  Whenever `spam.py' is successfully compiled, an attempt is made
to write the compiled version to `spam.pyc'.  It is not an error if
this attempt fails; if for any reason the file is not written
completely, the resulting `spam.pyc' file will be recognized as invalid
and thus ignored later.  The contents of the `spam.pyc' file is
platform independent, so a Python module directory can be shared by
machines of different architectures.

   Some tips for experts:

   * When the Python interpreter is invoked with the `-O' flag,
     optimized code is generated and stored in `.pyo' files.  The
     optimizer currently doesn't help much; it only removes `assert'
     statements and `SET_LINENO' instructions.  When `-O' is used,
     *all* bytecode is optimized; `.pyc' files are ignored and `.py'
     files are compiled to optimized bytecode.

   * Passing two `-O' flags to the Python interpreter (`-OO') will
     cause the bytecode compiler to perform optimizations that could in
     some rare cases result in malfunctioning programs.  Currently only
     `__doc__' strings are removed from the bytecode, resulting in more
     compact `.pyo' files.  Since some programs may rely on having
     these available, you should only use this option if you know what
     you're doing.

   * A program doesn't run any faster when it is read from a `.pyc' or
     `.pyo' file than when it is read from a `.py' file; the only thing
     that's faster about `.pyc' or `.pyo' files is the speed with which
     they are loaded.

   * When a script is run by giving its name on the command line, the
     bytecode for the script is never written to a `.pyc' or `.pyo'
     file.  Thus, the startup time of a script may be reduced by moving
     most of its code to a module and having a small bootstrap script
     that imports that module.

   * It is possible to have a file called `spam.pyc' (or `spam.pyo'
     when `-O' is used) without a module `spam.py' in the same module.
     This can be used to distribute a library of Python code in a form
     that is moderately hard to reverse engineer.

   * The module `compileall' can create `.pyc' files (or `.pyo' files
     when `-O' is used) for all modules in a directory.


File: python-tut.info,  Node: Standard Modules,  Next: dir Function,  Prev: More on Modules,  Up: Modules

Standard Modules
================

   Python comes with a library of standard modules, described in a
separate document, the *Python Library Reference* ("Library Reference"
hereafter).  Some modules are built into the interpreter; these provide
access to operations that are not part of the core of the language but
are nevertheless built in, either for efficiency or to provide access
to operating system primitives such as system calls.  The set of such
modules is a configuration option; e.g., the `amoeba' module is  only
provided on systems that somehow support Amoeba primitives.  One
particular module deserves some attention: `sys', which is built into
every Python interpreter.  The variables `sys.ps1' and `sys.ps2' define
the strings used as primary and secondary prompts:

     >>> import sys
     >>> sys.ps1
     '>>> '
     >>> sys.ps2
     '... '
     >>> sys.ps1 = 'C> '
     C> print 'Yuck!'
     Yuck!
     C>

   These two variables are only defined if the interpreter is in
interactive mode.

   The variable `sys.path' is a list of strings that determine the
interpreter's search path for modules. It is initialized to a default
path taken from the environment variable `PYTHONPATH', or from a
built-in default if `PYTHONPATH' is not set.  You can modify it using
standard list operations, e.g.:

     >>> import sys
     >>> sys.path.append('/ufs/guido/lib/python')


File: python-tut.info,  Node: dir Function,  Next: Packages,  Prev: Standard Modules,  Up: Modules

The `dir()' Function
====================

   The built-in function `dir()' is used to find out which names a
module defines.  It returns a sorted list of strings:

     >>> import fibo, sys
     >>> dir(fibo)
     ['__name__', 'fib', 'fib2']
     >>> dir(sys)
     ['__name__', 'argv', 'builtin_module_names', 'copyright', 'exit',
     'maxint', 'modules', 'path', 'ps1', 'ps2', 'setprofile', 'settrace',
     'stderr', 'stdin', 'stdout', 'version']

   Without arguments, `dir()' lists the names you have defined
currently:

     >>> a = [1, 2, 3, 4, 5]
     >>> import fibo, sys
     >>> fib = fibo.fib
     >>> dir()
     ['__name__', 'a', 'fib', 'fibo', 'sys']

   Note that it lists all types of names: variables, modules,
functions, etc.

   `dir()' does not list the names of built-in functions and variables.
If you want a list of those, they are defined in the standard module
`__builtin__':

     >>> import __builtin__
     >>> dir(__builtin__)
     ['AccessError', 'AttributeError', 'ConflictError', 'EOFError', 'IOError',
     'ImportError', 'IndexError', 'KeyError', 'KeyboardInterrupt',
     'MemoryError', 'NameError', 'None', 'OverflowError', 'RuntimeError',
     'SyntaxError', 'SystemError', 'SystemExit', 'TypeError', 'ValueError',
     'ZeroDivisionError', '__name__', 'abs', 'apply', 'chr', 'cmp', 'coerce',
     'compile', 'dir', 'divmod', 'eval', 'execfile', 'filter', 'float',
     'getattr', 'hasattr', 'hash', 'hex', 'id', 'input', 'int', 'len', 'long',
     'map', 'max', 'min', 'oct', 'open', 'ord', 'pow', 'range', 'raw_input',
     'reduce', 'reload', 'repr', 'round', 'setattr', 'str', 'type', 'xrange']


File: python-tut.info,  Node: Packages,  Prev: dir Function,  Up: Modules

Packages
========

   Packages are a way of structuring Python's module namespace by using
"dotted module names".  For example, the module name `A.B' designates a
submodule named `B' in a package named `A'.  Just like the use of
modules saves the authors of different modules from having to worry
about each other's global variable names, the use of dotted module
names saves the authors of multi-module packages like NumPy or PIL from
having to worry about each other's module names.

   Suppose you want to design a collection of modules (a "package") for
the uniform handling of sound files and sound data.  There are many
different sound file formats (usually recognized by their extension,
e.g. `.wav', `.aiff', `.au'), so you may need to create and maintain a
growing collection of modules for the conversion between the various
file formats.  There are also many different operations you might want
to perform on sound data (e.g. mixing, adding echo, applying an
equalizer function, creating an artificial stereo effect), so in
addition you will be writing a never-ending stream of modules to
perform these operations.  Here's a possible structure for your package
(expressed in terms of a hierarchical filesystem):

     Sound/                          Top-level package
           __init__.py               Initialize the sound package
           Formats/                  Subpackage for file format conversions
                   __init__.py
                   wavread.py
                   wavwrite.py
                   aiffread.py
                   aiffwrite.py
                   auread.py
                   auwrite.py
                   ...
           Effects/                  Subpackage for sound effects
                   __init__.py
                   echo.py
                   surround.py
                   reverse.py
                   ...
           Filters/                  Subpackage for filters
                   __init__.py
                   equalizer.py
                   vocoder.py
                   karaoke.py
                   ...

   The `__init__.py' files are required to make Python treat the
directories as containing packages; this is done to prevent directories
with a common name, such as `string', from unintentionally hiding valid
modules that occur later on the module search path. In the simplest
case, `__init__.py' can just be an empty file, but it can also execute
initialization code for the package or set the `__all__' variable,
described later.

   Users of the package can import individual modules from the package,
for example:

     import Sound.Effects.echo

   This loads the submodule `Sound.Effects.echo'.  It must be referenced
with its full name, e.g.

     Sound.Effects.echo.echofilter(input, output, delay=0.7, atten=4)

   An alternative way of importing the submodule is:

     from Sound.Effects import echo

   This also loads the submodule `echo', and makes it available without
its package prefix, so it can be used as follows:

     echo.echofilter(input, output, delay=0.7, atten=4)

   Yet another variation is to import the desired function or variable
directly:

     from Sound.Effects.echo import echofilter

   Again, this loads the submodule `echo', but this makes its function
echofilter directly available:

     echofilter(input, output, delay=0.7, atten=4)

   Note that when using `from PACKAGE import ITEM', the item can be
either a submodule (or subpackage) of the package, or some other name
defined in the package, like a function, class or variable.  The
`import' statement first tests whether the item is defined in the
package; if not, it assumes it is a module and attempts to load it.  If
it fails to find it, `ImportError' is raised.

   Contrarily, when using syntax like `import ITEM.SUBITEM.SUBSUBITEM',
each item except for the last must be a package; the last item can be a
module or a package but can't be a class or function or variable
defined in the previous item.

* Menu:

* Importing * From a Package::
* Intra-package References::


File: python-tut.info,  Node: Importing * From a Package,  Next: Intra-package References,  Prev: Packages,  Up: Packages

Importing * From a Package
--------------------------

   Now what happens when the user writes `from Sound.Effects import *'?
Ideally, one would hope that this somehow goes out to the filesystem,
finds which submodules are present in the package, and imports them
all.  Unfortunately, this operation does not work very well on Mac and
Windows platforms, where the filesystem does not always have accurate
information about the case of a filename!  On these platforms, there is
no guaranteed way to know whether a file `ECHO.PY' should be imported
as a module `echo', `Echo' or `ECHO'.  (For example, Windows 95 has the
annoying practice of showing all file names with a capitalized first
letter.)  The DOS 8+3 filename restriction adds another interesting
problem for long module names.

   The only solution is for the package author to provide an explicit
index of the package.  The import statement uses the following
convention: if a package's `__init__.py' code defines a list named
`__all__', it is taken to be the list of module names that should be
imported when `from PACKAGE import *' is encountered.  It is up to the
package author to keep this list up-to-date when a new version of the
package is released.  Package authors may also decide not to support
it, if they don't see a use for importing * from their package.  For
example, the file `Sounds/Effects/__init__.py' could contain the
following code:

     __all__ = ["echo", "surround", "reverse"]

   This would mean that `from Sound.Effects import *' would import the
three named submodules of the `Sound' package.

   If `__all__' is not defined, the statement `from Sound.Effects
import *' does *not* import all submodules from the package
`Sound.Effects' into the current namespace; it only ensures that the
package `Sound.Effects' has been imported (possibly running its
initialization code, `__init__.py') and then imports whatever names are
defined in the package.  This includes any names defined (and
submodules explicitly loaded) by `__init__.py'.  It also includes any
submodules of the package that were explicitly loaded by previous
import statements, e.g.

     import Sound.Effects.echo
     import Sound.Effects.surround
     from Sound.Effects import *

   In this example, the echo and surround modules are imported in the
current namespace because they are defined in the `Sound.Effects'
package when the `from...import' statement is executed.  (This also
works when `__all__' is defined.)

   Note that in general the practicing of importing * from a module or
package is frowned upon, since it often causes poorly readable code.
However, it is okay to use it to save typing in interactive sessions,
and certain modules are designed to export only names that follow
certain patterns.

   Remember, there is nothing wrong with using `from Package import
specific_submodule'!  In fact, this is the recommended notation unless
the importing module needs to use submodules with the same name from
different packages.


File: python-tut.info,  Node: Intra-package References,  Prev: Importing * From a Package,  Up: Packages

Intra-package References
------------------------

   The submodules often need to refer to each other.  For example, the
`surround' module might use the `echo' module.  In fact, such references
are so common that the `import' statement first looks in the containing
package before looking in the standard module search path.  Thus, the
surround module can simply use `import echo' or `from echo import
echofilter'.  If the imported module is not found in the current
package (the package of which the current module is a submodule), the
`import' statement looks for a top-level module with the given name.

   When packages are structured into subpackages (as with the `Sound'
package in the example), there's no shortcut to refer to submodules of
sibling packages - the full name of the subpackage must be used.  For
example, if the module `Sound.Filters.vocoder' needs to use the `echo'
module in the `Sound.Effects' package, it can use `from Sound.Effects
import echo'.


File: python-tut.info,  Node: Input and Output,  Next: Errors and Exceptions,  Prev: Modules,  Up: Top

Input and Output
****************

   There are several ways to present the output of a program; data can
be printed in a human-readable form, or written to a file for future
use.  This chapter will discuss some of the possibilities.

* Menu:

* Fancier Output Formatting::
* Reading and Writing Files::


File: python-tut.info,  Node: Fancier Output Formatting,  Next: Reading and Writing Files,  Prev: Input and Output,  Up: Input and Output

Fancier Output Formatting
=========================

   So far we've encountered two ways of writing values: *expression
statements* and the `print' statement.  (A third way is using the
`write()' method of file objects; the standard output file can be
referenced as `sys.stdout'.  See the Library Reference for more
information on this.)

   Often you'll want more control over the formatting of your output
than simply printing space-separated values.  There are two ways to
format your output; the first way is to do all the string handling
yourself; using string slicing and concatenation operations you can
create any lay-out you can imagine.  The standard module `string'
contains some useful operations for padding strings to a given column
width; these will be discussed shortly.  The second way is to use the
`%' operator with a string as the left argument.  `%' interprets the
left argument as a C `sprintf()'-style format string to be applied to
the right argument, and returns the string resulting from this
formatting operation.

   One question remains, of course: how do you convert values to
strings?  Luckily, Python has a way to convert any value to a string:
pass it to the `repr()' function, or just write the value between
reverse quotes (```').  Some examples:

     >>> x = 10 * 3.14
     >>> y = 200*200
     >>> s = 'The value of x is ' + `x` + ', and y is ' + `y` + '...'
     >>> print s
     The value of x is 31.4, and y is 40000...
     >>> # Reverse quotes work on other types besides numbers:
     ... p = [x, y]
     >>> ps = repr(p)
     >>> ps
     '[31.4, 40000]'
     >>> # Converting a string adds string quotes and backslashes:
     ... hello = 'hello, world\n'
     >>> hellos = `hello`
     >>> print hellos
     'hello, world\012'
     >>> # The argument of reverse quotes may be a tuple:
     ... `x, y, ('spam', 'eggs')`
     "(31.4, 40000, ('spam', 'eggs'))"

   Here are two ways to write a table of squares and cubes:

     >>> import string
     >>> for x in range(1, 11):
     ...     print string.rjust(`x`, 2), string.rjust(`x*x`, 3),
     ...     # Note trailing comma on previous line
     ...     print string.rjust(`x*x*x`, 4)
     ...
      1   1    1
      2   4    8
      3   9   27
      4  16   64
      5  25  125
      6  36  216
      7  49  343
      8  64  512
      9  81  729
     10 100 1000
     >>> for x in range(1,11):
     ...     print '%2d %3d %4d' % (x, x*x, x*x*x)
     ...
      1   1    1
      2   4    8
      3   9   27
      4  16   64
      5  25  125
      6  36  216
      7  49  343
      8  64  512
      9  81  729
     10 100 1000

   (Note that one space between each column was added by the way
`print' works: it always adds spaces between its arguments.)

   This example demonstrates the function `string.rjust()', which
right-justifies a string in a field of a given width by padding it with
spaces on the left.  There are similar functions `string.ljust()' and
`string.center()'.  These functions do not write anything, they just
return a new string.  If the input string is too long, they don't
truncate it, but return it unchanged; this will mess up your column
lay-out but that's usually better than the alternative, which would be
lying about a value.  (If you really want truncation you can always add
a slice operation, as in `string.ljust(x,~n)[0:n]'.)

   There is another function, `string.zfill()', which pads a numeric
string on the left with zeros.  It understands about plus and minus
signs:

     >>> string.zfill('12', 5)
     '00012'
     >>> string.zfill('-3.14', 7)
     '-003.14'
     >>> string.zfill('3.14159265359', 5)
     '3.14159265359'

   Using the `%' operator looks like this:

     >>> import math
     >>> print 'The value of PI is approximately %5.3f.' % math.pi
     The value of PI is approximately 3.142.

   If there is more than one format in the string you pass a tuple as
right operand, e.g.

     >>> table = {'Sjoerd': 4127, 'Jack': 4098, 'Dcab': 8637678}
     >>> for name, phone in table.items():
     ...     print '%-10s ==> %10d' % (name, phone)
     ...
     Jack       ==>       4098
     Dcab       ==>    8637678
     Sjoerd     ==>       4127

   Most formats work exactly as in C and require that you pass the
proper type; however, if you don't you get an exception, not a core
dump.  The `%s' format is more relaxed: if the corresponding argument is
not a string object, it is converted to string using the `str()'
built-in function.  Using `*' to pass the width or precision in as a
separate (integer) argument is supported.  The C formats `%n' and `%p'
are not supported.

   If you have a really long format string that you don't want to split
up, it would be nice if you could reference the variables to be
formatted by name instead of by position.  This can be done by using an
extension of C formats using the form `%(name)format', e.g.

     >>> table = {'Sjoerd': 4127, 'Jack': 4098, 'Dcab': 8637678}
     >>> print 'Jack: %(Jack)d; Sjoerd: %(Sjoerd)d; Dcab: %(Dcab)d' % table
     Jack: 4098; Sjoerd: 4127; Dcab: 8637678

   This is particularly useful in combination with the new built-in
`vars()' function, which returns a dictionary containing all local
variables.


File: python-tut.info,  Node: Reading and Writing Files,  Prev: Fancier Output Formatting,  Up: Input and Output

Reading and Writing Files
=========================

   `open()' returns a file object, and is most commonly used with two
arguments: `open(FILENAME, MODE)'.

     >>> f=open('/tmp/workfile', 'w')
     >>> print f
     <open file '/tmp/workfile', mode 'w' at 80a0960>

   The first argument is a string containing the filename.  The second
argument is another string containing a few characters describing the
way in which the file will be used.  MODE can be `'r'' when the file
will only be read, `'w'' for only writing (an existing file with the
same name will be erased), and `'a'' opens the file for appending; any
data written to the file is automatically added to the end.  `'r+''
opens the file for both reading and writing.  The MODE argument is
optional; `'r'' will be assumed if it's omitted.

   On Windows and the Macintosh, `'b'' appended to the mode opens the
file in binary mode, so there are also modes like `'rb'', `'wb'', and
`'r+b''.  Windows makes a distinction between text and binary files;
the end-of-line characters in text files are automatically altered
slightly when data is read or written.  This behind-the-scenes
modification to file data is fine for ASCII text files, but it'll
corrupt binary data like that in JPEGs or `.EXE' files.  Be very
careful to use binary mode when reading and writing such files.  (Note
that the precise semantics of text mode on the Macintosh depends on the
underlying C library being used.)

* Menu:

* Methods of File Objects::
* pickle Module::


File: python-tut.info,  Node: Methods of File Objects,  Next: pickle Module,  Prev: Reading and Writing Files,  Up: Reading and Writing Files

Methods of File Objects
-----------------------

   The rest of the examples in this section will assume that a file
object called `f' has already been created.

   To read a file's contents, call `f.read(SIZE)', which reads some
quantity of data and returns it as a string.  SIZE is an optional
numeric argument.  When SIZE is omitted or negative, the entire
contents of the file will be read and returned; it's your problem if
the file is twice as large as your machine's memory.  Otherwise, at
most SIZE bytes are read and returned.  If the end of the file has been
reached, `f.read()' will return an empty string (`""').
     >>> f.read()
     'This is the entire file.\012'
     >>> f.read()
     ''

   `f.readline()' reads a single line from the file; a newline
character (`\n') is left at the end of the string, and is only omitted
on the last line of the file if the file doesn't end in a newline.
This makes the return value unambiguous; if `f.readline()' returns an
empty string, the end of the file has been reached, while a blank line
is represented by `'\n'', a string containing only a single newline.

     >>> f.readline()
     'This is the first line of the file.\012'
     >>> f.readline()
     'Second line of the file\012'
     >>> f.readline()
     ''

   `f.readlines()' uses `f.readline()' repeatedly, and returns a list
containing all the lines of data in the file.

     >>> f.readlines()
     ['This is the first line of the file.\012', 'Second line of the file\012']

   `f.write(STRING)' writes the contents of STRING to the file,
returning `None'.

     >>> f.write('This is a test\n')

   `f.tell()' returns an integer giving the file object's current
position in the file, measured in bytes from the beginning of the file.
To change the file object's position, use `f.seek(OFFSET, FROM_WHAT)'.
The position is computed from adding OFFSET to a reference point; the
reference point is selected by the FROM_WHAT argument.  A FROM_WHAT
value of 0 measures from the beginning of the file, 1 uses the current
file position, and 2 uses the end of the file as the reference point.
FROM_WHAT can be omitted and defaults to 0, using the beginning of the
file as the reference point.

     >>> f=open('/tmp/workfile', 'r+')
     >>> f.write('0123456789abcdef')
     >>> f.seek(5)     # Go to the 5th byte in the file
     >>> f.read(1)
     '5'
     >>> f.seek(-3, 2) # Go to the 3rd byte before the end
     >>> f.read(1)
     'd'

   When you're done with a file, call `f.close()' to close it and free
up any system resources taken up by the open file.  After calling
`f.close()', attempts to use the file object will automatically fail.

     >>> f.close()
     >>> f.read()
     Traceback (innermost last):
       File "<stdin>", line 1, in ?
     ValueError: I/O operation on closed file

   File objects have some additional methods, such as `isatty()' and
`truncate()' which are less frequently used; consult the Library
Reference for a complete guide to file objects.


File: python-tut.info,  Node: pickle Module,  Prev: Methods of File Objects,  Up: Reading and Writing Files

The `pickle' Module
-------------------

   Strings can easily be written to and read from a file. Numbers take a
bit more effort, since the `read()' method only returns strings, which
will have to be passed to a function like `string.atoi()', which takes
a string like `'123'' and returns its numeric value 123.  However, when
you want to save more complex data types like lists, dictionaries, or
class instances, things get a lot more complicated.

   Rather than have users be constantly writing and debugging code to
save complicated data types, Python provides a standard module called
`pickle'.  This is an amazing module that can take almost any Python
object (even some forms of Python code!), and convert it to a string
representation; this process is called "pickling".  Reconstructing the
object from the string representation is called "unpickling".  Between
pickling and unpickling, the string representing the object may have
been stored in a file or data, or sent over a network connection to
some distant machine.

   If you have an object `x', and a file object `f' that's been opened
for writing, the simplest way to pickle the object takes only one line
of code:

     pickle.dump(x, f)

   To unpickle the object again, if `f' is a file object which has been
opened for reading:

     x = pickle.load(f)

   (There are other variants of this, used when pickling many objects or
when you don't want to write the pickled data to a file; consult the
complete documentation for `pickle' in the Library Reference.)

   `pickle' is the standard way to make Python objects which can be
stored and reused by other programs or by a future invocation of the
same program; the technical term for this is a "persistent" object.
Because `pickle' is so widely used, many authors who write Python
extensions take care to ensure that new data types such as matrices can
be properly pickled and unpickled.


File: python-tut.info,  Node: Errors and Exceptions,  Next: Classes,  Prev: Input and Output,  Up: Top

Errors and Exceptions
*********************

   Until now error messages haven't been more than mentioned, but if you
have tried out the examples you have probably seen some.  There are (at
least) two distinguishable kinds of errors: *syntax errors* and
*exceptions*.

* Menu:

* Syntax Errors::
* Exceptions::
* Handling Exceptions::
* Raising Exceptions::
* User-defined Exceptions::
* Defining Clean-up Actions::


File: python-tut.info,  Node: Syntax Errors,  Next: Exceptions,  Prev: Errors and Exceptions,  Up: Errors and Exceptions

Syntax Errors
=============

   Syntax errors, also known as parsing errors, are perhaps the most
common kind of complaint you get while you are still learning Python:

     >>> while 1 print 'Hello world'
       File "<stdin>", line 1
         while 1 print 'Hello world'
                     ^
     SyntaxError: invalid syntax

   The parser repeats the offending line and displays a little `arrow'
pointing at the earliest point in the line where the error was detected.
The error is caused by (or at least detected at) the token *preceding*
the arrow: in the example, the error is detected at the keyword
`print', since a colon (`:') is missing before it.  File name and line
number are printed so you know where to look in case the input came
from a script.

