This is Info file /home/pdm/tmp/Python-1.5.2p1/Doc/ref/python-ref.info,
produced by Makeinfo version 1.68 from the input file ref.texi.

   July 6, 1999			1.5.2


File: python-ref.info,  Node: Top,  Next: Front Matter,  Prev: (dir),  Up: (dir)

Python Reference Manual
***********************

* Menu:

* Front Matter::
* Introduction::
* Lexical analysis::
* Data model::
* Execution model::
* Expressions::
* Simple statements::
* Compound statements::
* Top-level components::
* Module Index::
* Class-Exception-Object Index::
* Function-Method-Variable Index::
* Miscellaneous Index::


File: python-ref.info,  Node: Front Matter,  Next: Introduction,  Prev: Top,  Up: Top

Front Matter
************

   Copyright (C) 1991-1995 by Stichting Mathematisch Centrum,
Amsterdam, The Netherlands.

   All Rights Reserved

   Permission to use, copy, modify, and distribute this software and its
documentation for any purpose and without fee is hereby granted,
provided that the above copyright notice appear in all copies and that
both that copyright notice and this permission notice appear in
supporting documentation, and that the names of Stichting Mathematisch
Centrum or CWI or Corporation for National Research Initiatives or CNRI
not be used in advertising or publicity pertaining to distribution of
the software without specific, written prior permission.

   While CWI is the initial source for this software, a modified version
is made available by the Corporation for National Research Initiatives
(CNRI) at the Internet address `ftp://ftp.python.org'.

   STICHTING MATHEMATISCH CENTRUM AND CNRI DISCLAIM ALL WARRANTIES WITH
REGARD TO THIS SOFTWARE, INCLUDING ALL IMPLIED WARRANTIES OF
MERCHANTABILITY AND FITNESS, IN NO EVENT SHALL STICHTING MATHEMATISCH
CENTRUM OR CNRI BE LIABLE FOR ANY SPECIAL, INDIRECT OR CONSEQUENTIAL
DAMAGES OR ANY DAMAGES WHATSOEVER RESULTING FROM LOSS OF USE, DATA OR
PROFITS, WHETHER IN AN ACTION OF CONTRACT, NEGLIGENCE OR OTHER TORTIOUS
ACTION, ARISING OUT OF OR IN CONNECTION WITH THE USE OR PERFORMANCE OF
THIS SOFTWARE.

     Python is an interpreted, object-oriented, high-level programming
     language with dynamic semantics.  Its high-level built in data
     structures, combined with dynamic typing and dynamic binding, make
     it very attractive for rapid application development, as well as
     for use as a scripting or glue language to connect existing
     components together.  Python's simple, easy to learn syntax
     emphasizes readability and therefore reduces the cost of program
     maintenance.  Python supports modules and packages, which
     encourages program modularity and code reuse.  The Python
     interpreter and the extensive standard library are available in
     source or binary form without charge for all major platforms, and
     can be freely distributed.

     This reference manual describes the syntax and "core semantics" of
     the language.  It is terse, but attempts to be exact and complete.
     The semantics of non-essential built-in object types and of the
     built-in functions and modules are described in the *Python
     Library Reference*.  For an informal introduction to the language,
     see the *Python Tutorial*.  For C or C++ programmers, two
     additional manuals exist: *Extending and Embedding the Python
     Interpreter* describes the high-level picture of how to write a
     Python extension module, and the *Python/C API Reference Manual*
     describes the interfaces available to C/C++ programmers in detail.



File: python-ref.info,  Node: Introduction,  Next: Lexical analysis,  Prev: Front Matter,  Up: Top

Introduction
************

   This reference manual describes the Python programming language.  It
is not intended as a tutorial.

   While I am trying to be as precise as possible, I chose to use
English rather than formal specifications for everything except syntax
and lexical analysis.  This should make the document more understandable
to the average reader, but will leave room for ambiguities.
Consequently, if you were coming from Mars and tried to re-implement
Python from this document alone, you might have to guess things and in
fact you would probably end up implementing quite a different language.
On the other hand, if you are using Python and wonder what the precise
rules about a particular area of the language are, you should
definitely be able to find them here.  If you would like to see a more
formal definitition of the language, maybe you could volunteer your
time -- or invent a cloning machine :-).

   It is dangerous to add too many implementation details to a language
reference document -- the implementation may change, and other
implementations of the same language may work differently.  On the
other hand, there is currently only one Python implementation in
widespread use (although a second one now exists!), and its particular
quirks are sometimes worth being mentioned, especially where the
implementation imposes additional limitations.  Therefore, you'll find
short "implementation notes" sprinkled throughout the text.

   Every Python implementation comes with a number of built-in and
standard modules.  These are not documented here, but in the separate
*Python Library Reference* document.  A few built-in modules are
mentioned when they interact in a significant way with the language
definition.

* Menu:

* Notation::


File: python-ref.info,  Node: Notation,  Prev: Introduction,  Up: Introduction

Notation
========

   The descriptions of lexical analysis and syntax use a modified BNF
grammar notation.  This uses the following style of definition:

     name:           lc_letter (lc_letter | "_")*
     lc_letter:      "a"..."z"

   The first line says that a `name' is an `lc_letter' followed by a
sequence of zero or more `lc_letter's and underscores.  An `lc_letter'
in turn is any of the single characters `a' through `z'.  (This rule is
actually adhered to for the names defined in lexical and grammar rules
in this document.)

   Each rule begins with a name (which is the name defined by the rule)
and a colon.  A vertical bar (`|') is used to separate alternatives; it
is the least binding operator in this notation.  A star (`*') means
zero or more repetitions of the preceding item; likewise, a plus (`+')
means one or more repetitions, and a phrase enclosed in square brackets
(`[ ]') means zero or one occurrences (in other words, the enclosed
phrase is optional).  The `*' and `+' operators bind as tightly as
possible; parentheses are used for grouping.  Literal strings are
enclosed in quotes.  White space is only meaningful to separate tokens.
Rules are normally contained on a single line; rules with many
alternatives may be formatted alternatively with each line after the
first beginning with a vertical bar.

   In lexical definitions (as the example above), two more conventions
are used: Two literal characters separated by three dots mean a choice
of any single character in the given (inclusive) range of ASCII
characters.  A phrase between angular brackets (`<...>') gives an
informal description of the symbol defined; e.g., this could be used to
describe the notion of `control character' if needed.

   Even though the notation used is almost the same, there is a big
difference between the meaning of lexical and syntactic definitions: a
lexical definition operates on the individual characters of the input
source, while a syntax definition operates on the stream of tokens
generated by the lexical analysis.  All uses of BNF in the next chapter
("Lexical Analysis") are lexical definitions; uses in subsequent
chapters are syntactic definitions.


File: python-ref.info,  Node: Lexical analysis,  Next: Data model,  Prev: Introduction,  Up: Top

Lexical analysis
****************

   A Python program is read by a *parser*.  Input to the parser is a
stream of *tokens*, generated by the *lexical analyzer*.  This chapter
describes how the lexical analyzer breaks a file into tokens.

   Python uses the 7-bit ASCII character set for program text and string
literals. 8-bit characters may be used in string literals and comments
but their interpretation is platform dependent; the proper way to
insert 8-bit characters in string literals is by using octal or
hexadecimal escape sequences.

   The run-time character set depends on the I/O devices connected to
the program but is generally a superset of ASCII.

   *Future compatibility note:* It may be tempting to assume that the
character set for 8-bit characters is ISO Latin-1 (an ASCII superset
that covers most western languages that use the Latin alphabet), but it
is possible that in the future Unicode text editors will become common.
These generally use the UTF-8 encoding, which is also an ASCII
superset, but with very different use for the characters with ordinals
128-255.  While there is no consensus on this subject yet, it is unwise
to assume either Latin-1 or UTF-8, even though the current
implementation appears to favor Latin-1.  This applies both to the
source character set and the run-time character set.

* Menu:

* Line structure::
* Other tokens::
* Identifiers and keywords::
* Literals::
* Operators::
* Delimiters::


File: python-ref.info,  Node: Line structure,  Next: Other tokens,  Prev: Lexical analysis,  Up: Lexical analysis

Line structure
==============

   A Python program is divided into a number of *logical lines*.

* Menu:

* Logical lines::
* Physical lines::
* Comments::
* Explicit line joining::
* Implicit line joining::
* Blank lines blank line::
* Indentation::
* Whitespace between tokens::


File: python-ref.info,  Node: Logical lines,  Next: Physical lines,  Prev: Line structure,  Up: Line structure

Logical lines
-------------

   The end of a logical line is represented by the token NEWLINE.
Statements cannot cross logical line boundaries except where NEWLINE is
allowed by the syntax (e.g., between statements in compound statements).
A logical line is constructed from one or more *physical lines* by
following the explicit or implicit *line joining* rules.


File: python-ref.info,  Node: Physical lines,  Next: Comments,  Prev: Logical lines,  Up: Line structure

Physical lines
--------------

   A physical line ends in whatever the current platform's convention is
for terminating lines.  On UNIX, this is the ASCII LF (linefeed)
character.  On DOS/Windows, it is the ASCII sequence CR LF (return
followed by linefeed).  On Macintosh, it is the ASCII CR (return)
character.


File: python-ref.info,  Node: Comments,  Next: Explicit line joining,  Prev: Physical lines,  Up: Line structure

Comments
--------

   A comment starts with a hash character (`#') that is not part of a
string literal, and ends at the end of the physical line.  A comment
signifies the end of the logical line unless the implicit line joining
rules are invoked.  Comments are ignored by the syntax; they are not
tokens.


File: python-ref.info,  Node: Explicit line joining,  Next: Implicit line joining,  Prev: Comments,  Up: Line structure

Explicit line joining
---------------------

   Two or more physical lines may be joined into logical lines using
backslash characters (`\'), as follows: when a physical line ends in a
backslash that is not part of a string literal or comment, it is joined
with the following forming a single logical line, deleting the
backslash and the following end-of-line character.  For example:
     if 1900 < year < 2100 and 1 <= month <= 12 \
        and 1 <= day <= 31 and 0 <= hour < 24 \
        and 0 <= minute < 60 and 0 <= second < 60:   # Looks like a valid date
             return 1

   A line ending in a backslash cannot carry a comment.  A backslash
does not continue a comment.  A backslash does not continue a token
except for string literals (i.e., tokens other than string literals
cannot be split across physical lines using a backslash).  A backslash
is illegal elsewhere on a line outside a string literal.


File: python-ref.info,  Node: Implicit line joining,  Next: Blank lines blank line,  Prev: Explicit line joining,  Up: Line structure

Implicit line joining
---------------------

   Expressions in parentheses, square brackets or curly braces can be
split over more than one physical line without using backslashes.  For
example:

     month_names = ['Januari', 'Februari', 'Maart',      # These are the
                    'April',   'Mei',      'Juni',       # Dutch names
                    'Juli',    'Augustus', 'September',  # for the months
                    'Oktober', 'November', 'December']   # of the year

   Implicitly continued lines can carry comments.  The indentation of
the continuation lines is not important.  Blank continuation lines are
allowed.  There is no NEWLINE token between implicit continuation
lines.  Implicitly continued lines can also occur within triple-quoted
strings (see below); in that case they cannot carry comments.


File: python-ref.info,  Node: Blank lines blank line,  Next: Indentation,  Prev: Implicit line joining,  Up: Line structure

Blank lines -----------

   A logical line that contains only spaces, tabs, formfeeds and
possibly a comment, is ignored (i.e., no NEWLINE token is generated).
During interactive input of statements, handling of a blank line may
differ depending on the implementation of the read-eval-print loop.  In
the standard implementation, an entirely blank logical line (i.e. one
containing not even whitespace or a comment) terminates a multi-line
statement.


File: python-ref.info,  Node: Indentation,  Next: Whitespace between tokens,  Prev: Blank lines blank line,  Up: Line structure

Indentation
-----------

   Leading whitespace (spaces and tabs) at the beginning of a logical
line is used to compute the indentation level of the line, which in
turn is used to determine the grouping of statements.

   First, tabs are replaced (from left to right) by one to eight spaces
such that the total number of characters up to and including the
replacement is a multiple of eight (this is intended to be the same
rule as used by UNIX).  The total number of spaces preceding the first
non-blank character then determines the line's indentation.
Indentation cannot be split over multiple physical lines using
backslashes; the whitespace up to the first backslash determines the
indentation.

   *Cross-platform compatibility note:* because of the nature of text
editors on non-UNIX platforms, it is unwise to use a mixture of spaces
and tabs for the indentation in a single source file.

   A formfeed character may be present at the start of the line; it will
be ignored for the indentation calculations above.  A formfeed
characters occurring elsewhere in the leading whitespace have an
undefined effect (for instance, they may reset the space count to zero).

   The indentation levels of consecutive lines are used to generate
INDENT and DEDENT tokens, using a stack, as follows.

   Before the first line of the file is read, a single zero is pushed on
the stack; this will never be popped off again.  The numbers pushed on
the stack will always be strictly increasing from bottom to top.  At
the beginning of each logical line, the line's indentation level is
compared to the top of the stack.  If it is equal, nothing happens.  If
it is larger, it is pushed on the stack, and one INDENT token is
generated.  If it is smaller, it *must* be one of the numbers occurring
on the stack; all numbers on the stack that are larger are popped off,
and for each number popped off a DEDENT token is generated.  At the end
of the file, a DEDENT token is generated for each number remaining on
the stack that is larger than zero.

   Here is an example of a correctly (though confusingly) indented piece
of Python code:

     def perm(l):
             # Compute the list of all permutations of l
         if len(l) <= 1:
                       return [l]
         r = []
         for i in range(len(l)):
                  s = l[:i] + l[i+1:]
                  p = perm(s)
                  for x in p:
                   r.append(l[i:i+1] + x)
         return r

   The following example shows various indentation errors:

          def perm(l):                       # error: first line indented
         for i in range(len(l)):             # error: not indented
             s = l[:i] + l[i+1:]
                 p = perm(l[:i] + l[i+1:])   # error: unexpected indent
                 for x in p:
                         r.append(l[i:i+1] + x)
                     return r                # error: inconsistent dedent

   (Actually, the first three errors are detected by the parser; only
the last error is found by the lexical analyzer -- the indentation of
`return r' does not match a level popped off the stack.)


File: python-ref.info,  Node: Whitespace between tokens,  Prev: Indentation,  Up: Line structure

Whitespace between tokens
-------------------------

   Except at the beginning of a logical line or in string literals, the
whitespace characters space, tab and formfeed can be used
interchangeably to separate tokens.  Whitespace is needed between two
tokens only if their concatenation could otherwise be interpreted as a
different token (e.g., ab is one token, but a b is two tokens).


File: python-ref.info,  Node: Other tokens,  Next: Identifiers and keywords,  Prev: Line structure,  Up: Lexical analysis

Other tokens
============

   Besides NEWLINE, INDENT and DEDENT, the following categories of
tokens exist: *identifiers*, *keywords*, *literals*, *operators*, and
*delimiters*.  Whitespace characters (other than line terminators,
discussed earlier) are not tokens, but serve to delimit tokens.  Where
ambiguity exists, a token comprises the longest possible string that
forms a legal token, when read from left to right.


File: python-ref.info,  Node: Identifiers and keywords,  Next: Literals,  Prev: Other tokens,  Up: Lexical analysis

Identifiers and keywords
========================

   Identifiers (also referred to as *names*) are described by the
following lexical definitions:

     identifier:     (letter|"_") (letter|digit|"_")*
     letter:         lowercase | uppercase
     lowercase:      "a"..."z"
     uppercase:      "A"..."Z"
     digit:          "0"..."9"

   Identifiers are unlimited in length.  Case is significant.

* Menu:

* Keywords::
* Reserved classes of identifiers::


File: python-ref.info,  Node: Keywords,  Next: Reserved classes of identifiers,  Prev: Identifiers and keywords,  Up: Identifiers and keywords

Keywords
--------

   The following identifiers are used as reserved words, or *keywords*
of the language, and cannot be used as ordinary identifiers.  They must
be spelled exactly as written here:

     and       del       for       is        raise
     assert    elif      from      lambda    return
     break     else      global    not       try
     class     except    if        or        while
     continue  exec      import    pass
     def       finally   in        print


File: python-ref.info,  Node: Reserved classes of identifiers,  Prev: Keywords,  Up: Identifiers and keywords

Reserved classes of identifiers
-------------------------------

   Certain classes of identifiers (besides keywords) have special
meanings.  These are:

Form                     Meaning                  Notes                    
------                   -----                    -----                    
_*                       Not imported by `from    (1)                      
                         MODULE import *'                                  
__*__                    System-defined name                               
__*                      Class-private name                                
                         mangling                                          

   (XXX need section references here.)

   Note:

`(1)'
     The special identifier `_' is used in the interactive interpreter
     to store the result of the last evaluation; it is stored in the
     `__builtin__' module.  When not in interactive mode, `_' has no
     special meaning and is not defined.


File: python-ref.info,  Node: Literals,  Next: Operators,  Prev: Identifiers and keywords,  Up: Lexical analysis

Literals
========

   Literals are notations for constant values of some built-in types.

* Menu:

* String literals::
* String literal concatenation::
* Numeric literals::
* Integer and long integer literals::
* Floating point literals::
* Imaginary literals::


File: python-ref.info,  Node: String literals,  Next: String literal concatenation,  Prev: Literals,  Up: Literals

String literals
---------------

   String literals are described by the following lexical definitions:

     stringliteral:   shortstring | longstring
     shortstring:     "'" shortstringitem* "'" | '"' shortstringitem* '"'
     longstring:      "'''" longstringitem* "'''" | '"""' longstringitem* '"""'
     shortstringitem: shortstringchar | escapeseq
     longstringitem:  longstringchar | escapeseq
     shortstringchar: <any ASCII character except "\" or newline or the quote>
     longstringchar:  <any ASCII character except "\">
     escapeseq:       "\" <any ASCII character>

   In plain English: String literals can be enclosed in matching single
quotes (`'') or double quotes (`"').  They can also be enclosed in
matching groups of three single or double quotes (these are generally
referred to as *triple-quoted strings*).  The backslash (`\') character
is used to escape characters that otherwise have a special meaning,
such as newline, backslash itself, or the quote character.  String
literals may optionally be prefixed with a letter `r' or `R'; such
strings are called raw strings and use different rules for backslash
escape sequences.

   In triple-quoted strings, unescaped newlines and quotes are allowed
(and are retained), except that three unescaped quotes in a row
terminate the string.  (A "quote" is the character used to open the
string, i.e. either `'' or `"'.)

   Unless an `r' or `R' prefix is present, escape sequences in strings
are interpreted according to rules similar to those used by Standard C.
The recognized escape sequences are:

Escape Sequence                      Meaning                              
------                               -----                                
\NEWLINE                             Ignored                              
\\                                   Backslash (`\')                      
\'                                   Single quote (`'')                   
\"                                   Double quote (`"')                   
\a                                   ASCII Bell (BEL)                     
\b                                   ASCII Backspace (BS)                 
\f                                   ASCII Formfeed (FF)                  
\n                                   ASCII Linefeed (LF)                  
\r                                   ASCII Carriage Return (CR)           
\t                                   ASCII Horizontal Tab (TAB)           
\v                                   ASCII Vertical Tab (VT)              
\OOO                                 ASCII character with octal value     
                                     *ooo*                                
\xHH...                              ASCII character with hex value       
                                     *hh...*                              

   In strict compatibility with Standard C, up to three octal digits are
accepted, but an unlimited number of hex digits is taken to be part of
the hex escape (and then the lower 8 bits of the resulting hex number
are used in 8-bit implementations).

   Unlike Standard C, all unrecognized escape sequences are left in the
string unchanged, i.e., *the backslash is left in the string.*  (This
behavior is useful when debugging: if an escape sequence is mistyped,
the resulting output is more easily recognized as broken.)

   When an `r' or `R' prefix is present, backslashes are still used to
quote the following character, but *all backslashes are left in the
string*.  For example, the string literal `r"\n"' consists of two
characters: a backslash and a lowercase `n'.  String quotes can be
escaped with a backslash, but the backslash remains in the string; for
example, `r"\""' is a valid string literal consisting of two
characters: a backslash and a double quote; `r"\"' is not a value
string literal (even a raw string cannot end in an odd number of
backslashes).  Specifically, *a raw string cannot end in a single
backslash* (since the backslash would escape the following quote
character).


File: python-ref.info,  Node: String literal concatenation,  Next: Numeric literals,  Prev: String literals,  Up: Literals

String literal concatenation
----------------------------

   Multiple adjacent string literals (delimited by whitespace), possibly
using different quoting conventions, are allowed, and their meaning is
the same as their concatenation.  Thus, `"hello" 'world'' is equivalent
to `"helloworld"'.  This feature can be used to reduce the number of
backslashes needed, to split long strings conveniently across long
lines, or even to add comments to parts of strings, for example:

     re.compile("[A-Za-z_]"       # letter or underscore
                "[A-Za-z0-9_]*"   # letter, digit or underscore
               )

   Note that this feature is defined at the syntactical level, but
implemented at compile time.  The `+' operator must be used to
concatenate string expressions at run time.  Also note that literal
concatenation can use different quoting styles for each component (even
mixing raw strings and triple quoted strings).


File: python-ref.info,  Node: Numeric literals,  Next: Integer and long integer literals,  Prev: String literal concatenation,  Up: Literals

Numeric literals
----------------

   There are four types of numeric literals: plain integers, long
integers, floating point numbers, and imaginary numbers.  There are no
complex literals (complex numbers can be formed by adding a real number
and an imaginary number).

   Note that numeric literals do not include a sign; a phrase like `-1'
is actually an expression composed of the unary operator ``-'' and the
literal `1'.


File: python-ref.info,  Node: Integer and long integer literals,  Next: Floating point literals,  Prev: Numeric literals,  Up: Literals

Integer and long integer literals
---------------------------------

   Integer and long integer literals are described by the following
lexical definitions:

     longinteger:    integer ("l"|"L")
     integer:        decimalinteger | octinteger | hexinteger
     decimalinteger: nonzerodigit digit* | "0"
     octinteger:     "0" octdigit+
     hexinteger:     "0" ("x"|"X") hexdigit+
     nonzerodigit:   "1"..."9"
     octdigit:       "0"..."7"
     hexdigit:        digit|"a"..."f"|"A"..."F"

   Although both lower case `l' and upper case `L' are allowed as suffix
for long integers, it is strongly recommended to always use `L', since
the letter `l' looks too much like the digit `1'.

   Plain integer decimal literals must be at most 2147483647 (i.e., the
largest positive integer, using 32-bit arithmetic).  Plain octal and
hexadecimal literals may be as large as 4294967295, but values larger
than 2147483647 are converted to a negative value by subtracting
4294967296.  There is no limit for long integer literals apart from
what can be stored in available memory.

   Some examples of plain and long integer literals:

     7     2147483647                        0177    0x80000000
     3L    79228162514264337593543950336L    0377L   0x100000000L


File: python-ref.info,  Node: Floating point literals,  Next: Imaginary literals,  Prev: Integer and long integer literals,  Up: Literals

Floating point literals
-----------------------

   Floating point literals are described by the following lexical
definitions:

     floatnumber:    pointfloat | exponentfloat
     pointfloat:     [intpart] fraction | intpart "."
     exponentfloat:  (nonzerodigit digit* | pointfloat) exponent
     intpart:        nonzerodigit digit* | "0"
     fraction:       "." digit+
     exponent:       ("e"|"E") ["+"|"-"] digit+

   Note that the integer part of a floating point number cannot look
like an octal integer.  The allowed range of floating point literals is
implementation-dependent.  Some examples of floating point literals:

     3.14    10.    .001    1e100    3.14e-10

   Note that numeric literals do not include a sign; a phrase like `-1'
is actually an expression composed of the operator `-' and the literal
`1'.


File: python-ref.info,  Node: Imaginary literals,  Prev: Floating point literals,  Up: Literals

Imaginary literals
------------------

   Imaginary literals are described by the following lexical
definitions:

     imagnumber:     (floatnumber | intpart) ("j"|"J")

   An imaginary literals yields a complex number with a real part of
0.0.  Complex numbers are represented as a pair of floating point
numbers and have the same restrictions on their range.  To create a
complex number with a nonzero real part, add a floating point number to
it, e.g., `(3+4j)'.  Some examples of imaginary literals:

     3.14j   10.j    10j     .001j   1e100j  3.14e-10j


File: python-ref.info,  Node: Operators,  Next: Delimiters,  Prev: Literals,  Up: Lexical analysis

Operators
=========

   The following tokens are operators:

     +       -       *       **      /       %
     <<      >>      &       |       ^       ~
     <       >       <=      >=      ==      !=      <>

   The comparison operators `<>' and `!=' are alternate spellings of
the same operator.  `!=' is the preferred spelling; `<>' is obsolescent.


File: python-ref.info,  Node: Delimiters,  Prev: Operators,  Up: Lexical analysis

Delimiters
==========

   The following tokens serve as delimiters in the grammar:

     (       )       [       ]       {       }
     ,       :       .       `       =       ;

   The period can also occur in floating-point and imaginary literals.
A sequence of three periods has a special meaning as ellipses in slices.

   The following printing ASCII characters have special meaning as part
of other tokens or are otherwise significant to the lexical analyzer:

     '       "       #       \

   The following printing ASCII characters are not used in Python.
Their occurrence outside string literals and comments is an
unconditional error:

     @       $       ?


File: python-ref.info,  Node: Data model,  Next: Execution model,  Prev: Lexical analysis,  Up: Top

Data model
**********

* Menu:

* Objects::
* standard type hierarchy::
* Special method names::


File: python-ref.info,  Node: Objects,  Next: standard type hierarchy,  Prev: Data model,  Up: Data model

Objects, values and types
=========================

   "Objects" are Python's abstraction for data.  All data in a Python
program is represented by objects or by relations between objects.  (In
a sense, and in conformance to Von Neumann's model of a "stored program
computer," code is also represented by objects.)

   Every object has an identity, a type and a value.  An object's
*identity* never changes once it has been created; you may think of it
as the object's address in memory.  The ``is'' operator compares the
identity of two objects; the `id()' function returns an integer
representing its identity (currently implemented as its address).  An
object's "type" is also unchangeable.  It determines the operations
that an object supports (e.g., "does it have a length?") and also
defines the possible values for objects of that type.  The `type()'
function returns an object's type (which is an object itself).  The
*value* of some objects can change.  Objects whose value can change are
said to be *mutable*; objects whose value is unchangeable once they are
created are called *immutable*.  (The value of an immutable container
object that contains a reference to a mutable object can change when
the latter's value is changed; however the container is still
considered immutable, because the collection of objects it contains
cannot be changed.  So, immutability is not strictly the same as having
an unchangeable value, it is more subtle.)  An object's mutability is
determined by its type; for instance, numbers, strings and tuples are
immutable, while dictionaries and lists are mutable.

   Objects are never explicitly destroyed; however, when they become
unreachable they may be garbage-collected.  An implementation is
allowed to postpone garbage collection or omit it altogether -- it is a
matter of implementation quality how garbage collection is implemented,
as long as no objects are collected that are still reachable.
(Implementation note: the current implementation uses a
reference-counting scheme which collects most objects as soon as they
become unreachable, but never collects garbage containing circular
references.)

   Note that the use of the implementation's tracing or debugging
facilities may keep objects alive that would normally be collectable.
Also note that catching an exception with a ``try'...`except''
statement may keep objects alive.

   Some objects contain references to "external" resources such as open
files or windows.  It is understood that these resources are freed when
the object is garbage-collected, but since garbage collection is not
guaranteed to happen, such objects also provide an explicit way to
release the external resource, usually a `close()' method.  Programs
are strongly recommended to explicitly close such objects.  The
``try'...`finally'' statement provides a convenient way to do this.

   Some objects contain references to other objects; these are called
*containers*.  Examples of containers are tuples, lists and
dictionaries.  The references are part of a container's value.  In most
cases, when we talk about the value of a container, we imply the
values, not the identities of the contained objects; however, when we
talk about the mutability of a container, only the identities of the
immediately contained objects are implied.  So, if an immutable
container (like a tuple) contains a reference to a mutable object, its
value changes if that mutable object is changed.

   Types affect almost all aspects of object behavior.  Even the
importance of object identity is affected in some sense: for immutable
types, operations that compute new values may actually return a
reference to any existing object with the same type and value, while
for mutable objects this is not allowed.  E.g., after `a = 1; b = 1',
`a' and `b' may or may not refer to the same object with the value one,
depending on the implementation, but after `c = []; d = []', `c' and `d'
are guaranteed to refer to two different, unique, newly created empty
lists.  (Note that `c = d = []' assigns the same object to both `c' and
`d'.)