This is /home/pdm/install/Python-2.1/Doc/ref/python-ref.info, produced
by makeinfo version 4.0 from ref.texi.

   April 15, 2001		2.1


File: python-ref.info,  Node: Top,  Next: Front Matter,  Prev: (dir),  Up: (dir)

Python Reference Manual
***********************

* Menu:

* Front Matter::
* Introduction::
* Lexical analysis::
* Data model::
* Execution model::
* Expressions::
* Simple statements::
* Compound statements::
* Top-level components::
* Future statements and nested scopes::
* Module Index::
* Class-Exception-Object Index::
* Function-Method-Variable Index::
* Miscellaneous Index::


File: python-ref.info,  Node: Front Matter,  Next: Introduction,  Prev: Top,  Up: Top

Front Matter
************

   Copyright (C) 2001 Python Software Foundation.  All rights reserved.

   Copyright (C) 2000 BeOpen.com.  All rights reserved.

   Copyright (C) 1995-2000 Corporation for National Research
Initiatives.  All rights reserved.

   Copyright (C) 1991-1995 Stichting Mathematisch Centrum.  All rights
reserved.

        *BEOPEN PYTHON OPEN SOURCE LICENSE AGREEMENT VERSION 1*

  1. This LICENSE AGREEMENT is between BeOpen.com ("BeOpen"), having an
     office at 160 Saratoga Avenue, Santa Clara, CA 95051, and the
     Individual or Organization ("Licensee") accessing and otherwise
     using this software in source or binary form and its associated
     documentation ("the Software").

  2. Subject to the terms and conditions of this BeOpen Python License
     Agreement, BeOpen hereby grants Licensee a non-exclusive,
     royalty-free, world-wide license to reproduce, analyze, test,
     perform and/or display publicly, prepare derivative works,
     distribute, and otherwise use the Software alone or in any
     derivative version, provided, however, that the BeOpen Python
     License is retained in the Software, alone or in any derivative
     version prepared by Licensee.

  3. BeOpen is making the Software available to Licensee on an "AS IS"
     basis.  BEOPEN MAKES NO REPRESENTATIONS OR WARRANTIES, EXPRESS OR
     IMPLIED.  BY WAY OF EXAMPLE, BUT NOT LIMITATION, BEOPEN MAKES NO
     AND DISCLAIMS ANY REPRESENTATION OR WARRANTY OF MERCHANTABILITY OR
     FITNESS FOR ANY PARTICULAR PURPOSE OR THAT THE USE OF THE SOFTWARE
     WILL NOT INFRINGE ANY THIRD PARTY RIGHTS.

  4. BEOPEN SHALL NOT BE LIABLE TO LICENSEE OR ANY OTHER USERS OF THE
     SOFTWARE FOR ANY INCIDENTAL, SPECIAL, OR CONSEQUENTIAL DAMAGES OR
     LOSS AS A RESULT OF USING, MODIFYING OR DISTRIBUTING THE SOFTWARE,
     OR ANY DERIVATIVE THEREOF, EVEN IF ADVISED OF THE POSSIBILITY
     THEREOF.

  5. This License Agreement will automatically terminate upon a material
     breach of its terms and conditions.

  6. This License Agreement shall be governed by and interpreted in all
     respects by the law of the State of California, excluding conflict
     of law provisions.  Nothing in this License Agreement shall be
     deemed to create any relationship of agency, partnership, or joint
     venture between BeOpen and Licensee.  This License Agreement does
     not grant permission to use BeOpen trademarks or trade names in a
     trademark sense to endorse or promote products or services of
     Licensee, or any third party.  As an exception, the "BeOpen
     Python" logos available at http://www.pythonlabs.com/logos.html
     may be used according to the permissions granted on that web page.

  7. By copying, installing or otherwise using the software, Licensee
     agrees to be bound by the terms and conditions of this License
     Agreement.

          *CNRI OPEN SOURCE GPL-COMPATIBLE LICENSE AGREEMENT*

   Python 1.6.1 is made available subject to the terms and conditions in
CNRI's License Agreement.  This Agreement together with Python 1.6.1 may
be located on the Internet using the following unique, persistent
identifier (known as a handle): 1895.22/1013.  This Agreement may also
be obtained from a proxy server on the Internet using the following
URL: <http://hdl.handle.net/1895.22/1013>.

              *CWI PERMISSIONS STATEMENT AND DISCLAIMER*

   Copyright (C) 1991 - 1995, Stichting Mathematisch Centrum Amsterdam,
The Netherlands.  All rights reserved.

   Permission to use, copy, modify, and distribute this software and its
documentation for any purpose and without fee is hereby granted,
provided that the above copyright notice appear in all copies and that
both that copyright notice and this permission notice appear in
supporting documentation, and that the name of Stichting Mathematisch
Centrum or CWI not be used in advertising or publicity pertaining to
distribution of the software without specific, written prior permission.

   STICHTING MATHEMATISCH CENTRUM DISCLAIMS ALL WARRANTIES WITH REGARD
TO THIS SOFTWARE, INCLUDING ALL IMPLIED WARRANTIES OF MERCHANTABILITY
AND FITNESS, IN NO EVENT SHALL STICHTING MATHEMATISCH CENTRUM BE LIABLE
FOR ANY SPECIAL, INDIRECT OR CONSEQUENTIAL DAMAGES OR ANY DAMAGES
WHATSOEVER RESULTING FROM LOSS OF USE, DATA OR PROFITS, WHETHER IN AN
ACTION OF CONTRACT, NEGLIGENCE OR OTHER TORTIOUS ACTION, ARISING OUT OF
OR IN CONNECTION WITH THE USE OR PERFORMANCE OF THIS SOFTWARE.

     Python is an interpreted, object-oriented, high-level programming
     language with dynamic semantics.  Its high-level built in data
     structures, combined with dynamic typing and dynamic binding, make
     it very attractive for rapid application development, as well as
     for use as a scripting or glue language to connect existing
     components together.  Python's simple, easy to learn syntax
     emphasizes readability and therefore reduces the cost of program
     maintenance.  Python supports modules and packages, which
     encourages program modularity and code reuse.  The Python
     interpreter and the extensive standard library are available in
     source or binary form without charge for all major platforms, and
     can be freely distributed.

     This reference manual describes the syntax and "core semantics" of
     the language.  It is terse, but attempts to be exact and complete.
     The semantics of non-essential built-in object types and of the
     built-in functions and modules are described in the .  For an
     informal introduction to the language, see the .  For C or C++
     programmers, two additional manuals exist:  describes the
     high-level picture of how to write a Python extension module, and
     the  describes the interfaces available to C/C++ programmers in
     detail.



File: python-ref.info,  Node: Introduction,  Next: Lexical analysis,  Prev: Front Matter,  Up: Top

Introduction
************

   This reference manual describes the Python programming language.  It
is not intended as a tutorial.

   While I am trying to be as precise as possible, I chose to use
English rather than formal specifications for everything except syntax
and lexical analysis.  This should make the document more understandable
to the average reader, but will leave room for ambiguities.
Consequently, if you were coming from Mars and tried to re-implement
Python from this document alone, you might have to guess things and in
fact you would probably end up implementing quite a different language.
On the other hand, if you are using Python and wonder what the precise
rules about a particular area of the language are, you should
definitely be able to find them here.  If you would like to see a more
formal definition of the language, maybe you could volunteer your time
-- or invent a cloning machine :-).

   It is dangerous to add too many implementation details to a language
reference document -- the implementation may change, and other
implementations of the same language may work differently.  On the
other hand, there is currently only one Python implementation in
widespread use (although a second one now exists!), and its particular
quirks are sometimes worth being mentioned, especially where the
implementation imposes additional limitations.  Therefore, you'll find
short "implementation notes" sprinkled throughout the text.

   Every Python implementation comes with a number of built-in and
standard modules.  These are not documented here, but in the separate
document.  A few built-in modules are mentioned when they interact in a
significant way with the language definition.

* Menu:

* Notation::


File: python-ref.info,  Node: Notation,  Prev: Introduction,  Up: Introduction

Notation
========

   The descriptions of lexical analysis and syntax use a modified BNF
grammar notation.  This uses the following style of definition:

     name:           lc_letter (lc_letter | "_")*
     lc_letter:      "a"..."z"

   The first line says that a `name' is an `lc_letter' followed by a
sequence of zero or more `lc_letter's and underscores.  An `lc_letter'
in turn is any of the single characters `a' through `z'.  (This rule is
actually adhered to for the names defined in lexical and grammar rules
in this document.)

   Each rule begins with a name (which is the name defined by the rule)
and a colon.  A vertical bar (`|') is used to separate alternatives; it
is the least binding operator in this notation.  A star (`*') means
zero or more repetitions of the preceding item; likewise, a plus (`+')
means one or more repetitions, and a phrase enclosed in square brackets
(`[ ]') means zero or one occurrences (in other words, the enclosed
phrase is optional).  The `*' and `+' operators bind as tightly as
possible; parentheses are used for grouping.  Literal strings are
enclosed in quotes.  White space is only meaningful to separate tokens.
Rules are normally contained on a single line; rules with many
alternatives may be formatted alternatively with each line after the
first beginning with a vertical bar.

   In lexical definitions (as the example above), two more conventions
are used: Two literal characters separated by three dots mean a choice
of any single character in the given (inclusive) range of ASCII
characters.  A phrase between angular brackets (`<...>') gives an
informal description of the symbol defined; e.g., this could be used to
describe the notion of `control character' if needed.

   Even though the notation used is almost the same, there is a big
difference between the meaning of lexical and syntactic definitions: a
lexical definition operates on the individual characters of the input
source, while a syntax definition operates on the stream of tokens
generated by the lexical analysis.  All uses of BNF in the next chapter
("Lexical Analysis") are lexical definitions; uses in subsequent
chapters are syntactic definitions.


File: python-ref.info,  Node: Lexical analysis,  Next: Data model,  Prev: Introduction,  Up: Top

Lexical analysis
****************

   A Python program is read by a _parser_.  Input to the parser is a
stream of _tokens_, generated by the _lexical analyzer_.  This chapter
describes how the lexical analyzer breaks a file into tokens.

   Python uses the 7-bit ASCII character set for program text and string
literals. 8-bit characters may be used in string literals and comments
but their interpretation is platform dependent; the proper way to
insert 8-bit characters in string literals is by using octal or
hexadecimal escape sequences.

   The run-time character set depends on the I/O devices connected to
the program but is generally a superset of ASCII.

   *Future compatibility note:* It may be tempting to assume that the
character set for 8-bit characters is ISO Latin-1 (an ASCII superset
that covers most western languages that use the Latin alphabet), but it
is possible that in the future Unicode text editors will become common.
These generally use the UTF-8 encoding, which is also an ASCII
superset, but with very different use for the characters with ordinals
128-255.  While there is no consensus on this subject yet, it is unwise
to assume either Latin-1 or UTF-8, even though the current
implementation appears to favor Latin-1.  This applies both to the
source character set and the run-time character set.

* Menu:

* Line structure::
* Other tokens::
* Identifiers and keywords::
* Literals::
* Operators::
* Delimiters::


File: python-ref.info,  Node: Line structure,  Next: Other tokens,  Prev: Lexical analysis,  Up: Lexical analysis

Line structure
==============

   A Python program is divided into a number of _logical lines_.

* Menu:

* Logical lines::
* Physical lines::
* Comments::
* Explicit line joining::
* Implicit line joining::
* Blank lines blank line::
* Indentation::
* Whitespace between tokens::


File: python-ref.info,  Node: Logical lines,  Next: Physical lines,  Prev: Line structure,  Up: Line structure

Logical lines
-------------

   The end of a logical line is represented by the token NEWLINE.
Statements cannot cross logical line boundaries except where NEWLINE is
allowed by the syntax (e.g., between statements in compound statements).
A logical line is constructed from one or more _physical lines_ by
following the explicit or implicit _line joining_ rules.


File: python-ref.info,  Node: Physical lines,  Next: Comments,  Prev: Logical lines,  Up: Line structure

Physical lines
--------------

   A physical line ends in whatever the current platform's convention is
for terminating lines.  On UNIX, this is the ASCII LF (linefeed)
character.  On DOS/Windows, it is the ASCII sequence CR LF (return
followed by linefeed).  On Macintosh, it is the ASCII CR (return)
character.


File: python-ref.info,  Node: Comments,  Next: Explicit line joining,  Prev: Physical lines,  Up: Line structure

Comments
--------

   A comment starts with a hash character (`#') that is not part of a
string literal, and ends at the end of the physical line.  A comment
signifies the end of the logical line unless the implicit line joining
rules are invoked.  Comments are ignored by the syntax; they are not
tokens.


File: python-ref.info,  Node: Explicit line joining,  Next: Implicit line joining,  Prev: Comments,  Up: Line structure

Explicit line joining
---------------------

   Two or more physical lines may be joined into logical lines using
backslash characters (`\'), as follows: when a physical line ends in a
backslash that is not part of a string literal or comment, it is joined
with the following forming a single logical line, deleting the
backslash and the following end-of-line character.  For example:
     if 1900 < year < 2100 and 1 <= month <= 12 \
        and 1 <= day <= 31 and 0 <= hour < 24 \
        and 0 <= minute < 60 and 0 <= second < 60:   # Looks like a valid date
             return 1

   A line ending in a backslash cannot carry a comment.  A backslash
does not continue a comment.  A backslash does not continue a token
except for string literals (i.e., tokens other than string literals
cannot be split across physical lines using a backslash).  A backslash
is illegal elsewhere on a line outside a string literal.


File: python-ref.info,  Node: Implicit line joining,  Next: Blank lines blank line,  Prev: Explicit line joining,  Up: Line structure

Implicit line joining
---------------------

   Expressions in parentheses, square brackets or curly braces can be
split over more than one physical line without using backslashes.  For
example:

     month_names = ['Januari', 'Februari', 'Maart',      # These are the
                    'April',   'Mei',      'Juni',       # Dutch names
                    'Juli',    'Augustus', 'September',  # for the months
                    'Oktober', 'November', 'December']   # of the year

   Implicitly continued lines can carry comments.  The indentation of
the continuation lines is not important.  Blank continuation lines are
allowed.  There is no NEWLINE token between implicit continuation
lines.  Implicitly continued lines can also occur within triple-quoted
strings (see below); in that case they cannot carry comments.


File: python-ref.info,  Node: Blank lines blank line,  Next: Indentation,  Prev: Implicit line joining,  Up: Line structure

Blank lines -----------

   A logical line that contains only spaces, tabs, formfeeds and
possibly a comment, is ignored (i.e., no NEWLINE token is generated).
During interactive input of statements, handling of a blank line may
differ depending on the implementation of the read-eval-print loop.  In
the standard implementation, an entirely blank logical line (i.e. one
containing not even whitespace or a comment) terminates a multi-line
statement.


File: python-ref.info,  Node: Indentation,  Next: Whitespace between tokens,  Prev: Blank lines blank line,  Up: Line structure

Indentation
-----------

   Leading whitespace (spaces and tabs) at the beginning of a logical
line is used to compute the indentation level of the line, which in
turn is used to determine the grouping of statements.

   First, tabs are replaced (from left to right) by one to eight spaces
such that the total number of characters up to and including the
replacement is a multiple of eight (this is intended to be the same
rule as used by UNIX).  The total number of spaces preceding the first
non-blank character then determines the line's indentation.
Indentation cannot be split over multiple physical lines using
backslashes; the whitespace up to the first backslash determines the
indentation.

   *Cross-platform compatibility note:* because of the nature of text
editors on non-UNIX platforms, it is unwise to use a mixture of spaces
and tabs for the indentation in a single source file.

   A formfeed character may be present at the start of the line; it will
be ignored for the indentation calculations above.  Formfeed characters
occurring elsewhere in the leading whitespace have an undefined effect
(for instance, they may reset the space count to zero).

   The indentation levels of consecutive lines are used to generate
INDENT and DEDENT tokens, using a stack, as follows.

   Before the first line of the file is read, a single zero is pushed on
the stack; this will never be popped off again.  The numbers pushed on
the stack will always be strictly increasing from bottom to top.  At
the beginning of each logical line, the line's indentation level is
compared to the top of the stack.  If it is equal, nothing happens.  If
it is larger, it is pushed on the stack, and one INDENT token is
generated.  If it is smaller, it _must_ be one of the numbers occurring
on the stack; all numbers on the stack that are larger are popped off,
and for each number popped off a DEDENT token is generated.  At the end
of the file, a DEDENT token is generated for each number remaining on
the stack that is larger than zero.

   Here is an example of a correctly (though confusingly) indented piece
of Python code:

     def perm(l):
             # Compute the list of all permutations of l
         if len(l) <= 1:
                       return [l]
         r = []
         for i in range(len(l)):
                  s = l[:i] + l[i+1:]
                  p = perm(s)
                  for x in p:
                   r.append(l[i:i+1] + x)
         return r

   The following example shows various indentation errors:

          def perm(l):                       # error: first line indented
         for i in range(len(l)):             # error: not indented
             s = l[:i] + l[i+1:]
                 p = perm(l[:i] + l[i+1:])   # error: unexpected indent
                 for x in p:
                         r.append(l[i:i+1] + x)
                     return r                # error: inconsistent dedent

   (Actually, the first three errors are detected by the parser; only
the last error is found by the lexical analyzer -- the indentation of
`return r' does not match a level popped off the stack.)


File: python-ref.info,  Node: Whitespace between tokens,  Prev: Indentation,  Up: Line structure

Whitespace between tokens
-------------------------

   Except at the beginning of a logical line or in string literals, the
whitespace characters space, tab and formfeed can be used
interchangeably to separate tokens.  Whitespace is needed between two
tokens only if their concatenation could otherwise be interpreted as a
different token (e.g., ab is one token, but a b is two tokens).


File: python-ref.info,  Node: Other tokens,  Next: Identifiers and keywords,  Prev: Line structure,  Up: Lexical analysis

Other tokens
============

   Besides NEWLINE, INDENT and DEDENT, the following categories of
tokens exist: _identifiers_, _keywords_, _literals_, _operators_, and
_delimiters_.  Whitespace characters (other than line terminators,
discussed earlier) are not tokens, but serve to delimit tokens.  Where
ambiguity exists, a token comprises the longest possible string that
forms a legal token, when read from left to right.


File: python-ref.info,  Node: Identifiers and keywords,  Next: Literals,  Prev: Other tokens,  Up: Lexical analysis

Identifiers and keywords
========================

   Identifiers (also referred to as _names_) are described by the
following lexical definitions:

     identifier:     (letter|"_") (letter|digit|"_")*
     letter:         lowercase | uppercase
     lowercase:      "a"..."z"
     uppercase:      "A"..."Z"
     digit:          "0"..."9"

   Identifiers are unlimited in length.  Case is significant.

* Menu:

* Keywords::
* Reserved classes of identifiers::


File: python-ref.info,  Node: Keywords,  Next: Reserved classes of identifiers,  Prev: Identifiers and keywords,  Up: Identifiers and keywords

Keywords
--------

   The following identifiers are used as reserved words, or _keywords_
of the language, and cannot be used as ordinary identifiers.  They must
be spelled exactly as written here:

     and       del       for       is        raise
     assert    elif      from      lambda    return
     break     else      global    not       try
     class     except    if        or        while
     continue  exec      import    pass
     def       finally   in        print


File: python-ref.info,  Node: Reserved classes of identifiers,  Prev: Keywords,  Up: Identifiers and keywords

Reserved classes of identifiers
-------------------------------

   Certain classes of identifiers (besides keywords) have special
meanings.  These are:

Form                     Meaning                  Notes
------                   -----                    -----
_*                       Not imported by `from    (1)
                         MODULE import *'         
__*__                    System-defined name      
__*                      Class-private name       
                         mangling                 

   (XXX need section references here.)

   Note:

`(1)'
     The special identifier `_' is used in the interactive interpreter
     to store the result of the last evaluation; it is stored in the
     `__builtin__' module.  When not in interactive mode, `_' has no
     special meaning and is not defined.


File: python-ref.info,  Node: Literals,  Next: Operators,  Prev: Identifiers and keywords,  Up: Lexical analysis

Literals
========

   Literals are notations for constant values of some built-in types.

* Menu:

* String literals::
* String literal concatenation::
* Unicode literals::
* Numeric literals::
* Integer and long integer literals::
* Floating point literals::
* Imaginary literals::


File: python-ref.info,  Node: String literals,  Next: String literal concatenation,  Prev: Literals,  Up: Literals

String literals
---------------

   String literals are described by the following lexical definitions:

     stringliteral:   shortstring | longstring
     shortstring:     "'" shortstringitem* "'" | '"' shortstringitem* '"'
     longstring:      "'''" longstringitem* "'''" | '"""' longstringitem* '"""'
     shortstringitem: shortstringchar | escapeseq
     longstringitem:  longstringchar | escapeseq
     shortstringchar: <any ASCII character except "\" or newline or the quote>
     longstringchar:  <any ASCII character except "\">
     escapeseq:       "\" <any ASCII character>

   In plain English: String literals can be enclosed in matching single
quotes (`'') or double quotes (`"').  They can also be enclosed in
matching groups of three single or double quotes (these are generally
referred to as _triple-quoted strings_).  The backslash (`\') character
is used to escape characters that otherwise have a special meaning,
such as newline, backslash itself, or the quote character.  String
literals may optionally be prefixed with a letter `r' or `R'; such
strings are called "raw strings" and use different rules for backslash
escape sequences.  A prefix of 'u' or 'U' makes the string a Unicode
string.  Unicode strings use the Unicode character set as defined by
the Unicode Consortium and ISO~10646.  Some additional escape
sequences, described below, are available in Unicode strings.

   In triple-quoted strings, unescaped newlines and quotes are allowed
(and are retained), except that three unescaped quotes in a row
terminate the string.  (A "quote" is the character used to open the
string, i.e. either `'' or `"'.)

   Unless an `r' or `R' prefix is present, escape sequences in strings
are interpreted according to rules similar to those used by Standard C.
The recognized escape sequences are:

Escape Sequence                      Meaning
------                               -----
\NEWLINE                             Ignored
\\                                   Backslash (`\')
\'                                   Single quote (`'')
\"                                   Double quote (`"')
\a                                   ASCII Bell (BEL)
\b                                   ASCII Backspace (BS)
\f                                   ASCII Formfeed (FF)
\n                                   ASCII Linefeed (LF)
\N{NAME}                             Character named NAME in the Unicode
                                     database (Unicode only)
\r                                   ASCII Carriage Return (CR)
\t                                   ASCII Horizontal Tab (TAB)
\uXXXX                               Character with 16-bit hex value
                                     XXXX (Unicode only)
\UXXXXXXXX                           Character with 32-bit hex value
                                     XXXXXXXX (Unicode only)
\v                                   ASCII Vertical Tab (VT)
\OOO                                 ASCII character with octal value OOO
\xHH                                 ASCII character with hex value HH

   As in Standard C, up to three octal digits are accepted.  However,
exactly two hex digits are taken in hex escapes.

   Unlike Standard C, all unrecognized escape sequences are left in the
string unchanged, i.e., _the backslash is left in the string_.  (This
behavior is useful when debugging: if an escape sequence is mistyped,
the resulting output is more easily recognized as broken.)  It is also
important to note that the escape sequences marked as "(Unicode only)"
in the table above fall into the category of unrecognized escapes for
non-Unicode string literals.

   When an `r' or `R' prefix is present, a character following a
backslash is included in the string without change, and _all
backslashes are left in the string_.  For example, the string literal
`r"\n"' consists of two characters: a backslash and a lowercase `n'.
String quotes can be escaped with a backslash, but the backslash
remains in the string; for example, `r"\""' is a valid string literal
consisting of two characters: a backslash and a double quote; `r"\"' is
not a value string literal (even a raw string cannot end in an odd
number of backslashes).  Specifically, _a raw string cannot end in a
single backslash_ (since the backslash would escape the following quote
character).  Note also that a single backslash followed by a newline is
interpreted as those two characters as part of the string, _not_ as a
line continuation.


File: python-ref.info,  Node: String literal concatenation,  Next: Unicode literals,  Prev: String literals,  Up: Literals

String literal concatenation
----------------------------

   Multiple adjacent string literals (delimited by whitespace), possibly
using different quoting conventions, are allowed, and their meaning is
the same as their concatenation.  Thus, `"hello" 'world'' is equivalent
to `"helloworld"'.  This feature can be used to reduce the number of
backslashes needed, to split long strings conveniently across long
lines, or even to add comments to parts of strings, for example:

     re.compile("[A-Za-z_]"       # letter or underscore
                "[A-Za-z0-9_]*"   # letter, digit or underscore
               )

   Note that this feature is defined at the syntactical level, but
implemented at compile time.  The `+' operator must be used to
concatenate string expressions at run time.  Also note that literal
concatenation can use different quoting styles for each component (even
mixing raw strings and triple quoted strings).


File: python-ref.info,  Node: Unicode literals,  Next: Numeric literals,  Prev: String literal concatenation,  Up: Literals

Unicode literals
----------------

   XXX explain more here...


File: python-ref.info,  Node: Numeric literals,  Next: Integer and long integer literals,  Prev: Unicode literals,  Up: Literals

Numeric literals
----------------

   There are four types of numeric literals: plain integers, long
integers, floating point numbers, and imaginary numbers.  There are no
complex literals (complex numbers can be formed by adding a real number
and an imaginary number).

   Note that numeric literals do not include a sign; a phrase like `-1'
is actually an expression composed of the unary operator ``-'' and the
literal `1'.


File: python-ref.info,  Node: Integer and long integer literals,  Next: Floating point literals,  Prev: Numeric literals,  Up: Literals

Integer and long integer literals
---------------------------------

   Integer and long integer literals are described by the following
lexical definitions:

     longinteger:    integer ("l"|"L")
     integer:        decimalinteger | octinteger | hexinteger
     decimalinteger: nonzerodigit digit* | "0"
     octinteger:     "0" octdigit+
     hexinteger:     "0" ("x"|"X") hexdigit+
     nonzerodigit:   "1"..."9"
     octdigit:       "0"..."7"
     hexdigit:        digit|"a"..."f"|"A"..."F"

   Although both lower case `l' and upper case `L' are allowed as suffix
for long integers, it is strongly recommended to always use `L', since
the letter `l' looks too much like the digit `1'.

   Plain integer decimal literals must be at most 2147483647 (i.e., the
largest positive integer, using 32-bit arithmetic).  Plain octal and
hexadecimal literals may be as large as 4294967295, but values larger
than 2147483647 are converted to a negative value by subtracting
4294967296.  There is no limit for long integer literals apart from
what can be stored in available memory.

   Some examples of plain and long integer literals:

     7     2147483647                        0177    0x80000000
     3L    79228162514264337593543950336L    0377L   0x100000000L


File: python-ref.info,  Node: Floating point literals,  Next: Imaginary literals,  Prev: Integer and long integer literals,  Up: Literals

Floating point literals
-----------------------

   Floating point literals are described by the following lexical
definitions:

     floatnumber:    pointfloat | exponentfloat
     pointfloat:     [intpart] fraction | intpart "."
     exponentfloat:  (nonzerodigit digit* | pointfloat) exponent
     intpart:        nonzerodigit digit* | "0"
     fraction:       "." digit+
     exponent:       ("e"|"E") ["+"|"-"] digit+

   Note that the integer part of a floating point number cannot look
like an octal integer, though the exponent may look like an octal
literal but will always be interpreted using radix 10.  For example,
`1e010' is legal, while `07.1' is a syntax error.  The allowed range of
floating point literals is implementation-dependent.  Some examples of
floating point literals:

     3.14    10.    .001    1e100    3.14e-10

   Note that numeric literals do not include a sign; a phrase like `-1'
is actually an expression composed of the operator `-' and the literal
`1'.


File: python-ref.info,  Node: Imaginary literals,  Prev: Floating point literals,  Up: Literals

Imaginary literals
------------------

   Imaginary literals are described by the following lexical
definitions:

     imagnumber:     (floatnumber | intpart) ("j"|"J")

   An imaginary literal yields a complex number with a real part of
0.0.  Complex numbers are represented as a pair of floating point
numbers and have the same restrictions on their range.  To create a
complex number with a nonzero real part, add a floating point number to
it, e.g., `(3+4j)'.  Some examples of imaginary literals:

     3.14j   10.j    10j     .001j   1e100j  3.14e-10j


File: python-ref.info,  Node: Operators,  Next: Delimiters,  Prev: Literals,  Up: Lexical analysis

Operators
=========

   The following tokens are operators:

     +       -       *       **      /       %
     <<      >>      &       |       ^       ~
     <       >       <=      >=      ==      !=      <>

   The comparison operators `<>' and `!=' are alternate spellings of
the same operator.  `!=' is the preferred spelling; `<>' is obsolescent.


File: python-ref.info,  Node: Delimiters,  Prev: Operators,  Up: Lexical analysis

Delimiters
==========

   The following tokens serve as delimiters in the grammar:

     (       )       [       ]       {       }
     ,       :       .       `       =       ;
     +=      -=      *=      /=      %=      **=
     &=      |=      ^=      >>=     <<=

   The period can also occur in floating-point and imaginary literals.
A sequence of three periods has a special meaning as an ellipsis in
slices.  The second half of the list, the augmented assignment
operators, serve lexically as delimiters, but also perform an operation.

   The following printing ASCII characters have special meaning as part
of other tokens or are otherwise significant to the lexical analyzer:

     '       "       #       \

   The following printing ASCII characters are not used in Python.
Their occurrence outside string literals and comments is an
unconditional error:

     @       $       ?


File: python-ref.info,  Node: Data model,  Next: Execution model,  Prev: Lexical analysis,  Up: Top

Data model
**********

* Menu:

* Objects::
* standard type hierarchy::
* Special method names::


File: python-ref.info,  Node: Objects,  Next: standard type hierarchy,  Prev: Data model,  Up: Data model

Objects, values and types
=========================

   "Objects" are Python's abstraction for data.  All data in a Python
program is represented by objects or by relations between objects.  (In
a sense, and in conformance to Von Neumann's model of a "stored program
computer," code is also represented by objects.)

   Every object has an identity, a type and a value.  An object's
_identity_ never changes once it has been created; you may think of it
as the object's address in memory.  The ``is'' operator compares the
identity of two objects; the `id()' function returns an integer
representing its identity (currently implemented as its address).  An
object's "type" is also unchangeable.  It determines the operations
that an object supports (e.g., "does it have a length?") and also
defines the possible values for objects of that type.  The `type()'
function returns an object's type (which is an object itself).  The
_value_ of some objects can change.  Objects whose value can change are
said to be _mutable_; objects whose value is unchangeable once they are
created are called _immutable_.  (The value of an immutable container
object that contains a reference to a mutable object can change when
the latter's value is changed; however the container is still
considered immutable, because the collection of objects it contains
cannot be changed.  So, immutability is not strictly the same as having
an unchangeable value, it is more subtle.)  An object's mutability is
determined by its type; for instance, numbers, strings and tuples are
immutable, while dictionaries and lists are mutable.

   Objects are never explicitly destroyed; however, when they become
unreachable they may be garbage-collected.  An implementation is
allowed to postpone garbage collection or omit it altogether -- it is a
matter of implementation quality how garbage collection is implemented,
as long as no objects are collected that are still reachable.
(Implementation note: the current implementation uses a
reference-counting scheme with (optional) delayed detection of cyclicly
linked garbage, which collects most objects as soon as they become
unreachable, but is not guaranteed to collect garbage containing
circular references.  See the  for information on controlling the
collection of cyclic garbage.)

   Note that the use of the implementation's tracing or debugging
facilities may keep objects alive that would normally be collectable.
Also note that catching an exception with a ``try'...`except''
statement may keep objects alive.

   Some objects contain references to "external" resources such as open
files or windows.  It is understood that these resources are freed when
the object is garbage-collected, but since garbage collection is not
guaranteed to happen, such objects also provide an explicit way to
release the external resource, usually a `close()' method.  Programs
are strongly recommended to explicitly close such objects.  The
``try'...`finally'' statement provides a convenient way to do this.

   Some objects contain references to other objects; these are called
_containers_.  Examples of containers are tuples, lists and
dictionaries.  The references are part of a container's value.  In most
cases, when we talk about the value of a container, we imply the
values, not the identities of the contained objects; however, when we
talk about the mutability of a container, only the identities of the
immediately contained objects are implied.  So, if an immutable
container (like a tuple) contains a reference to a mutable object, its
value changes if that mutable object is changed.

   Types affect almost all aspects of object behavior.  Even the
importance of object identity is affected in some sense: for immutable
types, operations that compute new values may actually return a
reference to any existing object with the same type and value, while
for mutable objects this is not allowed.  E.g., after `a = 1; b = 1',
`a' and `b' may or may not refer to the same object with the value one,
depending on the implementation, but after `c = []; d = []', `c' and `d'
are guaranteed to refer to two different, unique, newly created empty
lists.  (Note that `c = d = []' assigns the same object to both `c' and
`d'.)

