This is Info file pm.info, produced by Makeinfo version 1.68 from the
input file bigpm.texi.


File: pm.info,  Node: XML/DOM/Element,  Next: XML/DOM/ElementDecl,  Prev: XML/DOM/DocumentType,  Up: Module List

An XML element node in XML::DOM
*******************************

NAME
====

   XML::DOM::Element - An XML element node in XML::DOM

DESCRIPTION
===========

   XML::DOM::Element extends *Note XML/DOM/Node: XML/DOM/Node,.

   By far the vast majority of objects (apart from text) that authors
encounter when traversing a document are Element nodes. Assume the
following XML document:

     <elementExample id="demo">
       <subelement1/>
       <subelement2><subsubelement/></subelement2>
     </elementExample>

   When represented using DOM, the top node is an Element node for
"elementExample", which contains two child Element nodes, one for
"subelement1" and one for "subelement2". "subelement1" contains no child
nodes.

   Elements may have attributes associated with them; since the Element
interface inherits from Node, the generic Node interface method
getAttributes may be used to retrieve the set of all attributes for an
element. There are methods on the Element interface to retrieve either an
Attr object by name or an attribute value by name. In XML, where an
attribute value may contain entity references, an Attr object should be
retrieved to examine the possibly fairly complex sub-tree representing the
attribute value. On the other hand, in HTML, where all attributes have
simple string values, methods to directly access an attribute value can
safely be used as a convenience.

METHODS
-------

getTagName
     The name of the element. For example, in:

          <elementExample id="demo">
                  ...
          </elementExample>

     tagName has the value "elementExample". Note that this is
     case-preserving in XML, as are all of the operations of the DOM.

getAttribute (name)
     Retrieves an attribute value by name.

     Return Value: The Attr value as a string, or the empty string if that
     attribute does not have a specified or default value.

setAttribute (name, value)
     Adds a new attribute. If an attribute with that name is already
     present in the element, its value is changed to be that of the value
     parameter. This value is a simple string, it is not parsed as it is
     being set. So any markup (such as syntax to be recognized as an
     entity reference) is treated as literal text, and needs to be
     appropriately escaped by the implementation when it is written out.
     In order to assign an attribute value that contains entity
     references, the user must create an Attr node plus any Text and
     EntityReference nodes, build the appropriate subtree, and use
     setAttributeNode to assign it as the value of an attribute.

     DOMExceptions:

        * INVALID_CHARACTER_ERR

          Raised if the specified name contains an invalid character.

        * NO_MODIFICATION_ALLOWED_ERR

          Raised if this node is readonly.

removeAttribute (name)
     Removes an attribute by name. If the removed attribute has a default
     value it is immediately replaced.

     DOMExceptions:

        * NO_MODIFICATION_ALLOWED_ERR

          Raised if this node is readonly.

getAttributeNode
     Retrieves an Attr node by name.

     Return Value: The Attr node with the specified attribute name or undef
     if there is no such attribute.

setAttributeNode (attr)
     Adds a new attribute. If an attribute with that name is already
     present in the element, it is replaced by the new one.

     Return Value: If the newAttr attribute replaces an existing attribute
     with the same name, the previously existing Attr node is returned,
     otherwise undef is returned.

     DOMExceptions:

        * WRONG_DOCUMENT_ERR

          Raised if newAttr was created from a different document than the
          one that created the element.

        * NO_MODIFICATION_ALLOWED_ERR

          Raised if this node is readonly.

        * INUSE_ATTRIBUTE_ERR

          Raised if newAttr is already an attribute of another Element
          object. The DOM user must explicitly clone Attr nodes to re-use
          them in other elements.

removeAttributeNode (oldAttr)
     Removes the specified attribute. If the removed Attr has a default
     value it is immediately replaced. If the Attr already is the default
     value, nothing happens and nothing is returned.

     Parameters:  *oldAttr*  The Attr node to remove from the attribute
     list.

     Return Value: The Attr node that was removed.

     DOMExceptions:

        * NO_MODIFICATION_ALLOWED_ERR

          Raised if this node is readonly.

        * NOT_FOUND_ERR

          Raised if oldAttr is not an attribute of the element.

Additional methods not in the DOM Spec
--------------------------------------

setTagName (newTagName)
     Sets the tag name of the Element. Note that this method is not
     portable between DOM implementations.

     DOMExceptions:

        * INVALID_CHARACTER_ERR

          Raised if the specified name contains an invalid character.

check ($checker)
     Uses the specified *Note XML/Checker: XML/Checker, to validate the
     document.  NOTE: an XML::Checker must be supplied. The checker can be
     created in different ways, e.g. when parsing a document with
     XML::DOM::ValParser, or with XML::DOM::Document::createChecker().
     See *Note XML/Checker: XML/Checker, for more info.


File: pm.info,  Node: XML/DOM/ElementDecl,  Next: XML/DOM/Entity,  Prev: XML/DOM/Element,  Up: Module List

An XML ELEMENT declaration in XML::DOM
**************************************

NAME
====

   XML::DOM::ElementDecl - An XML ELEMENT declaration in XML::DOM

DESCRIPTION
===========

   XML::DOM::ElementDecl extends *Note XML/DOM/Node: XML/DOM/Node, but is
not part of the DOM Level 1 specification.

   This node represents an Element declaration, e.g.

     <!ELEMENT address (street+, city, state, zip, country?)>

METHODS
-------

getName
     Returns the Element tagName.

getModel and setModel (model)
     Returns and sets the model as a string, e.g.  "(street+, city, state,
     zip, country?)" in the above example.


File: pm.info,  Node: XML/DOM/Entity,  Next: XML/DOM/EntityReference,  Prev: XML/DOM/ElementDecl,  Up: Module List

An XML ENTITY in XML::DOM
*************************

NAME
====

   XML::DOM::Entity - An XML ENTITY in XML::DOM

DESCRIPTION
===========

   XML::DOM::Entity extends *Note XML/DOM/Node: XML/DOM/Node,.

   This node represents an Entity declaration, e.g.

     <!ENTITY % draft 'INCLUDE'>

     <!ENTITY hatch-pic SYSTEM "../grafix/OpenHatch.gif" NDATA gif>

   The first one is called a parameter entity and is referenced like this:
%draft; The 2nd is a (regular) entity and is referenced like this:
&hatch-pic;

METHODS
-------

getNotationName
     Returns the name of the notation for the entity.

     Not Implemented The DOM Spec says: For unparsed entities, the name of
     the notation for the entity. For parsed entities, this is null.
     (This implementation does not support unparsed entities.)

getSysId
     Returns the system id, or undef.

getPubId
     Returns the public id, or undef.

Additional methods not in the DOM Spec
--------------------------------------

isParameterEntity
     Whether it is a parameter entity (%ent;) or not (&ent;)

getValue
     Returns the entity value.

getNdata
     Returns the NDATA declaration (for general unparsed entities), or
     undef.


File: pm.info,  Node: XML/DOM/EntityReference,  Next: XML/DOM/NamedNodeMap,  Prev: XML/DOM/Entity,  Up: Module List

An XML ENTITY reference in XML::DOM
***********************************

NAME
====

   XML::DOM::EntityReference - An XML ENTITY reference in XML::DOM

DESCRIPTION
===========

   XML::DOM::EntityReference extends *Note XML/DOM/Node: XML/DOM/Node,.

   EntityReference objects may be inserted into the structure model when
an entity reference is in the source document, or when the user wishes to
insert an entity reference. Note that character references and references
to predefined entities are considered to be expanded by the HTML or XML
processor so that characters are represented by their Unicode equivalent
rather than by an entity reference. Moreover, the XML processor may
completely expand references to entities while building the structure
model, instead of providing EntityReference objects. If it does provide
such objects, then for a given EntityReference node, it may be that there
is no Entity node representing the referenced entity; but if such an
Entity exists, then the child list of the EntityReference node is the same
as that of the Entity node. As with the Entity node, all descendants of the
EntityReference are readonly.

   The resolution of the children of the EntityReference (the replacement
value of the referenced Entity) may be lazily evaluated; actions by the
user (such as calling the childNodes method on the EntityReference node)
are assumed to trigger the evaluation.


File: pm.info,  Node: XML/DOM/NamedNodeMap,  Next: XML/DOM/Node,  Prev: XML/DOM/EntityReference,  Up: Module List

A hash table interface for XML::DOM
***********************************

NAME
====

   XML::DOM::NamedNodeMap - A hash table interface for XML::DOM

DESCRIPTION
===========

   Objects implementing the NamedNodeMap interface are used to represent
collections of nodes that can be accessed by name. Note that NamedNodeMap
does not inherit from NodeList; NamedNodeMaps are not maintained in any
particular order. Objects contained in an object implementing NamedNodeMap
may also be accessed by an ordinal index, but this is simply to allow
convenient enumeration of the contents of a NamedNodeMap, and does not
imply that the DOM specifies an order to these Nodes.

   Note that in this implementation, the objects added to a NamedNodeMap
are kept in order.

METHODS
-------

getNamedItem (name)
     Retrieves a node specified by name.

     Return Value: A Node (of any type) with the specified name, or undef
     if the specified name did not identify any node in the map.

setNamedItem (arg)
     Adds a node using its nodeName attribute.

     As the nodeName attribute is used to derive the name which the node
     must be stored under, multiple nodes of certain types (those that
     have a "special" string value) cannot be stored as the names would
     clash. This is seen as preferable to allowing nodes to be aliased.

     Parameters:  *arg*  A node to store in a named node map.

     The node will later be accessible using the value of the nodeName
     attribute of the node. If a node with that name is already present in
     the map, it is replaced by the new one.

     Return Value: If the new Node replaces an existing node with the same
     name the previously existing Node is returned, otherwise undef is
     returned.

     DOMExceptions:

        * WRONG_DOCUMENT_ERR

          Raised if arg was created from a different document than the one
          that created the NamedNodeMap.

        * NO_MODIFICATION_ALLOWED_ERR

          Raised if this NamedNodeMap is readonly.

        * INUSE_ATTRIBUTE_ERR

          Raised if arg is an Attr that is already an attribute of another
          Element object.  The DOM user must explicitly clone Attr nodes
          to re-use them in other elements.

removeNamedItem (name)
     Removes a node specified by name. If the removed node is an Attr with
     a default value it is immediately replaced.

     Return Value: The node removed from the map or undef if no node with
     such a name exists.

     DOMException:

        * NOT_FOUND_ERR

          Raised if there is no node named name in the map.

item (index)
     Returns the indexth item in the map. If index is greater than or
     equal to the number of nodes in the map, this returns undef.

     Return Value: The node at the indexth position in the NamedNodeMap, or
     undef if that is not a valid index.

getLength
     Returns the number of nodes in the map. The range of valid child node
     indices is 0 to length-1 inclusive.

Additional methods not in the DOM Spec
--------------------------------------

getValues
     Returns a NodeList with the nodes contained in the NamedNodeMap.  The
     NodeList is "live", in that it reflects changes made to the
     NamedNodeMap.

     When this method is called in a list context, it returns a regular
     perl list containing the values. Note that this list is not "live".
     E.g.

          @list = $map->getValues;	 # returns a perl list
          $nodelist = $map->getValues;    # returns a NodeList (object ref.)
          for my $val ($map->getValues)   # iterate over the values

getChildIndex (node)
     Returns the index of the node in the NodeList as returned by
     getValues, or -1 if the node is not in the NamedNodeMap.

dispose
     Removes all circular references in this NamedNodeMap and its
     descendants so the objects can be claimed for garbage collection. The
     objects should not be used afterwards.


File: pm.info,  Node: XML/DOM/Node,  Next: XML/DOM/NodeList,  Prev: XML/DOM/NamedNodeMap,  Up: Module List

Super class of all nodes in XML::DOM
************************************

NAME
====

   XML::DOM::Node - Super class of all nodes in XML::DOM

DESCRIPTION
===========

   XML::DOM::Node is the super class of all nodes in an XML::DOM document.
This means that all nodes that subclass XML::DOM::Node also inherit all
the methods that XML::DOM::Node implements.

GLOBAL VARIABLES
----------------

@NodeNames
     The variable @XML::DOM::Node::NodeNames maps the node type constants
     to strings.  It is used by XML::DOM::Node::getNodeTypeName.

METHODS
-------

getNodeType
     Return an integer indicating the node type. See XML::DOM constants.

getNodeName
     Return a property or a hardcoded string, depending on the node type.
     Here are the corresponding functions or values:

          Attr			getName
          AttDef			getName
          AttlistDecl		getName
          CDATASection		"#cdata-section"
          Comment		"#comment"
          Document		"#document"
          DocumentType		getNodeName
          DocumentFragment	"#document-fragment"
          Element		getTagName
          ElementDecl		getName
          EntityReference	getEntityName
          Entity			getNotationName
          Notation		getName
          ProcessingInstruction	getTarget
          Text			"#text"
          XMLDecl		"#xml-declaration"

     *Not In DOM Spec*: AttDef, AttlistDecl, ElementDecl and XMLDecl were
     added for completeness.

getNodeValue and setNodeValue (value)
     Returns a string or undef, depending on the node type. This method is
     provided for completeness. In other languages it saves the programmer
     an upcast.  The value is either available thru some other method
     defined in the subclass, or else undef is returned. Here are the
     corresponding methods: Attr::getValue, Text::getData,
     CDATASection::getData, Comment::getData,
     ProcessingInstruction::getData.

getParentNode and setParentNode (parentNode)
     The parent of this node. All nodes, except Document,
     DocumentFragment, and Attr may have a parent. However, if a node has
     just been created and not yet added to the tree, or if it has been
     removed from the tree, this is undef.

getChildNodes
     A NodeList that contains all children of this node. If there are no
     children, this is a NodeList containing no nodes. The content of the
     returned NodeList is "live" in the sense that, for instance, changes
     to the children of the node object that it was created from are
     immediately reflected in the nodes returned by the NodeList
     accessors; it is not a static snapshot of the content of the node.
     This is true for every NodeList, including the ones returned by the
     getElementsByTagName method.

     NOTE: this implementation does not return a "live" NodeList for
     getElementsByTagName. See `CAVEATS' in this node.

     When this method is called in a list context, it returns a regular
     perl list containing the child nodes. Note that this list is not
     "live". E.g.

          @list = $node->getChildNodes;	      # returns a perl list
          $nodelist = $node->getChildNodes;    # returns a NodeList (object reference)
          for my $kid ($node->getChildNodes)   # iterate over the children of $node

getFirstChild
     The first child of this node. If there is no such node, this returns
     undef.

getLastChild
     The last child of this node. If there is no such node, this returns
     undef.

getPreviousSibling
     The node immediately preceding this node. If there is no such node,
     this returns undef.

getNextSibling
     The node immediately following this node. If there is no such node,
     this returns undef.

getAttributes
     A NamedNodeMap containing the attributes (Attr nodes) of this node
     (if it is an Element) or undef otherwise.  Note that adding/removing
     attributes from the returned object, also adds/removes attributes
     from the Element node that the NamedNodeMap came from.

getOwnerDocument
     The Document object associated with this node. This is also the
     Document object used to create new nodes. When this node is a
     Document this is undef.

insertBefore (newChild, refChild)
     Inserts the node newChild before the existing child node refChild. If
     refChild is undef, insert newChild at the end of the list of children.

     If newChild is a DocumentFragment object, all of its children are
     inserted, in the same order, before refChild. If the newChild is
     already in the tree, it is first removed.

     Return Value: The node being inserted.

     DOMExceptions:

        * HIERARCHY_REQUEST_ERR

          Raised if this node is of a type that does not allow children of
          the type of the newChild node, or if the node to insert is one
          of this node's ancestors.

        * WRONG_DOCUMENT_ERR

          Raised if newChild was created from a different document than
          the one that created this node.

        * NO_MODIFICATION_ALLOWED_ERR

          Raised if this node is readonly.

        * NOT_FOUND_ERR

          Raised if refChild is not a child of this node.

replaceChild (newChild, oldChild)
     Replaces the child node oldChild with newChild in the list of
     children, and returns the oldChild node. If the newChild is already
     in the tree, it is first removed.

     Return Value: The node replaced.

     DOMExceptions:

        * HIERARCHY_REQUEST_ERR

          Raised if this node is of a type that does not allow children of
          the type of the newChild node, or it the node to put in is one
          of this node's ancestors.

        * WRONG_DOCUMENT_ERR

          Raised if newChild was created from a different document than
          the one that created this node.

        * NO_MODIFICATION_ALLOWED_ERR

          Raised if this node is readonly.

        * NOT_FOUND_ERR

          Raised if oldChild is not a child of this node.

removeChild (oldChild)
     Removes the child node indicated by oldChild from the list of
     children, and returns it.

     Return Value: The node removed.

     DOMExceptions:

        * NO_MODIFICATION_ALLOWED_ERR

          Raised if this node is readonly.

        * NOT_FOUND_ERR

          Raised if oldChild is not a child of this node.

appendChild (newChild)
     Adds the node newChild to the end of the list of children of this
     node. If the newChild is already in the tree, it is first removed. If
     it is a DocumentFragment object, the entire contents of the document
     fragment are moved into the child list of this node

     Return Value: The node added.

     DOMExceptions:

        * HIERARCHY_REQUEST_ERR

          Raised if this node is of a type that does not allow children of
          the type of the newChild node, or if the node to append is one
          of this node's ancestors.

        * WRONG_DOCUMENT_ERR

          Raised if newChild was created from a different document than
          the one that created this node.

        * NO_MODIFICATION_ALLOWED_ERR

          Raised if this node is readonly.

hasChildNodes
     This is a convenience method to allow easy determination of whether a
     node has any children.

     Return Value: 1 if the node has any children, 0 otherwise.

cloneNode (deep)
     Returns a duplicate of this node, i.e., serves as a generic copy
     constructor for nodes. The duplicate node has no parent (parentNode
     returns undef.).

     Cloning an Element copies all attributes and their values, including
     those generated by the XML processor to represent defaulted
     attributes, but this method does not copy any text it contains unless
     it is a deep clone, since the text is contained in a child Text node.
     Cloning any other type of node simply returns a copy of this node.

     Parameters:  *deep*   If true, recursively clone the subtree under
     the specified node.  If false, clone only the node itself (and its
     attributes, if it is an Element).

     Return Value: The duplicate node.

normalize
     Puts all Text nodes in the full depth of the sub-tree underneath this
     Element into a "normal" form where only markup (e.g., tags, comments,
     processing instructions, CDATA sections, and entity references)
     separates Text nodes, i.e., there are no adjacent Text nodes. This
     can be used to ensure that the DOM view of a document is the same as
     if it were saved and re-loaded, and is useful when operations (such as
     XPointer lookups) that depend on a particular document tree structure
     are to be used.

     *Not In DOM Spec*: In the DOM Spec this method is defined in the
     Element and Document class interfaces only, but it doesn't hurt to
     have it here...

getElementsByTagName (name [, recurse])
     Returns a NodeList of all descendant elements with a given tag name,
     in the order in which they would be encountered in a preorder
     traversal of the Element tree.

     Parameters:  name  The name of the tag to match on. The special value
     "*" matches all tags.   recurse  Whether it should return only direct
     child nodes (0) or any descendant that matches the tag name (1). This
     argument is optional and defaults to 1. It is not part of the DOM
     spec.

     Return Value: A list of matching Element nodes.

     NOTE: this implementation does not return a "live" NodeList for
     getElementsByTagName. See `CAVEATS' in this node.

     When this method is called in a list context, it returns a regular
     perl list containing the result nodes. E.g.

          @list = $node->getElementsByTagName("tag");       # returns a perl list
          $nodelist = $node->getElementsByTagName("tag");   # returns a NodeList (object ref.)
          for my $elem ($node->getElementsByTagName("tag")) # iterate over the result nodes

Additional methods not in the DOM Spec
--------------------------------------

getNodeTypeName
     Return the string describing the node type.  E.g. returns
     "ELEMENT_NODE" if getNodeType returns ELEMENT_NODE.  It uses
     @XML::DOM::Node::NodeNames.

toString
     Returns the entire subtree as a string.

printToFile (filename)
     Prints the entire subtree to the file with the specified filename.

     Croaks: if the file could not be opened for writing.

printToFileHandle (handle)
     Prints the entire subtree to the file handle.  E.g. to print to
     STDOUT:

          $node->printToFileHandle (\*STDOUT);

print (obj)
     Prints the entire subtree using the object's print method. E.g to
     print to a FileHandle object:

          $f = new FileHandle ("file.out", "w");
          $node->print ($f);

getChildIndex (child)
     Returns the index of the child node in the list returned by
     getChildNodes.

     Return Value: the index or -1 if the node is not found.

getChildAtIndex (index)
     Returns the child node at the specifed index or undef.

addText (text)
     Appends the specified string to the last child if it is a Text node,
     or else appends a new Text node (with the specified text.)

     Return Value: the last child if it was a Text node or else the new
     Text node.

dispose
     Removes all circular references in this node and its descendants so
     the objects can be claimed for garbage collection. The objects should
     not be used afterwards.

setOwnerDocument (doc)
     Sets the ownerDocument property of this node and all its children (and
     attributes etc.) to the specified document.  This allows the user to
     cut and paste document subtrees between different
     XML::DOM::Documents. The node should be removed from the original
     document first, before calling setOwnerDocument.

     This method does nothing when called on a Document node.

isAncestor (parent)
     Returns 1 if parent is an ancestor of this node or if it is this node
     itself.

expandEntityRefs (str)
     Expands all the entity references in the string and returns the
     result.  The entity references can be character references (e.g.
     "&#123;" or "&#x1fc2"), default entity references ("&quot;", "&gt;",
     "&lt;", "&apos;" and "&amp;") or entity references defined in Entity
     objects as part of the DocumentType of the owning Document. Character
     references are expanded into UTF-8.  Parameter entity references
     (e.g. %ent;) are not expanded.

to_sax ( %HANDLERS )
     E.g.

          $node->to_sax (DocumentHandler => $my_handler,
          		Handler => $handler2 );

     %HANDLERS may contain the following handlers:

        * DocumentHandler

        * DTDHandler

        * EntityResolver

        * Handler

          Default handler when one of the above is not specified

     Each XML::DOM::Node generates the appropriate SAX callbacks (for the
     appropriate SAX handler.) Different SAX handlers can be plugged in to
     accomplish different things, e.g. *Note XML/Checker: XML/Checker,
     would check the node (currently only Document and Element nodes are
     supported), *Note XML/Handler/BuildDOM: XML/Handler/BuildDOM, would
     create a new DOM subtree (thereby, in essence, copying the Node) and
     in the near future, XML::Writer could print the node.  All Perl SAX
     related work is still in flux, so this interface may change a little.

     See PerlSAX for the description of the SAX interface.

check ( [$checker] )
     See descriptions for check() in *Note XML/DOM/Document:
     XML/DOM/Document, and *Note XML/DOM/Element: XML/DOM/Element,.

xql ( @XQL_OPTIONS )
     To use the xql method, you must first use *Note XML/XQL: XML/XQL, and
     *Note XML/XQL/DOM: XML/XQL/DOM,.  This method is basically a shortcut
     for:

          $query = new XML::XQL::Query ( @XQL_OPTIONS );
          return $query->solve ($node);

     If the first parameter in @XQL_OPTIONS is the XQL expression, you can
     leave off the 'Expr' keyword, so:

          $node->xql ("doc//elem1[@attr]", @other_options);

     is identical to:

          $node->xql (Expr => "doc//elem1[@attr]", @other_options);

     See *Note XML/XQL/Query: XML/XQL/Query, for other available
     XQL_OPTIONS.  See *Note XML/XQL: XML/XQL, and *Note XML/XQL/Tutorial:
     XML/XQL/Tutorial, for more info.

isHidden ()
     Whether the node is hidden.  See `Hidden Nodes|XML::DOM' in this node
     for details.


File: pm.info,  Node: XML/DOM/NodeList,  Next: XML/DOM/Notation,  Prev: XML/DOM/Node,  Up: Module List

A node list as used by XML::DOM
*******************************

NAME
====

   XML::DOM::NodeList - A node list as used by XML::DOM

DESCRIPTION
===========

   The NodeList interface provides the abstraction of an ordered
collection of nodes, without defining or constraining how this collection
is implemented.

   The items in the NodeList are accessible via an integral index,
starting from 0.

   Although the DOM spec states that all NodeLists are "live" in that they
allways reflect changes to the DOM tree, the NodeList returned by
getElementsByTagName is not live in this implementation. See `CAVEATS' in
this node for details.

METHODS
-------

item (index)
     Returns the indexth item in the collection. If index is greater than
     or equal to the number of nodes in the list, this returns undef.

getLength
     The number of nodes in the list. The range of valid child node
     indices is 0 to length-1 inclusive.

Additional methods not in the DOM Spec
--------------------------------------

dispose
     Removes all circular references in this NodeList and its descendants
     so the objects can be claimed for garbage collection. The objects
     should not be used afterwards.


File: pm.info,  Node: XML/DOM/Notation,  Next: XML/DOM/Parser,  Prev: XML/DOM/NodeList,  Up: Module List

An XML NOTATION in XML::DOM
***************************

NAME
====

   XML::DOM::Notation - An XML NOTATION in XML::DOM

DESCRIPTION
===========

   XML::DOM::Notation extends *Note XML/DOM/Node: XML/DOM/Node,.

   This node represents a Notation, e.g.

     <!NOTATION gs SYSTEM "GhostScript">

     <!NOTATION name PUBLIC "pubId">

     <!NOTATION name PUBLIC "pubId" "sysId">

     <!NOTATION name SYSTEM "sysId">

METHODS
-------

getName and setName (name)
     Returns (or sets) the Notation name, which is the first token after
     the NOTATION keyword.

getSysId and setSysId (sysId)
     Returns (or sets) the system ID, which is the token after the optional
     SYSTEM keyword.

getPubId and setPubId (pubId)
     Returns (or sets) the public ID, which is the token after the optional
     PUBLIC keyword.

getBase
     This is passed by XML::Parser in the Notation handler.  I don't know
     what it is yet.

getNodeName
     Returns the same as getName.


File: pm.info,  Node: XML/DOM/Parser,  Next: XML/DOM/PerlSAX,  Prev: XML/DOM/Notation,  Up: Module List

An XML::Parser that builds XML::DOM document structures
*******************************************************

NAME
====

   XML::DOM::Parser - An XML::Parser that builds XML::DOM document
structures

SYNOPSIS
========

     use XML::DOM;

     my $parser = new XML::DOM::Parser;
     my $doc = $parser->parsefile ("file.xml");

DESCRIPTION
===========

   XML::DOM::Parser extends *Note XML/Parser: XML/Parser,

   The XML::Parser module was written by Clark Cooper and is built on top
of XML::Parser::Expat, which is a lower level interface to James Clark's
expat library.

   XML::DOM::Parser parses XML strings or files and builds a data
structure that conforms to the API of the Document Object Model as
described at `http:' in this node.  See the *Note XML/Parser: XML/Parser,
manpage for other additional properties of the XML::DOM::Parser class.
Note that the 'Style' property should not be used (it is set internally.)

   The XML::Parser *NoExpand* option is more or less supported, in that it
will generate EntityReference objects whenever an entity reference is
encountered in character data. I'm not sure how useful this is. Any
comments are welcome.

   As described in the synopsis, when you create an XML::DOM::Parser
object, the parse and parsefile methods create an *Note XML/DOM/Document:
XML/DOM/Document, object from the specified input. This Document object
can then be examined, modified and written back out to a file or converted
to a string.

   When using XML::DOM with XML::Parser version 2.19 and up, setting the
XML::DOM::Parser option *KeepCDATA* to 1 will store CDATASections in
CDATASection nodes, instead of converting them to Text nodes.  Subsequent
CDATASection nodes will be merged into one. Let me know if this is a
problem.

Using LWP to parse URLs
=======================

   The parsefile() method now also supports URLs, e.g.
*http://www.erols.com/enno/xsa.xml*.  It uses LWP to download the file and
then calls parse() on the resulting string.  By default it will use a
*Note LWP/UserAgent: LWP/UserAgent, that is created as follows:

     use LWP::UserAgent;
     $LWP_USER_AGENT = LWP::UserAgent->new;
     $LWP_USER_AGENT->env_proxy;

   Note that env_proxy reads proxy settings from environment variables,
which is what I need to do to get thru our firewall. If you want to use a
different LWP::UserAgent, you can either set it globally with:

     XML::DOM::Parser::set_LWP_UserAgent ($my_agent);

   or, you can specify it for a specific XML::DOM::Parser by passing it to
the constructor:

     my $parser = new XML::DOM::Parser (LWP_UserAgent => $my_agent);

   Currently, LWP is used when the filename (passed to parsefile) starts
with one of the following URL schemes: http, https, ftp, wais, gopher, or
file (followed by a colon.)  If I missed one, please let me know.

   The LWP modules are part of libwww-perl which is available at CPAN.


File: pm.info,  Node: XML/DOM/PerlSAX,  Next: XML/DOM/ProcessingInstruction,  Prev: XML/DOM/Parser,  Up: Module List

Old name of *Note XML/Handler/BuildDOM: XML/Handler/BuildDOM,
*************************************************************

NAME
====

   XML::DOM::PerlSAX - Old name of *Note XML/Handler/BuildDOM:
XML/Handler/BuildDOM,

SYNOPSIS
========

     See L<XML::DOM::BuildDOM>

DESCRIPTION
===========

   XML::DOM::PerlSAX was renamed to *Note XML/Handler/BuildDOM:
XML/Handler/BuildDOM, to comply with naming conventions for PerlSAX
filters/handlers.

   For backward compatibility, this package will remain in existence (it
simply includes XML::Handler::BuildDOM), but it will print a warning when
running with *'perl -w'*.

AUTHOR
======

   Send bug reports, hints, tips, suggestions to Enno Derksen at
<`enno@att.com'>.

SEE ALSO
========

   *Note XML/Handler/BuildDOM: XML/Handler/BuildDOM,, *Note XML/DOM:
XML/DOM,


File: pm.info,  Node: XML/DOM/ProcessingInstruction,  Next: XML/DOM/Text,  Prev: XML/DOM/PerlSAX,  Up: Module List

An XML processing instruction in XML::DOM
*****************************************

NAME
====

   XML::DOM::ProcessingInstruction - An XML processing instruction in
XML::DOM

DESCRIPTION
===========

   XML::DOM::ProcessingInstruction extends *Note XML/DOM/Node:
XML/DOM/Node,.

   It represents a "processing instruction", used in XML as a way to keep
processor-specific information in the text of the document. An example:

     <?PI processing instruction?>

   Here, "PI" is the target and "processing instruction" is the data.

METHODS
-------

getTarget
     The target of this processing instruction. XML defines this as being
     the first token following the markup that begins the processing
     instruction.

getData and setData (data)
     The content of this processing instruction. This is from the first
     non white space character after the target to the character
     immediately preceding the ?>.


File: pm.info,  Node: XML/DOM/Text,  Next: XML/DOM/ValParser,  Prev: XML/DOM/ProcessingInstruction,  Up: Module List

A piece of XML text in XML::DOM
*******************************

NAME
====

   XML::DOM::Text - A piece of XML text in XML::DOM

DESCRIPTION
===========

   XML::DOM::Text extends *Note XML/DOM/CharacterData:
XML/DOM/CharacterData,, which extends *Note XML/DOM/Node: XML/DOM/Node,.

   The Text interface represents the textual content (termed character
data in XML) of an Element or Attr. If there is no markup inside an
element's content, the text is contained in a single object implementing
the Text interface that is the only child of the element.  If there is
markup, it is parsed into a list of elements and Text nodes that form the
list of children of the element.

   When a document is first made available via the DOM, there is only one
Text node for each block of text. Users may create adjacent Text nodes
that represent the contents of a given element without any intervening
markup, but should be aware that there is no way to represent the
separations between these nodes in XML or HTML, so they will not (in
general) persist between DOM editing sessions. The normalize() method on
Element merges any such adjacent Text objects into a single node for each
block of text; this is recommended before employing operations that depend
on a particular document structure, such as navigation with XPointers.

METHODS
-------

splitText (offset)
     Breaks this Text node into two Text nodes at the specified offset,
     keeping both in the tree as siblings. This node then only contains
     all the content up to the offset point. And a new Text node, which is
     inserted as the next sibling of this node, contains all the content
     at and after the offset point.

     Parameters:  offset  The offset at which to split, starting from 0.

     Return Value: The new Text node.

     DOMExceptions:

        * INDEX_SIZE_ERR

          Raised if the specified offset is negative or greater than the
          number of characters in data.

        * NO_MODIFICATION_ALLOWED_ERR

          Raised if this node is readonly.


File: pm.info,  Node: XML/DOM/ValParser,  Next: XML/DOM/XMLDecl,  Prev: XML/DOM/Text,  Up: Module List

an XML::DOM::Parser that validates at parse time
************************************************

NAME
====

   XML::DOM::ValParser - an XML::DOM::Parser that validates at parse time

SYNOPSIS
========

     use XML::DOM::ValParser;

     my %expat_options = (KeepCDATA => 1,
     		      Handlers => [ Unparsed => \&my_Unparsed_handler ]);
     my $parser = new XML::DOM::ValParser (%expat_options);

     eval {
         local $XML::Checker::FAIL = \&my_fail;
         my $doc = $parser->parsefile ("fail.xml");
         ... XML::DOM::Document was created sucessfully ...
     };
     if ($@) {
         # Either XML::Parser (expat) threw an exception or my_fail() died.
         ... your error handling code here ...
         # Note that the XML::DOM::Document is automatically disposed off and
         # will be garbage collected
     }

     # Throws an exception (with die) when an error is encountered, this
     # will stop the parsing process.
     # Don't die if a warning or info message is encountered, just print a message.
     sub my_fail {
         my $code = shift;
         die XML::Checker::error_string ($code, @_) if $code < 200;
         XML::Checker::print_error ($code, @_);
     }

DESCRIPTION
===========

   Use XML::DOM::ValParser wherever you would use *Note XML/DOM/Parser:
XML/DOM/Parser, and your XML will be checked using *Note XML/Checker:
XML/Checker, at parse time.

   See *Note XML/DOM: XML/DOM, for details on XML::DOM::Parser options.
See *Note XML/Checker: XML/Checker, for details on setting the fail
handler (my_fail.)

   The following handlers are currently supported, just like
XML::DOM::Parser: Init, Final, Char, Start, End, Default, Doctype,
CdataStart, CdataEnd, XMLDecl, Entity, Notation, Proc, Default, Comment,
Attlist, Element, Unparsed.

XML::DOM::ValParser
===================

   XML::DOM::ValParser extends from *Note XML/Checker/Parser:
XML/Checker/Parser,. It creates an *Note XML/Checker: XML/Checker, object
and routes all event handlers through the checker, before processing the
events to create the XML::DOM::Document.

   Just like *Note XML/Checker/Parser: XML/Checker/Parser,, the checker
object can be retrieved with the getChecker() method and can be reused
later on (provided that the DOCTYPE section of the XML::DOM::Document did
not change in the mean time.)

   You can control which errors are fatal (and therefore should stop
creation of the XML::DOM::Document) by filtering the appropriate error
codes in the global $XML::Checker::FAIL handler (see `ERROR_HANDLING',
*Note XML/Checker: XML/Checker,) and calling die or croak appropriately.

   Just like XML::Checker::Parser, XML::DOM::ValParser supports the
SkipExternalDTD and SkipInsignifWS options. See *Note XML/Checker/Parser:
XML/Checker/Parser, for details.

AUTHOR
======

   Send bug reports, hints, tips, suggestions to Enno Derksen at
<`enno@att.com'>.

SEE ALSO
========

   *Note XML/DOM: XML/DOM,, *Note XML/Checker: XML/Checker, (`SEE_ALSO',
*Note XML/Checker: XML/Checker,)


File: pm.info,  Node: XML/DOM/XMLDecl,  Next: XML/DT,  Prev: XML/DOM/ValParser,  Up: Module List

XML declaration in XML::DOM
***************************

NAME
====

   XML::DOM::XMLDecl - XML declaration in XML::DOM

DESCRIPTION
===========

   XML::DOM::XMLDecl extends *Note XML/DOM/Node: XML/DOM/Node,, but is not
part of the DOM Level 1 specification.

   It contains the XML declaration, e.g.

     <?xml version="1.0" encoding="UTF-16" standalone="yes"?>

   See also XML::DOM::Document::getXMLDecl.

METHODS
-------

getVersion and setVersion (version)
     Returns and sets the XML version. At the time of this writing the
     version should always be "1.0"

getEncoding and setEncoding (encoding)
     undef may be specified for the encoding value.

getStandalone and setStandalone (standalone)
     undef may be specified for the standalone value.


File: pm.info,  Node: XML/DT,  Next: XML/Doctype,  Prev: XML/DOM/XMLDecl,  Up: Module List

a package for down translation of XML to strings
************************************************

NAME
====

   XML::DT - a package for down translation of XML to strings

SYNOPSIS
========

     use XML::DT;

     %xml=( 'music'    => sub{"Music from: $c\n"},
            'lyrics'   => sub{"Lyrics from:$c\n (the value of attribute
                                IN is:$v{IN}\n)"},
            'title'    => sub{ uc($c) },
            '-default' => sub{"$q:$c"},
            '-outputenc' => 'ISO-8859-1');
     
     print dt($filename,%xml);

     print dtstring("<arq>
                     <title>Vejam Bem</title>
                     <music>Zeca Afonso</music>
                     </arq>",%xml);

     inctxt('music/lyrics')
     inctxt('music.*')

     ctxt(1)       /* the father element */

     mkdtskel($file)
     mkdtdskel($file)

DESCRIPTION
===========

   This module processes XML files with an approach similar to OMNIMARK.

   Down translation function `dt' receives a filename and a set of
expressions (functions) defining the processing and associated values for
each element.

   `dtstring' is similar but takes input from a string instead of a file.

`pathdt' function
-----------------

   The `pathdt' function uses a subset of XPath as key in the handler.
Example:

     %handler = (
          "article/title" => sub{ toxml("h1",{},$c) },
          "section/title" => sub{ toxml("h2",{},$c) },
          "title"         => sub{ $c },
          "//image[@type='jpg']" => sub{ "JPEG: <img src=\"$c\">" },
          "//image[@type='bmp']" => sub{ "BMP: sorry, no bitmaps on the web" },
          ...
       )

     pathdt($filename,%handler);

   Here are some examples of valid XPath expressions under XML::DT:

     /aaa
     /aaa/bbb
     //ccc                           - ccc somewhere (same as "ccc")
     /*/aaa/*
     //*                             - same as "-default"
     /aaa[@id]                       - aaa with an attribute id
     /*[@*]                          - root with an attribute
     /aaa[not(@name)]                - aaa with no attribute "name"
     //bbb[@name='foo']              - ... attribute "name" = "foo"
     /ccc[normalize-space(@name)='bbb']
     //*[name()='bbb']               - complex way of saying "//bbb"
     //*[starts-with(name(),'aa')]   - an element named "aa.*"
     //*[contains(name(),'c')]       - an element       ".*c.*"
     //aaa[string-length(name())=4]  -                  "...."
     //aaa[string-length(name())&lt;4]                  ".{1,4}"
     //aaa[string-length(name())&gt;5]                  ".{5,}"

   For more information, visit www.w3c.org or try a tutorial under
www.zvon.org

`inctxt' function
-----------------

   `inctxt(pattern)' is true if the actual element path matches the
provided pattern. This function is meant to be used in the element
functions in order to achieve context dependent processing.

User provided element processing functions
------------------------------------------

   The user must provide an HASH with a function for each element, that
computes element output. Functions can use the element name $q, the
element content $c and the attribute values hash `%v'.

   All those global variables are defined in `$CALLER::'.

   Each time an element is find the associated function is called.

   Content is calculated by concatenation of element contents strings and
interior elements return values.

`-default' function
-------------------

   When a element has no associated function, the function associated with
`-default' called. If no `-default' function is defined the default
function returns a XML like string for the element.

   When you use `/-type' definitions, you often need do set `-default'
function to return just the contents: `sub{$id}'.

`-outputenc' option
-------------------

   `-outputenc' defines the output encoding (default is Unicode UTF8).

`-inputenc' option
------------------

   `-inputenc' forces a input encoding type. Whenever that is possible,
define the input encoding in the XML file:

     <?xml version='1.0' encoding='ISO-8859-1'?>

`-pcdata' function
------------------

   `-pcdata' function is used to define transformation over the contents.
Typically this function should look at context (see `inctxt' function)

   The default `-pcdata' function is the identity

`-begin' function
-----------------

   Function to be executed before processing XML file.

   Example of use: initialization of side-effect variables

`-end' function
---------------

   Function to be executed after processing XML file.  I can use $c
content value.  The value returned by `-end' will be the `dt' return value.

   Example of use: post-processing of returned contents

`toxml' function
----------------

   This is the default "-default" function. It can be used to generate xml
based on $c $q and `%v' variables. Example: add a new attribute to element
`ele1' without changing it:

     %handler=( ...
                ele1 => sub { $v{at1} = "v1"; toxml(); },
              )

   `toxml' can also be used with 3 arguments: tag, attrigutes and contents

     toxml("a",{href=> "http://local/f.html"}, "example")

   returns:

     <a href='http://local/f.html'>example</a>

Elements with values other than strings (`-type')
=================================================

   By default all elements return strings, and contents ($c) is the
concatenation of the strings returned by the sub-elements.

   In some situations the XML text contains values that are better
processed as a structured type.

   The following types (functors) are available:

     STR  -> concatenates all the subelements returned values (DEFAULT)
          all the subelement should return strings to be concatenated
     SEQ  -> makes an ARRAY with all the sub elements contents; attributes are
          ignored (they should be processed in the subelement). (returns a ref)
     SEQH -> makes an ARRAY of HASH with all the sub elements (returns a ref);
          for each subelement:
                 -q  => element name
                 -c  => contents
                 at1 => at value1    for each attribute
     MAP  -> makes an HASH with the sub elements; keys are the sub-element
          names, values are their contents. Attributes are ignored. (they should
          be processed in the subelement) (returns a ref)
     MULTIMAP -> makes an HASH of ARRAY; keys are the sub-element names;
         values are lists of contents; attributes are ignored (they should be
         processed in the subelement); (returns a ref)
     MMAPON(elementlist) -> makes an HASH with the subelements;
          keys are the sub-element names, values are their contents;
          attributes are ignored (they should be processed in the subelement);
          for all the elements contained in the elementlist, it is created
          an ARRAY with their contents. (returns a ref)
     ZERO -> don't process the subelements; return ""

   When you use `/-type' definitions, you often need do set `-default'
function returning just the contents `sub{$id}'.

An example:
-----------

     use XML::DT;
     %handler = ( contacts => sub{ [ split(";",$c)] },
                  -default => sub{$c},
                  -type    => { institution => 'MAP',
                                degrees     =>  MMAPON('name')
                                tels        => 'SEQ' }
                );
     $a = dt ("f.xml", %handler);

   with the following f.xml

     <degrees>
      <institution>
       <id>U.M.</id>
       <name>University of Minho</name>
       <tels>
         <item>1111</item>
         <item>1112</item>
         <item>1113</item>
       </tels>
       <where>Portugal</where>
       <contacts>J.Joao; J.Rocha; J.Ramalho</contacts>
      </institution>
      <name>Computer science</name>
      <name>Informatica </name>
      <name> history </name>
     </degrees>

   would make $a

     { 'name' => [ 'Computer science',
                   'Informatica ',
                   ' history ' ],
       'institution' => { 'tels' => [ 1111,
                                      1112,
                                      1113 ],
                          'name' => 'University of Minho',
                          'where' => 'Portugal',
                          'id' => 'U.M.',
                          'contacts' => [ 'J.Joao',
                                          ' J.Rocha',
                                          ' J.Ramalho' ] } };

DT Skeleton generation
======================

   It is possible to build an initial processor program based on an example

   To do this use the function `mkdtskel(filename)'.

   Example:

     perl -MXML::DT -e 'mkdtskel "f.xml"' > f.pl

DTD skeleton generation
=======================

   It makes a naive DTD based on an example(s).

   To do this use the function `mkdtdskel(filename*)'.

   Example:

     perl -MXML::DT -e 'mkdtdskel "f.xml"' > f.dtd

BUGS
====

   This section is out of date...

Author
======

   Jose Joao, jj@di.uminho.pt

     http://natura.di.uminho.pt/~jj/perl/XML/

   Alberto Simoes <albie@alfarrabio.di.uminho.pt>

   thanks to

     Michel Rodriguez <mrodrigu@ieee.org>
     José Carlos Ramalho <jcr@di.uminho.pt>

module for unicode utf8 to latin1 translation
*********************************************

NAME
====

   `lat1.pm' - module for unicode utf8 to latin1 translation

SYNOPSIS
========

     $latin1string = lat1::utf8($utf8string)

Bugs
====

   Translating the latin1 subset of unicode utf8 is very simples and needs
no tables.

   If you need more complex translation, see the perl modules about unicode
and the `recode' command.


