Return-Path: <env_-7806342981862380554@hermes.java.sun.com>
Received: from fort-point-station.mit.edu by po10.mit.edu (8.9.2/4.7) id UAA18659; Tue, 24 Apr 2001 20:24:49 -0400 (EDT)
Received: from hermes.java.sun.com (hermes.java.sun.com [204.160.241.85])
	by fort-point-station.mit.edu (8.9.2/8.9.2) with SMTP id UAA05578
	for <alexp@mit.edu>; Tue, 24 Apr 2001 20:25:33 -0400 (EDT)
Message-Id: <200104250025.UAA05578@fort-point-station.mit.edu>
Date: Tue, 24 Apr 2001 17:25:33 PDT
From: "JDC Tech Tips" <body_-7806342981862380554@hermes.java.sun.com>
To: alexp@mit.edu
Subject: JDC Tech Tips No.
Errors-To: bounced_mail@hermes.java.sun.com
Precedence: junk
Mime-Version: 1.0
Content-Type: text/plain; charset=us-ascii
X-Mailer: Beyond Email 2.2


 J  D  C    T  E  C  H    T  I  P  S

                      TIPS, TECHNIQUES, AND SAMPLE CODE


WELCOME to the Java Developer Connection(sm) (JDC) Tech Tips, 
April 24, 2001. This issue of the JDC Tech Tips covers 
the following topics about transforming XML documents
in the Java(tm) platform:
 
         * Using XPath Expressions in a Simple Transform
         * Using Rule-Based XSLT

This tip was developed using Java(tm) 2 SDK, Standard Edition, 
v 1.3.

This issue of the JDC Tech Tips is written by Stuart Halloway,
a Java specialist at DevelopMentor (http://www.develop.com/java).

You can view this issue of the Tech Tips on the Web at
http://java.sun.com/jdc/JDCTechTips/2001/tt0424.html

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
USING XPATH EXPRESSIONS IN A SIMPLE TRANSFORM

The June 27, 2000 edition of the JDC Tech Tips
(http://java.sun.com/developer/TechTips/2000/tt0627.html)
showed how to use the SAX and DOM APIs to transform an XML
document. It showed you how to preserve the relevant parts of the 
document content while creating a new document structure. Both 
SAX and DOM are low level approaches, requiring a lot of explicit 
Java code. With SAX you must implement a handler, and provide 
methods that will be called by the handler for each markup event.
With the DOM you must navigate the document tree in code.  

By comparison, the eXtensible Stylesheet Language: 
Transformations (XSLT) and the XPath expression language provide 
a declarative approach that is far more powerful for certain 
applications than the SAX and DOM APIs. 

To demonstrate the expressive power of XSLT and XPath, this tip
shows how to use a stylesheet to produce the same visible
results as the XML example in the June 27, 2000 edition of the 
JDC Tech Tips. Recall that the example showed how to code the 
index of the JDC Tech Tips in XML. The XML format of the index 
looked like this:

<!-- File Index.xml -->
<tips>
<author id="stu" fullName="Stuart Halloway"/>
<author id="glen" fullName="Glen McCluskey"/>
<tip title="Using the SAX API"
     author="stu"
     htmlURL="http://developer.java.sun.com/developer/TechTips/2000/tt0627.html#tip2"
     textURL="http://developer.java.sun.com/developer/TechTips/txtarchive/June00_Stu.txt">
</tip>
<tip title="Random Access for Files"
     author="glen"
     htmlURL="http://developer.java.sun.com/developer/TechTips/2000/tt0509.html#tip1"
     textURL="http://developer.java.sun.com/developer/TechTips/txtarchive/May00_GlenM.txt">
</tip>     
</tips>

Here is a simple XSLT stylesheet that converts the Index.xml file
above into an HTML document. Notice that the stylesheet itself is
just another XML document.

<HTML xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
      xsl:version="1.0">
<!-- File Xform.xsl -->
<BODY><H1>JDC Tech Tips Archive</H1>
<xsl:for-each select="/tips/tip">
<br>
<A HREF="{@htmlURL}"> HTML </A> |
<A HREF="{@textURL}"> TEXT </A> |
<xsl:value-of select="@title"/>
</br>
</xsl:for-each>
</BODY>
</HTML>

To transform the Index.xml document using the XSLT stylesheet,
you need to run both documents through an XSLT processor of your 
choice. For example, you can use the open source Java Xalan 
processor, which you can download from the Apache XML project
download page for Xalan 
(http://xml.apache.org/xalan-j/index.html). To run the transform 
using Xalan, set your class path to point to the Xerces and Xalan 
JAR files and use the following command line:

  java org.apache.xalan.xslt.Process -in Index.xml 
    -xsl Xform.xsl -out Index.html

The transform takes an input document (Index.xml), processes it 
according to rules in the stylesheet (Xform.xsl), and produces 
an HTML index in the file Index.html.

Now let's look at what happens when this stylesheet is processed.
The XSLT processor sees the version attribute on the root HTML
element, and infers that it should use the simplified stylesheet
syntax. In this case, the processor simply directs the content of
the stylesheet to the output document. It does this until it sees
an XSLT element. The first XSLT element is the line <xsl:for-each
select="/tips/tip">. So, the processor directs to the output
document the line before the XSLT element, that is, the line
beginning with <BODY>. 

The for-each element in <xsl:for-each select="/tips/tip"> is an
iterator similar to a "for" loop in the Java programming 
language. The value of the select attribute is an XPath
expression, and the contents of the for-each element will be
processed once for each element in the source document that
matches that expression. Simple XPath expressions such as this
one leverage the familiar "/"-delimited syntax of file systems.
In other words, "/tips/tip" matches all tip elements that are
children of a top-level "tips" element.

Let's follow the processor one time through the loop body. The
first thing the processor encounters in the loop body is the <br> 
element; the processor directs it to the output file. An <A> 
element follows the <br> element. Notice the unusual attribute 
HREF="{@htmlURL}". The curly braces delimit an attribute value 
template that needs to be expanded by the processor. The @htmlURL 
is another XPath expression. The @ sign is a shorthand notation 
that tells the processor to select the attribute value that 
follows it. Here that attribute value is htmlURL. The XPath 
expression @htmlURL does not begin with "/", so it does not give 
a full path from the root of the document tree. Because a
full traversal is not specified, the processor begins with the
context node, that is, the node that is currently processed. The 
rest of the body just repeats things you have seen before. One 
exception is the xsl:value-of element, which simply produces the 
value of the XPath expression that is selected. 

The result of the first loop iteration should look something like 
this:

<br>
<A
HREF="http://developer.java.sun.com/developer/TechTips/2000/tt0627.html#tip2"> 
HTML </A> |
<A
HREF="http://developer.java.sun.com/developer/TechTips/txtarchive/June00_Stu.txt"> 
TEXT </A> | Using the SAX API
</br>

This very simple transform is already an improvement over the DOM
and SAX examples from the June 27, 2000 tip. The code is shorter, 
and the structure of the output document is evident from the
stylesheet. This style of transform is often called "fill in the
blanks" because the stylesheet is basically poured directly
into the output document, with data from the input document 
filling in the "blanks" denoted by XPath expressions.

The fill-in-the-blanks approach looks like a verbose syntax for
JavaServer Pages(tm) (JSP(tm)). In fact, that is exactly the 
purpose of the simplified syntax -- it provides a migration path 
for JSP developers. The next tip ("Using Rule-Based XSLT") 
presents a more sophisticated use of XSLT that looks nothing like 
JSPs or procedural Java code.

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
USING RULE-BASED XSLT

XSLT is particularly valuable when you write rule-based
stylesheets. A rule-based stylesheet decomposes the 
transformation into a set of template rules that describe each 
sub-transformation that needs to occur. However (and this is the 
critical difference) you do not specify any order or structure 
for the rules. You simply enumerate the template rules, and the 
XSLT processor determines when to apply them. For the tech tip
transform described in the tip "Using XPath Expressions in a 
Simple Transform," you might start by creating a plain-English 
list of rules like this:

1. List the name of all tips
2. Create a link to the HTML version of an element, if one exists
3. Create a link to the text version of an element, if one exists

Here are the corresponding XSLT template rules:

<!-- list the title of a tip -->
<xsl:template match="tip">
<br><xsl:apply-templates select="@*"/><xsl:value-of select="@title"/></br>
</xsl:template>

<!-- create a link to any htmlURL -->
<xsl:template match="@htmlURL">
<A HREF="{.}"> HTML </A> |
</xsl:template>

<!-- create a link to any textURL -->
<xsl:template match="@textURL">
<A HREF="{.}"> TEXT </A> |
</xsl:template>

Each rule begins by specifying an XPath expression that 
determines when the rule should be used. The first rule runs 
whenever an element named "tip" is encountered. It produces 
the value of the title attribute for that tip, delimited by 
<br>...</br>. The call to <xsl:apply-templates select="@*" tells 
the processor to continue looking for rules that could match an 
attribute of the tip element. (If you do not explicitly continue
processing, the processor will stop looking for matching rules.)

The second and third rules match attributes instead of elements.
That's why there's the '@' in the match expression. Each rule 
produces the text for an anchor tag. As in the tip, "Using XPath 
Expressions in a Simple Transform," the curly braces denote
an attribute value template. The dot is shorthand for the context
node, in this case the value of the attribute that caused the
template to match.

To use these template rules, you simply need a rule to set up the
framework of an HTML document. You also need a top-level 
stylesheet element to identify the document. Putting it all 
together you get:

<!-- File Xform2.xsl -->
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="1.0">

<!-- create a link to any htmlURL -->
<xsl:template match="@htmlURL">
<A HREF="{.}"> HTML </A> |
</xsl:template>

<xsl:template match="/">
<HTML><BODY><H1>JDC Tech Tips Archive</H1>
<xsl:apply-templates/>
</BODY></HTML>
</xsl:template>

<!-- create a link to any textURL -->
<xsl:template match="@textURL">
<A HREF="{.}"> TEXT </A> |
</xsl:template>

<!-- ignore other attributes -->
<xsl:template match="@*"/>

<!-- list the title of a tip -->
<xsl:template match="tip">
<br><xsl:apply-templates select="@*"/><xsl:value-of select="@title"/></br>
</xsl:template>

</xsl:stylesheet>
<!-- end file Xform2.xsl -->

The templates are deliberately listed in an odd order to 
emphasize the fact that order is not important. 

You can run this transform using Xalan as follows:

  java org.apache.xalan.xslt.Process -in Index.xml 
    -xsl Xform.xsl -out Index.html
    
The processor begins by matching the root node, and then searches
for templates to process.

You should get an Index.html output file that is substantially the
same as the one you produced in the tip "Using XPath Expressions 
in a Simple Transform." (Note that the general process is more 
complex than what is shown here. There are built-in templates that 
process content if none of your templates match. There are also 
priorities and modes to control how the processor chooses between 
multiple templates that might match.)

When you first look at it, this new transform is harder to read
than the fill-in-the-blanks version. In fact, you might think it 
is harder to read than an equivalent DOM or SAX program. The 
benefit of the rule-based version is its power to cope with
variations in the input data. Imagine that the input file schema
changed so that tips were listed underneath an author element:

<!-- File Index2.xml -->
<tips>
<author id="stu" fullName="Stuart Halloway">
<tip title="Using the SAX API"
     htmlURL="http://developer.java.sun.com/developer/TechTips/2000/tt0627.html#tip2"
     textURL="http://developer.java.sun.com/developer/TechTips/txtarchive/June00_Stu.txt">
</tip>
</author>
</tips>

If you try to run the fill-in-the-blanks version of the transform
on this new file, the links will be lost. That's because the
fill-in-the-blanks transform expects that all tip elements are
direct children of a top-level tips element. The rule-based
stylesheet can handle this change without difficulty because it
does not care where the tip elements are found.

The rule-based transform also adapts better to changing
requirements. Return to the original Index.xml document.
Imagine that you wanted to list all the authors at the top of 
the index, and then credit the author for each tip, 
individually. All you need to do is add a new rule to list the
authors, and modify the existing title rule to credit them.
Changes are localized to the element or attribute type they 
modify, regardless of their location in the document. 
Here's what the new and modified rules look like:

<!-- modifications to Xform2.xsl -->
<!-- list the authors -->
<xsl:template match="tips">
<br>Authors:
<xsl:for-each select="author">
<xsl:value-of select="@fullName"/>
<xsl:if test="position()!=last()">, </xsl:if>
</xsl:for-each>
</br>
<xsl:apply-templates/>
</xsl:template>

<!-- replace the rule to list the title of a tip with this rule -->
<xsl:template match="tip">
<br><xsl:apply-templates select="@*"/><xsl:value-of select="@title"/>
by <xsl:value-of select="//*[@id=current()/@author]/@fullName"/></br>
</xsl:template>

Add these rules to your Xform2.xsl template and re-run the
transform:

  java org.apache.xalan.xslt.Process -in Index.xml 
    -xsl Xform2.xsl -out Index2.html

This will produce an HTML document that begins with a list of
authors, and lists the author name after each tip title. If you
want to really appreciate the simplicity of this change, try to
implement this new transformation in the fill-in-the-blanks style,
or use one of the parsers.

These rules demonstrate several other features of XSLT. The author
list rule uses an xsl:if element to generate commas after each 
author, except the last. The new title rule uses a complex XPath
expression to look up the authors by name; it does this by 
correlating the title element's author attribute with the author 
element's id attribute.

For more information about XSLT, see the book "XSLT: Programmer's 
Reference" by Michael Kay, published by Wrox Press, Inc.

.  .  .  .  .  .  .  .  .  .  .  .  .  .  .  .  .  .  .  .  .  .  .

- NOTE

Sun respects your online time and privacy. The Java Developer 
Connection mailing lists are used for internal Sun Microsystems 
purposes only. You have received this email because you elected 
to subscribe. To unsubscribe, go to the Subscriptions page 
(http://developer.java.sun.com/subscription/), uncheck the 
appropriate checkbox, and click the Update button.


- SUBSCRIBE

To subscribe to a JDC newsletter mailing list, go to the 
Subscriptions page (http://developer.java.sun.com/subscription/), 
choose the newsletters you want to subscribe to, and click Update.


- FEEDBACK
Comments? Send your feedback on the JDC Tech Tips to:

jdc-webmaster@sun.com


- ARCHIVES
You'll find the JDC Tech Tips archives at:

http://java.sun.com/jdc/TechTips/index.html


- COPYRIGHT
Copyright 2001 Sun Microsystems, Inc. All rights reserved.
901 San Antonio Road, Palo Alto, California 94303 USA.

This document is protected by copyright. For more information, see:

http://java.sun.com/jdc/copyright.html


- LINKS TO NON-SUN SITES
The JDC Tech Tips may provide, or third parties may provide, 
links to other Internet sites or resources. Because Sun has no 
control over such sites and resources, You acknowledge and agree 
that Sun is not responsible for the availability of such external 
sites or resources, and does not endorse and is not responsible 
or liable for any Content, advertising, products, or other 
materials on or available from such sites or resources. Sun will 
not be responsible or liable, directly or indirectly, for any 
damage or loss caused or alleged to be caused by or in connection 
with use of or reliance on any such Content, goods or services 
available on or through any such site or resource.


JDC Tech Tips 
April 24, 2001

Sun, Sun Microsystems, Java, Java Developer Connection, 
JavaServer Pages, and JSP are trademarks or registered trademarks 
of Sun Microsystems, Inc. in the United States and other 
countries.

