Return-Path: <Mailing@hermes.java.sun.com>
Received: from pacific-carrier-annex.mit.edu by po10.mit.edu (8.9.2/4.7) id PAA04337; Tue, 10 Apr 2001 15:37:36 -0400 (EDT)
Received: from hermes.java.sun.com (hermes.java.sun.com [204.160.241.85])
	by pacific-carrier-annex.mit.edu (8.9.2/8.9.2) with SMTP id PAA20734
	for <alexp@mit.edu>; Tue, 10 Apr 2001 15:38:16 -0400 (EDT)
Date: Tue, 10 Apr 2001 15:38:16 -0400 (EDT)
Message-Id: <200104101938.PAA20734@pacific-carrier-annex.mit.edu>
From: "JDC Tech Tips"<Mailing@hermes.java.sun.com>
To: alexp@mit.edu
Subject: JDC Tech Tips  April 10, 2001
Reply-To: JDCTechTips@hermes.java.sun.com
Errors-To: bounced_mail@hermes.java.sun.com
Precedence: junk
Mime-Version: 1.0
Content-Type: text/plain; charset=us-ascii
X-Mailer: Beyond Email 2.2


 J  D  C    T  E  C  H    T  I  P  S

                      TIPS, TECHNIQUES, AND SAMPLE CODE


WELCOME to the Java Developer Connection(sm) (JDC) Tech Tips, 
April 10, 2001. This issue covers:

         * Making Deep Copies of Objects
         * Using strictfp
         * Optimizing String Performance
         
These tips were developed using Java(tm) 2 SDK, Standard Edition, 
v 1.3.

You can view this issue of the Tech Tips on the Web at
http://java.sun.com/jdc/JDCTechTips/2001/tt0410.html

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
MAKING DEEP COPIES OF OBJECTS

The March 6, 2001 JDC Tech Tip "Cloning Objects"
(http://java.sun.com/jdc/JDCTechTips/2001/tt0306.html#cloning)
described how to make copies of objects, using a combination of 
Object.clone and your own clone methods. Object.clone, and 
sometimes user-defined clone methods, make what are called 
"shallow" copies, that is, they copy object fields and 
references, but they do not recursively copy the objects 
referenced by these fields.

Suppose you have an ArrayList object. If you call its clone 
method, it will create a new ArrayList object, and copy into it 
the various element references of the original list. If these 
elements are mutable, that is, they can be changed, then changing 
one of them in the original list will also change the element in 
the new list. In other words, two ArrayList objects have common 
elements through shared references. But this sharing might not
be what you want. You might want to make a list copy, but not 
share updates to elements in the two lists.

These type of totally distinct copies are called "deep copies." 
Let's look at a few deep copy techniques in the following 
program:

    import java.io.*;
    import java.util.*;
    
    class A implements java.io.Serializable {
        private int x;
    
        public A(int i) {
            setX(i);
        }
        public void setX(int i) {
            x = i;
        }
        public int getX() {
            return x;
        }
    }
    
    public class DeepDemo {
    
        // shallow copy
        static ArrayList copy1(ArrayList list) {
            return (ArrayList)list.clone();
        }
    
        // hand-coded deep copy
        static ArrayList copy2(ArrayList list) {
            ArrayList newlist = new ArrayList();
            A aobj = (A)list.get(0);
            newlist.add(new A(aobj.getX()));
            return newlist;
        }
    
        // deep copy via serialization
        static ArrayList copy3(ArrayList list) throws Exception {
    
            // serialize ArrayList into byte array
    
            ByteArrayOutputStream baos =
                new ByteArrayOutputStream(100);
            ObjectOutputStream oos = new ObjectOutputStream(baos);
            oos.writeObject(list);
            byte buf[] = baos.toByteArray();
            oos.close();
    
            // deserialize byte array into ArrayList
    
            ByteArrayInputStream bais =
                new ByteArrayInputStream(buf);
            ObjectInputStream ois = new ObjectInputStream(bais);
            ArrayList newlist = (ArrayList)ois.readObject();
            ois.close();
    
            return newlist;
        }
    
        static final int NUMITERS = 50000;
    
        public static void main(String args[]) throws Exception {
            ArrayList listold, listnew;
            long currtime, elapsedtime2, elapsedtime3;
    
            // shallow copy
    
            listold = new ArrayList();
            listold.add(new A(37));
            listnew = copy1(listold);
            ((A)listold.get(0)).setX(47);
            System.out.print("copy1 old/new = ");
            System.out.print(((A)listold.get(0)).getX() + " ");
            System.out.println(((A)listnew.get(0)).getX());
    
            // deep copy via hand-coded method
    
            listold = new ArrayList();
            listold.add(new A(37));
            currtime = System.currentTimeMillis();
            for (int i = 1; i <= NUMITERS; i++) {
                listnew = copy2(listold);
            }
            elapsedtime2 = System.currentTimeMillis() - currtime;
            ((A)listold.get(0)).setX(47);
            System.out.print("copy2 old/new = ");
            System.out.print(((A)listold.get(0)).getX() + " ");
            System.out.println(((A)listnew.get(0)).getX());
    
            // deep copy via serialization
    
            listold = new ArrayList();
            listold.add(new A(37));
            currtime = System.currentTimeMillis();
            for (int i = 1; i <= NUMITERS; i++) {
                listnew = copy3(listold);
            }
            elapsedtime3 = System.currentTimeMillis() - currtime;
            ((A)listold.get(0)).setX(47);
            System.out.print("copy3 old/new = ");
            System.out.print(((A)listold.get(0)).getX() + " ");
            System.out.println(((A)listnew.get(0)).getX());
    
            System.out.println();
            System.out.println("copy2 time = " + elapsedtime2);
            System.out.println("copy3 time = " + elapsedtime3);
        }
    }

The DeepDemo program uses an ArrayList object containing a single
element of type A. Objects of type A are mutable through the setX 
method. 

The program defines three copy methods. The first method, copy1, 
does a shallow copy using the ArrayList.clone method. This method 
creates a new list and copies object references from the old list. 
After the copy1 method is called, the program changes the value of 
the single element in the old list and checks whether the value of 
the element in the new list is also changed. Here, a change to the 
new list indicates a shallow copy.

The second approach to copying uses a hand-coded method, copy2, to 
make a deep copy. The method sets up a new ArrayList, gets the 
value of the element from the old list, and adds a new A object 
with this value to the new list.

The third approach, copy3, uses serialization. It uses writeObject 
to convert the ArrayList into a byte stream stored in a
ByteArrayOutputStream object, and then reverses the process,
converting the stream of bytes back into an ArrayList. This 
process forces the creation of new ArrayList and A objects.

When you run the program, the output should look something like 
this:

    copy1 old/new = 47 47
    copy2 old/new = 47 37
    copy3 old/new = 47 37

    copy2 time = 47
    copy3 time = 8500

Both hand-coded deep copying (copy2) and serialization (copy3)  
keep the copies distinct. But there's an apparent issue here -- 
performance. The serialization approach is much slower than the
hand-coded approach. However, this time discrepancy doesn't mean
that you should avoid serialization when making deep copies. The
DeepDemo results show that it took 8,500 milliseconds to make the
copies. If you examine DeepDemo, you'll see that it does the
copying 50,000 times. Doing 50,000 copies in 8,500 milliseconds
means that it didn't take very long to do a single copy. And time
might not be a critical factor in the part of your application 
that makes these copies.

There's another big issue here besides performance. The manual or 
hand-coded approach doesn't scale very well to complex data 
structures. For example, if you need to make a deep copy of a
graph or tree structure, doing it yourself can get pretty
complicated. So comparing the serialization approach to the 
manual approach, you can say that the serialization approach 
costs a lot more, measured in time, than the manual approach, but 
it does a lot more as well.

A final point about copying through serialization is that 
all the objects that the serialization process encounters must be
serializable. In the DeepDemo example, A got this property by
implementing the java.io.Serializable marker interface. ArrayList 
does likewise.

For more information about deep copying, see Section 3.9.3, 
Shallow versus Deep Cloning, and  Section 15.7, Object 
Serialization, in "The Java(tm) Programming Language Third 
Edition" by Arnold, Gosling, and Holmes 
(http://java.sun.com/docs/books/javaprog/thirdedition/).

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
USING STRICTFP

The keyword "strictfp" was a late addition to the Java 
programming language. It is used to control certain aspects of 
floating-point arithmetic. It can be a bit difficult to
describe and illustrate strictfp, so this tip will do it
in a stepwise way. The tip starts with a few examples of syntax,
then it presents an example that shows where strictfp might be 
important.

You can use strictfp as a modifier of class, interface, and 
method declarations, like this:

    // legal uses of strictfp

    strictfp interface A {}

    public strictfp class FpDemo1 {
        strictfp void f() {}
    }

You cannot use strictfp on constructors or methods within
interfaces:

    // illegal uses of strictfp

    interface A {
        strictfp void f();
    }

    public class FpDemo2 {
        strictfp FpDemo2() {}
    }

The strictfp keyword is used to designate an expression as
"FP-strict." If a class, interface, or method is declared using
strictfp, then the class, interface, or method  is FP-strict.
So too are all classes, interfaces, methods, constructors, 
instance initializers, variable initializers, and static 
initializers within the declaration.

An expression is FP-strict if it occurs anywhere within one of 
these FP-strict declarations, or if it is a compile-time 
constant expression. In practical terms, this means that 
if a class or method is declared with strictfp, any 
expression that occurs within the class or method is an
FP-strict expression.

Because constructors cannot be declared using strictfp, they are
FP-strict only if their defining class is FP-strict. Methods
within interfaces cannot be declared using strictfp because this 
is an implementation rather than an interface property.

So what does FP-strict actually mean? Consider the following 
example:

    public strictfp class FpDemo3 {
        public static void main(String[] args) {
            double d = 8e+307;
            System.out.println(4.0 * d * 0.5);
            System.out.println(2.0 * d);
        }
    }

The maximum value of a double (Double.MAX_VALUE) is approximately 
1.8e+308. In the FpDemo3 example, the first expression is
evaluated as:

    (4.0 * d) * 0.5

because the Java programming language guarantees a left-to-right 
order of evaluation. When the first part of the expression is 
evaluated, the result is 32e+307 or 3.2e+308, which is larger 
than Double.MAX_VALUE.

Because the expression is FP-strict, the implementation is 
required to evaluate the whole expression as resulting in 
positive infinity (which the program prints as "Infinity"). This 
is true even though the later multiplication by 0.5 produces a 
final value for the expression of 1.6e+308, which is less than 
Double.MAX_VALUE.

By contrast, if the expression is not FP-strict, an 
implementation is allowed to use an extended exponent range to 
represent intermediate results. In the FpDemo3 example, this 
could keep the expression from overflowing, and produce a final 
result that is within range. An implementation is not required to 
do this; it can, in effect, use FP-strict rules everywhere.

Note that multiplication in this example is not associative, in
that:

    (4.0 * d) * 0.5 != 4.0 * (d * 0.5)

strictfp is important because its use guarantees common behavior
across different Java implementations. In other words, you can 
know that the floating-point arithmetic in your application 
behaves the same when you move your application to a different 
Java implementation or hardware platform.

For more information about strictfp, see Section 6.6.3, Strict 
and non-Strict Floating-Point Arithmetic in "The Java(tm) 
Programming Language Third Edition" by Arnold, Gosling, and 
Holmes
(http://java.sun.com/docs/books/javaprog/thirdedition/); and
Section 15.4, FP-strict Expressions, in "The Java(tm) Language 
Specification Second Edition" by Gosling, Joy, Steele, and Bracha 
(http://java.sun.com/docs/books/jls/).

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
OPTIMIZING STRING PERFORMANCE

Suppose that you're working with a Java programming language
application where you have lists of Integer wrapper objects.
Suppose too that the application frequently converts these 
lists to strings. In other words, the application might have
code like this:

    List list = new ArrayList();
    list.add(new Integer(37));
    list.add(new Integer(47));
    ...
    String s = list.toString();

The toString method converts lists into a string, formatted like
this:

    [37, 47]

Is there a way you can optimize the conversions? To answer this
question, it's useful to look at what happens internally when 
a list is converted to a string. A StringBuffer object is first
set up. Then each list element is converted to a string and 
appended to the buffer. In the final step, the StringBuffer 
object itself is converted to a String object.

List elements are converted to strings by calling a series of
methods, the most important of which is Integer.toString(value,
radix). This method takes an integer value, such as 59, and 
produces a new string from it. The conversion is expensive, in 
part because the value must be picked apart to obtain individual 
digits in the string representation. It's also expensive because 
a new string has to be created.

Can you do better than the standard approach? The answer is yes, 
at least in some cases. Consider an example where the list 
elements are Integer objects with values in the range 0-99. In 
this case, it's possible to write a local version of toString 
that is faster than the standard method. The code looks like 
this:

    import java.util.*;
    
    public class PerfDemo {
        // length of ArrayList objects
        static final int MAXLISTLEN = 10000;
    
        // maximum value in Integer object
        static final int MAXNUM = 99;
    
        // cache of formatted strings for small integers
        static String cache[] = new String[MAXNUM + 1];
    
        // fill cache at program startup
        static {
            for (int i = 0; i <= MAXNUM; i++) {
                cache[i] = Integer.toString(i);
            }
        }
    
        // local version of toString, that uses the cache
        static String localtoString(List list) {
            StringBuffer sb = new StringBuffer();
            sb.append("[");
            int size = list.size();
            for (int i = 0; i < size; i++) {
                Integer iobj = (Integer)list.get(i);
                sb.append(cache[iobj.intValue()]);
                if (i + 1 < size) {
                    sb.append(", ");
                }
            }
            sb.append("]");
            return sb.toString();
        }
    
        public static void main(String args[]) {
    
            Random rn = new Random(0);
    
            List list = new ArrayList();
    
            // fill list with random Integer values 0-99
    
            for (int i = 1; i <= MAXLISTLEN; i++) {
                int r = rn.nextInt(MAXNUM + 1);
                list.add(new Integer(r));
            }
    
            String s1 = null;
            String s2 = null;
    
            // do timing of standard approach
    
            long currtime = System.currentTimeMillis();
            for (int i = 1; i <= 100; i++) {
                s1 = list.toString();
            }
            long elapsed = System.currentTimeMillis() - currtime;
            System.out.println("standard toString time = " +
                elapsed);
    
            // do timing of custom approach
    
            currtime = System.currentTimeMillis();
            for (int i = 1; i <= 100; i++) {
                s2 = localtoString(list);
            }
            elapsed = System.currentTimeMillis() - currtime;
            System.out.println("local toString time = " + elapsed);
    
            // check to make sure same strings are produced
    
            if (!s1.equals(s2)) {
                System.out.println("error");
            }
        }
    }

The PerfDemo program sets up a string cache for integer values 
0-99. In other words, the program precomputes the string values 
that are the result of converting the integers 0-99 to strings. 
Using this cache, it's a simple matter within localtoString to 
call the intValue method on each Integer object, and then use the 
value as an index into the cache table.

If you run PerfDemo, the result should look something like this:

    standard toString time = 1719

    local toString time = 781

These results show that the local method runs more than twice as 
fast as the standard one. This is because the local method avoids 
integer-to-string conversions and string creation.

The optimization is not general, in that you cannot expand the 
cache technique to handle any range of integers. You can term
this performance technique "trading space for speed" or 
"precomputing results." Unfortunately, it becomes unreasonable if 
the cache table is too large. But many optimizations are of this 
type, that is, they achieve performance gains by exploiting 
knowledge about the particular data and operations in an 
application.

Another angle on replacing standard library methods with your 
own, concerns reliability. The library methods have been 
exercised heavily, much more so that any method you are likely to 
write. Although it sometimes makes sense to use your own methods 
in preference to library methods, it's worth examining the 
tradeoffs. Looking at the PerfDemo example, you might ask whether 
you really care about the performance of the toString method, and 
whether speeding it up will actually improve the overall 
performance of your application.

For more information about optimizing performance, see the book
"Programming Pearls" by Jon Bentley. You can browse selections 
from the book at http://www.programmingpearls.com/. See 
especially Appendix 4 Rules for Code Tuning 
(http://www.programmingpearls.com/apprules.html).
.  .  .  .  .  .  .  .  .  .  .  .  .  .  .  .  .  .  .  .  .  .  .

- NOTE

Sun respects your online time and privacy. The Java Developer 
Connection mailing lists are used for internal Sun Microsystems(tm) 
purposes only. You have received this email because you elected 
to subscribe. To unsubscribe, go to the Subscriptions page 
(http://developer.java.sun.com/subscription/), uncheck the 
appropriate checkbox, and click the Update button.


- SUBSCRIBE

To subscribe to a JDC newsletter mailing list, go to the 
Subscriptions page (http://developer.java.sun.com/subscription/), 
choose the newsletters you want to subscribe to, and click Update.


- FEEDBACK
Comments? Send your feedback on the JDC Tech Tips to:

jdc-webmaster@sun.com


- ARCHIVES
You'll find the JDC Tech Tips archives at:

http://java.sun.com/jdc/TechTips/index.html


- COPYRIGHT
Copyright 2001 Sun Microsystems, Inc. All rights reserved.
901 San Antonio Road, Palo Alto, California 94303 USA.

This document is protected by copyright. For more information, see:

http://java.sun.com/jdc/copyright.html


- LINKS TO NON-SUN SITES
The JDC Tech Tips may provide, or third parties may provide, 
links to other Internet sites or resources. Because Sun has no 
control over such sites and resources, You acknowledge and agree 
that Sun is not responsible for the availability of such external 
sites or resources, and does not endorse and is not responsible 
or liable for any Content, advertising, products, or other 
materials on or available from such sites or resources. Sun will 
not be responsible or liable, directly or indirectly, for any 
damage or loss caused or alleged to be caused by or in connection 
with use of or reliance on any such Content, goods or services 
available on or through any such site or resource.


This issue of the JDC Tech Tips is written by Glen McCluskey.

JDC Tech Tips 
April 10, 2001

Sun, Sun Microsystems, Java, and Java Developer Connection are 
trademarks or registered trademarks of Sun Microsystems, Inc. 
in the United States and other countries.








