Received: from SOUTH-STATION-ANNEX.MIT.EDU by po10.MIT.EDU (5.61/4.7) id AA16571; Tue, 25 Apr 00 16:59:56 EDT
Received: from hermes.javasoft.com by MIT.EDU with SMTP
	id AA03659; Tue, 25 Apr 00 16:58:52 EDT
Received: (from nobody@localhost)
	by hermes.java.sun.com (8.9.3+Sun/8.9.1) id VAA14345;
	Tue, 25 Apr 2000 21:00:06 GMT
Date: Tue, 25 Apr 2000 21:00:06 GMT
Message-Id: <200004252100.VAA14345@hermes.java.sun.com>
X-Authentication-Warning: hermes.java.sun.com: Processed from queue /bulkmail/data/ed_2/mqueue6
X-Mailing: 198
From: JDCTechTips@sun.com
Subject: JDC Tech Tips  April 25, 2000
To: JDCMember@sun.com
Reply-To: JDCTechTips@sun.com
Errors-To: bounced_mail@hermes.java.sun.com
Precedence: junk
Mime-Version: 1.0
Content-Type: text/plain; charset=us-ascii
X-Mailer: Beyond Email 2.2


 J  D  C    T  E  C  H    T  I  P  S

                      TIPS, TECHNIQUES, AND SAMPLE CODE


WELCOME to the Java Developer Connection(sm) (JDC) Tech Tips, 
April 25, 2000. This issue covers:

     * Improving Serialization Performance with Externalizable
     * Handling Those Pesky InterruptedExceptions
                 
This tip was developed using Java(tm) 2 SDK, Standard Edition, 
v 1.2.2.

You can view this issue of the Tech Tips on the Web at 
http://developer.java.sun.com/developer/TechTips/2000/tt0425.html.
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 

IMPROVING SERIALIZATION PERFORMANCE WITH EXTERNALIZABLE

The February 29, 2000 Tech Tip on serialization 
(http://developer.java.sun.com/developer/TechTips/2000/tt0229.html),
explored the flexibility of Java's serialization mechanism. 
With serialization, you can customize how an object's fields are 
mapped to a stream, and even recover when you encounter a stream 
that has different fields from the ones you expect. This 
flexibility is a benefit of the serialization format; the format
includes more then just your object's field values, but also 
metadata about the version of your class and its field names and 
types.

However, flexibility comes at the price of lower performance. 
This is certainly true for serialization. This tip shows you how 
to improve the performance of serialization by turning off the 
standard serialization format. You do this by making your objects 
externalizable. Let's start the tip with a programming example 
that uses serializable objects: 

import java.io.*;

class Employee implements Serializable {
        String lastName;
        String firstName;
        String ssn;
        int salary;
        int level;
    
        public Employee(String lastName, String firstName, String ssn,
                                        int salary, int level) 
        {
                this.lastName = lastName;
                this.firstName = firstName;
                this.ssn = ssn;
                this.salary = salary;
                this.level = level;				
        }
}

public class TestSerialization {
    public static final int tests=5;
        public static final int count=5000;

    public static void appMain(String[] args) throws Exception {
        Employee[] emps = new Employee[count];
        for (int n=0; n<count; n++) {
            emps[n] = new Employee("LastName" + n, "FirstName" + n,
                                    "222-33-" + n, 34000 + n,
                                    n % 10);
        }
        for (int outer=0; outer<tests; outer++) {
            ObjectOutputStream oos = null;
            FileOutputStream fos = null;
            BufferedOutputStream bos = null;
            long start = System.currentTimeMillis();
            try {
                fos = new FileOutputStream("TestSerialization");
                bos = new BufferedOutputStream(fos);
                oos = new ObjectOutputStream(bos);
                for (int n=0; n<count; n++) {
                    oos.writeObject(emps[n]);
                }
                long end = System.currentTimeMillis();
                System.out.println("Serialization of " + count + 
                               " objects took " + (end-start) + " ms.");
            }
            finally {
                if (oos != null) oos.close();
                if (bos != null) bos.close();
                if (fos != null) fos.close();
            }
            new File("TestSerialization").delete();
        }
    }

    public static void main(String[] args) 
    {
        try {
            appMain(args);
        }   
        catch (Exception e) {
            e.printStackTrace();
        }
    }
}

The TestSerialization class is a simple benchmark that measures 
how long it takes to write Employees into an OutputStream. It 
creates 5000 fictitious employees, then writes them all into 
a file. The test runs five times. If you run TestSerialization, 
you should see output that looks something like this (your times 
might differ substantially depending on environmental factors 
such as your processor speed and other applications running in 
your system): 

Serialization of 5000 objects took 438 ms.
Serialization of 5000 objects took 203 ms.
Serialization of 5000 objects took 234 ms.
Serialization of 5000 objects took 188 ms.
Serialization of 5000 objects took 219 ms.

These results indicate that it was a good idea to run the test 
more than once because the first run was so different from the 
others. Ignoring the first run, which probably incurred some
one-time startup overhead, the results range from approximately 
190-235 ms to write 5000 objects to a file.

The Employee class takes advantage of the simplest flavor of  
serialization by implementing the signal interface Serializable;
this indicates to the Java(tm) virtual machine* that you want to 
use the default serialization mechanism. Implementing the 
Serializable interface allows you to serialize the Employee 
objects by passing them to the writeObject() method of 
ObjectOutputStream. ObjectOutputStream automates the process of 
writing the Employee class metadata and instance fields to the 
stream. In other words, it does all the serialization work for you. 

Though the work is automated, you might want faster results. How 
do you improve the results? The answer is you need to write some 
custom code. Begin by declaring that the Employee class implements 
Externalizable instead of Serializable. You also need to declare 
a public no-argument constructor for the Employee class.  

When you declare that an object is Externalizable you assume full 
responsibility for writing the object's state to the stream.  
ObjectOutputStream no longer automates the process of writing your 
class's metadata and instance fields to the stream. Instead, you 
manipulate the stream directly using the methods readExternal and 
writeExternal. Here is the code you need to add to the Employee 
class:

    public void readExternal(java.io.ObjectInput s)
                    throws ClassNotFoundException, IOException 
    {
        lastName = s.readUTF();
        firstName = s.readUTF();
        ssn = s.readUTF();
        salary = s.readInt(); 
        level = s.readInt();
    }

    public void writeExternal(java.io.ObjectOutput s)
                    throws IOException 
    {
        s.writeUTF(lastName);
        s.writeUTF(firstName);
        s.writeUTF(ssn);
        s.writeInt(salary);
        s.writeInt(level);
    }

The ObjectInput and ObjectOutput interfaces extend the DataInput 
and DataOutput interfaces, respectively. This gives you the 
methods you need to use the stream. Through methods inherited from 
DataInput and DataOutput, you can read and write native types using 
methods such as readInt() and writeInt(), and read and write string 
types using methods such as readUTF() and writeUTF(). (Java uses a 
UTF-8 variant to encode Unicode strings, see RFC 2279 and the Java 
Virtual Machine Specification for details.) 

Try running the example again with the Externalizable version of 
Employee. You should see better performance, for example:

Serialization of 5000 objects took 266 ms.
Serialization of 5000 objects took 125 ms.
Serialization of 5000 objects took 110 ms.
Serialization of 5000 objects took 156 ms.
Serialization of 5000 objects took 109 ms.

Again ignoring the first run, this gives a range of 110-156ms, 
which is about 35-40% faster than the serializable version.

Does this kind of performance advantage imply that you should  
make all of your classes externalizable? Absolutely not. As you 
can see, making a class externalizable requires writing more code. 
And more code means more possible bugs. If you forget to write 
a field, or read fields in a different order than you wrote them, 
externalization will break. With the Serializable interface, these 
problems are handled by the ObjectOutputStream. Probably the worst 
disadvantage of externalizable objects is that you must have the 
class in order to interpret the stream. This is because the stream 
format is opaque binary data. With normal serializable classes the 
stream format includes field names and types. So it is possible to 
reconstruct the state of an object even without the object's class 
file. Unfortunately, the Java serialization mechanism doesn't 
include any code to do the reconstruction, so you will have write 
your own code to do that. (See the ObjectStreamWalker class at 
http://staff.develop.com/halloway/JavaTools.html for sample code 
to get you started.) 

However, if performance is your primary concern, it's a good idea 
to use externalizable objects. If your code manages a large number 
of events in a Local Area Network and you need near real-time 
performance, you will probably want to model the events as 
externalizable objects.

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
HANDLING THOSE PESKY INTERRUPTEDEXCEPTIONS

If you have done any thread-related programming in the Java
programming language, you have have been forced to deal with 
InterruptedExceptions. These exceptions appear in the throws clause 
of Thread.sleep(), Thread.join(), and Object.wait(). An 
InterruptedException allows code on another thread to interrupt 
your thread if, for example, your thread takes too long to process. 
Many programmers rarely use interruption, and find these exceptions 
annoying. But even if your code never interrupts other threads, 
there are two reasons you should care about interruption.  
 
o  InterruptedException is a checked exception, which means that
   your code must catch or propagate the exception, even if you
   never expect it to happen.
     
o  In the Java environment, you cannot typically rely on 
   controlling the entire process in which your code runs. This 
   is good, because it allows for the use of things like mobile 
   agents, container architectures, applets, and RMI. However, it 
   also means that even if you never call Thread.interrupt() on 
   one of your threads, somebody else probably will.

This tip compares three different strategies for handling 
InterruptedExceptions: propagate them, ignore them, or defer them.  

The first strategy is to propagate the exception back to whoever 
calls your code. Here's an example: 

//throughout this example error checking omitted for brevity
interface Task {
    public void run() throws InterruptedException;
    }

class PropagatingTask implements Task {
        public void run() throws InterruptedException {
        Thread.sleep(1000);
        System.out.println("PropagatingTask completed");
    }
}

public class TaskClient implements Runnable {
    public static final int taskCount = 1000;
    Task[] tasks;

    public void doMain(Task[] tasks) {
        this.tasks = tasks;
        Thread worker = new Thread(this);
        worker.start();
        try {
            System.in.read();
        }
        catch (java.io.IOException ioe) {
            System.out.println("I/O exception on input");
        }
        System.out.println("==========================Shutting down");
        worker.interrupt();
        try {
            worker.join();
        }
        catch (InterruptedException ie) {
            System.out.println("Unexpected interruption on main thread");
        }
    }

    public void run() {
        try {
            for (int n=0; n<taskCount; n++) {
                tasks[n].run();
                if (Thread.interrupted()) {
                    System.out.println("Interrupted state detected by client");
                    throw new InterruptedException();
                }
            }
        }
        catch (InterruptedException ie) {
            System.out.println("Interruption caught by client");
        }
    }

    public static void main(String[] args) {
        try {
            Class cls = Class.forName(args[0]);
            Task[] tasks = new Task[taskCount];
            for (int n=0; n<taskCount; n++) {
                tasks[n] = (Task) cls.newInstance();
            }
            new TaskClient().doMain(tasks);
        } catch (Exception e) {
            e.printStackTrace();
        }
    }
}

Try running TaskClient by entering the following on the command 
line:

        java TaskClient PropagatingTask

TaskClient expects a single command line argument which names an 
implementation of the Task interface, in this case, 
Propagatingtask. TaskClient then creates 1000 tasks, and runs them 
on a background thread. Each PropagatingTask sleeps for one second 
and prints "PropagatingTask completed." 
	
You will probably get bored and want to interrupt the thread 
before all 1000 tasks complete. The main thread allows this by 
reading from System.in. Try this by pressing the Enter key. When 
you do this, the main thread calls interrupt() on the worker thread. 
It then calls join(); this allows the worker thread to complete 
before the main thread exits. Your output should end like this: 

PropagatingTask completed
==========================Shutting down
Interruption caught by client

Notice that no tasks complete after the interruption. This means 
that all the tasks not yet started never get the chance to start. 
It also means that the one task in progress is rudely interrupted 
in the middle of processing. Both of these behaviors are a 
consequence of the PropagatingTask allowing the 
InterruptedException to propagate all the way back to the caller.  

The PropagatingTask implementation is the simplest way to deal with 
InterruptedExceptions, and it has the advantage of allowing you to 
interrupt the thread so that no new tasks begin. However, this
approach has two disadvantages: (1) the caller is forced to handle 
InterruptedExceptions, and (2) the task that was in progress is 
forcibly stopped; this might be unacceptable if the task left data 
in some invalid state. Here is another approach, one that addresses
the first problem: 

class IgnoringTask implements Task {
    public void run() {
                long now = System.currentTimeMillis();
                long end = now + 1000;
                while (now < end) {
            try {
                Thread.sleep(end-now);
            }
            catch (InterruptedException ie) {
                System.out.println("IgnoringTask ignoring interruption");
            }
            now = System.currentTimeMillis();
                }
        System.out.println("IgnoringTask completed");
    }
}

IgnoringTask uses System.currentTimeMillis() to keep track of 
elapsed time, and if an InterruptedException is thrown, it catches 
the exception and goes back to finish its work. Because the 
InterruptedException is not thrown from the run() method, it is not 
declared as a checked exception, and clients do not have to handle 
it. Try running IgnoringTask by entering the following on the 
command line: 

        java TaskClient IgnoringTask
	
If you press Enter to interrupt the thread, you will see this 
output:

==========================Shutting down
IgnoringTask ignoring interruption
IgnoringTask completed
IgnoringTask completed
etc.

Notice that "IgnoringTask completed" continues to be printed. As 
you can see, an IgnoringTask cannot be interrupted midstream. This 
is appropriate in most situations. Unfortunately, an IgnoringTask
also prevents the thread from being interrupted at all. Even after
you try to interrupt the thread, new tasks will continue to run.
You have made your thread permanently uninterruptible, and other 
programmers who use your code are not likely to be happy.  

What you need is some way to guarantee that tasks already in 
progress will finish, but still provide some way to interrupt the 
thread. The DeferringTask class provides a solution: 

class DeferringTask implements Task {
    public void run() {
                long now = System.currentTimeMillis();
                long end = now + 1000;
        boolean wasInterrupted = false;
                while (now < end) {
            try {
                Thread.sleep(end-now);
            }
            catch (InterruptedException ie) {
                System.out.println("DeferringTask deferring interruption");
                wasInterrupted = true;
            }
            now = System.currentTimeMillis();
                }
        System.out.println("DeferringTask completed");
        if (wasInterrupted) {
            Thread.currentThread().interrupt();
        }
    }
}

DeferringTask is almost exactly the same as IgnoringTask, with 
one crucial difference. DeferringTask remembers that it was 
interrupted by setting the boolean flag wasInterrupted. When the 
task completes, DeferringTask calls interrupt() to reset the 
interrupt flag. Because interrupt() sets a flag instead of 
throwing an InterruptedException, the client does not have to 
catch an InterruptedException. Instead, it can check the setting 
of the interrupt flag by calling Thread.interrupted(), which is 
what TaskClient.run() does. Try running DeferringTask as follows: 

        java TaskClient DeferringTask
	
When you trigger the interrupt() by pressing enter, you should see 
output that ends like this:

==========================Shutting down
DeferringTask deferring interruption
DeferringTask completed
Interrupted state detected by client
Interruption caught by client

Notice that a single DeferringTask completes after interruption.  
This was the one task in progress. However no new tasks begin 
because DeferringTask resets the interrupt flag, and that stops 
the thread.  

No single interruption strategy is appropriate for all situations.  
Here is a summary of the three strategies in this tip:

 ----------------------------------------------------------------
| Strategy | Client must  | Tasks forcibly | No new tasks begin  |
|	   | catch IE?	  | stopped        | after interrupt?	 |
|----------------------------------------------------------------|		  
| propagate| yes          | yes            | yes                 |
| ignore   | no           | no             | no                  |						    
| defer    | no*          | no             | yes*                |
|----------------------------------------------------------------|
| * for defer to work correctly, caller must check for           |           
|   interruption                                                 |
 ----------------------------------------------------------------
																  
For a more in-depth look at interruption, refer to Chapter 9 of 
Multithreaded Programming with Java Technology, by Bil Lewis and 
Daniel J. Berg (for information about this book, see
http://www.sun.com/books/catalog/lewis3/).																

.  .  .  .  .  .  .  .  .  .  .  .  .  .  .  .  .  .  .  .  .  .  .

- NOTE
The names on the JDC mailing list are used for internal Sun
Microsystems(tm) purposes only. To remove your name from the list,
see Subscribe/Unsubscribe below.


- FEEDBACK
Comments? Send your feedback on the JDC Tech Tips to:

jdc-webmaster@sun.com


- SUBSCRIBE/UNSUBSCRIBE
The JDC Tech Tips are sent to you because you elected to subscribe
when you registered as a JDC member. To unsubscribe from JDC email,
go to the following address and enter the email address you wish to
remove from the mailing list:

http://developer.java.sun.com/unsubscribe.html


To become a JDC member and subscribe to this newsletter go to:

http://java.sun.com/jdc/


- ARCHIVES
You'll find the JDC Tech Tips archives at:

http://developer.java.sun.com/developer/TechTips/index.html


- COPYRIGHT
Copyright 2000 Sun Microsystems, Inc. All rights reserved.
901 San Antonio Road, Palo Alto, California 94303 USA.

This document is protected by copyright. For more information, see:

http://developer.java.sun.com/developer/copyright.html


This issue of the JDC Tech Tips is written by Stuart Halloway,
a Java specialist at DevelopMentor (http://www.develop.com/java).


JDC Tech Tips
April 25, 2000

* As used in this document, the terms "Java virtual machine" or "JVM" 
  mean a virtual machine for the Java platform. 







