Received: from SOUTH-STATION-ANNEX.MIT.EDU by po10.MIT.EDU (5.61/4.7) id AA16994; Tue, 11 Apr 00 17:07:29 EDT
Received: from hermes.javasoft.com by MIT.EDU with SMTP
	id AA27697; Tue, 11 Apr 00 17:06:52 EDT
Received: (from nobody@localhost)
	by hermes.java.sun.com (8.9.3+Sun/8.9.1) id VAA22622;
	Tue, 11 Apr 2000 21:08:01 GMT
Date: Tue, 11 Apr 2000 21:08:01 GMT
Message-Id: <200004112108.VAA22622@hermes.java.sun.com>
X-Authentication-Warning: hermes.java.sun.com: Processed from queue /bulkmail/data/ed_11/mqueue7
X-Mailing: 198
From: JDCTechTips@sun.com
Subject: JDC Tech Tips  April, 11, 2000
To: JDCMember@sun.com
Reply-To: JDCTechTips@sun.com
Errors-To: bounced_mail@hermes.java.sun.com
Precedence: junk
Mime-Version: 1.0
Content-Type: text/plain; charset=us-ascii
X-Mailer: Beyond Email 2.2


 J  D  C    T  E  C  H    T  I  P  S

                      TIPS, TECHNIQUES, AND SAMPLE CODE


WELCOME to the Java Developer Connection(sm) (JDC) Tech Tips, 
April 11, 2000. This issue covers:

         * Formatting Decimal Numbers
         * Using Checksums
         
These tips were developed using Java(tm) 2 SDK, Standard Edition, 
v 1.2.2.

You can view this issue of the Tech Tips on the Web at 
http://developer.java.sun.com/developer/TechTips/2000/tt0411.html
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 

FORMATTING DECIMAL NUMBERS

Suppose that you're developing a Java(tm) application that uses 
decimal numbers, and you'd like to control the formatting of 
these numbers for output purposes. How do you do this using the 
Java library?

Or perhaps you don't care about formatting, but you do care about
making your application work in an international context. For
example, a simple statement like:

    System.out.println(1234.56);

is locale-dependent; "." is used as a decimal point in the United
States, but not necessarily everywhere else. How do you deal with
this concern?

A couple of classes in the java.text package deal with these kinds
of issues. Here's a simple example that uses these classes to
tackle the problem mentioned in the previous paragraph:

    import java.text.NumberFormat;
    import java.util.Locale;

    public class DecimalFormat1 {
        public static void main(String args[]) {

            // get format for default locale

            NumberFormat nf1 = NumberFormat.getInstance();
            System.out.println(nf1.format(1234.56));

            // get format for German locale

            NumberFormat nf2 =
                NumberFormat.getInstance(Locale.GERMAN);
            System.out.println(nf2.format(1234.56));
        }
    }

If you live in the United States and run this program, the output
is:

    1,234.56
    1.234,56

In other words, different locales, in this case locales for the
United States and for Germany, use different conventions for
representing numbers.

NumberFormat.getInstance returns an instance of NumberFormat
(actually a concrete subclass of NumberFormat such as
DecimalFormat), that is suited for formatting numbers according 
to the default locale. You can also specify a non-default locale, 
such as "Locale.GERMAN". Then the format method is called to 
format a number according to the rules of a specific locale.

Note that the program could have done the formatting using 
a single expression:

    NumberFormat.getInstance().format(1234.56)

but it's more efficient to save a format and then reuse it.

Internationalization is a big issue when formatting numbers. 
Another is the ability to exercise fine control over formatting,
for example, by specifying the number of decimal places. Here's
another example that illustrates this idea:

    import java.text.DecimalFormat;
    import java.util.Locale;

    public class DecimalFormat2 {
        public static void main(String args[]) {

            // get format for default locale

            DecimalFormat df1 = new DecimalFormat("####.000");
            System.out.println(df1.format(1234.56));

            // get format for German locale

            Locale.setDefault(Locale.GERMAN);
            DecimalFormat df2 = new DecimalFormat("####.000");
            System.out.println(df2.format(1234.56));
        }
    }

In this example, a specific number format is set, using a notation
like "####.000". This pattern means "four places before the decimal 
point, which are empty if not filled, and three places after the 
decimal point, which are 0 if not filled". The output of this 
program is:

    1234.560
    1234,560

In a similar way, it's possible to control exponent formatting, for
example:

    import java.text.DecimalFormat;

    public class DecimalFormat3 {
        public static void main(String args[]) {
            DecimalFormat df = new DecimalFormat("0.000E0000");
            System.out.println(df.format(1234.56));
        }
    }

The output here is:

    1.235E0003

You can also work with percentages:

    import java.text.NumberFormat;

    public class DecimalFormat4 {
        public static void main(String args[]) {
            NumberFormat nf = NumberFormat.getPercentInstance();
            System.out.println(nf.format(0.47));
        }
    }

The output from this program is:

    47%

So far, you've seen various techniques for formatting numbers.
What about going the other direction, that is, reading and parsing 
strings that contain formatted numbers? Parsing support is included 
in NumberFormat. For example, you can say:

    import java.util.Locale;
    import java.text.NumberFormat;
    import java.text.ParseException;

    public class DecimalFormat5 {
        public static void main(String args[]) {

            // get format for default locale

            NumberFormat nf1 = NumberFormat.getInstance();
            Object obj1 = null;

            // parse number based on format

            try {
                obj1 = nf1.parse("1234,56");
            }
            catch (ParseException e1) {
                System.err.println(e1);
            }
            System.out.println(obj1);

            // get format for German locale

            NumberFormat nf2 =
                NumberFormat.getInstance(Locale.GERMAN);
            Object obj2 = null;

            // parse number based on format

            try {
                obj2 = nf2.parse("1234,56");
            }
            catch (ParseException e2) {
                System.err.println(e2);
            }
            System.out.println(obj2);
        }
    }

This example has two parts, both of them concerned with parsing 
an identical string: "1234,56". The first part uses the default 
locale, the second the German locale. When this program is run 
in the United States, the result is:

    123456
    1234.56

In other words, the string "1234,56" is interpreted as a large
integer "123456" in the United States, but as a decimal number
"1234.56" in the German locale.

There's one final point to be covered in this discussion of
formatting. In the examples above, DecimalFormat and NumberFormat
are both used. DecimalFormat is used to gain fine control over
formatting, while NumberFormat is used to specify a locale other 
than the default. How do you combine these two classes?

The answer centers around the fact that DecimalFormat is a
subclass of NumberFormat, a subclass whose instances are specific 
to a particular locale. So you can use NumberFormat.getInstance to
specify a locale, and then cast the resulting instance to a
DecimalFormat object. The documentation says that this technique
will work in the vast majority of cases, but that you need to
surround the cast with a try/catch block just in case it does not
(presumably in a very obscure case with an exotic locale). Such an
approach looks like this:

    import java.text.DecimalFormat;
    import java.text.NumberFormat;
    import java.util.Locale;

    public class DecimalFormat6 {
        public static void main(String args[]) {
            DecimalFormat df = null;

            // get a NumberFormat object and cast it to
            // a DecimalFormat object

            try {
                df = (DecimalFormat)
                    NumberFormat.getInstance(Locale.GERMAN);
            }
            catch (ClassCastException e) {
                System.err.println(e);
            }

            // set a format pattern

            df.applyPattern("####.00000");

            // format a number

            System.out.println(df.format(1234.56));
        }
    }

The getInstance method obtains the format, then applyPattern is
called to set a particular formatting pattern. The output of this
program is:

    1234,56000

If you don't care about internationalization, it makes sense to use
DecimalFormat directly.

USING CHECKSUMS

In the computer software field, a "checksum" is a value computed
from a stream of bytes. The checksum is a signature for the bytes, 
that is, a combining of the bytes using some algorithm. What's
important is that changes or corruption in the byte stream can be 
detected with a high degree of probability.

An example of checksum use is found in data transmission. An
application might transmit 100 bytes of information to another
application across a network. The application appends to the bytes 
a 32-bit checksum that is computed from the values of the bytes. 
On the receiving end of the transmission, the checksum is computed 
again based on the 100 bytes that were received. If the checksum
at the receiving end is different than the one computed at the 
transmitting end, then the data has been corrupted in some way.

A checksum is typically much smaller than the data it's calculated
on. So it relies on a probabilistic model to catch most, but not
all, errors in the data. Checksums closely resemble hash codes, in
that an algorithm is applied in each case to compute a number from 
a sequence of bytes.

The class java.util.zip.CRC32 implements one of the standard
checksum algorithms: CRC-32. To see how you might use
checksums, consider the following application: you're writing some 
strings to a text file, and you'd like to know whether the string 
list has been modified after writing. For example, you'd like to 
find out if someone used a text editor to edit the file. Here are 
two programs that comprise the application. The first program 
writes a set of strings to a file, and computes a running checksum 
from the bytes of the string characters:

    import java.io.*;
    import java.util.zip.CRC32;

    public class Checksum1 {

        // list of names to write to a file

        static final String namelist[] = {
            "Jane Jones",
            "Tom Garcia",
            "Sally Smith",
            "Richard Robinson",
            "Jennifer Williams"
        };

        public static void main(String args[]) throws IOException {
            FileWriter fw = new FileWriter("out.txt");
            BufferedWriter bw = new BufferedWriter(fw);
            CRC32 checksum = new CRC32();

            // write the length of the list

            bw.write(Integer.toString(namelist.length));
            bw.newLine();

            // write each name and update the checksum

            for (int i= 0; i < namelist.length; i++) {
                String name = namelist[i];
                bw.write(name);
                bw.newLine();
                checksum.update(name.getBytes());
            }

            // write the checksum

            bw.write(Long.toString(checksum.getValue()));
            bw.newLine();

            bw.close();
        }
    }

The output of running this program is in a file "out.txt", with
contents:

    5
    Jane Jones
    Tom Garcia
    Sally Smith
    Richard Robinson
    Jennifer Williams
    4113203990

The number on the last line is a checksum computed by combining all
the bytes found in the string characters.

The second program reads the file:

    import java.io.*;
    import java.util.zip.CRC32;

    public class Checksum2 {
        public static void main(String args[]) throws IOException {
            FileReader fr = new FileReader("out.txt");
            BufferedReader br = new BufferedReader(fr);
            CRC32 checksum = new CRC32();

            // read the number of names from the file

            int len = Integer.parseInt(br.readLine());

            // read each name from the file and update the checksum

            String namelist[] = new String[len];
            for (int i = 0; i < len; i++) {
                namelist[i] = br.readLine();
                checksum.update(namelist[i].getBytes());
            }

            // read the checksum

            long cs = Long.parseLong(br.readLine());

            br.close();

            // if checksum doesn't match, give error,
            // else display the list of names

            if (cs != checksum.getValue()) {
                System.err.println("*** bad checksum ***");
            }
            else {
                for (int i = 0; i < len; i++) {
                    System.out.println(namelist[i]);
                }
            }
        }
    }

This program reads the list of names from the file and displays the
names. If you edit "out.txt" with a text editor, and change one of
the names, for example changing "Tom" to "Thomas", the program will
compute a different checksum, and display a checksum error message.

Now, you might think that a person could maliciously change the 
text file, compute a new checksum, and change that as well. This
is possible, but not easy to do. That's because the CRC-32 checksum 
algorithm is not obvious to a casual user, and so it's difficult to 
calculate what the new checksum value should be.

Another way of using checksums is through the CheckedInputStream and 
CheckedOutputStream classes in java.util.zip. These classes support 
computation of a running checksum on an I/O stream.

.  .  .  .  .  .  .  .  .  .  .  .  .  .  .  .  .  .  .  .  .  .  .

- NOTE
The names on the JDC mailing list are used for internal Sun
Microsystems(tm) purposes only. To remove your name from the list,
see Subscribe/Unsubscribe below.


- FEEDBACK
Comments? Send your feedback on the JDC Tech Tips to:

jdc-webmaster@sun.com


- SUBSCRIBE/UNSUBSCRIBE
The JDC Tech Tips are sent to you because you elected to subscribe
when you registered as a JDC member. To unsubscribe from JDC email,
go to the following address and enter the email address you wish to
remove from the mailing list:

http://developer.java.sun.com/unsubscribe.html


To become a JDC member and subscribe to this newsletter go to:

http://java.sun.com/jdc/


- ARCHIVES
You'll find the JDC Tech Tips archives at:

http://developer.java.sun.com/developer/TechTips/index.html


- COPYRIGHT
Copyright 2000 Sun Microsystems, Inc. All rights reserved.
901 San Antonio Road, Palo Alto, California 94303 USA.

This document is protected by copyright. For more information, see:

http://developer.java.sun.com/developer/copyright.html


This issue of the JDC Tech Tips is written by Glen McCluskey.

JDC Tech Tips 
April 11, 2000














