Tuesday, 28 March 2017

Java Bean & JAR Files - Java Tutorials

Java Bean

What Is a Java Bean?

A Java Bean is a software component that has been designed to be reusable in a variety of different environments. There is no restriction on the capability of a Bean. It may perform a simple function, such as checking the spelling of a document, or a complex function, such as forecasting the performance of a stock portfolio. A Bean may be visible to an end user. One example of this is a button on a graphical user interface. A Bean may also be invisible to a user. Software to decode a stream of multimedia information in real time is an example of this type of building block. Finally, a Bean may be designed to work autonomously on a user’s workstation or to work in cooperation with a set of other distributed components. Software to generate a pie chart from a set of data points is an example of a Bean that can execute locally. However, a Bean that provides real-time price information from a stock or commodities exchange would need to work in cooperation with other distributed software to obtain its data.

You will see shortly what specific changes a software developer must make to a class so that it is usable as a Java Bean. However, one of the goals of the Java designers was to make it easy to use this technology. Therefore, the code changes are minimal.


Advantages of Java Beans

A software component architecture provides standard mechanisms to deal with software building blocks. The following list enumerates some of the specific benefits that Java technology provides for a component developer:
  • A Bean obtains all the benefits of Java’s “write-once, run-anywhere” paradigm.
  • The properties, events, and methods of a Bean that are exposed to an application builder tool can be controlled.
  • A Bean may be designed to operate correctly in different locales, which makes it useful in global markets.
  • Auxiliary software can be provided to help a person configure a Bean. This software is only needed when the design-time parameters for that component are being set. It does not need to be included in the run-time environment.
  • The configuration settings of a Bean can be saved in persistent storage and restored at a later time.
  • A Bean may register to receive events from other objects and can generate events that are sent to other objects.


Application Builder Tools

When working with Java Beans, most developers use an application builder tool, a utility that enables you to configure a set of Beans, connect them together, and produce a working application. In general, Bean builder tools have the following capabilities.
  • A palette is provided that lists all of the available Beans. As additional Beans are developed or purchased, they can be added to the palette.
  • A worksheet is displayed that allows the designer to lay out Beans in a graphical user interface. A designer may drag and drop a Bean from the palette to this worksheet.
  • Special editors and customizers allow a Bean to be configured. This is the mechanism by which the behavior of a Bean may be adapted for a particular environment.
  • Commands allow a designer to inquire about the state and behavior of a Bean. This information automatically becomes available when a Bean is added to the palette.
  • Capabilities exist to interconnect Beans. This means that events generated by one component are mapped to method invocations on other components.
  • When a collection of Beans has been configured and connected, it is possible to save all of this information in a persistent storage area. At a later time, this information can then be used to restore the state of the application.

Sun provides two Bean application builder tools. The first is the BeanBox, which is part of the Bean Developers Kit (BDK). The BDK is the original builder tool provided by Sun. The second is the new Bean Builder. Because Bean Builder is designed to supplant the BeanBox, Sun has stopped development of the BDK and all new Bean applications will be created using Bean Builder.

Although Bean Builder is the future of Bean development, it is not the sole focus of this chapter. Instead, both BeanBox and Bean Builder are discussed. The reason for this is that Bean Builder requires Java 2, version 1.4. It is incompatible with earlier versions of Java 2. This means that readers of this book using Java 2, version 1.2 or version 1.3 will not be able to use Bean Builder. Instead, they must continue to use the BDK. Further, readers using version 1.4 cannot use the BDK because it is not compatible with Java 2, version 1.4. So, if you are using version 1.4, then you must use Bean Builder. If you are using a version of Java prior to 1.4, you must use the BDK. Thus, both approaches are described here, beginning with the BDK. Keep in mind that the information about Beans, Bean architecture, JAR files, and so on, apply to either Bean development tool.

One other point: At the time of this writing, Java 2, version 1.4 is a released product, but Bean Builder is currently in beta testing. This means that the only way for a 1.4 user to create a Bean application is to do so using latest Bean Builder beta. For this reason, we will not examine its features in depth at this time. However, at the end of this chapter, a general overview is presented and a sample application is created.


Using the Bean Developer Kit (BDK)

The Bean Developer Kit (BDK), available from the JavaSoft site, is a simple example of a tool that enables you to create, configure, and connect a set of Beans. There is also a set of sample Beans with their source code. This section provides step-by-step instructions for installing and using this tool. Remember, the BDK is for use with versions of Java 2 prior to 1.4. For Java 2, v1.4 you must use the Bean Builder Tool described at the end of this chapter.

In this chapter, instructions are provided for a Windows environment. The procedures for a UNIX platform are similar, but some of the commands are different.

Installing the BDK
The Java 2 SDK must be installed on your machine for the BDK to work. Confirm that the SDK tools are accessible from your environment. The BDK can then be downloaded from the JavaSoft site (http://java.sun.com). It is packaged as one file that is a self-extracting archive. Follow the instructions to install it on your machine. The discussion that follows assumes that the BDK is installed in a directory called bdk. If this is not the case with your system, substitute the proper directory.

Starting the BDK
To start the BDK, follow these steps:
  1. Change to the directory c:\bdk\beanbox.
  2. Execute the batch file called run.bat. This causes the BDK to display the three windows s. ToolBox lists all of the different Beans that have been included with the BDK. BeanBox provides an area to lay out and connect the Beans selected from the ToolBox. Properties provides the ability to configure a selected Bean. You may also see a window called Method Tracer, but we won’t be using it.


Using the BDK
This section describes how to create an application by using some of the Beans provided with the BDK. First, the Molecule Bean displays a three-dimensional view of a molecule. It may be configured to present one of the following molecules: hyaluronic acid, benzene, buckminsterfullerine, cyclohexane, ethane, or water. This component also has methods that allow the molecule to be rotated in space along its X or Y axis. Second, the OurButton Bean provides a push-button functionality. We will have one button labeled “Rotate X” to rotate the molecule along its X axis and another button labeled “Rotate Y” to rotate the molecule along its Y axis.

Create and Configure an Instance of the Molecule Bean
Follow these steps to create and configure an instance of the Molecule Bean:
  1. Position the cursor on the ToolBox entry labeled Molecule and click the left mouse button. You should see the cursor change to a cross.
  2. Move the cursor to the BeanBox display area and click the left mouse button in approximately the area where you wish the Bean to be displayed. You should see a rectangular region appear that contains a 3-D display of a molecule. This area is surrounded by a hatched border, indicating that it is currently selected.
  3. You can reposition the Molecule Bean by positioning the cursor over one of the hatched borders and dragging the Bean.
  4. You can change the molecule that is displayed by changing the selection in the Properties window. Notice that the Bean display changes immediately when you change the selected molecule.


Create and Configure an Instance of the OurButton Bean
Follow these steps to create and configure an instance of the OurButton Bean and connect it to the Molecule Bean:
  1. Position the cursor on the ToolBox entry labeled OurButton and click the left mouse button. You should see the cursor change to a cross.
  2. Move the cursor to the BeanBox display area and click the left mouse button in approximately the area where you wish the Bean to be displayed. You should see a rectangular region appear that contains a button. This area is surrounded by a hatched border indicating that it is currently selected.
  3. You may reposition the OurButton Bean by positioning the cursor over one of the hatched borders and dragging the Bean.
  4. Go to the Properties window and change the label of the Bean to “Rotate X”. The button appearance changes immediately when this property is changed.
  5. Go to the menu bar of the BeanBox and select Edit | Events | action | actionPerformed. You should now see a line extending from the button to the cursor. Notice that one end of the line moves as the cursor moves. However, the other end of the line remains fixed at the button.
  6. Move the cursor so that it is inside the Molecule Bean display area, and click the left mouse button. You should see the Event Target Dialog dialog box.
  7. The dialog box allows you to choose a method that should be invoked when this button is clicked. Select the entry labeled “rotateOnX” and click the OK button. You should see a message box appear very briefly, stating that the tool is “Generating and compiling adaptor class.”

Test the application. Each time you press the button, the molecule should move a few degrees around one of its axes. Now create another instance of the OurButton Bean. Label it “Rotate Y” and map its action event to the “rotateY” method of the Molecule Bean. The steps to do this are very similar to those just described for the button labeled “Rotate X”. Test the application by clicking these buttons and observing how the molecule moves.




JAR Files

Before developing your own Bean, it is necessary for you to understand JAR (Java Archive) files, because tools such as the BDK expect Beans to be packaged within JAR files. A JAR file allows you to efficiently deploy a set of classes and their associated resources. For example, a developer may build a multimedia application that uses various sound and image files. A set of Beans can control how and when this information is presented. All of these pieces can be placed into one JAR file.

JAR technology makes it much easier to deliver and install software. Also, the elements in a JAR file are compressed, which makes downloading a JAR file much faster than separately downloading several uncompressed files. Digital signatures may also be associated with the individual elements in a JAR file. This allows a consumer to be sure that these elements were produced by a specific organization or individual.

The package java.util.zip contains classes that read and write JAR files.


Manifest Files

A developer must provide a manifest file to indicate which of the components in a JAR file are Java Beans. An example of a manifest file is provided in the following listing. It defines a JAR file that contains four .gif files and one .class file. The last entry is a Bean.

  Name: sunw/demo/slides/slide0.gif
  Name: sunw/demo/slides/slide1.gif
  Name: sunw/demo/slides/slide2.gif
  Name: sunw/demo/slides/slide3.gif
  Name: sunw/demo/slides/Slides.class
  Java-Bean: True

A manifest file may reference several .class files. If a .class file is a Java Bean, its entry must be immediately followed by the line “Java-Bean: True”.

The JAR Utility
A utility is used to generate a JAR file. Its syntax is shown here:

      jar options files

The following examples show how to use this utility.

Creating a JAR File
The following command creates a JAR file named Xyz.jar that contains all of the .class
and .gif files in the current directory:

  jar cf Xyz.jar *.class *.gif

If a manifest file such as Yxz.mf is available, it can be used with the following command:

  jar cfm Xyz.jar Yxz.mf *.class *.gif


JAR Command Options

c      -   A new archive is to be created.

C     -   Change directories during command execution.

f      -   The first element in the file list is the name of the archive that is to be created or accessed.

i      -   Index information should be provided.

m    -   The second element in the file list is the name of the external manifest file.

M    -   Manifest file not created.

t      -   The archive contents should be tabulated.

u     -   Update existing JAR file.

v     -   Verbose output should be provided by the utility as it executes.

x    -   Files are to be extracted from the archive. (If there is only one file, that is the name of the archive, and all files in it are extracted. Otherwise, the first element in the file list is the name of the archive, and the remaining elements in the list are the files that should be extracted from the archive.)

0     -   Do not use compression.


Tabulating the Contents of a JAR File
The following command lists the contents of Xyz.jar:

  jar tf Xyz.jar

Extracting Files from a JAR File
The following command extracts the contents of Xyz.jar and places those files in the current directory:

  jar xf Xyz.jar

Updating an Existing JAR File
The following command adds the file file1.class to Xyz.jar:

  jar -uf Xyz.jar file1.class

Recursing Directories
The following command adds all files below directoryX to Xyz.jar:

  jar -uf Xyz.jar -C directoryX *

Reflection & Remote Method Invocation (RMI) - Java Tutorials

Reflection

Reflection is the ability of software to analyze itself. This is provided by the java.lang.reflect package and elements in Class. Reflection is an important capability, needed when using components called Java Beans. It allows you to analyze a software component and describe its capabilities dynamically, at run time rather than at compile time. For example, by using reflection, you can determine what methods, constructors, and fields a class supports.

The package java.lang.reflect has an interface, called Member, which defines methods that allow you to get information about a field, constructor, or method of a class. There are also eight classes in this package. 

The following application illustrates a simple use of the Java reflection capabilities. It prints the constructors, fields, and methods of the class java.awt.Dimension. The program begins by using the forName( ) method of Class to get a class object for java.awt.Dimension. Once this is obtained, getConstructors( ), getFields( ), and getMethods( ) are used to analyze this class object. They return arrays of Constructor, Field, and Method objects that provide the information about the object. The Constructor, Field, and Method classes define several methods that can be used to obtain information about an object. You will want to explore these on your own. However, each supports the toString( ) method. Therefore, using Constructor, Field, and Method objects as arguments to the println( ) method is straightforward, as shown in the program.


Classes Defined in java.lang.reflect

AccessibleObject:  Allows you to bypass the default access control checks. (Added by Java 2)

Array:  Allows you to dynamically create and manipulate arrays.

Constructor:  Provides information about a constructor.

Field:  Provides information about a field.

Method:  Provides information about a method.

Modifier:  Provides information about class and member access modifiers.

Proxy:  Supports dynamic proxy classes. (Added by Java 2, v1.3)

ReflectPermission:  Allows reflection of private or protected members of a class. (Added by Java 2)


  // Demonstrate reflection.
  import java.lang.reflect.*;
  public class ReflectionDemo1 {
    public static void main(String args[]) {
      try {
        Class c = Class.forName("java.awt.Dimension");
        System.out.println("Constructors:");
        Constructor constructors[] = c.getConstructors();
        for(int i = 0; i < constructors.length; i++) {
          System.out.println(" " + constructors[i]);
        }

        System.out.println("Fields:");
        Field fields[] = c.getFields();
        for(int i = 0; i < fields.length; i++) {
          System.out.println(" " + fields[i]);
        }

        System.out.println("Methods:");
        Method methods[] = c.getMethods();
        for(int i = 0; i < methods.length; i++) {
          System.out.println(" " + methods[i]);
        }
      }
      catch(Exception e) {
        System.out.println("Exception: " + e);
      }
    }
  }

Here is the output from this program:

  Constructors:
   public java.awt.Dimension(java.awt.Dimension)
   public java.awt.Dimension(int,int)
   public java.awt.Dimension()
  Fields:
   public int java.awt.Dimension.width
   public int java.awt.Dimension.height
  Methods:
   public int java.awt.Dimension.hashCode()
   public boolean java.awt.Dimension.equals(java.lang.Object)
   public java.lang.String java.awt.Dimension.toString()
   public void java.awt.Dimension.setSize(java.awt.Dimension)
   public void java.awt.Dimension.setSize(int,int)
   public void java.awt.Dimension.setSize(double,double)
   public java.awt.Dimension java.awt.Dimension.getSize()
   public double java.awt.Dimension.getWidth()
   public double java.awt.Dimension.getHeight()
   public java.lang.Object java.awt.geom.Dimension2D.clone()
   public void java.awt.geom.Dimension2D.
     setSize(java.awt.geom.Dimension2D)
   public final native java.lang.Class java.lang.Object.getClass()
   public final void java.lang.Object.wait(long,int) throws
     java.lang.InterruptedException
   public final void java.lang.Object.wait()
     throws java.lang.InterruptedException
   public final native void java.lang.Object.wait(long) throws
     java.lang.InterruptedException
   public final native void java.lang.Object.notify()
   public final native void java.lang.Object.notifyAll()


The next example uses Java’s reflection capabilities to obtain the public methods of a class. The program begins by instantiating class A. The getClass( ) method is applied to this object reference and it returns the Class object for class A. The getDeclaredMethods( ) method returns an array of Method objects that describe only the methods declared by this class. Methods inherited from superclasses such as Object are not included.

Each element of the methods array is then processed. The getModifiers( ) method returns an int containing flags that describe which access modifiers apply for this element. The Modifier class provides a set of methods, shown below, that can be used to examine this value. For example, the static method isPublic( ) returns true if its argument includes the public access modifier. Otherwise, it returns false. In the following program, if the method supports public access, its name is obtained by the getName( ) method and is then printed.


Methods Defined by Modifier That Determine Access Modifiers

static boolean isAbstract(int val):  Returns true if val has the abstract flag set and false otherwise.

static boolean isFinal(int val):  Returns true if val has the final flag set and false otherwise.

static boolean isInterface(int val):  Returns true if val has the interface flag set and false otherwise.

static boolean isNative(int val):  Returns true if val has the native flag set and false otherwise.

static boolean isPrivate(int val):  Returns true if val has the private flag set and false otherwise.

static boolean isProtected(int val):  Returns true if val has the protected flag set and false otherwise.

static boolean isPublic(int val):  Returns true if val has the public flag set and false otherwise.

static boolean isStatic(int val):  Returns true if val has the static flag set and false otherwise.

static boolean isStrict(int val):  Returns true if val has the strict flag set and false otherwise.

static boolean isSynchronized(int val):  Returns true if val has the synchronized flag set and false otherwise.

static boolean isTransient(int val):  Returns true if val has the transient flag set and false otherwise.

static boolean isVolatile(int val):  Returns true if val has the volatile flag set and false otherwise.


  // Show public methods.
  import java.lang.reflect.*;
  public class ReflectionDemo2 {
    public static void main(String args[]) {
      try {
        A a = new A();
        Class c = a.getClass();
        System.out.println("Public Methods:");
        Method methods[] = c.getDeclaredMethods();
        for(int i = 0; i < methods.length; i++) {
          int modifiers = methods[i].getModifiers();
          if(Modifier.isPublic(modifiers)) {
            System.out.println(" " + methods[i].getName());
          }
        }
      }
      catch(Exception e) {
        System.out.println("Exception: " + e);
      }
    }
  }
  
  class A {
    public void a1() {
    }
    public void a2() {
    }
    protected void a3() {
    }
    private void a4() {
    }
  }

Here is the output from this program:

  Public Methods:
   a1
   a2




Remote Method Invocation (RMI)

Remote Method Invocation (RMI) allows a Java object that executes on one machine to invoke a method of a Java object that executes on another machine. This is an important feature, because it allows you to build distributed applications. While a complete discussion of RMI is outside the scope of this book, the following example describes the basic principles involved.


A Simple Client/Server Application Using RMI

This section provides step-by-step directions for building a simple client/server application by using RMI. The server receives a request from a client, processes it, and returns a result. In this example, the request specifies two numbers. The server adds these together and returns the sum.

Step One: Enter and Compile the Source Code
This application uses four source files. The first file, AddServerIntf.java, defines the remote interface that is provided by the server. It contains one method that accepts two double arguments and returns their sum. All remote interfaces must extend the Remote interface, which is part of java.rmi. Remote defines no members. Its purpose is simply to indicate that an interface uses remote methods. All remote methods can throw a RemoteException.

  import java.rmi.*;
  public interface AddServerIntf extends Remote {
    double add(double d1, double d2) throws RemoteException;
  }

The second source file, AddServerImpl.java, implements the remote interface. The implementation of the add( ) method is straightforward. All remote objects must extend UnicastRemoteObject, which provides functionality that is needed to make objects available from remote machines.

  import java.rmi.*;
  import java.rmi.server.*;
  public class AddServerImpl extends UnicastRemoteObject
    implements AddServerIntf {

    public AddServerImpl() throws RemoteException {
    }
    public double add(double d1, double d2) throws RemoteException {
      return d1 + d2;
    }
  }

The third source file, AddServer.java, contains the main program for the server machine. Its primary function is to update the RMI registry on that machine. This is done by using the rebind( ) method of the Naming class (found in java.rmi). That method associates a name with an object reference. The first argument to the rebind( ) method is a string that names the server as “AddServer”. Its second argument is a reference to an instance of AddServerImpl.

  import java.net.*;
  import java.rmi.*;
  public class AddServer {
    public static void main(String args[]) {
      try {
        AddServerImpl addServerImpl = new AddServerImpl();
        Naming.rebind("AddServer", addServerImpl);
      }
      catch(Exception e) {
        System.out.println("Exception: " + e);
      }
    }
  }

The fourth source file, AddClient.java, implements the client side of this distributed application. AddClient.java requires three command line arguments. The first is the IP address or name of the server machine. The second and third arguments are the two numbers that are to be summed.

The application begins by forming a string that follows the URL syntax. This URL uses the rmi protocol. The string includes the IP address or name of the server and the string “AddServer”. The program then invokes the lookup( ) method of the Naming class. This method accepts one argument, the rmi URL, and returns a reference to an object of type AddServerIntf. All remote method invocations can then be directed to this object.

The program continues by displaying its arguments and then invokes the remote add( ) method. The sum is returned from this method and is then printed.

  import java.rmi.*;
  public class AddClient {
    public static void main(String args[]) {
      try {
        String addServerURL = "rmi://" + args[0] + "/AddServer";
        AddServerIntf addServerIntf =
                     (AddServerIntf)Naming.lookup(addServerURL);
        System.out.println("The first number is: " + args[1]);
        double d1 = Double.valueOf(args[1]).doubleValue();
        System.out.println("The second number is: " + args[2]);

        double d2 = Double.valueOf(args[2]).doubleValue();
        System.out.println("The sum is: " + addServerIntf.add
                          (d1, d2));
      }
      catch(Exception e) {
        System.out.println("Exception: " + e);
      }
    }
  }

After you enter all the code, use javac to compile the four source files that you created.

Step Two: Generate Stubs and Skeletons
Before you can use the client and server, you must generate the necessary stub. You may also need to generate a skeleton. In the context of RMI, a stub is a Java object that resides on the client machine. Its function is to present the same interfaces as the remote server. Remote method calls initiated by the client are actually directed to the stub. The stub works with the other parts of the RMI system to formulate a request that is sent to the remote machine.

A remote method may accept arguments that are simple types or objects. In the latter case, the object may have references to other objects. All of this information must be sent to the remote machine. That is, an object passed as an argument to a remote method call must be serialized and sent to the remote machine.  that the serialization facilities also recursively process all referenced objects.

Skeletons are not required by Java 2. However, they are required for the Java 1.1 RMI model. Because of this, skeletons are still required for compatibility between Java 1.1 and Java 2. A skeleton is a Java object that resides on the server machine. It works with the other parts of the 1.1 RMI system to receive requests, perform deserialization, and invoke the appropriate code on the server. Again, the skeleton mechanism is not required for Java 2 code that does not require compatibility with 1.1. Because many readers will want to generate the skeleton, one is used by this example.

If a response must be returned to the client, the process works in reverse. Note that the serialization and deserialization facilities are also used if objects are returned to a client. To generate stubs and skeletons, you use a tool called the RMI compiler, which is invoked from the command line, as shown here:

      rmic AddServerImpl

This command generates two new files: AddServerImpl_Skel.class (skeleton) and AddServerImpl_Stub.class (stub). When using rmic, be sure that CLASSPATH is set to include the current directory. As you can see, by default, rmic generates both a stub and a skeleton file. If you do not need the skeleton, you have the option to suppress it.

Step Three: Install Files on the Client and Server Machines
Copy AddClient.class, AddServerImpl_Stub.class, and AddServerIntf.class to a directory on the client machine. Copy AddServerIntf.class, AddServerImpl.class, AddServerImpl_Skel.class, AddServerImpl_Stub.class, and AddServer.class to a directory on the server machine.

RMI has techniques for dynamic class loading, but they are not used by the example at hand. Instead, all of the files that are used by the client and server applications must be installed manually on those machines.

Step Four: Start the RMI Registry on the Server Machine
The Java 2 SDK provides a program called rmiregistry, which executes on the server machine. It maps names to object references. First, check that the CLASSPATH environment variable includes the directory in which your files are located. Then, start the RMI Registry from the command line, as shown here:

      start rmiregistry

When this command returns, you should see that a new window has been created. You need to leave this window open until you are done experimenting with the RMI example.

Step Five: Start the Server
The server code is started from the command line, as shown here:

      java AddServer

Recall that the AddServer code instantiates AddServerImpl and registers that object with the name “AddServer”.

Step Six: Start the Client
The AddClient software requires three arguments: the name or IP address of the server machine and the two numbers that are to be summed together. You may invoke it from the command line by using one of the two formats shown here:

      java AddClient server1 8 9
      java AddClient 11.12.13.14 8 9

In the first line, the name of the server is provided. The second line uses its IP address (11.12.13.14). You can try this example without actually having a remote server. To do so, simply install all of the programs on the same machine, start rmiregistry, start AddSever, and then execute AddClient using this command line:

      java AddClient 127.0.0.1 8 9

Here, the address 127.0.0.1 is the “loop back” address for the local machine. Using this address allows you to exercise the entire RMI mechanism without actually having to install the server on a remote computer. In either case, sample output from this program is shown here:

  The first number is: 8
  The second number is: 9
  The sum is: 17.0

Regular Expression Processing - Java Tutorials

Another exciting package added by Java 2, version 1.4 is java.util.regex, which supports regular expression processing. As the term is used here, a regular expression is a string of characters that describes a character sequence. This general description, called a pattern, can then be used to find matches in other character sequences. Regular expressions can specify wildcard characters, sets of characters, and various quantifiers. Thus, you can specify a regular expression that represents a general form that can match several different specific character sequences.

There are two classes that support regular expression processing: Pattern and Matcher. These classes work together. Use Pattern to define a regular expression. Match the pattern against another sequence using Matcher.


Pattern

The Pattern class defines no constructors. Instead, a pattern is created by calling the compile( ) factory method. One of its forms is shown here:

      static Pattern compile(String pattern)

Here, pattern is the regular expression that you want to use. The compile( ) method transforms the string in pattern into a pattern that can be used for pattern matching by the Matcher class. It returns a Pattern object that contains the pattern. Once you have created a Pattern object, you will use it to create a Matcher. This is done by calling the matcher( ) factory method defined by Pattern. It is shown here:

      Matcher matcher(CharSequence str)

Here str is the character sequence that the pattern will be matched against. This is called the input sequence. CharSequence is an interface that was added by Java 2, version 1.4 that defines a read-only set of characters. It is implemented by the String class, among others. Thus, you can pass a string to matcher( ).


Matcher

The Matcher class has no constructors. Instead, you create a Matcher by calling the matcher( ) factory method defined by Pattern, as just explained. Once you have created a Matcher, you will use its methods to perform various pattern matching operations. The simplest pattern matching method is matches( ), which simply determines whether the character sequence matches the pattern. It is shown here:

      boolean matches( )

It returns true if the sequence and the pattern match, and false otherwise. Understand that the entire sequence must match the pattern, not just a subsequence of it. To determine if a subsequence of the input sequence matches the pattern, use find( ). One version is shown here:

      boolean find( )

It returns true if there is a matching subsequence and false otherwise. This method can be called repeatedly, allowing it to find all matching subsequences. Each call to find( ) begins where the previous one left off.  You can obtain a string containing the last matching sequence by calling group ( ). One of its forms is shown here:

      String group( )

The matching string is returned. If no match exists, then an IllegalStateException is thrown. You can obtain the index within the input sequence of the current match by calling start( ). The index one past the end of the current match is obtained by calling end( ). These methods are shown here:

      int start( )
      int end( )

You can replace all occurrences of a matching sequence with another sequence by calling replaceAll( ), shown here:

      String replaceAll(String newStr)

Here, newStr specifies the new character sequence that will replace the ones that match the pattern. The updated input sequence is returned as a string.


Regular Expression Syntax

Before demonstrating Pattern and Matcher it is necessary to explain how to construct a regular expression. The syntax and rules that define a regular expression are similar to those used by Perl 5. Although no rule is complicated by itself, there are a large number of them, and a complete discussion is beyond the scope of this chapter. However, a few of the more commonly used constructs are described here.

In general, a regular expression is comprised of normal characters, character classes (sets of characters), wildcard characters, and quantifiers. A normal character is matched as-is. Thus, if a pattern consists of “xy”, then the only input sequence that will match it is “xy”. Characters such as newline and tab are specified using the standard escape sequences, which begin with a \. For example, a newline is specified by \n. In the language of regular expressions, a normal character is also called a literal.

A character class is a set of characters. A character class is specified by putting the characters in the class between brackets. For example, the class [wxyz] matches w, x, y, or z. To specify an inverted set, precede the characters with a ^. For example, [^wxyz] matches any character except w, x, y, or z. You can specify a range of characters using a hypen. For example, to specify a character class that will match the digits 1 through 9 use [1-9].

The wildcard character is the . (dot) and it matches any character. Thus, a pattern that consists of “.” will match these (and other) input seqeunces: “A”, “a”, “x”, and so on. A quantifier determines how many times an expression is matched. The quantifiers are shown here:

      +     -  Match one or more.
      *     -  Match zero or more.
      ?     -  Match zero or one.

For example, the pattern “x+” will match “x”, “xx”, and “xxx”, among others.


Demonstrating Pattern Matching

The best way to understand how regular expression pattern matching operates is to work through some examples. The first, shown here, looks for a match with a literal pattern.

  // A simple pattern matching demo.
  import java.util.regex.*;

  class RegExpr {
    public static void main(String args[]) {
      Pattern pat;
      Matcher mat;
      boolean found;

      pat = Pattern.compile("Java");
      mat = pat.matcher("Java");

      found = mat.matches(); // check for a match

      System.out.println("Testing Java against Java.");
      if(found) System.out.println("Matches");
      else System.out.println("No Match");

      System.out.println();

      System.out.println("Testing Java against Java 2.");
      mat = pat.matcher("Java 2"); // create a new matcher

      found = mat.matches(); // check for a match

      if(found) System.out.println("Matches");
      else System.out.println("No Match");
    }
  }

The output from the program is shown here:

  Testing Java against Java.
  Matches

  Testing Java against Java 2.
  No Match

Let’s look closely at this program. The program begins by creating the pattern that contains the sequence “Java”. Next, a Matcher is created for that pattern that has the input sequence “Java”. Then, the matches( ) method is called to determine if the input sequence matches the pattern. Because, the sequence and the pattern are the same, matches( ) returns true. Next, a new Matcher is created with the input sequence “Java 2” and matches( ) is called again. In this case, the pattern and the input sequence differ, and no match is found. Remember, the matches( ) function returns true only when the input sequence precisely matches the pattern. It will not return true just because a subsequence matches.

You can use find( ) to determine if the input sequence contains a subsequence that matches the pattern. Consider the following program.

  // Use find() to find a subsequence.
  import java.util.regex.*;

  class RegExpr2 {
    public static void main(String args[]) {
      Pattern pat = Pattern.compile("Java");
      Matcher mat = pat.matcher("Java 2");

      System.out.println("Looking for Java in Java 2.");

      if(mat.find()) System.out.println("subsequence found");
      else System.out.println("No Match");
    }
  }

The output is shown here:

  Looking for Java in Java 2.
  subsequence found

In this case, find( ) finds the subsequence “Java”. The find( ) method can be used to search the input sequence for repeated occurrences of the pattern because each call to find( ) picks up where the previous one left off. For example, the following program finds two occurrences of the pattern “test”.

  // Use find() to find multiple subsequences.
  import java.util.regex.*;

  class RegExpr3 {
    public static void main(String args[]) {
      Pattern pat = Pattern.compile("test");
      Matcher mat = pat.matcher("test 1 2 3 test");

      while(mat.find()) {
        System.out.println("test found at index " +
                           mat.start());
      }
    }
  }

The output is shown here:

  test found at index 0
  test found at index 11

As the output shows, two matches were found. The program uses the start( ) method to obtain the index of each match.


Using Wildcards and Quantifiers

Although the preceding programs show the general technique for using Pattern and Matcher, they don’t show their power. The real benefit of regular expression processing is not seen until wildcards and quantifiers are used. To begin, consider the following example that uses the + quantifier to match any arbitrarily long sequence of Ws.

  // Use a quantifier.
  import java.util.regex.*;

  class RegExpr4 {
    public static void main(String args[]) {
      Pattern pat = Pattern.compile("W+");
      Matcher mat = pat.matcher("W WW WWW");

      while(mat.find())
        System.out.println("Match: " + mat.group());
    }
  }

The output from the program is shown here:

  Match: W
  Match: WW
  Match: WWW

As the output shows, the regular expression pattern “W+” matches any arbitrarily long sequence of Ws. The next program uses a wildcard to create a pattern that will match any sequence that begins with e and ends with d. To do this, it uses the dot wildcard character along with the + quantifier.

  // Use wildcard and quantifier.
  import java.util.regex.*;

  class RegExpr5 {
    public static void main(String args[]) {
      Pattern pat = Pattern.compile("e.+d");
      Matcher mat = pat.matcher("extend cup end table");

      while(mat.find())
        System.out.println("Match: " + mat.group());
    }
  }

You might be surprised by the the output produced by the program, which is shown here:

  Match: extend cup end

Only one match is found, and it is the longest sequence that begins with e and ends with d. You might have expected two matches: extend and end. The reason that the longer sequence is found is that by default, find( ) matches the longest sequence that fits the pattern. This is called greedy behavior. You can specify reluctant behavior by adding the ? quantifier to the pattern, as shown in this version of the program. It causes the shortest matching pattern to be obtained.

  // Use the ? quantifier.
  import java.util.regex.*;

  class RegExpr6 {
    public static void main(String args[]) {
      // Use reluctant matching behavior.
      Pattern pat = Pattern.compile("e.+?d");
      Matcher mat = pat.matcher("extend cup end table");

      while(mat.find())
        System.out.println("Match: " + mat.group());
    }
  }

The output from the program is shown here:

  Match: extend
  Match: end

As the output shows, the pattern “e.+?d” will match the shortest sequence that begins with e and ends with d. Thus, two matches are found.


Working with Classes of Characters

Sometimes you will want to match any sequence that contains one or more characters, in any order, that are part of a set of characters. For example, to match whole words, you want to match any sequence of the letters of the alphabet. One of the easiest ways to do this is to use a character class, which defines a set of characters. Recall that a character class is created by putting the characters you want to match between brackets. For example, to match the lowercase characters a through z, use [a-z]. The following program demonstrates this technique.

  // Use a character class.
  import java.util.regex.*;

  class RegExpr7 {
    public static void main(String args[]) {
      // Match lowercase words.
      Pattern pat = Pattern.compile("[a-z]+");
      Matcher mat = pat.matcher("this is a test.");

      while(mat.find())
        System.out.println("Match: " + mat.group());
    }
  }

The output is shown here:

  Match: this
  Match: is
  Match: a
  Match: test


Using replaceAll( )

The replaceAll( ) method supplied by Matcher lets you perform powerful search and replace operations that use regular expressions. For example, the following program replaces all occurrences of sequences that begin with “Jon” with “Eric”.

  // Use replaceAll().
  import java.util.regex.*;

  class RegExpr8 {
    public static void main(String args[]) {
      String str = "Jon Jonathan Frank Ken Todd";

      Pattern pat = Pattern.compile("Jon.*? ");
      Matcher mat = pat.matcher(str);

      System.out.println("Original sequence: " + str);

      str = mat.replaceAll("Eric ");

      System.out.println("Modified sequence: " + str);
    }
  }

The output is shown here:

  Original sequence: Jon Jonathan Frank Ken Todd
  Modified sequence: Eric Eric Frank Ken Todd

Because the regular expression “Jon.*? “ matches any string that begins with Jon followed by zero or more characters, ending in a space, it can be used to match and replace both Jon and Jonathan with the name Eric. Such a substitution is not possible without pattern matching capabilities.


Using split( )

You can reduce an input sequence into its individual tokens by using the split( ) method defined by Pattern. The split( ) method is shown here:

      String[ ] split(CharSequence str)

It processes the input sequence passed in str, reducing it into tokens based on the delimiters specified by the pattern. For example, the following program finds tokens that are separated by spaces, commas, periods, and exclamation points.

  // Use split().
  import java.util.regex.*;

  class RegExpr9 {
    public static void main(String args[]) {

      // Match lowercase words.
      Pattern pat = Pattern.compile("[ ,.!]");

      String strs[] = pat.split("one two,alpha9 12!done.");

      for(int i=0; i < strs.length; i++)
        System.out.println("Next token: " + strs[i]);
    }
  }

The output is shown here:

  Next token: one
  Next token: two
  Next token: alpha9
  Next token: 12
  Next token: done

As the output shows, the input sequence is reduced to its individual tokens. Notice that the delimiters are not included.


Two Pattern-Matching Options

Although the pattern-matching techniques described in the foregoing offer the greatest flexibility and power, there are two alternatives which you might find useful in some circumstances. If you only need to perform a one-time pattern match, you can use the matches( ) method defined by Pattern. It is shown here:

      static boolean matches(String pattern, CharSequence str)

It returns true if pattern matches str and false otherwise. This method automatically compiles pattern and then looks for a match. If you will be using the same pattern repeatedly, then using matches( ) is less efficient than compiling the pattern and using the pattern-matching methods defined by Matcher, as described previously. You can also perform a pattern match by using the matches( ) method implemented by String. It is shown here:

      boolean matches(String pattern)

If the invoking string matches the regular expression in pattern, then matches( ) returns true. Otherwise, it returns false.


Exploring Regular Expressions

The overview of regular expressions presented in this section only hints at their power. Since text parsing, manipulation, and tokenization are a large part of programming, you will likely find Java’s regular expression subsystem a powerful tool that you can use to your advantage. It is, therefore, wise to explore the capabilities of regular expressions. Experiment with several different types of patterns and input sequences. Once you understand how regular expression pattern matching works, you will find it useful in many of your programming endeavors.