3 Migrating Servlets

This chapter discusses key servlet features and APIs, JBoss support for servlet APIs and its extensions to standards, and OC4J support for servlet APIs.

3.1 Overview of the Java Servlet API

A servlet is an instance of a Java class running in a web container and servlet engine. Servlets are used for generating dynamic web pages. Servlets receive and respond to requests from web clients, usually via the HTTP protocol.

Servlets have several advantages over traditional CGI programming:

Each servlet does not run in a separate process. This removes the overhead of creating a new process for each request.
A servlet stays in memory between requests. A CGI program (and probably also an extensive runtime system or interpreter) needs to be loaded and started for each CGI request.
There is only a single instance which answers all requests concurrently. This saves memory and allows a servlet to easily manage persistent data.
A servlet can be run by a servlet engine in a restrictive sandbox (similar to how an applet runs in a web browser's sandbox), which allows for secure use of servlets.
Servlets are scalable, providing support for a multi-application server configuration. Servlets also enable data caching, database access, and data sharing with other servlets, JSP files and (in some environments) Enterprise JavaBeans.

The servlet API is specified in two Java extension packages: javax.servlet and javax.servlet.http. Most servlets, however, extend one of the standard implementations of that interface, namely javax.servlet.GenericServlet and javax.servlet.http.HttpServlet. Of these, the classes and interfaces in javax.servlet are protocol independent, while javax.servlet.http contain classes specific to HTTP.

The servlet API provides support in four categories:

Servlet life cycle management
Access to servlet context
Utility classes
HTTP-specific support classes

identifies the servlet API classes according to the purpose they serve.

Table 3-1 Servlet API Classes

Purpose	Class or Interface
Servlet implementation	`javax.servlet.Servlet` `javax.servlet.SingleThreadModel` `javax.servlet.GenericServlet` `javax.servlet.httpServlet`
Servlet configuration	`javax.servlet.ServletConfig`
Servlet exceptions	`javax.servlet.ServletException` `javax.servlet.UnavailableException`
Request/response	`javax.servlet.ServletRequest` `javax.servlet.ServletResponse` `javax.servlet.ServletInputStream` `javax.servlet.ServletOutputStream` `javax.servlet.http.HttpServletRequest` `javax.servlet.http.HttpServletResponse`
Session tracking	`javax.servlet.http.HttpSession` `javax.servlet.http.HttpSessionBindingListner` `javax.servlet.http.HttpSessionBindingEvent` `javax.servlet.http.Cookie`
Servlet context	`javax.servlet.ServletContext`
Servlet collaboration	`javax.servlet.RequestDispatcher`

3.1.1 Servlet Lifecycle

Servlets run on the web server platform as part of the same process as the web server itself. The web server is responsible for initializing, invoking, and destroying each servlet instance. A web server communicates with a servlet through a simple interface, javax.servlet.Servlet.

This interface consists of three main methods

init()
service()
destroy()

and two ancillary methods:

getServletConfig()
getServletInfo()

3.1.1.1 The `init()` Method

When a servlet is first loaded, its init() method is invoked, and begins initial processing such as opening files or establishing connections to servers. If a servlet has been permanently installed in a server, it is loaded when the server starts.

Otherwise, the server activates a servlet when it receives the first client request for the services provided by the servlet. The init() method is guaranteed to finish before any other calls are made to the servlet, such as a call to the service() method. The init() method is called only once; it is not called again unless the servlet is reloaded by the server.

The init() method takes one argument, a reference to a ServletConfig object, which provides initialization arguments for the servlet. This object has a method getServletContext() that returns a ServletContext object, which contains information about the servlet's environment.

3.1.1.2 The `service()` Method

The service() method is the heart of the servlet. Each request from a client results in a single call to the servlet's service() method. The service() method reads the request and produces the response from its two parameters:

A ServletRequest object with data from the client. The data consists of name/value pairs of parameters and an InputStream. Several methods are provided that return the client's parameter information. The InputStream from the client can be obtained via the getInputStream() method. This method returns a ServletInputStream, which can be used to get additional data from the client. If you are interested in processing character-level data instead of byte-level data, you can get a BufferedReader instead with getReader().
A ServletResponse represents the servlet's reply back to the client. When preparing a response, the method setContentType() is called first to set the MIME type of the reply. Next, the method getOutputStream() or getWriter() can be used to obtain a ServletOutputStream or PrintWriter, respectively, to send data back to the client.

There are two ways for a client to send information to a servlet. The first is to send parameter values and the second is to send information via the InputStream (or Reader). Parameter values can be embedded into a URL. The service() method's job is simple--it creates a response for each client request sent to it from the host server. However, note that there can be multiple service requests being processed simultaneously. If a service method requires any outside resources, such as files, databases, or some external data, resource access must be thread-safe.

3.1.1.3 The `destroy()` Method

The destroy() method is called to allow the servlet to clean up any resources (such as open files or database connections) before the servlet is unloaded. If no clean-up operations are required, this can be an empty method.

The server waits to call the destroy() method until either all service calls are complete, or a certain amount of time has passed. This means that the destroy() method can be called while some longer-running service() methods are still running. It is important that you write your destroy() method to avoid closing any necessary resources until all service() calls have completed.

3.1.2 Session Tracking

HTTP is a stateless protocol, which means that every time a client requests a resource, the protocol opens a separate connection to the server, and the server doesn't preserve the context from one connection to another; each transaction is isolated. However, most web applications aren't stateless. Robust Web applications need to interact with with the user, remember the nature of the user's requests, make data collected about the user in one request available to the next request from the same user. A classic example would be the shopping cart application, from internet commerce. The Servlet API provides techniques for identifying a session and associating data with it, even over multiple connections. These techniques include the following:

Cookies
URL rewriting
Hidden form fields

To eliminate the need for manually managing the session information within application code (regardless of the technique used), you use the HttpSession class of the Java Servlet API. The HttpSession interface allows servlets to:

View and manage information about a session
Preserve information across multiple user connections, to include multiple page requests as well as connections

3.1.2.1 Cookies

Cookies are probably the most common approach for session tracking. Cookies store information about a session in a human-readable file on the client's machine. Subsequent sessions can access the cookie to extract information. The server associates a session ID from the cookie with the data from that session. This becomes more complicated when there are multiple cookies involved, when a decision must be made about when to expire the cookie, and when many unique session identifiers are needed. Also, a cookie has a maximum size of 4K, and no domain can have more than 20 cookies. Cookies pose some privacy concerns for users. Some people don't like that a program can store and retrieve information from their local disk, and disable cookies or delete them altogether. Therefore, they are not dependable as a sole mechanism for session tracking.

3.1.2.2 URL rewriting

The URL rewriting technique works by appending data to the end of each URL that identifies a session. The server associates the identifier with data it has stored about the session. The URL is constructed using an HTTP GET, and may include a query string containing pairs of parameters and values. For example:

http://www.server.com/getPreferences?uid=username&bgcolor=red&fgcolor=blue.

3.1.2.3 Hidden form fields in HTML

Hidden form fields are another way to store information about the session. The hidden data can be retrieved later by using the HTTPServletRequest object. When a form is submitted, the data is included in the GET or POST. A note of caution though: form fields can be used only on dynamically generated pages,so their use is limited. And there are security holes: people can view the HTML source to see the stored data.

3.1.3 The `HttpSession` object

No matter the technique(s) used to collect session data, it must be stored somewhere. The HttpSession object can be used to store the session data from a servlet and associate it with a user.

The basic steps for using the HttpSession object are:

Obtain a session object
Read or write to it
Terminate the session by expiring it, or allowing it to expire on its own

A session persists for a certain time period, up to forever, depending on the value set in the servlet. A unique session ID is used to track multiple requests from the same client to the server. Persistence is valid within the context of the Web application, which may encompass multiple servlets. A servlet can access an object stored by another servlet; the object is distinguished by name and is considered bound to the session. These objects (called attributes when set and get methods are performed on them) are available to other servlets within the scope of a request, a session, or an application.

Servlets are used to maintain state between requests, which is cumbersome to implement in traditional CGI and many CGI alternatives. Only a single instance of the servlet is created, and each request simply results in a new thread calling the servlet's service method (which calls doGet or doPost). So, shared data simply has to be placed in a regular instance variable (field) of the servlet. Thus,the servlet can access the appropriate ongoing calculation when the browser reloads the page and can keep a list of the N most recently requested results, returning them immediately if a new request specifies the same parameters as a recent one. Of course, the normal rules that require authors to synchronize multithreaded access to shared data still apply to servlets.

Servlets can also store persistent data in the ServletContext object, available through the getServletContext method. ServletContext has setAttribute and getAttribute methods that enable storage of arbitrary data associated with specified keys. The difference between storing data in instance variables and storing it in the ServletContext is that the ServletContext is shared by all servlets in the servlet engine or in the Web application.

3.1.4 J2EE Web Applications

A Web application, as defined in the servlet specification, is a collection of servlets, JavaServer Pages (JSPs), Java utility classes and libraries, static documents such as HTML pages, images , client side applets, beans, and classes, and other Web resources that are set up in such a way as to be portably deployed across any servlet-enabled Web server. A Web application can be contained in entirety within a single archive file and deployed by placing the file into a specific directory.

3.1.4.1 Web Application Archive (WAR)

Web application archive files have the extension .war. WAR files are .jar files (created using the jar utility) saved with an alternate extension. The JAR format allows JAR files to be stored in compressed form and have their contents digitally signed. The .war file extension was chosen over .jar to distinguish them for certain operations. An example of a WAR file listing is shown below:

index.html 
howto.jsp 
feedback.jsp 
images/banner.gif 
images/jumping.gif 
WEB-INF/web.xml 
WEB-INF/lib/jspbean.jar 
WEB-INF/classes/MyServlet.class 
WEB-INF/classes/com/mycorp/frontend/CorpServlet.class 
WEB-INF/classes/com/mycorp/frontend/SupportClass.class

On install, a WAR file can be mapped to any URI prefix path on the server. The WAR file then handles all requests beginning with that prefix. For example, if the WAR file above were installed under the prefix /demo, the server would use it to handle all requests beginning with /demo. A request for /demo/index.html would serve the index.html file from the WAR file. A request for /demo/howto.jsp or /demo/images/banner.gif would also serve content from the WAR file.

3.1.4.2 About the `WEB-INF` directory

The WEB-INF directory is special. The files in it are not served directly to the client; instead, they contain Java classes and configuration information for the Web application. The directory behaves like a JAR file's META-INF directory; it contains meta-information about the archive contents. The WEB-INF/classes directory contains the class files for the Web application's servlets and supporting classes. WEB-INF/lib contains classes stored in JAR files. For convenience, web server class loaders automatically look to WEB-INF/classes and WEB-INF/lib for their classes—no extra install steps are necessary.

The servlets under WEB-INF in the example Web application listing can be invoked using URIs like /demo/servlet/MyServlet and /demo/servlet/com.mycorp.frontend.CorpServlet.

Note that every request for this application begins with /demo, even requests for servlets.

The web.xml file in the WEB-INF directory defines descriptors for a Web Application. This file contains configuration information about the Web application in which it resides and is used to register your servlets, define servlet initialization parameters, register JSP tag libraries, define security constraints, and other Web Application parameters .

3.1.5 Differences between Servlet 2.0, 2.1 and 2.2

The Servlet API in the J2EE specification is continously evolving. In a span of two years Servlet API 2.0 , 2.1, 2.2 has been published; the most recent version as of this writing is Servlet API 2.3. The fundamental architecture of servlets has not changed much, so most of the API is still relevant. However, there are enhancements and some new functionality, and some APIs have been deprecated.

This section covers the major differences between Servlet API 2.0 , 2.1 ,2.2 and 2.3 draft specification.

3.1.5.1 Highlights of the Java Servlet API 2.1

The Servlet 2.1 API highlights include:

A request dispatcher wrapper for each resource (servlet)

A request dispatcher is a wrapper for resources that can process HTTP requests (such as servlets andJSPs) and files related to those resources (such as static HTML and GIFs). The servlet engine generatesa single request dispatcher for each servlet or JSP when it is instantiated. The request dispatcher receives client requests and dispatches the request to the resource.
A servlet context for each application

In Servlet API 2.0, the servlet engine generated a single servlet context that was shared by all servlets. The Servlet API 2.1 provides a single servlet context per application, which facilitates partitioning applications. As explained in the description of the application programming model, applications on the same virtual host can access each other's servlet context.
Deprecated HTTP session context

The Servlet API 2.0 HttpSessionContext interface grouped all of the sessions for a Web server into a single session context. Using the session context interface methods, a servlet could get a list of the session IDs for the session context and get the session associated with an ID. As a security safeguard, this interface has been deprecated in the Servlet API 2.1. The interface methods have been redefined to return null.

3.1.5.2 New Features in the Java Servlet API 2.2

The Servlet API 2.2 specification changed the term 'servlet engine', replacing it with 'servlet container'. This change is indicative of the Java Servlet API is now a required API of the Java 2 Platform, Enterprise Edition (J2EE) specification and, throughout J2EE's terminology, container is preferred over engine. Servlet API 2.2 introduced the following new features:

Web Applications (as discussed above)
References to external data sources, such as JNDI. Enables adding resources into the JNDI lookup table, such as database connections. Allows the resources to be located by servlets using a simple name lookup.
Parameter information for the application (initiallization parameters for the application).
Registered servlet names. Provides a place to register servlets and give them names. Previously, each server had a different process for registering servlets, making deployment difficult.
Servlet initialization parameters. Enables passing parameters to servlets parameters at initialization time. This is a new, standard way to accomplish what used to be a server dependent process.
Servlet load order. Specifies which servlets are preloaded, and in what order.
Security constraints. Dictate which pages must be protected, and by what mechanism. Include built-in form-based authentication.

3.1.5.3 Servlet API 2.3

The Servlet API 2.3 leaves the core of servlets relatively untouched. Additions and changes include:

JDK 1.2 or later is required
A filter mechanism has been created
Application lifecycle events have been added
Additional internationalization support has been added
The technique to express inter-JAR dependencies has been formalized
Rules for class loading have been clarified
Error and security attributes have been added
The HttpUtils class has been deprecated
Several DTD behaviors have been expanded and clarified

3.1.5.4 Filters and Servlet Chaining

Filtering support is provided as a part of the Servlet 2.3 API. JBoss 3.2.6 achieves similar filtering functionality with a JBoss-specific package. OC4J supports the Java servlet 2.3 filtering specification.

Filtering is a method of loading and invoking servlets in a web server. Both local and remote servlets can be part of a servlet chain (defined below). There are restrictions, however, on chaining the local internal servlets, and these restrictions are specific to the J2EE container used. For example, in JBoss, if an internal servlet is used in a chain, it must be the first servlet in the chain. Internal servlets include: file servlet, pageCompile servlet, ssInclude servlet, and template servlet.

3.1.5.5 Servlet Chains

For some requests, a chain of ordered servlets can be invoked rather than just one servlet. The input from the browser is sent to the first servlet in the chain and the output from the last servlet in the chain is the response sent back to the browser. Each servlet in the chain receives inputs from, and transmits outputs to, the servlet before and after it, respectively. A chain of servlets can be triggered for an incoming request by using:

Servlet aliasing to indicate a chain of servlets for a request
MIME types to trigger the next servlet in the chain

3.1.6 JBoss Servlet API Support

JBoss 3.2.6 supports the Java Servlet API 2.2 specification through the Tomcat servlet container. Third-party servlet containers can be integrated with JBoss by mapping web-app.xml JNDI information into the JBoss JNDI namespace using an optional JBoss-app.xml descriptor and using the JBoss security layer for security.

3.1.7 Oracle Application Server Servlet API Suport

Oracle Application Server OC4J is a fully compliant implementation of the Java Servlets 2.2 and 2.3 specifications. As such, standard Java Servlets 2.2 code will work correctly.