|
|
|
|
Java Enterprise Best PracticesBy The O'Reilly Java AuthorsEdited by Robert Eckstein December 2002 0-596-00384-6, 288 pages |
Chapter 3
Servlet Best PracticesBy Jason Hunter
Since their introduction in 1996, servlets have dominated the server-side Java landscape and have become the standard way to interface Java to the Web. They are the foundation technology on which Java developers build web applications and, increasingly, web services. This chapter discusses best practices for servlet-based development and deployment.
Working Effectively with Servlets
We start with a look at servlet frameworks. Frameworks (e.g., Apache Struts) are becoming increasingly popular because they increase programmer efficiency by providing a skeleton on which applications can be built. In the first section, we examine what servlet frameworks offer, and I give a quick overview of the most popular frameworks. After that, we jump from the high level to the low level with a discussion on how using pre-encoded characters can optimize your servlet's performance. Next, we tackle the thorny issue of loading configuration files and provide some code to make the task easier, and after that I give some tips on when you should (or should not) use the
HttpSessionandSingleThreadModelfeatures. As we near the end of the chapter, I explain how to reliably control caching to improve the user's experience. Then I address the frequently asked question: "How do I download a file to the client so that the client sees a `Save As' pop up?" As you'll see, the answer lies in setting the right HTTP headers.Choose the Right Servlet Framework
When writing web applications, it's good to remember that servlets are an enabling technology. This is easy to forget because in the early days, the Servlet API was all we had for server-side Java web programming. If the Servlet API didn't include something, we had to build it ourselves. It was a little like the Old West, where times were tough and real programmers wrote servlets by hand. Specs weren't written yet. Heck, we felt lucky just to have
out.println( ).These days, times have changed. The crowds have come, and with them we see a multitude of servlet-based technologies designed to make web application development easier and more effective. The first area of innovation has been happening at the presentation layer. Technologies such as JavaServer Pages (JSP), WebMacro, and Velocity give us more productive alternatives to the vast fields of
out.println( )that came before. These technologies make it easier than ever before to quickly develop, deploy, and maintain dynamic web content. You can find a full discussion of these and other templating technologies in my book Java Servlet Programming, Second Edition (O'Reilly).Today, we're seeing a new area of innovation happening below the presentation layer, at the framework level (see Figure 3-1). These new frameworks provide a solid scaffolding against which new web applications can be built, moving from building pages quickly to building full applications quickly. Frameworks take the best designs of the experts and make them available to you for reuse. Good frameworks help improve your application's modularization and maintainability. Frameworks also bring together disparate technologies into a single bundled package and provide components that build on these technologies to solve common tasks. If you choose the right servlet framework, you can greatly enhance your productivity and leverage the work of the crowds. Consequently, I advise you to consider using a framework and provide some helpful tips in this section on selecting the right framework.
Figure 3-1. Servlets, template technologies, and frameworks
![]()
Tips for selecting a framework
When choosing a servlet framework, it's important that you consider its feature list. Here are some of the features that frameworks provide. Not all frameworks support all these features, nor should this short list be considered exhaustive.[1]
- Integration with a template language
- Some frameworks integrate with a specific template language. Others have a pluggable model to support many templates, although they're often optimized for one template language. If you prefer a particular template language, make sure the framework supports it well.
- Support (ideally, enforcement) of designer/developer separation
- One of the common goals in web application development is to effectively separate the developer's duties from the designer's. Choosing the right template language helps here, but the framework choice can have even more impact. Some enable the separation of concerns; some enforce it.
- Security integration
- The default servlet access control and security model works for simple tasks but isn't extensible for more advanced needs. Some frameworks provide alternative security models, and many support pluggable security models. If you've ever wanted more advanced security control, the right framework can help.
- Form validation
- Frameworks commonly provide tools to validate form data, allowing the framework to sanity-check parameters before the servlet even sees the data, for example. Some allow for easy development of "form wizards" with Next/Previous buttons and maintained state.
- Error handling
- Some frameworks include advanced or custom error handling, such as sending alert emails, logging errors to a special data store, or autoformatting errors to the user and/or administrator.
- Persistence/database integration
- One of the most powerful features of frameworks can be their close and elegant integration with back-end data stores such as databases. These frameworks let the user think in objects rather than in SQL.
- Internationalization
- Internationalization (i18n) is always a challenge, but some frameworks have features and idioms that simplify the process.
- IDE integration
- Some frameworks provide integrated development environments (IDEs) for development and/or have features that plug in to third-party IDEs.
- Mechanisms to support web services
- With the growing interest in web services, it's common to see new frameworks centered around web services, or existing frameworks touting new web services features.
Beyond features, a second important criterion when examining a framework is its license. My advice is to stick with open source projects or standards implemented by multiple vendors. This protects your investment. Both open source and common standards avoid the single-vendor problem and ensure that no one entity can terminate support for the framework on which your application depends.
A third consideration is for whom the framework is targeted (e.g., news sites, portal sites, commerce sites, etc.). Different sites have different needs, and frameworks tend to be optimized toward a certain market segment. You might find it useful to investigate which frameworks are used by others implementing similar applications.
High-profile frameworks
While it would be wonderful to do a full comparison between frameworks here, that's not what this book is about. What we can do instead is briefly discuss the four most popular servlet frameworks available today: Java 2 Enterprise Edition (J2EE) BluePrints, Apache Struts, JavaServer Faces, and Apache Turbine.
You might be thinking to yourself, "Just skip the summary and tell me which is best!" Unfortunately, there's no all-encompassing answer; it depends entirely on your application and personal taste. This is one place where working with server-side Java follows Perl's slogan: "There's more than one way to do it."
- J2EE BluePrints
- J2EE BluePrints (http://java.sun.com/blueprints/enterprise) is more accurately described as a guidebook than a framework. Authored by Sun engineers, the book provides guidelines, patterns, and code samples showing how best to use J2EE and the constituent technologies. For example, the book shows how to implement a Model-View-Controller (MVC) framework that encapsulates back-end web operations into three parts: a model representing the central data, the view handling the display of the data, and the controller handling the alteration of the data. To support this MVC model BluePrints suggests using an
Actionclass in the style of the "Command" pattern:
The sample application defines an abstract class Action, which represents a single application model operation. A controller can look up concrete Action subclasses by name and delegate requests to them.
- The book gives code samples for how to implement an
Actionbut doesn't provide any production-quality support code. For production code, the J2EE BluePrints book points readers to Apache Struts.
- Apache Struts
- Apache Struts (http://jakarta.apache.org/struts) might very well be the most popular servlet framework. It follows very closely the MVC pattern discussed in BluePrints (from what I can tell, the ideas have flowed in both directions):
Struts is highly configurable, and has a large (and growing) feature list, including a Front Controller, action classes and mappings, utility classes for XML, automatic population of server-side JavaBeans, Web forms with validation, and some internationalization support. It also includes a set of custom tags for accessing server-side state, creating HTML, performing presentation logic, and templating. Some vendors have begun to adopt and evangelize Struts. Struts has a great deal of mindshare, and can be considered an industrial-strength framework suitable for large applications.
- In Struts, requests are routed through a controller servlet.
Actionobjects control request handling, and these actions use components such as JavaBeans to perform business logic. Struts elegantly creates a full dispatch mechanism on top of servlets with an external configuration, eliminating the artificial tie between URLs and online activities. Nearly all requests come in through the same servlet, client requests indicate as part of the request the action they'd like to take (i.e., login, add to cart, checkout), and the Struts controller dispatches the request to anActionfor processing. JSP is used as the presentation layer, although it also works with Apache Velocity and other technologies. Struts is an open source project and was developed under the Apache model of open, collaborative development.
- JavaServer Faces
- JavaServer Faces (JSF) is a Sun-led Java Community Process effort (JSR-127) still in the early development stage. It's just reaching the first stage of Community Review as of this writing, but already it's gaining terrific mindshare. The JSF proposal document contains plans to define a standard web application framework, but the delivery appears to focus on the more limited goal of defining a request-processing lifecycle for requests that include a number of phases (i.e., a form wizard). It's a JSF goal to integrate well with Struts.
- Apache Turbine
- Apache Turbine might be one of the oldest servlet frameworks, having been around since 1999. It has services to handle parameter parsing and validation, connection pooling, job scheduling, caching, database abstractions, and even XML-RPC. Many of its components can be used on their own, such as the Torque tool for database abstraction. Turbine bundles them together, providing a solid platform for building web applications the same way J2EE works for enterprise applications.
- Turbine, like the other frameworks, is based on the MVC model and action event abstraction. However, unlike the rest, Turbine provides extra support at the View layer and has dubbed itself "Model 2+1" as a play on being better than the standard "Model 2" MVC. Turbine Views support many template engines, although Apache Velocity is preferred.
We could discuss many more frameworks if only we had the space. If you're interested in learning more, Google away on these keywords: TeaServlet, Apache Cocoon, Enhydra Barracuda, JCorporate Expresso, and Japple.
Use Pre-Encoded Characters
One of the first things you learn when programming servlets is to use a
PrintWriterfor writing characters and anOutputStreamfor writing bytes. And while that's stylistically good advice, it's also a bit simplistic. Here's the full truth: just because you're outputting characters doesn't mean you should always use aPrintWriter!A
PrintWriterhas a downside: specifically, it has to encode every character from acharto abytesequence internally. When you have content that's already encoded--such as content in a file, URL, or database, or even in aStringheld in memory--it's often better to stick with streams. That way you can enable a straight byte-to-byte transfer. Except for those rare times when there's a charset mismatch between the stored encoding and the required encoding, there's no need to first decode the content into aStringand then encode it again to bytes on the way to the client. Use the pre-encoded characters and you can save a lot of overhead.To demonstrate, the servlet in Example 3-1 uses a reader to read from a text file and a writer to output text to the client. Although this follows the mantra of using
Example 3-1: Chars in, chars outReader/Writerclasses for text, it involves a wasteful, needless conversion.import java.io.*;import java.util.prefs.*;import javax.servlet.*;import javax.servlet.http.*;public class WastedConversions extends HttpServlet {// Random file, for demo purposes onlyString name = "content.txt";public void doGet(HttpServletRequest req, HttpServletResponse res)throws ServletException, IOException {String file = getServletContext( ).getRealPath(name);res.setContentType("text/plain");PrintWriter out = res.getWriter( );returnFile(file, out);}public static void returnFile(String filename, Writer out)throws FileNotFoundException, IOException {Reader in = null;try {in = new BufferedReader(new FileReader(filename));char[ ] buf = new char[4 * 1024]; // 4K char bufferint charsRead;while ((charsRead = in.read(buf)) != -1) {out.write(buf, 0, charsRead);}}finally {if (in != null) in.close( );}}}The servlet in Example 3-2 is more appropriate for returning a text file. This servlet recognizes that file content starts as bytes and can be sent directly as bytes, as long as the encoding matches what's expected by the client.
Example 3-2: Bytes in, bytes outimport java.io.*;import java.util.prefs.*;import javax.servlet.*;import javax.servlet.http.*;public class NoConversions extends HttpServlet {String name = "content.txt"; // Demo file to sendpublic void doGet(HttpServletRequest req, HttpServletResponse res)throws ServletException, IOException {String file = getServletContext( ).getRealPath(name);res.setContentType("text/plain");OutputStream out = res.getOutputStream( );returnFile(file, out);}public static void returnFile(String filename, OutputStream out)throws FileNotFoundException, IOException {InputStream in = null;try {in = new BufferedInputStream(new FileInputStream(filename));byte[ ] buf = new byte[4 * 1024]; // 4K bufferint bytesRead;while ((bytesRead = in.read(buf)) != -1) {out.write(buf, 0, bytesRead);}}finally {if (in != null) in.close( );}}}How much performance improvement you get by using pre-encoded characters depends on the server. Testing these two servlets against a 2 MB file accessed locally shows a 20% improvement under Tomcat 3.x. Tomcat 4.x shows a whopping 50% improvement. Although those numbers sound impressive, they of course assume that the application does nothing except transfer text files. Real-world numbers depend on the servlet's business logic. This technique (illustrated in Figure 3-2) are most helpful for applications that are bandwidth- or server CPU-bound.
Figure 3-2. Taking advantage of pre-encoded characters
![]()
The principle "Use Pre-encoded Characters" applies whenever a large majority of your source content is pre-encoded, such as with content from files, URLs, and even databases. For example, using the
ResultSetgetAsciiStream( )method instead ofgetCharacterStream( )can avoid conversion overhead for ASCII strings--both when reading from the database and writing to the client. There's also the potential for cutting the bandwidth in half between the server and database because ASCII streams can be half the size of UCS-2 streams. How much benefit you actually see depends, of course, on the database and how it internally stores and transfers data.In fact, some servlet developers preencode their static
Stringcontents withString.getBytes( )so that they're encoded only once. Whether the performance gain justifies going to that extreme is a matter of taste. I advise it only when performance is a demonstrated problem without a simpler solution.To mix bytes and characters on output is actually easier than it probably should be. Example 3-3 demonstrates how to mix output types using the
Example 3-3: ValueObjectProxy.javaServletOutputStreamand its combinationwrite(byte[ ])andprintln(String)methods.import java.io.*;import java.sql.*;import java.util.Date;import javax.servlet.*;import javax.servlet.http.*;public class AsciiResult extends HttpServlet {public void doGet(HttpServletRequest req, HttpServletResponse res)throws ServletException, IOException {res.setContentType("text/html");ServletOutputStream out = res.getOutputStream( );// ServletOutputStream has println( ) methods for writing strings.// The println( ) call works only for single-byte character encodings.// If you need multibyte, make sure to set the charset in the Content-Type// and use, for example, out.write(str.getBytes("Shift_JIS")) for Japanese.out.println("Content current as of");out.println(new Date( ).toString( ));// Retrieve a database ResultSet here.try {InputStream ascii = resultSet.getAsciiStream(1);returnStream(ascii, out);}catch (SQLException e) {throw new ServletException(e);}}public static void returnStream(InputStream in, OutputStream out)throws FileNotFoundException, IOException {byte[ ] buf = new byte[4 * 1024]; // 4K bufferint bytesRead;while ((bytesRead = in.read(buf)) != -1) {out.write(buf, 0, bytesRead);}}}Although mixing bytes with characters can provide a performance boost because the bytes are transferred directly, I recommend you use this technique sparingly because it can be confusing to readers and can be error-prone if you're not entirely familiar with how charsets work. If your character needs to extend beyond ASCII, be sure you know what you're doing. Writing non-ASCII characters to an output stream should not be attempted by a novice.
Load Configuration Files from the Classpath
From Servlet API 1.0 through Servlet API 2.3, servlets have distinctly lacked a standard mechanism to retrieve external configuration files. Although many server-side libraries require configuration files, servlets have no commonly accepted way to locate them. When a servlet runs under J2EE, it receives support for JNDI, which can provide a certain amount of configuration information. But the common web server configuration file problem remains.
The best solution (or perhaps I should call it the "lesser evil" solution) is to locate files with a search of the classpath and/or the resource path. This lets server admins place server-wide configuration files in the web server's classpath, or place per-application configuration files in WEB-INF/classes found in the resource path. It also works equally well for locating configuration files placed within WAR files and/or deployed across multiple back-end servlet containers. In fact, using files for configuration has several advantages, even when JNDI is available. The component provider can include a set of "sample" or "default" configuration files. One configuration file can be made to work across the entire server. And finally, configuration files are trivially easy to understand for both the developer and deployer.
Example 3-4 demonstrates the search technique with a class called
Example 3-4: A standard Resource locatorResource. Given a resource name, theResourceconstructor searches the class path and resource path attempting to locate the resource. When the resource is found, it makes available the resource contents as well as its directory location and last modified time (if those are available). The last modified time helps an application know, for example, when to reload the configuration data. The class uses special code to convertfile: URL resources toFileobjects. This proves handy because URLs, evenfile: URLs, often don't expose special features such as a modified time. By searching both the class path and the resource path this class can find server-wide resources and per-application resources. The source code for this class can also be downloaded from http://www.servlets.com.import java.io.*;import java.net.*;import java.util.*;/*** A class to locate resources, retrieve their contents, and determine their* last modified time. To find the resource the class searches the CLASSPATH* first, then Resource.class.getResource("/" + name). If the Resource finds* a "file:" URL, the file path will be treated as a file. Otherwise, the* path is treated as a URL and has limited last modified info.*/public class Resource implements Serializable {private String name;private File file;private URL url;public Resource(String name) throws IOException {this.name = name;SecurityException exception = null;try {// Search using the CLASSPATH. If found, "file" is set and the call// returns true. A SecurityException might bubble up.if (tryClasspath(name)) {return;}}catch (SecurityException e) {exception = e; // Save for later.}try {// Search using the classloader getResource( ). If found as a file,// "file" is set; if found as a URL, "url" is set.if (tryLoader(name)) {return;}}catch (SecurityException e) {exception = e; // Save for later.}// If you get here, something went wrong. Report the exception.String msg = "";if (exception != null) {msg = ": " + exception;}throw new IOException("Resource '" + name + "' could not be found in " +"the CLASSPATH (" + System.getProperty("java.class.path") +"), nor could it be located by the classloader responsible for the " +"web application (WEB-INF/classes)" + msg);}/*** Returns the resource name, as passed to the constructor*/public String getName( ) {return name;}/*** Returns an input stream to read the resource contents*/public InputStream getInputStream( ) throws IOException {if (file != null) {return new BufferedInputStream(new FileInputStream(file));}else if (url != null) {return new BufferedInputStream(url.openStream( ));}return null;}/*** Returns when the resource was last modified. If the resource was found* using a URL, this method will work only if the URL connection supports* last modified information. If there's no support, Long.MAX_VALUE is* returned. Perhaps this should return -1, but you should return MAX_VALUE on* the assumption that if you can't determine the time, it's* maximally new.*/public long lastModified( ) {if (file != null) {return file.lastModified( );}else if (url != null) {try {return url.openConnection( ).getLastModified( ); // Hail Mary}catch (IOException e) { return Long.MAX_VALUE; }}return 0; // can't happen}/*** Returns the directory containing the resource, or null if the resource* isn't directly available on the filesystem. This value can be used to* locate the configuration file on disk, or to write files in the same directory.*/public String getDirectory( ) {if (file != null) {return file.getParent( );}else if (url != null) {return null;}return null;}// Returns true if foundprivate boolean tryClasspath(String filename) {String classpath = System.getProperty("java.class.path");String[ ] paths = split(classpath, File.pathSeparator);file = searchDirectories(paths, filename);return (file != null);}private static File searchDirectories(String[ ] paths, String filename) {SecurityException exception = null;for (int i = 0; i < paths.length; i++) {try {File file = new File(paths[i], filename);if (file.exists( ) && !file.isDirectory( )) {return file;}}catch (SecurityException e) {// Security exceptions can usually be ignored, but if all attempts// to find the file fail, report the (last) security exception.exception = e;}}// Couldn't find any matchif (exception != null) {throw exception;}else {return null;}}// Splits a String into pieces according to a delimiter.// Uses JDK 1.1 classes for backward compatibility.// JDK 1.4 actually has a split( ) method now.private static String[ ] split(String str, String delim) {// Use a Vector to hold the split strings.Vector v = new Vector( );// Use a StringTokenizer to do the splitting.StringTokenizer tokenizer = new StringTokenizer(str, delim);while (tokenizer.hasMoreTokens( )) {v.addElement(tokenizer.nextToken( ));}String[ ] ret = new String[v.size( )];v.copyInto(ret);return ret;}// Returns true if foundprivate boolean tryLoader(String name) {name = "/" + name;URL res = Resource.class.getResource(name);if (res = = null) {return false;}// Try converting from a URL to a File.File resFile = urlToFile(res);if (resFile != null) {file = resFile;}else {url = res;}}private static File urlToFile(URL res) {String externalForm = res.toExternalForm( );if (externalForm.startsWith("file:")) {return new File(externalForm.substring(5));}return null;}public String toString( ) {return "[Resource: File: " + file + " URL: " + url + "]";}}Example 3-4 shows a fairly realistic example of how the class can be used. Assume your servlet library component needs to load some chunk of raw data from the filesystem. This file can be named anything, but the name must be entered in a library.properties main configuration file. Because the data, in some situations, takes a while to process in its raw form, the library keeps a serialized version of the data around in a second file named library.ser to speed up load times. The cache file, if any, resides in the same directory as the main configuration file. Example 3-5 gives the code implementing this logic, building on the
Example 3-5: Loading configuration information from a ResourceResourceclass.import java.io.*;import java.util.*;import javax.servlet.*;import javax.servlet.http.*;public class LibraryLoader {static final String CONFIG_FILE = "library.properties";static final String CACHE_FILE = "library.ser";public ConfigData load( ) throws IOException {// Find the configuration file and fetch its contents as Properties.Resource config = new Resource(CONFIG_FILE);Properties props = new Properties( );InputStream in = null;try {in = config.getInputStream( );props.load(in);}finally {if (in != null) in.close( ); // IOException propagates up.}// Determine the source directoru of the configuration file and look for a cache file// next to it containing a full representation of your program state.// If you find a cache file and it is current, load and return that data.if (config.getDirectory( ) != null) {File cache = new File(config.getDirectory( ), CACHE_FILE);if (cache.exists( ) &&cache.lastModified( ) >= config.lastModified( )) {try {return loadCache(new FileInputStream(cache));}catch (IOException ignored) { }}}// You get here if there's no cache file or it's stale and you need to do a// full reload. Locate the name of the raw datafile from the configuration file// and return its contents using Resource.Resource data = new Resource(props.getProperty("data.file"));return loadData(data.getInputStream( ));}private ConfigData loadCache(InputStream in) {// Read the file, perhaps as a serialized object.return null;}private ConfigData loadData(InputStream in) {// Read the file, perhaps as XML.return null;}class ConfigData {// An example class that would hold configuration data}}The loading code doesn't need to concern itself with where the resource might be located. The
Resourceclass searches the class path and resource path and pulls from the WAR if necessary.Think of Sessions as a Local Cache
Servlet sessions as implemented by
HttpSessionprovide a simple and convenient mechanism to store information about a user. While sessions are a useful tool, it's important to know their limitations. They are not a good choice for acting as the back-end storage in real-world applications, no matter how tempting it might be to try it. Rather, sessions are best thought of as a handy local cache--a place to store information which, if lost, can be recovered or safely ignored.To understand why this is, we should quickly review how sessions work. Sessions generally use cookies to identify users. During a client's first request to a server, the server sets a special cookie on the client that holds a server-generated unique ID. On a later request the server can use the cookie to recognize the request as coming from the same client. The server holds a server-side hashtable that associates the cookie ID keys with
HttpSessionobject values. When a servlet callsrequest.getSession( ), the server gets the cookie ID, looks up the appropriateHttpSession, and returns it. To keep memory in check, after some period of inactivity (typically 30 minutes) or on programmer request, the session expires and the stored data is garbage-collected.Session data is inherently transient and fragile. Session data will be lost when a session expires, when a client shuts down the browser,[2] when a client changes browsers, when a client changes machines, or when a servlet invalidates the session to log out the user. Consequently, sessions are best used for storing temporary information that can be forgotten--because either it's nonpermanent or is backed by a real store.
When information needs to be persistent, I recommend using a database, an EJB backed by a database, or another formal back-end data store. These are much safer, portable, and reliable, and they work better for backups. If the data must have a long-term association with a user, even when he moves between machines, use a true login mechanism that allows the user to relogin and reassociate. Servlet sessions can help in each case, but their role should be limited to a local cache, as we'll see in the next section.
Architecture of a shopping cart
Let's look at how you can architect session tracking for a shopping-cart application (think Amazon.com). Here are some requirements for your shopping cart:
- Logged-in users get a customized experience.
- Logins last between browser shutdowns.
- Users can log out and not lose cart contents.
- Items added to a cart persist for one month.
- Guests are allowed to place items in the cart (although contents are not necessarily available to the guest for the long term or from a different browser).
- Purchasing items in the cart requires a password for safety.
Servlet sessions alone don't adequately satisfy these requirements. With the right server you might be able to get sessions to persist for a month, but you lose the information when a user changes machines. Trying to use sessions as storage, you also need to take pains to expire individual items (but not the whole session) after a month, while at the same time making sure to put nothing into the session that shouldn't be kept indefinitely, and you need a way to log out a user without invalidating his cart contents. There's no way to do this in API 2.3!
Here's one possible architecture for this application that takes advantage of sessions as a local cache: if the user has not logged in, he is a guest, and the session stores his cart contents. The items persist there as long as the session lasts, which you have deemed sufficient for a guest. However, if the user has logged in, the cart contents are more safely recorded and pushed through to a back-end database for semi-permanent storage. The database will be regularly swept to remove any items added more than a month earlier. For performance, the user's session should be used to store cart contents even if the user is logged in, but the session should act as a local cache of the database--allowing later requests to display cart information without going across the wire to the database on each request.
The user logins can be tracked with a manually set cookie with a long expiration time. After a form-based login, the cookie stores a hash of the user's ID; the hash corresponds to the database records. On later visits, the user can be automatically recognized and his cart contents loaded into the session. For safety, on checkout the server logic asks for password verification before proceeding. Even though the server knows the client's identity, because the login is automatic the billing activity should be protected. The marker stating that the password was verified would, of course, be stored in the session, with a 30-minute timeout being fairly appropriate! A user-request logout would require only the removal of the cookie. The full architecture is shown in Figure 3-3.
Figure 3-3. Shopping-cart architecture
![]()
In this example you proposed custom login management. The default servlet form-based login could be used--however, it's designed for single-session login to restrict access to secure content. It is not designed for multisession login to identify users for shopping-cart applications.
When to use sessions
As shown in the shopping-cart example, sessions are useful but aren't a panacea. Sessions make the best sense in the following situations:
- Storing login status
- The timeout is useful, and changes between browsers or machines should naturally require a new login.
- Storing user data pulled from a database
- The local cache avoids an across-the-wire database request.
- Storing user data that's temporary
- Temporary data includes search results, form state, or a guest's shopping-cart contents that don't need to be preserved over the long term.
Don't Use SingleThreadModel
Now, onto a servlet feature for which there's never a good use:
SingleThreadModel. Here's my advice: don't use it. Ever.This interface was intended to make life easier for programmers concerned about thread safety, but the simple fact is that
SingleThreadModeldoes not help. It's an admitted mistake in the Servlet API, and it's about as useful as a dud firecracker on the Fourth of July.Here's how the interface works: any servlet implementing
SingleThreadModelreceives a special lifecycle within the server. Instead of allocating one servlet instance to handle multiple requests (with multiple threads operating on the servlet simultaneously), the server allocates a pool of instances (with at most one thread operating on any servlet at a time). From the outside this looks good, but there's actually no benefit.Imagine a servlet needing access to a unique resource, such as a transactional database connection. That servlet needs to synchronize against the resource regardless of the servlet's own thread model. There's no difference if two threads are on the same servlet instance or on different servlet instances; the problem is that two threads are trying to use the connection, and that's solved only with careful synchronization.
Imagine instead that multiple copies of the resources are available, but access to any particular one needs to be synchronized. It's the same situation. The best approach is not to use
SingleThreadModel, but to manage the resources with a pool that all servlets share. For example, with database connections it's common to have connection pools. You could instead useSingleThreadModeland arrange for each servlet instance to hold its own copy of the resource, but that's a poor use of resources. A server with hundreds of servlets might require thousands of resource instances.As a book author, I've kept my eye out for a compelling use for
SingleThreadModel. (After all, I need to write book examples showing how best to use this feature.) The most justifiable use I found was given to me by a development manager who said the programmers he hired were used to C global variables. By implementingSingleThreadModelthey could pass data between servlet method calls using instance variables rather than parameters. WhileSingleThreadModelaccomplishes that, it's poor form and inadvisable unless you're hiring Java newbies. When that's the best use case, you know there's no good use case. The bottom line: don't useSingleThreadModel.Caching with Servlets
Here are some tips to consider that will help things move quickly with your servlets.
Pregenerate Content Offline and Cache Like Mad
Pregeneration and caching of content can be key to providing your site visitors with a quality experience. With the right pregeneration and caching, web pages pop up rather than drag, and loads are reduced--sometimes dramatically--on the client, server, and network. In this section I'll provide advice for how best to pregenerate content and cache at the client, at the proxy, and at the server. By the end of this section you'll feel compelled to generate new content during request handling only in worst-case scenarios.
There's no need to dynamically regenerate content that doesn't change between requests. Yet such regeneration happens all the time because servlets and JSPs provide an easy way to template a site by pulling in headers, footers, and other content at runtime. Now this might sound like strange guidance in a chapter on servlets, but in many of these situations servlets aren't the best choice. It's better to "build" the content offline and serve it as static content. When the content changes, you can build the content again. Pull the content together once it is offline rather than during every request.
Take, for example, an online magazine, newspaper, or weblog ('blog). How do the pros handle templatization without burdening the server? By pregenerating the content. Articles added to a site are written and submitted in a standard format (often XML-based) which, when run through a build process, produces a comprehensive update to the web site. The build reformats the article into HTML, creates links to the article from other pages, adds the content into the search engine (before the HTML reformatting), and ultimately prepares the site to handle heavy loads without extreme resources. You can see this in action with 'blog tools such as MovableType. It's a Perl application, but it generates content statically, so Perl doesn't even need to run on the production server.
As another example, think of an e-tailer with millions of catalog items and thousands of visitors. Clearly, the content should be database-backed and regularly updated, yet because much of the content will be identical for all visitors between updates, the site can effectively use pregeneration. The millions of catalog item description pages can be built offline, so the server load will be greatly reduced. Regular site builds keep the content fresh.
The challenge comes where the static and dynamic content meet. For example, the e-tailer might need servlets to handle the dynamic aspects of its site, such as an item review system or checkout counter. In this example, the item review can invoke a servlet to update the database, but the servlet doesn't necessarily need to immediately update the page. In fact, you see just such a delay with Amazon.com updates--a human review and subsequent site build must take place before you see new comments. The checkout pages in our example, however, can be implemented as a fully servlet-based environment. Just make sure special coordination is implemented to ensure that the template look and feel used by the servlets matches the template look and feel implemented during the offline build.
Pregeneration tools
Unfortunately, few professional-quality, reasonably priced standard tools are available to handle the offline build process. Most companies and webmasters either purchase high-end content management systems or develop custom tools that satisfy their own needs. Perhaps that's why more people don't practice offline site building until their site load requires it.
For those looking for a tool, the Apache Jakarta project manages its own content using something called Anakia. Built on Apache Velocity, Anakia runs XML content through an XSL stylesheet to produce static HTML offline. Apache Ant, the famous Java build system, manages the site build. Others have had success with Macromedia Dreamweaver templates. Dreamweaver has the advantage of viewing JSP, WebMacro, Velocity, and Tea files as simple template files whose HTML contents are autoupdated when a template changes, providing a helpful bridge between the static and the dynamic.
There's a need here for a good common tool. If you think you have the right tool, please share it or evangelize it. Maybe it's out there and we just haven't heard of it yet.
Cache on the client
Pregeneration and caching go hand in hand because caching is nothing more than holding what you previously generated. Browsers (a.k.a. clients) all have caches, and it behooves a servlet developer to make use of them. The
Last-ModifiedHTTP header provides the key to effective caching. Attached to a response, this header tells the browser when the content last changed. This is useful because if the browser requests the same content again, it can attach anIf-Modified-Sinceheader with the previousLast-Modifiedtime, telling the server it needs to issue a full response only if the content has changed since that time. If the content hasn't changed, the server can issue a short status code 304 response, and the client can pull the content from its cache, avoiding thedoGet( )ordoPost( )methods entirely and saving server resources and bandwidth.A servlet takes advantage of
Example 3-6: The getLastModified( ) methodLast-Modifiedby implementing thegetLastModified( )method on itself. This method returns the time as alongat which the content was last changed, as shown in Example 3-6. That's all a servlet has to do. The server handles issuing the HTTP header and interceptingIf-Modified-Sincerequests.public long getLastModified(HttpServletRequest req) {return dataModified.getTime( ) / 1000 * 1000;}The
getLastModified( )method is easy to implement and should be implemented for any content that has a lifespan of more than a minute. For details ongetLastModified( ), see my book, Java Servlet Programming, Second Edition (O'Reilly).Cache at the proxy
While implementing
getLastModified( )to make use of client caches is a good idea, there are bigger caches to consider. Oftentimes, especially at companies or large ISPs, browsers use a proxy to connect to the Web. These proxies commonly implement a shared cache of the content they're fetching so that if another user (or the same user) requests it again, the proxy can return the response without needing to access the original web server. Proxies help reduce latency and improve bandwidth utilization. The content might come from across the world, but it's served as if it's on the local LAN--because it is.The
Last-Modifiedheader helps web caches as it does client caches, but web caches can be helped more if a servlet hints to the cache when the content is going to change, giving the cache a timeframe during which it can serve the content without even connecting to the server. The easiest way to do this is to set theExpiresheader, indicating the time when the content should be considered stale. For example:// This content will expire in 24 hours.response.setDateHeader("Expires",System.currentTimeMillis( ) + 24*60*60*1000);If you take my earlier advice and build some parts of your site on a daily basis, you can set the
Expiresheader on those pages accordingly and watch as the distributed proxy caches take the load off your server. Some clients can also use theExpiresheader to avoid refetching content they already have.A servlet can set other headers as well. The
Cache-Controlheader provides many advanced dials and knobs for interacting with a cache. For example, setting the header value toonly-if-cachedrequests the content only if it's cached. For more information onCache-Control, see http://www.servlets.com/rfcs/rfc2616-sec14.html#sec14.9. A great overview of caching strategies is also available at http://www.mnot.net/cache_docs. This site also includes tips for how to count page accesses even while caching.Cache on the server
Caching at the client and proxy levels helps if requests came from the same person or organization, but what about the multitude of requests from all the different browsers? This is why the last level of caching needs to happen on the server side. Any content that takes significant time or resources to generate but doesn't generally change between requests is a good candidate for server-side caching. In addition, server-side caching works for both full pages and (unlike other caching technologies) parts of pages.
Take, for example, an RSS text feed. In case you're not familiar, RSS stands for Rich Site Summary and is an XML-based file format with which publishers advertise new content. Affiliates can pull the RSS files and display links between sites. Servlets.com (http://www.servlets.com), for example, pulls and links to servlet-related articles, using RSS with O'Reilly's Meerkat (http://meerkat.oreillynet.com) as the RSS newsfeed hub.
Suppose you want to display an RSS feed for an affiliate site, updated every 30 minutes. This data should absolutely be cached on the server side. Not only would it be slow to pull an RSS feed for every request, but it's also terribly poor form. The cache can be implemented using an internal timer or, more easily, a simple date comparator. The code to pull the stories can check on each access if it's time to refetch and reformat the display. Because the data is small, the formatted content can be held in memory. Example 3-7 demonstrates sample code that would manage the story cache. On top of this cache might be another cache holding the actual
Example 3-7: Caching RSS feedsString(or bytes) to be displayed.public Story[ ] getStories(String url) {Story[ ] stories = (Story[ ]) storyCache.get(url);Long lastUpdate = (Long) timeCache.get(url);long halfHourAgo = System.currentTimeMillis( ) - 30*60*1000;if (stories = = null || stories.length = = 0 ||lastUpdate = = null || (lastUpdate.longValue( ) < halfHourAgo)) {refetch( );}return stories;}As a second example, take a stock chart diagram as found on any financial site. This presents a more significant challenge because such a site can support thousands of stock symbols, each offering charts of different sizes with different time spans and sometimes even allowing graphical comparisons between stocks. In this application caching will be absolutely necessary because generating a chart takes significant time and resources, and dynamically generating every chart would balloon the server requirements.
A good solution would be multifaceted. Some charts can be statically built offline (as discussed earlier in this section). These charts would be served as files. This technique works for most commonly accessed charts that don't change more than once a day. Other charts, perhaps ones that are accessed heavily but change frequently, such as a day's activity charts for popular stocks, would benefit from being cached in memory and served directly. They might be stored using a
SoftReference. Soft references free the memory if the Java Virtual Machine (JVM) gets low. Still other charts, perhaps ones that are less popular or actively changing, would benefit from being cached by servlets to the filesystem, stored in a semirandom temporary file whose contents can be pulled by a servlet instead of generated by the servlet. TheFile.createTempFile( )method can help manage such files.Many potential solutions exist, and this shouldn't be taken as gospel. The main point is that memory caches, temporary file caches, and prebuilt static files are good components of any design for caching on the server.
Beyond the server cache, it's important to remember the client and proxy caches. The chart pages should implement
getLastModified( )and set theExpiresand/orCache-Controlheaders. This will reduce the load on the server and increase responsiveness even more.. . . Or don't cache at all
Even though caching makes sense most of the time and should be enabled whenever possible, some types of content are unsuitable for caching and must always be refreshed. Take, for example, a current status indicator or a "Please wait . . . " page that uses a
Refreshtag to periodically access the server during the wait. But because of the sheer number of locations where content might be cached--at the client, proxy, and server levels--and because of several notorious browser bugs (see http://www.web-caching.com/browserbugs.html), it can be difficult to effectively turn off caching across the board.After spending hours attempting to disable caching, programmers can feel like magicians searching in vain for the right magic spell. Well, Harry Potter, Example 3-8 provides that magic spell, gathered from personal experience and the recommendations of the Net Wizards.
Example 3-8: The magic spell to disable caching// Set to expire far in the past.res.setHeader("Expires", "Sat, 6 May 1995 12:00:00 GMT");// Set standard HTTP/1.1 no-cache headers.res.setHeader("Cache-Control", "no-store, no-cache, must-revalidate");// Set IE extended HTTP/1.1 no-cache headers (use addHeader).res.addHeader("Cache-Control", "post-check=0, pre-check=0");// Set standard HTTP/1.0 no-cache header.res.setHeader("Pragma", "no-cache");The
Expiresheader indicates that the page expired long ago, thus making the page a poor cache candidate. The firstCache-Controlheader sets three directives that each disable caching. One tells caches not to store this content, another not to use the content to satisfy another request, and the last to always revalidate the content on a later request if it's expired (which, conveniently, it always is). One directive might be fine, but in magic spells and on the Web, it's always good to play things safe.The second
Cache-Controlheader sets two caching "extensions" supported by Microsoft Internet Explorer. Without getting into the details on nonstandard directives, suffice to say that settingpre-checkandpost-checkto0indicates that the content should always be refetched. Because it's adding another value to theCache-Controlheader, we useaddHeader( ), introduced in Servlet API 2.2. For servlet containers supporting earlier versions, you can combine the two calls onto one line.The last header,
Pragma, is defined in HTTP/1.0 and supported by some caches that don't understandCache-Control. Put these headers together, and you have a potent mix of directives to disable caching. Some programmers also add agetLastModified( )method that returns a time in the past.Other Servlet Tips
Here are some other things to keep in mind when working with servlets.
Use Content-Disposition to Send a File
While we're on the subject of magic header incantations, servlet developers often struggle with finding the right header combination to send a browser a file that's intended for saving rather than viewing and thus triggers a "Save As" dialog. For the solution to this problem, I have some good news and some bad news.
The bad news is that although the HTTP specification provides a mechanism for file downloads (see HTTP/1.1, Section 19.5.1), many browsers second-guess the server's directives and do what they think is best rather than what they're told. These browsers--including Microsoft Internet Explorer and Opera--look at the file extension and "sniff" the incoming content. If they see HTML or image content, they inline-display the file contents instead of offering a Save As dialog.[3] Turns out there's no 100% reliable way to download a file across all browsers. Perhaps, with this effort, programmers are more like alchemists than magicians, trying in vain to turn lead into gold.
The good news is that the right combination of headers will download files well enough to be practical. With these special headers set, a compliant browser will open a Save As dialog, while a noncompliant browser will open the dialog for all content except HTML or image files. For these types it will display the content inline, where a user can use the menu to save the content. Example 3-9 shows the best technique for sending files.
Example 3-9: Sending a file for download// Set the headers.res.setContentType("application/x-download");res.setHeader("Content-Disposition", "attachment; filename=" + filename);// Send the file.OutputStream out = res.getOutputStream( );returnFile(filename, out); // Shown earlier in the chapterFirst, set the
Content-Typeheader to a nonstandard value such asapplication/x-download. It's very important that this header is something unrecognized by browsers because browsers often try to do something special when they recognize the content type.[4] Then set theContent-Dispositionheader to the valueattachment;filename=foo, in which foo is substituted with the filename to be used by default in the Save As dialog. Finally, send the file content as bytes. The bytes can come from the filesystem or be dynamically generated.Using these headers, the file content in the response will be saved by most browsers or, in worst cases, displayed inline where the user can save the file. There's no standard way to download multiple files in one response.
Finally, it can be useful to include the download file's name as extra path information to the servlet. The servlet can use the filename to learn which file to download, or it can ignore the extra path info. Either way, it's useful because the name appears to the browser as the name of the resource being retrieved, and browsers often use that name in the Save As dialog prompt. For example, instead of serving content from /servlet/FileDownload?fileid=5, serve it from /servlet/FileDownload/inventory.pdf?fileid=5.
Hire a UI Designer
My last piece of personal advice comes not from fellow servlet programmers, but from the users we all serve: "Please, please, please hire a user interface designer."
Here are the facts: only a handful of people can be good designers, and only a handful of people can be good Java programmers. The rare odds of one person being in both handfuls can barely be measured with an IEEE double. Therefore, the odds are you're not a good UI designer. Please hire someone who is, and spend your time hacking back-end code with the rest of us who can't tell kerning from a kernel. We back-end folks are more fun anyway, in a laugh-through-the-nose kind of way.
1. There's actually a research project underway with the goal of tracking servlet framework features and implementing the same demonstration web application across every framework. See http://www.waferproject.org for more information.
2. The session data isn't lost immediately when a browser is shut down, of course, because no notification is sent to the server. However, the session data will be lost because the browser will lose its session cookies, and after a timeout the server will garbage-collect the abandoned session data.
3. Microsoft documents Internet Explorer's deviant behavior at http://msdn.microsoft.com/workshop/networking/moniker/overview/appendix_a.asp, although I've found that reality doesn't exactly match the documentation.
4. The HTTP specification recommends setting the
Content-Typetoapplication/octet-stream. Unfortunately, this causes problems with Opera 6 on Windows (which will display the raw bytes for any file whose extension it doesn't recognize) and on Internet Explorer 5.1 on the Mac (which will display inline content that would be downloaded if sent with an unrecognized type).
Back to: Java Enterprise Best Practices