83 Understanding White Space and Compression

When WebCenter Sites streams a text page, the page may contain a significant amount of white space (spaces, carriage returns, and tabs) that have no effect on the data that is consumed by the client. The white space is visible when the source code is viewed by the consumer. Furthermore, excessive white space needlessly increases the size of the response, which ultimately increases bandwidth use. So, it’s a good practice to eliminate the white space whenever possible.

Topics:

83.1 White Space and JSP

The JSP specification needs you to preserve the white space.

Thus, a page that looks like this:

<%@ page import="my class name"%> 
<%@ page import="my class 2"%> 
<cs:ftcs> 
<p>Hello world!</p> 
</cs:ftcs>

has three carriage returns and a tab preceding the <p> because the text is displayed on third line after the JSP has been interpreted. With more complicated pages, the problem is compounded.

83.2 White Space and XML

WebCenter Sites XML processing language, being a proprietary set of XML-compliant tags, does not adhere to the white space preserving rules of JSP.

As such, a WebCenter Sites XML page like this:

<? XML version 1.0 ?> 
<FTCS> 
<p>Hello World!</p> 
</FTCS>

displays <p> as the first characters of output, because our XML parser strips all of the white space. (An exception is if XML debug is enabled, in which case all white space is preserved).

83.3 Compression

White space is an artifact of writing well-formatted code. Its presence is a side effect of programming practices that benefit the developer. The impact on the consumer and the customer is minimal except for bandwidth. To address bandwidth, you can compress the output of all text-based pages. You need to be on the server-side to compress the output. The consumer's user-agent (browser) performs decompression.

The compression/decompression is completely transparent to the end user. This sort of compression can yield up to an 80% reduction in bandwidth use.

One commonly-used compression mechanism is the mod-gzip extension to the Apache web server. This module automatically gzips all output to the user agent, provided that it can decompress it. Configuration is minimal and its effectiveness is quite high. It can be obtained from SourceForge (http://sourceforge.net/projects/mod-gzip/). Similar tools are available for other common web servers, such as IIS.

Another possibility is to do the compression at the application server layer, and leave the web server alone. This is best done by connecting a standard servlet filter to Satellite Server (or to WebCenter Sites if Satellite Server is not being used). The servlet filter is invoked in a prescribed order before or after the invocation of the specified servlet (or both), and during invocation it can compress the output before sending it to compatible user-agents, exactly the same way mod-gzip works. One such compression filter can be found at SourceForge (http://sourceforge.net/projects/pjl-comp-filter/).

If you need assistance with compression, contact Oracle Consulting Services.

83.4 JSP Design

If compression is not an option, consider altering your JSP pages in a way that white space doesn’t occur.

You can do this by changing the code above to the following:

<%@ page import="my class name" 
%><%@ page import="my class 2" 
%><cs:ftcs><p>Hello world!</p></cs:ftcs>

While this is not as elegant (or readable), it results in page output without any white space whatsoever before the <p> tag. An intermediate solution may be something like the following:

<%@ page import="my class name" 
%><%@ page import="my class 2" 
%><cs:ftcs> 
<p>Hello world!</p>
</cs:ftcs>

For extensive examples of how to address white space issues in JSP, refer to our WebServices elements in the ElementCatalog. They are included with WebCenter Sites.