46 White Space and Compression

When WebCenter Sites streams a text page, the page may contain a significant amount of white space (spaces, carriage returns and tabs) that have no effect on the data that is consumed by the client. The white space is visible when the source code is viewed by the consumer, and this is not desirable. Furthermore, excessive white space needlessly increases the size of the response, which ultimately increases bandwidth use. Consequently, it is beneficial to eliminate white space whenever possible.

This chapter contains the following sections:

46.1 White Space and JSP

The JSP specification requires that all white space be preserved. Thus, a page that looks like this:

<%@ page import="my class name"%> 
<%@ page import="my class 2"%> 
<cs:ftcs> 
<p>Hello world!</p> 
</cs:ftcs>

will have three carriage returns and a tab preceding the <p> because the text is displayed on third line after the JSP has been interpreted. With more complicated pages, the problem is compounded.

46.2 White Space and XML

WebCenter Sites's XML processing language, being a proprietary set of xml-compliant tags, does not adhere to the white space preserving rules of JSP. As such, a WebCenter Sites XML page like this:

<? XML version 1.0 ?> 
<FTCS> 
<p>Hello World!</p> 
</FTCS>

will display <p> as the first characters of output, because our xml parser will strip all of the white space (unless xml debug is enabled, in which case all the white space is preserved).

46.3 Compression

Because white space is an artifact of writing well-formatted code, its presence is an unfortunate side effect of programming practices that benefit the developer. The impact on the consumer and the customer is minimal except for bandwidth. To address bandwidth, the output of all text-based pages can be compressed. Compressing the output is done on the server-side, and decompression is done by the consumer's user-agent (browser). The compression/decompression is completely transparent to the end user. This sort of compression can yield up to an 80% reduction in bandwidth use. One commonly-used compression mechanism is the mod-gzip extension to the Apache web server. This module will automatically gzip all output to the user agent provided that it can decompress it. Configuration is minimal and its effectiveness is quite high. It can be obtained from SourceForge (http://sourceforge.net/projects/mod-gzip/). Similar tools are available for other common web servers such as IIS.

Another possibility is to do the compression at the application server layer, and leave the web server alone. This is best done by connecting a standard servlet filter to Satellite Server (or to WebCenter Sites if Satellite Server is not being used). The servlet filter is invoked in a prescribed order prior to and/or after the invocation of the specified servlet, and during invocation it can compress the output prior to sending it to compatible user-agents, exactly the same way mod-gzip works. One such compression filter can be found at SourceForge (http://sourceforge.net/projects/pjl-comp-filter/).

If you are interested in compression but need assistance, contact Oracle Consulting Services.

46.4 JSP Design

If compression is not an option, consider altering your JSP pages so that they do not require compression to address the white space problem. This can be done by changing the code above to this:

<%@ page import="my class name" 
%><%@ page import="my class 2" 
%><cs:ftcs><p>Hello world!</p></cs:ftcs>

While this is not as elegant (or readable), it will result in page output without any white space whatsoever prior to the <p> tag. An intermediate solution may be something like this:

<%@ page import="my class name" 
%><%@ page import="my class 2" 
%><cs:ftcs> 
<p>Hello world!</p>
</cs:ftcs>

For extensive examples of how to address white space issues in JSP, refer to our WebServices elements in the ElementCatalog. They are included with WebCenter Sites.