19 HTTP Client Access

This section describes the Apache Commons HttpClient library and how WebCenter Sites integrates with this library.

This chapter contains the following sections:

19.1 Apache Commons HttpClient

WebCenter Sites uses Apache Commons HttpClient as the underlying library for all HTTP access. As of version 3, HttpClient supports the parameters that are posted at: http://jakarta.apache.org/commons/httpclient/preference-api.html

The parameters function as follows:

  • HttpClient parameters change the runtime behavior of HttpClient components. For example, if you want the Post operation to have a timeout that differs from the default, you can call PostMethod.getParams().setParam("timeout", 1000) before executing it.

  • HttpClient parameters can be hierarchically linked. In top-down order, the levels of the hierarchy are: global, client, host, and method. Values that are set for parameters at higher levels are overridden by the values of equivalent parameters at lower levels.

Despite its flexibility, HttpClient has a limitation; that is—parameters can be set only programmatically. No configuration file can be written where parameter values can be specified by users or automatically retrieved by the library. The WebCenter Sites integration, however, overcomes this limitation as explained in the next section Section 19.2, "Integration with WebCenter Sites."

Note:

WebCenter Sites uses the parameters that are posted at http://jakarta.apache.org/commons/httpclient/preference-api.html. The parameters are listed in Table 19-1, along with descriptions (duplicated from the site named above). Changes to parameters and their functionality as defined by HttpClient are not automatically supported.

19.2 Integration with WebCenter Sites

WebCenter Sites abstracts HttpClient functionality by allowing WebCenter Sites users to create user-configurable property files. After creating the files, users populate them with the required HttpClient parameters (that is, parameters whose values differ from the default values), and place the property files into the classpath. WebCenter Sites loads the property files from the classpath and parses the parameters according to a predefined syntax (shown in Table 19-1). The HttpAccess API retrieves the parameters and applies them at runtime.

WebCenter Sites supports a parameter hierarchy whose levels correspond directly to the levels that are defined in the HttpAccess Java API (provided in WebCenter Sites Java Docs). For each level, one or more property files can be created, depending on the implementation, and populated with any combination of HttpClient parameters. The levels and property file naming conventions are given below:

Note:

The property files must be created as text files, outside of the WebCenter Sites Property Editor. Property file names are case sensitive and must be in lower case throughout.

  • HttpAccess (level 1)

    Property File: httpaccess.properties. The user specifies parameters and their values in the httpaccess.properties file. This file is applied to all HttpAccess instances that are created.

    Overrides: Parameter values at the HttpAccess level are overridden by the values of equivalent parameters at levels 2, 3, and 4 (described below).

  • HostConfig (level 2)

    Property File: <protocol>-<hostname>-<port number>.properties. The user specifies host-specific parameters in each property file. For example, for a host named targetserver accessible at port 7001, the property file would be named http-targetserver-7001.properties and would contain HttpClient parameters specific to that host.

    Overrides: Parameter values at the HostConfig level override the values of equivalent parameters at the HttpAccess level.

  • Request (level 3)

    Property File: <request type>.properties where <request type> takes one of the following values: post, get, or login. The user specifies parameters specific to a Request. For example, post.properties specifies HttpClient parameters applicable to instances of post.

    Overrides: Parameter values at the Request level override the values of equivalent parameters at the HttpAccess and HostConfig levels.

  • Per host, per Request (level 4)

    Property File: <request type>-<protocol>-<host name>-<port number>.properties where <request type> takes one of the following values: post, get, or login. Parameters in this property file function as Request level parameters. However, they apply to a specific host.

    Overrides: Parameter values specified at the "Per host, per Request" level override the values of equivalent parameters at the HttpAccess, HostConfig, and Request levels for that particular host. The following example illustrates how an override takes effect from the "Per Host, Per Request" level. In this example, a user defines a property file named login-http-m2-7002.properties, where she specifies an http.connection.timeout of 100 seconds. The timeout applies strictly to the host machine named m2 and port 7002. The timeout value overrides all timeout values that might be specified for m2 at higher levels. For all other host machines, the timeout values remain unaffected.

WebCenter Sites supports all parameters defined by HttpClient in an externally configurable way. Furthermore, WebCenter Sites extends HttpClient functionality by enabling users to configure parameters externally and facilitating the specification of parameters at the fourth level (per host, per request).

In addition to all the parameters supported by HttpClient, the WebCenter Sites HttpAccess API defines a configuration property cs.SecureProtocolSocketFactory in httpaccess.properties. This property specifies the protocol socket factory to be used for SSL (Secure Socket Layer) connections. Three implementations are available at http://jakarta.apache.org/commons/httpclient/sslguide.html. Note that if you want to use SSL to connect to a host using self-signed certificates, you must configure the following:

cs.SecureProtocolSocketFactory=org.apache.commons.httpclient.  contrib.ssl.EasySSLProtocolSocketFactory

WebCenter Sites does not provide this EasySSLProtocolSocketFactory class. You can obtain this class at http://jakarta.apache.org/commons/httpclient/sslguide.html. Make sure to build it differently for Sun and IBM JDKs, as the Apache implementation (at the link directly above) is Sun-specific. Alternatively you can write your own Socket factory implementation based on HttpClient documentation.

Note that there are two levels—connection manager and connection—in the HttpClient hierarchy for which parameters cannot be explicitly set, as the HttpAccess API does not directly support them. However, this does not mean users cannot configure those parameters; the parameters can be specified at a lower or higher corresponding level in the HttpAccess API.

19.2.1 Implementation

How does the WebCenter Sites user configure WebCenter Sites for http access? The user simply creates property files with appropriate names and places them in the classpath. The infrastructure will retrieve and use them. This seems like a good deal of work, especially given the number of parameters. However, by default, no properties or property files need to be created. All defaults will be used, and HttpClient takes the "best guess" values, which are usually the best settings for the given system. In 95% of the cases, "best guess" values are sufficient and users need not create any property files.

In the rare cases when one needs parameter values other than defaults, the WebCenter Sites infrastructure makes it possible to implement them by allowing the user to specify configuration in property files. This gives the user the full range of configuration capabilities that HttpClient itself is built upon.

19.3 HTTPClient Parameters and WebCenter Sites Properties

The table in this section describes parameters that are supported by Apache Commons HttpClient. Descriptions in the table are duplicated from the following site:

http://jakarta.apache.org/commons/httpclient/preference-api.html

Syntax and default values are defined by Oracle, as they are specific to WebCenter Sites. Where syntax is straightforward, the "Syntax" field in the table below is left blank.

Note that changes to the parameters and their functionality are not automatically supported. Information in the table below is valid until Oracle issues an update.

In addition to supporting HttpClient parameters, WebCenter Sites defines the following property:

Property: cs.SecureProtocolSocketFactory

Usage: applicable only to the httpaccess.properties file

Description: defines the class used opening SSL Socket connections

Default: empty. The system will use the JSSE-based default implementation of HttpClient. Details are available at: http://jakarta.apache.org/commons/httpclient/sslguide.html

Table 19-1 HttpClient Parameters

Name Description

http.authentication.preemptive

Defines whether authentication should be attempted preemptively.

Type: Boolean

Default value: <undefined>

http.connection.stalecheck

Determines whether stale connection check is to be used. Disabling stale connection check may result in slight performance improvement at the risk of getting an I/O error when executing a request over a connection that has been closed at the server side.

Type: Boolean

Default value: <undefined>

http.connection.timeout

The timeout until a connection is established. A value of zero means the timeout is not used.

Type: Integer

Default value: <undefined>

http.connection-manager.class

The default HTTP connection manager class.

Type: Class

Syntax: Fully qualified classname

Default value: SimpleHttpConnectionManager class

http.connection-manager.max-per-host

Defines the maximum number of connections allowed per host configuration. These values only apply to the number of connections from a particular instance of HttpConnectionManager. This parameter expects a value of type Map. The value should map instances of HostConfiguration to Integer s. The default value can be specified using ANY_HOST_CONFIGURATION.

Type: Map

Syntax: Specify ${<host>;<port>;<protocol>; <max connections>}

Default value: <undefined>

http.connection-manager.max-total

Defines the maximum number of connections allowed overall. This value only applies to the number of connections from a particular instance of HttpConnectionManager.

Type: Integer

Default value: <undefined>

http.connection-manager.timeout

The timeout in milliseconds used when retrieving an HTTP connection from the HTTP connection manager.

Type: Long

Default value: <undefined>

http.dateparser.patterns

Date patterns used for parsing. The patterns are stored in a Collection and must be compatible with SimpleDateFormat.

Type: Collection

Syntax: Specify the collection with each element enclosed in ${<element>}.e.g., ${EEE, dd-MMM-yyyy HH-mm- ss z}${EEE, dd MMM yy HH:mm:ss z}

Default value:

EEE, dd MMM yyyy HH:mm:ss zzzEEEE, dd-MMM-yy HH:mm:ss zzzEEE MMM d HH:mm:ss yyyyEEE, dd-MMM-yyyy HH:mm:ss zEEE, dd-MMM-yyyy HH-mm-ss zEEE, dd MMM yy HH:mm:ss zEEE dd-MMM-yyyy H:mm:ss zEEE dd MMM yyyy HH:mm:ss zEEE dd-MMM-yyyy HH-mm-ss zEEE dd-MMM-yy HH:mm:ss zEEE dd MMM yy HH:mm:ss zEEE,dd-MMM-yy HH:mm:ss zEEE,dd-MMM-yyyy HH:mm:ss zEEE, dd-MM-yyyy HH:mm:ss z

http.default-headers

The request headers to be sent per default with each request. This parameter expects a value of type Collection. The collection is expected to contain HTTP headers.

Type: Collection

Syntax: Specify each header in ${name=<header name>; value=<header value>}

Default value: <undefined>

http.method.multipart.boundary

The multipart boundary string to use in conjunction with the MultipartRequestEntity. When this property is not set, a random value will be generated for each request.

Type: String

Syntax:

Default value: <undefined>

http.method.response.buffer.warnlimit

The maximum buffered response size (in bytes) that triggers no warning. Buffered responses exceeding this size will trigger a warning in the log. If not set, the limit is 1 MB.

Type: Integer

Default value: 1

http.method.retry-handler

The method retry handler used for retrying failed methods. For details see the Exception handling guide.

Type: HttpMethodRetryHandler

Syntax: Fully qualified classname

Default value: default implementation

http.protocol.allow-circular-redirects

Defines whether circular redirects (redirects to the same location) should be allowed. The HTTP spec is not sufficiently clear whether circular redirects are permitted, therefore optionally they can be enabled.

Type: Boolean

Default value: <undefined>

http.protocol.content-charset

The charset to be used for encoding content body.

Type: String

Default value: ISO-8859-1

http.protocol.cookie-policy

The cookie policy to be used for cookie management.

Type: String

Default value: CookiePolicy.RFC_2109

http.protocol.credential-charset

The charset to be used when encoding credentials. If not defined then the value of the "http.protocol.element-charset" should be used.

Type: String

Default value: <undefined>

http.protocol.element-charset

The charset to be used for encoding/decoding HTTP protocol elements (status line and headers).

Type: String

Default value: US-ASCII

http.protocol.expect-continue

Activates "Expect: 100-Continue" handshake for the entity enclosing methods. The "Expect: 100-Continue" handshake allows a client that is sending a request message with a request body to determine if the origin server is willing to accept the request (based on the request headers) before the client sends the request body.

The use of the "Expect: 100-continue" handshake can result in noticeable performance improvement for entity enclosing requests (such as POST and PUT) that require the target server's authentication. "Expect: 100-continue" handshake should be used with caution, as it may cause problems with HTTP servers and proxies that do not support HTTP/1.1 protocol.

Type: Boolean

Default value: <undefined>

http.protocol.head-body-timeout

Sets period of time in milliseconds to wait for a content body sent in response to HEAD response from a non-compliant server. If the parameter is not set or set to -1 non-compliant response body check is disabled.

Type: Integer

Default value: <undefined>

http.protocol.max-redirects

Defines the maximum number of redirects to be followed. The limit on number of redirects is intended to prevent infinite loops.

Type: Integer

Default value: <undefined>

http.protocol.reject-head-body

Defines whether the content body sent in response to HEAD request should be rejected.

Type: Boolean

Default value: <undefined>

http.protocol.reject-relative-redirect

Defines whether relative redirects should be rejected.

Type: Boolean

Default value: <undefined>

http.protocol.single-cookie-header

Defines whether cookies should be put on a single response header.

Type: Boolean

Default value: <undefined>

http.protocol.status-line-garbage-limit

Defines the maximum number of ignorable lines before we expect a HTTP response's status code.

With HTTP/1.1 persistent connections, the problem arises that broken scripts could return a wrong Content-Length (there are more bytes sent than specified). Unfortunately, in some cases, this is not possible after the bad response, but only before the next one. So, HttpClient must be able to skip those surplus lines this way. Set this to 0 to disallow any garbage/empty lines before the status line. To specify no limit, use Integer.MAX_VALUE.

Type: Integer

Default value: <undefined>

http.protocol.strict-transfer-encoding

Defines whether responses with an invalid Transfer-Encoding header should be rejected.

Type: Boolean

Default value: <undefined>

http.protocol.unambiguous-statusline

Defines whether HTTP methods should reject ambiguous HTTP status line.

Type: Boolean

Default value: <undefined>

http.protocol.version

The HTTP protocol version used per default by the HTTP methods.

Type: HttpVersion

Syntax: <(int)major>.<(int)minor>; e.g., 1.1

Default value: HttpVersion_1_1

http.protocol.warn-extra-input

Defines HttpClient's behavior when a response provides more bytes than expected (specified with Content-Length header, for example). Such surplus data makes the HTTP connection unreliable for keep-alive requests, as malicious response data (faked headers etc.) can lead to undesired results on the next request using that connection. If this parameter is set to true, any detection of extra input data will generate a warning in the log.

Type: Boolean

Default value: <undefined>

http.socket.linger

The linger time (SO_LINGER) in seconds. This option disables/enables immediate return from a close() of a TCP Socket. Enabling this option with a non-zero Integer timeout means that a close() will block pending the transmission and acknowledgement of all data written to the peer, at which point the socket is closed gracefully. Value 0 implies that the option is disabled. Value -1 implies that the JRE default is used.

Type: Integer

Default value: <undefined>

http.socket.receivebuffer

The value to set on Socket.setReceiveBufferSize(int). This value is a suggestion to the kernel from the application about the size of buffers to use for the data to be received over the socket.

Type: Integer

Default value: <undefined>

http.socket.sendbuffer

The value to set on Socket.setSendBufferSize(int). This value is a suggestion to the kernel from the application about the size of buffers to use for the data to be sent over the socket.

Type: Integer

Default value: <undefined>

http.socket.timeout

Sets the socket timeout (SO_TIMEOUT) in milliseconds to be used when executing the method. A timeout value of zero is interpreted as an infinite timeout.

Type: Integer

Default value: <undefined>

http.socket.timeout

The default socket timeout (SO_TIMEOUT) in milliseconds which is the timeout for waiting for data. A timeout value of zero is interpreted as an infinite timeout. This value is used when no socket timeout is set in the HTTP method parameters.

Type: Integer

Default value: <undefined>

http.tcp.nodelay

Determines whether Nagle's algorithm is to be used. The Nagle's algorithm tries to conserve bandwidth by minimizing the number of segments that are sent. When applications need to decrease network latency and increase performance, they can disable Nagle's algorithm (by enabling TCP_NODELAY). Data will be sent earlier, at the cost of an increase in bandwidth consumption.

Type: Boolean

Default value: <undefined>

http.useragent

The content of the User-Agent header used by the HTTP methods.

Type: String

Default value: <Official release name>e.g., Jakarta Commons-HttpClient/3.0