Oracle iPlanet Web Proxy Server 4.0.14 NSAPI Developer's Guide

Chapter 8 Hypertext Transfer Protocol

The Hypertext Transfer Protocol (HTTP) is a protocol that enables a client such as a web browser and a web proxy server to communicate with each other.

HTTP is based on a request-response model. The browser opens a connection to the server and sends a request to the server. The server processes the request and generates a response, which it sends to the browser. The server then closes the connection.

This chapter provides a short introduction to a few HTTP basics. For more information on HTTP, see the IETF home page at: http://www.ietf.org/home.html.

This chapter contains the following sections:

HTTP Compliance

Proxy Server 4 supports HTTP/1.1. Previous versions of the server supported HTTP/1.0. The server is conditionally compliant with the HTTP/1.1 proposed standard, as approved by the Internet Engineering Steering Group (IESG), and the Internet Engineering Task Force (IETF) HTTP working group.

For more information on the criteria for being conditionally compliant, see the Hypertext Transfer Protocol -- HTTP/1.1 specification (RFC 2068) at: http://www.ietf.org/rfc/rfc2068.txt?number=2068

HTTP Requests

A request from a browser to a server includes the following information:

Request Method, URI, and Protocol Version

A browser can request information using a number of methods. The commonly used methods are:

GET — Requests the specified resource such as a document or image
HEAD — Requests only the header information for the document
POST — Requests that the server accept some data from the browser, such as form input for a CGI program
PUT — Replaces the contents of a server’s document with data from the browser

Request Headers

The browser can send headers to the server. Most headers are optional.

The following table lists some of the commonly used request headers.

Table 8–1 Common Request Headers


Request Header	Description
`Accept`	File types the browser can accept.
`Authorization`	Used if the browser wants to authenticate itself with a server. Information such as the user name and password are included.
`User-Agent`	Name and version of the browser software.
`Referer`	URL of the document where the user clicked the link.
`Host`	Internet host and port number of the resource being requested.

Request Data

If the browser has made a POST or PUT request, it sends data after the blank line following the request headers. If the browser sends a GET or HEAD request, there is no data to send.

Server Responses

The server’s response includes the following information:

HTTP Protocol Version, Status Code, and Reason Phrase

The server sends back a status code in response to a request, which is a three-digit numeric code. The five categories of status codes are:

100-199 — A provisional response
200-299 — A successful transaction
300-399 — The requested resource should be retrieved from a different location
400-499 — An error was caused by the browser
500-599 — A serious error occurred in the server

The following table lists some common status codes.

Table 8–2 Common HTTP Status Codes


Status Code	Meaning
`200`	Request has succeeded for the method used (`GET`, `POST`, `HEAD`).
`201`	The request has resulted in the creation of a new resource reference by the returned URI.
`206`	The server has sent a response to byte-range requests.
`302`	Found. Redirection to a new URL. The original URL has moved. This code is not an error because most browsers will get the new page.
`304`	Use a local copy. If a browser already has a page in its cache, and the page is requested again, some browsers, such as Netscape Navigator, relay to the web server the “last-modified” timestamp on the browser’s cached copy. If the copy on the server is not newer than the browser’s copy, the server returns a 304 code instead of returning the page, reducing unnecessary network traffic. This code is not an error.
`400`	Sent if the request is not a valid HTTP/1.0 or HTTP/1.1 request. For example HTTP/1.1 requires a host to be specified either in the `Host` header or as part of the URI on the request line.
`401`	Unauthorized. The user requested a document but didn’t provide a valid user name or password.
`403`	Forbidden. Access to this URL is forbidden.
`404`	Not found. The document requested isn’t on the server. This code can also be sent if the server has been instructed to send this response to unauthorized user.
`408`	If the client starts a request but does not complete it within the keep-alive timeout configured in the server, then this response will be sent and the connection closed. The request can be repeated with another open connection.
`411`	The client submitted a `POST` request with chunked encoding, which is of variable length. However, the resource or application on the server requires a fixed length - a `Content-Length` header to be present. This code tells the client to resubmit its request with content-length.
`413`	Some applications, for example, certain NSAPI plug-ins, cannot handle very large amounts of data, so these applications will return this code.
`414`	The URI is longer than the maximum the web server is willing to serve.
`416`	Data was requested outside the range of a file.
`500`	Server error. A server-related error occurred. The server administrator should check the server’s error log.
`503`	Sent if the quality of service mechanism was enabled and bandwidth or connection limits were attained. The server will then serve requests with that code. See the "quality of service" section.

Response Headers

The response headers contain information about the server and the response data.

The following table lists some common response headers.

Table 8–3 Common Response Headers


Response Header	Description
`Server`	Name and version of the web server
`Date`	Current date (in Greenwich Mean Time)
`Last-Modified`	Date when the document was last modified
`Expires`	Date when the document expires
`Content-Length`	Length of the data that follows (in bytes)
`Content-Type`	MIME type of the following data
`WWW-Authenticate`	Used during authentication and includes information that tells the browser software what is necessary for authentication such as user name and password.

Response Data

The server sends a blank line after the last header. The server then sends the response data such as an image or an HTML page.

Buffered Streams

Buffered streams improve the efficiency of network I/O, for example, the exchange of HTTP requests and responses, especially for dynamic content generation. Buffered streams are implemented as transparent NSPR I/O layers, so existing NSAPI modules can use them without any change.

The buffered streams layer adds the following features to Proxy Server:

Enhanced keep-alive support. When the response is smaller than the buffer size, the buffering layer generates the Content-Length header so that the client can detect the end of the response and reuse the connection for subsequent requests.
Response length determination. If the buffering layer cannot determine the length of the response, it uses HTTP/1.1 chunked encoding instead of the Content-Length header to convey the delineation information. If the client only understands HTTP/1.0, the server must close the connection to indicate the end of the response.
Deferred header writing. Response headers are written out as late as possible to give the servlets a chance to generate their own headers, for example, the session management header set-cookie.
Ability to understand request entity bodies with chunked encoding. Though popular clients do not use chunked encoding for sending POST request data, this feature is mandatory for HTTP/1.1 compliance.

The improved connection handling and response length header generation provided by buffered streams also addresses the HTTP/1.1 protocol compliance issues, where absence of the response length headers is regarded as a category 1 failure. In previous Enterprise Server versions, the dynamic content generation programs was expected to send the length headers. If a CGI script did not generate the Content-Length header, the server had to close the connection to indicate the end of the response, breaking the keep-alive mechanism. However, keeping track of response length in CGI scripts or servlets is often very inconvenient, and as an application platform provider, the web server is expected to handle such low-level protocol issues.

Output buffering has been built in to the functions that transmit data, such as net_write. You can specify the following Service SAF parameters that affect stream buffering, which are described in detail in Chapter 3, Syntax and Use of the magnus.conf File, in Oracle iPlanet Web Proxy Server 4.0.14 Configuration File Reference.

UseOutputStreamSize
ChunkedRequestBufferSize
ChunkedRequestTimeout

The UseOutputStreamSize, ChunkedRequestBufferSize, and ChunkedRequestTimeout parameters also have equivalent magnus.conf directives, as described in Chapter 3, Syntax and Use of the magnus.conf File, in Oracle iPlanet Web Proxy Server 4.0.14 Configuration File Reference. The obj.conf parameters override the magnus.conf directives.

The UseOutputStreamSize parameter can be set to zero (0) in the obj.conf file to disable output stream buffering. For the magnus.conf file, setting UseOutputStreamSize to zero has no effect.

To override the default behavior when invoking an SAF that uses one of the functions net_read or netbuf_grab, you can specify the value of the parameter in obj.conf, for example:

Service fn="my-service-saf" type=perf UseOutputStreamSize=8192