The following record properties describe the source of files that are fetched from a Web crawl.

Endeca Property Name

Property Value

Endeca.SourceType

Indicates the source type of the crawl. The Web Crawler produces values with Web.

Endeca.Id

Provides a unique identifier for a record. Endeca.Id has the same value as Endeca.Web.URL.

Endeca.Web.Accept-Ranges

The value of the Accept-Ranges header field, which allows the server to indicate its acceptance of range requests for a resource.

Endeca.Web.Connection

The value of the Connection general-header field as returned from the server.

Endeca.Web.Content-Type

The value of the Content-Type header field, which indicates the media type of the entity-body. Examples of media types are text/html and image/gif.

Endeca.Web.ETag

The value of the ETag header field, which provides the current value of the entity tag for the requested variant.

Endeca.Web.Host

The Internet host and port number where the document resides. The absence of port information implies the default port for the service requested.

Endeca.Web.HTTP.Content-Length

The value of the Content-Length header field, which indicates the size of the entity-body.

Endeca.Web.HTTP.Status

The HTTP response status code, which determines the outcome of the request (for example, 200 indicates a successful request).

Endeca.Web.Last-Modified

The value of the Last-Modified entity-header field, which indicates the date and time at which the origin server believes the file was last modified. Typically, the value is the file system last-modified time.

Endeca.Web.HTMLMetaTag.name

The value of an HTML meta tag, where name is the name of the meta tag. For example, Endeca.Web.HTMLMetaTag.keywords would contain the keywords defined in the tag.

Endeca.Web.SeedUrl

The URL of the seed from which this URL came.

Endeca.Web.LinkedFromUrl

The URL of the page that contained the outlink to this page.

Endeca.Web.LinkedFromUrl.LinkText

The text that was used on the LinkedFromUrl to link to this page.

Endeca.Web.Server

The value of the Server response-header field, which contains information about the software used by the origin server to handle the request (for example, Apache-Coyote/1.1).

Endeca.Web.URL

The URL of the document.

Endeca.Web.URL.Protocol

The protocol of the source document (for example, http or https).


Copyright © Legal Notices