The following record properties describe the source of files that are fetched from a Web crawl.
Endeca Property Name |
Property Value |
---|---|
|
Indicates the source type of the crawl. The
Web Crawler produces values with
|
|
Provides a unique identifier for a record.
|
|
The value of the Accept-Ranges header field, which allows the server to indicate its acceptance of range requests for a resource. |
|
The value of the Connection general-header field as returned from the server. |
|
The value of the Content-Type header field,
which indicates the media type of the entity-body. Examples of media types are
|
|
The value of the ETag header field, which provides the current value of the entity tag for the requested variant. |
|
The Internet host and port number where the document resides. The absence of port information implies the default port for the service requested. |
|
The value of the Content-Length header field, which indicates the size of the entity-body. |
|
The HTTP response status code, which
determines the outcome of the request (for example,
|
|
The value of the Last-Modified entity-header field, which indicates the date and time at which the origin server believes the file was last modified. Typically, the value is the file system last-modified time. |
|
The value of an HTML meta tag, where
|
|
The URL of the seed from which this URL came. |
|
The URL of the page that contained the outlink to this page. |
|
The text that was used on the
|
|
The value of the Server response-header
field, which contains information about the software used by the origin server
to handle the request (for example,
|
|
The URL of the document. |
|
The protocol of the source document (for
example,
|