The CAS Server generates certain properties whether you crawl a file system, CMS, or custom data source extension.

The CAS Server generates record properties and assigns each property a qualified name, with a period (.) to separate qualifier terms. The CAS Server constructs the qualified name as follows:

The CAS Server may generate the following properties for all records:

Property Name

Property Value

Endeca.Action

The action that was taken with the document. Values are UPSERT (the file or folder has been added or modified) or DELETE (the document or directory has been deleted since the last crawl).

Endeca.SourceType

Indicates the source type of the crawl. Values are FILESYSTEM (for file system data sources), WEB (for Web servers), CMS (for Content Management System data sources), or EXTENSION (for data source extensions).

Endeca.Id

Provides a unique identifier for each record.

For file system crawls, Endeca.Id is the same as Endeca.FileSystem.Path. It is the full path to the file including the file name. For archive files, this is a string pointing to a file within Endeca.FileSystem.Path a container. This property also includes the PathWithinSourceArchive (if present).

For Web crawls, Endeca.Id is the same as Endeca.Web.Url.

For CMS crawls, Endeca.Id is the concatenation of the Endeca.CMS.RepositoryId and Endeca.CMS.ItemId properties, and the Endeca.CMS.ContentPieceId (if present). This property also includes the PathWithinSourceArchive (if present)

For data source extensions, a plug-in developer must add Endeca.Id to each record and assign it a value appropriate for the data source.

Endeca.SourceId

Indicates the name of the data source. This is the same as the id value of crawlId in a crawl configuration.

Endeca.File.IsArchive

A boolean that, if set to a value of true, indicates that the document is an archive file, such as a Zip file. If the file is not identified as an archive, the property is absent. Note that archives are identified by their file extension or Mime type.

It is possible for a document to have both Endeca.File.IsArchive and Endeca.File.IsInArchive properties set, as archive files may contain other archive files nested within.

Endeca.File.IsInArchive

A boolean that, if set to a value of true, indicates that the document is extracted from an archive file. If the file is not an archived document, the property is absent.

Endeca.File.Size

The size of the document in bytes, as reported by the native file system, CMS, or an archive entry.

Endeca.File.SourceArchiveId

This property is added to all records that have the Endeca.File.IsInArchive property. It is intended to provide a reference to the original archive that was encountered in the file system or CMS. The value is the original archive's Endeca.FileSystem.Path or Endeca.Id property. In the case of nested archives, it is the top-level archive, because that is the original source in the file system or CMS being crawled.

Endeca.File.PathWithinSourceArchive

This property is added to all records that have the Endeca.File.SourceArchiveId property. The value of this property is the path to the current record within the source archive file. In the case of nested archive entries, it includes the path to the nested archive, appended with the path to the current record within the nested archive.


Copyright © Legal Notices