About collections

In the Endeca Server, collections represent a data model in which source records are partitioned into named collections, according to their unique key assignments, and then loaded into the Endeca Server.

Collections allow you to divide the data in a given data domain into multiple organized groupings (known in Studio as data sets). You can therefore build an Endeca Server application with multiple collections of records, all comprising records in a single data domain's index.

Collections have a practical meaning. It is very common to have multiple different kinds of data that users want to search through: products for sale alongside how-to articles; vehicles alongside warranty claims; structured HR records of employee changes alongside satisfaction surveys; and so on. These different kinds of records are typically related and relevant to each other, but are often not useful to see mixed up with each other. It would not be very useful, for example, to see a results list in a UI where some of the rows represented products for sale and others represented data sheets. Collections allow you to load and organize data, for each data domain, inside them, by their logical groupings.

Keep in mind that collections are optional in your Endeca Server application. Endeca records are not required to be members of a collection, but you can use collections if this approach represents your data model.

Note: At least one collection (which is called a data set in Studio) must be present in an Endeca data domain before a Studio application can be configured to connect to that data domain. For this reason, you should always ingest your source data into one or more collections in a Studio environment.

Sample use case

Looking at the application from a high level, the Conversation Web Service has knowledge of multiple collections of data in a single Endeca Server data domain. This means that the query state is tracked per-collection, and most content elements (such as NavigationMenu and RecordList) can operate on a single focal collection.

For example, consider an application for exploring product sales and online reviews. Imagine that Sales and Reviews are two collections of records. You might have two different tabs in your application: one for exploring Reviews, and the other for exploring Sales. Further, you might make different filters on the two tabs, so that you can narrow your Review records to only five-star reviews, while Sales records are limited by a text search on product description. As you switch between tabs, the single State includes the separate selections for both kinds of records, so selections for both kinds of records are not lost.

On each tab, you see only data relevant to the collection in question: on the Reviews tab, you see a menu of available refinements containing values that are useful refinements for narrowing your Reviews (but not, notably, attributes that only appear on Sales records). You also see a record list with a page of just Review records without Sales records being mixed in.

Cross filtering

When switching between navigating over the Reviews records and navigating over the Sales records, it can be useful to carry some filters between the two. For example, though a selection on 5-star Reviews only should not filter Sales, other attributes may actually be shared. For example, if user Jane has filtered her Reviews to only reviews written by John Doe, she may also want to automatically narrow her Sales to only purchases made by John Doe. Endeca Server supports these cross filtering scenarios with filter rules. For more information, see Filter Rules.

Filter rules allow you to tie the Reviewer attribute on your Reviews records to the Buyer attribute on your Sales records. Any filter that narrows and makes a selection on one of those attributes will implicitly make an additional selection on the other attribute. This means that if you narrow to only Reviews written by John Doe on the Reviews tab, and then switch to your Sales view, the selection of "Buyer=John Doe" will have been automatically added to your Sales breadcrumbs also. Cross-filtering on multiple collections thus enables an important value proposition for Endeca applications: discovery across multiple data sets.