Procedure for creating collections in the data domain

This topic provides a high-level overview of the steps necessary to create a collection.

The procedure assumes that your are creating two collections: Products (with the ProductID standard attribute as its unique property key) and Sales (with the SalesID standard attribute as its unique property key). It also assumes that each source record (when ingested) will have an assignment of either the ProductID attribute (in which case it will belong to the Products collection) or the SalesID attribute (in which case it will belong to the Sales collection).

To create a collection:

  1. Create an empty Endeca data domain.

    For example, you can use the Endeca Server create-dd command.

  2. Load the attribute schema (i.e., the PDRs for the standard attributes) into the data domain. In particular, make sure you create the ProductID and SalesID standard attributes as single-assign, unique attributes.
  3. Create the collections, as in this example that uses the putCollections operation:
    <soapenv:Envelope 
       xmlns:soapenv="http://schemas.xmlsoap.org/soap/envelope/" 
       xmlns:ns="http://www.endeca.com/endeca-server/sconfig/3/0">
       <soapenv:Header/>
       <soapenv:Body>
          <ns:putCollections>
             <ns:language>en</ns:language>
             <ns:collection collectionKey="Products" displayName="Product data" uniquePropertyKey="ProductID">
                <ns:description>product records for the region</ns:description>
                <ns:property key="Region">New England</ns:property>
             </ns:collection>
             <ns:collection collectionKey="Sales" displayName="Sales data" uniquePropertyKey="SalesID">
                <ns:description>sales information</ns:description>
                <ns:property key="Currency">$</ns:property>
             </ns:collection>
          </ns:putCollections>
       </soapenv:Body>
    </soapenv:Envelope>
  4. Load the source records into the data domain.

    As mentioned above, each source record should have an assignment of either the ProductID or SalesID attribute, which will also serve as the record spec (primary key) for the records.

When the ingest operation has finished, the results of a listCollections operation would return the following list (which is abbreviated for ease of reading):
<listCollectionsResponse xmlns="http://www.endeca.com/endeca-server/sconfig/3/0">
   <collectionRecord collectionKey="Products" displayName="Product data" uniquePropertyKey="ProductID">
      <description>Collection of Product information</description>
      <property key="Locale">US region</property>
      <collectionAttributes>
         <collectionAttribute propertyKey="Color"/>
         <collectionAttribute propertyKey="DealerPrice"/>
         ...
         <collectionAttribute propertyKey="Style"/>
         <collectionAttribute propertyKey="Weight"/>
      </collectionAttributes>
   </collectionRecord>
   <collectionRecord collectionKey="Sales" displayName="Sales data" uniquePropertyKey="SalesID">
      <description>Collection of Sales information</description>
      <property key="Currency">$</property>
      <collectionAttributes>
         <collectionAttribute propertyKey="FactSales_CurrencyKey"/>
         <collectionAttribute propertyKey="FactSales_CustomerPONumber"/>
         ...
         <collectionAttribute propertyKey="FactSales_TotalProductCost"/>
         <collectionAttribute propertyKey="FactSales_UnitPrice"/>
      </collectionAttributes>
   </collectionRecord>
</listCollectionsResponse>

The listCollectionsResponse shows the two Products and Sales collections that were created in step 3. It also shows that the Dgraph has populated the CDRs with the attributes from records that have an assignment from the unique property key of a collection.