You can purge older orders from the index using sharding, which horizontally partitions orders into separate logical partitions. Each of these partitions is represented by a single shard, which defines an indexed date range of orders by order creation date.
For example, if you have more than a full year of orders, you could divide the order into five separate shards that each contain 90 days of orders. The five shards are assigned date ranges based upon the date that the index was created. The date range for each shard is fixed, which keeps orders in a known logical partition, allowing orders to be updated quickly.
Continuing with the example, you would have the following shards and logical partitions configured:
Shard/Logical Partition | Date Ending | Date Starting |
---|---|---|
LP5 | 4/5/2011 | 1/5/2011 |
LP4 | 1/5/2011 | 10/7/2010 |
LP3 | 10/7/2010 | 7/9/2010 |
LP2 | 7/9/2010 | 4/10/2010 |
LP1 | 4/10/2010 | 1/10/2010 |
Every 90 days, the oldest logical partition will be deleted, and a new logical partition will be created to index new items:
Shard/Logical Partition | Date Ending | Date Starting |
---|---|---|
LP6 | 7/6/2011 | 4/5/2011 |
LP5 | 4/5/2011 | 1/5/2011 |
LP4 | 1/5/2011 | 10/7/2010 |
LP3 | 10/7/2010 | 7/9/2010 |
LP2 | 7/9/2010 | 4/10/2010 |
LP1 (deleted) | 4/10/2010 | 1/10/2010 |
Note that a sixth shard is required to index future orders, as all existing shards already hold data.
The Sharding Process
By default, sharding is disabled, but can be enabled using the shardingEnabled
property in atg/commerce/search/OrderOutputConfig
. Note that shards are created based on the creation date of the order, not the completion date of the order. The /atg/commerce/search/
implements the sharding of orders and the
OrderSharder/atg/commerce/search/
component’s
OrderShardRotationServicedaysBeforeExpiration
property defines how many days prior to the shard’s expiration date the shard should be rotated.
The /atg/search/routing/respository/SearchConfigurationRepository
contains the following shard items:
shardConfig
– Defines the shard configuration. Both the bulk and live search environments have their ownshardConfig
shard
– Defines the date range for the shard and the name of the logical partition for the shard
Once a bulkLoad
method has been run using /atg/commerce/search/OrderOutputConfig
, the shard information will appear when you select the ATGOrder
search environment. For example:
Configuring Shards for Production Systems
Because each shard represents a logical partition, you must set the AEConfig.xml
settings to accommodate for the largest period of activity. The AEConfig.xml
file configures the search engine’s MemoryReserveSize
setting, which defines how much memory is reserved for an index. This memory may not be allocated until the index grows.
For example, if each shard contains three months of orders, but one shard holds holiday orders, that shard may hold double the amount of orders than the other two shards. You must set the MemoryReserveSize
to support the maximum number of orders within the shard date range. In this example, you would set the MemoryReserveSize
site to hold the largest number of orders that might occur during the year, or the holiday season. This ensures that shards created during peak order times have enough memory reserved.
For additional information on setting the MemoryReserveSize
, refer to the Adjusting Physical Partition Size section in the ATG Search Installation and Configuration Guide.
The amount of memory needed is equal to the amount of memory required to index the content, plus the MemoryThreshold
and heap size per process. It is best that you allocate a minimum of two cores per engine (per physical partition). Use the following formula to determine the number of required CPUs:
2 x P_partitions x ((N_days / M_range) + 1)
P_partitions
– The number of physical partitions per shard. If all items in a shard fit in a single physical partition, thenP_partitions = 1
N_days
– The total number of days to indexM_range
– The number of days per shard
For example, for 360 days of orders, you might divide them into four partitions of 90 days each. Including the additional extra partition for future orders, and assuming that all items in a shard fit into a single physical partition, the formula could look like this:
2 x 1 x ((360/90) + 1) = 10 cores
Note that this example only takes into account the order live indexing processes and does not count other processes such as the Java application server, profile live indexing or other indexes, such as catalog search, solutions, etc.