This topic provides guidance about how to delete stale generations
of records by appropriately setting the
generationRetentionTime
property.
As a general rule, the value of
generationRetentionTime
should be greater than the sum
of the following:
The time between the start of two write operations to a Record Store instance.
The time between the start of two delta read operations from a Record Store instance.
Time for a margin of safety (For example, this includes time to revert to an earlier generation, fix any issues in the data, and re-crawl the data.)
For example, suppose a crawl, which writes to a Record Store, takes a
few hours to run and runs once a day: the time between the start of two write
operations is 24 hours. Next, suppose you run Forge once a day so the time
between reads of the Record Store is 24 hours. Last, suppose you want to be
able to revert to data up to three days old. You want a margin of safety of 72
hours. This means the value of
generationRetentionTime
should be at least 120. In this
scenario, a value of 120 ensures there are two generations in a Record Store
instance.
Note
The Record Store applies a read-lock to the generation being read.
If a generation with a read-lock exceeds the
generationRetentionTime
value, the generation is not
deleted until the read is complete and the read-lock is released.