Description: Clustering is done by examining term distribution across a sample of the result set. This parameter governs how many records are sampled from the navigation state. Clustering processing time and memory consumption are both roughly linear with this number; thus, lowering the value results in smaller memory consumption and faster turnaround. However, statistical errors are likely to occur when the sample size is small. Setting this value higher overcomes statistical errors for data sets where fewer terms are tagged onto each record.
Range: Integer, 50-2000 (default: 500)
Recommended value: 500