public class TextProfileSignature extends Signature
An implementation of a page signature. It calculates an MD5 hash
of a plain text "profile" of a page. In case there is no text, it
calculates a hash using the MD5Signature.
The algorithm to calculate a page "profile" takes the plain text version of a page and performs the following steps:
QUANT = QUANT_RATE * maxFreq, where QUANT_RATE is 0.01f
by default, and maxFreq is the maximum token frequency). If
maxFreq is higher than 1, then QUANT is always higher than 2 (which
means that tokens with frequency 1 are always discarded).| Constructor and Description |
|---|
TextProfileSignature() |
Copyright © 2007, 2014, Oracle and/or its affiliates. All rights reserved.