|
Oracle® Globalization Development Kit Java API Reference 10g Release 1(10.1) B10971-01 | |||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |
java.lang.Object | +--oracle.i18n.lcsd.LCSDetector
The LCSDetector
class contains methods to automatically detect and recognize language and/or encoding based on text input.
To use LCSDetector
, call getInstance()
to obtain an instance of the LCSDetector
class. You can specify a profile by calling the method getInstance(profile)
, or simply call getInstance()
to use the standard profile. Depending on the content of the text you plan to sample. Certain profiles may yield more accurate results. For example, if you are sampling medical journals, you many want to use a profile that is built using mainly medical journals. If you are sampling computer related white papers, a profile built with similar documents will help improve the accuracy of the detection. Currently, we only provide one standard profile which is for general purpose detection.
The detection process can begin by calling the detect(byte[])
method. Statistics are cummulated every time a detect(byte[])
method is called. When the user is ready for the result, getResult()
can be called to retrieve a LCSDResultSet
instance. To begin a new detection using the same LCSDetector
instance, reset()
can be called to remove the cummulated statistics.
LCSDResultSet
Constructor Summary | |
LCSDetector() Constructor. | |
LCSDetector(String name) Constructor which takes a profile name allows user to choose a different profile other than the default |
Method Summary | |
void |
detect(byte[] input) Statistical data is cumulated in an internal structure when the detect() methods are called. |
int |
detect(byte[] input, int offset, int length) Statistical data is cumulated in an internal structure when the detect() methods are called. |
void |
detect(char[] input) Statistical data is cumulated in an internal structure when the detect() methods are called. |
int |
detect(char[] input, int offset, int length) Statistical data is cumulated in an internal structure when the detect() methods are called. |
void |
detect(InputStream input) Statistical data is cumulated in an internal structure when the detect() methods are called. |
int |
detect(InputStream input, int length) Statistical data is cumulated in an internal structure when the detect() methods are called. |
void |
detect(String input) Statistical data is cumulated in an internal structure when the detect() methods are called. |
int |
detect(String input, int length) Statistical data is cumulated in an internal structure when the detect() methods are called. |
oracle.i18n.lcsd.LCSDResultSet |
getResult() Determines the high hit language/character set pairs from the cumulated statistical data |
void |
reset() To reset statistical data for all pairs to 0. |
int |
setCharacterSetFilter(String charset) Set character set filter if user know the character set of the input data. |
int |
setLanguageFilter(String language) Set language filter if user knows the language of the input data. |
Methods inherited from class java.lang.Object |
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait |
Constructor Detail |
public LCSDetector()
public LCSDetector(String name)
name
- name of profile to useMethod Detail |
public int setCharacterSetFilter(String charset)
charset
- ISO character set name.public int setLanguageFilter(String language)
public void detect(byte[] input)
detect()
methods are called. Use reset()
to clear the cumulated statictics.input
- the bytes to be sampled by detectpublic int detect(byte[] input, int offset, int length)
detect()
methods are called. Use reset()
to clear the cumulated statictics. Only the specified length of bytes will be sampledinput
- the bytes to be sampled by detectoffset
- the index of the first byte to samplelength
- the number of bytes to samplepublic void detect(InputStream input) throws IOException
detect()
methods are called. Use reset()
to clear the cumulated statistics. The entire stream will be sampled by detect.input
- inputStream to be sampled by detectIOException
- if error occurs while doing operation on streampublic int detect(InputStream input, int length) throws IOException
detect()
methods are called. Use reset()
to clear the cumulated statistics. Only the specified length of bytes will be sampledinput
- inputStream to be sampled by detectlength
- the number of bytes to sampleIOException
- if error occurs while doing operation on streampublic void detect(String input)
detect()
methods are called. Use reset()
to clear the cumulated statistics. The entire String will be sampled by detect.public int detect(String input, int length)
detect()
methods are called. Use reset()
to clear the cumulated statistics. Only the specified length of chars will be sampledinput
- a string to be sampled by detectlength
- the number of chars to samplepublic void detect(char[] input)
detect()
methods are called. Use reset()
to clear the cumulated statistics. The entire array will be sampled by detectinput
- the chars to be sampled by detectpublic int detect(char[] input, int offset, int length)
detect()
methods are called. Use reset()
to clear the cumulated statistics. Only the specified length of chars will be sampledinput
- the char array to be sampled by detectoffset
- the index of the first char to samplelength
- the number of chars to samplepublic oracle.i18n.lcsd.LCSDResultSet getResult()
LCSDResultSet
object which contains the resultpublic void reset()
|
Oracle® Globalization Development Kit Java API Reference 10g Release 1(10.1) B10971-01 | |||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |