public class LCSDetectionHTMLInputStream extends LCSDetectionInputStream
LCSDetectionHTMLInputStream
class extends the LCSDetectionInputStream
class to support the language/encoding detection for input in HTML format.
The detection sampling length indicates how many bytes of plain text on which the detection feature will perform. The default sampling length is 1K. Generally, LCSD handles the language/encoding detection, and you do not need to set this value. By allowing changes to this value, you can control the detection sampling length.
You can get the detection results from the LCSDResultSet
class if needed.
Any read
method returns UTFDataFormatException
if the source is UTF-8 data and an invalid UTF-8 sequence is found.
DEFAULT_SAMPLING_SIZE
in
Constructor and Description |
---|
LCSDetectionHTMLInputStream(InputStream in)
Creates an
LCSDetectionHTMLInputStream object. |
LCSDetectionHTMLInputStream(InputStream in, int len)
Creates an
LCSDetectionStream object with the specified sampling length. |
LCSDetectionHTMLInputStream(String name, InputStream in)
Creates an
LCSDetectionStream object with the specified profile for detection. |
LCSDetectionHTMLInputStream(String name, InputStream in, int len)
Creates an
LCSDetectionStream object with the specified sampling length and the specified profile for detection. |
getResult, read, read, read
available, close, mark, markSupported, reset, skip
public LCSDetectionHTMLInputStream(InputStream in) throws IOException, UTFDataFormatException
LCSDetectionHTMLInputStream
object. Use the default sampling length and default profile for detection.in
- input stream that you want to detectIOException
- if any I/O error occursUTFDataFormatException
- if any invalid UTF-8 data sequence is detected. Note this occurs only if the source is UTF-8 datapublic LCSDetectionHTMLInputStream(String name, InputStream in) throws IOException, UTFDataFormatException
LCSDetectionStream
object with the specified profile for detection. Use the default sampling length.name
- the profile namein
- input stream that you want to detectIOException
- if any I/O error occursUTFDataFormatException
- if any invalid UTF-8 data sequence is detected. Note this occurs only if the source is UTF-8 datapublic LCSDetectionHTMLInputStream(InputStream in, int len) throws IOException, UTFDataFormatException
LCSDetectionStream
object with the specified sampling length. Use the default profile for detection.in
- input stream that you want to detectlen
- the sampling lengthIOException
- if any I/O error occursUTFDataFormatException
- if any invalid UTF-8 data sequence is detected. Note this occurs only if the source is UTF-8 datapublic LCSDetectionHTMLInputStream(String name, InputStream in, int len) throws IOException, UTFDataFormatException
LCSDetectionStream
object with the specified sampling length and the specified profile for detection.name
- the profile namein
- input stream that you want to detectlen
- the sampling lengthIOException
- if any I/O error occursUTFDataFormatException
- if any invalid UTF-8 data sequence is detected. Note this occurs only if the source is UTF-8 data