|
Oracle Secure Enterprise Search Java API Reference 11g Release 1 (11.1.2.0.0) E14433-02 |
|||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |
public interface CrawlingThreadService
CrawlingThreadService is an interface used by a crawler plug-in to perform crawl-related tasks. Its execution is context-specific to the crawling thread that invokes the plug-in crawl() method.
Field Summary | |
---|---|
static int |
DOC_EXCLUDED_BY_MIMETYPE |
static int |
DOC_EXCLUDED_BY_SIZE |
static int |
DOC_EXCLUDED_BY_URL_BOUNDARY |
static int |
DOC_INCLUDED |
Method Summary | |
---|---|
int |
checkDocumentExcluded(DocumentMetadata meta) Checks if the document should be crawled. |
String |
inferMimeType(String url) Checks the mime type based on the URL suffix. |
void |
markStatusNotChanged(DocumentMetadata meta) Marks a URL entry as not requiring any changes or updates. |
void |
submitForProcessing(DocumentContainer target) Submits the document for processing. |
Field Detail |
---|
static final int DOC_INCLUDED
static final int DOC_EXCLUDED_BY_URL_BOUNDARY
static final int DOC_EXCLUDED_BY_MIMETYPE
static final int DOC_EXCLUDED_BY_SIZE
Method Detail |
---|
void submitForProcessing(DocumentContainer target) throws ProcessingException
DocumentContainer.STATUS_OK_FOR_INDEX
. After the processing is done, this document is automatically removed from the queue. The DocumentMetadata
in the submitted target is cleared automatically if the operation is a success.target
- - the document container containing the content and metadata.ProcessingException
void markStatusNotChanged(DocumentMetadata meta) throws ProcessingException
meta
- - the metadata object corresponding to the URL entryProcessingException
int checkDocumentExcluded(DocumentMetadata meta)
The internal exclusion checking always occurs when submitting the documents.
meta
- the document metadataCrawlingThreadService.DOC_INCLUDED
, CrawlingThreadService.DOC_EXCLUDED_BY_URL_BOUNDARY
, CrawlingThreadService.DOC_EXCLUDED_BY_MIMETYPE
, or CrawlingThreadService.DOC_EXCLUDED_BY_SIZE
String inferMimeType(String url)
url
- the document URLnull
.
|
Oracle Secure Enterprise Search Java API Reference 11g Release 1 (11.1.2.0.0) E14433-02 |
|||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |