|
Oracle Secure Enterprise Search Java API Reference 10g Release 1 (10.1.8.1) B32515-01 |
||||||||||
| PREV CLASS NEXT CLASS | FRAMES NO FRAMES | ||||||||||
| SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD | ||||||||||
CrawlingThreadService is an interface used by a crawler plugin to perform crawl related tasks. It has execution context specific to the crawling thread that invokes the plugin crawl() method
| Field Summary | |
static int |
DOC_EXCLUDED_BY_MIMETYPEdocument excluded by mimetype |
static int |
DOC_EXCLUDED_BY_SIZEdocument excluced by document size |
static int |
DOC_EXCLUDED_BY_URL_BOUNDARYdocument excluded by url boundary |
static int |
DOC_INCLUDEDdocument should be included |
| Method Summary | |
int |
checkDocumentExcluded(DocumentMetadata meta)Checks if the document should be crawled or not. |
String |
inferMimeType(String url)Checks the mime type based on the URL suffix. |
void |
markStatusNotChanged(DocumentMetadata meta)Marks a URL entry as not requiring any changes or updates. |
void |
submitForProcessing(DocumentContainer target)Submits the document for processing. |
| Field Detail |
public static final int DOC_INCLUDED
public static final int DOC_EXCLUDED_BY_URL_BOUNDARY
public static final int DOC_EXCLUDED_BY_MIMETYPE
public static final int DOC_EXCLUDED_BY_SIZE
| Method Detail |
public void submitForProcessing(DocumentContainer target)
throws ProcessingException
DocumentContainer.STATUS_OK_FOR_INDEX. After the processing is done, this document will be automatically removed from the queue. Note that the DocumentMetadata in the submitted target will be cleared automatically if the operation is a success.target - the document container containing the content and metadata.ProcessingException
public void markStatusNotChanged(DocumentMetadata meta)
throws ProcessingException
meta - the metadata object corresponding to the URL entryProcessingExceptionpublic int checkDocumentExcluded(DocumentMetadata meta)
The internal exclusion checking always occurs when submitting the documents.
meta - the document metadataCrawlingThreadService.DOC_INCLUDED, CrawlingThreadService.DOC_EXCLUDED_BY_URL_BOUNDARY, CrawlingThreadService.DOC_EXCLUDED_BY_MIMETYPE, or CrawlingThreadService.DOC_EXCLUDED_BY_SIZEpublic String inferMimeType(String url)
url - the document URLnull.
|
Oracle Secure Enterprise Search Java API Reference 10g Release 1 (10.1.8.1) B32515-01 |
||||||||||
| PREV CLASS NEXT CLASS | FRAMES NO FRAMES | ||||||||||
| SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD | ||||||||||