|
Oracle Secure Enterprise Search Java API Reference 10g Release 1 (10.1.8.2) E10465-01 |
||||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | ||||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |
CrawlingThreadService is an interface used by a crawler plugin to perform crawl related tasks. It has execution context specific to the crawling thread that invokes the plugin crawl() method
Field Summary | |
static int |
DOC_EXCLUDED_BY_MIMETYPE document excluded by mimetype |
static int |
DOC_EXCLUDED_BY_SIZE document excluced by document size |
static int |
DOC_EXCLUDED_BY_URL_BOUNDARY document excluded by url boundary |
static int |
DOC_INCLUDED document should be included |
Method Summary | |
int |
checkDocumentExcluded(DocumentMetadata meta) Checks if the document should be crawled or not. |
String |
inferMimeType(String url) Checks the mime type based on the URL suffix. |
void |
markStatusNotChanged(DocumentMetadata meta) Marks a URL entry as not requiring any changes or updates. |
void |
submitForProcessing(DocumentContainer target) Submits the document for processing. |
Field Detail |
public static final int DOC_INCLUDED
public static final int DOC_EXCLUDED_BY_URL_BOUNDARY
public static final int DOC_EXCLUDED_BY_MIMETYPE
public static final int DOC_EXCLUDED_BY_SIZE
Method Detail |
public void submitForProcessing(DocumentContainer target) throws ProcessingException
DocumentContainer.STATUS_OK_FOR_INDEX
. After the processing is done, this document will be automatically removed from the queue. Note that the DocumentMetadata
in the submitted target will be cleared automatically if the operation is a success.target
- the document container containing the content and metadata.ProcessingException
public void markStatusNotChanged(DocumentMetadata meta) throws ProcessingException
meta
- the metadata object corresponding to the URL entryProcessingException
public int checkDocumentExcluded(DocumentMetadata meta)
The internal exclusion checking always occurs when submitting the documents.
meta
- the document metadataCrawlingThreadService.DOC_INCLUDED
, CrawlingThreadService.DOC_EXCLUDED_BY_URL_BOUNDARY
, CrawlingThreadService.DOC_EXCLUDED_BY_MIMETYPE
, or CrawlingThreadService.DOC_EXCLUDED_BY_SIZE
public String inferMimeType(String url)
url
- the document URLnull
.
|
Oracle Secure Enterprise Search Java API Reference 10g Release 1 (10.1.8.2) E10465-01 |
||||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | ||||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |