|
Oracle Secure Enterprise Search Java API Reference 11g Release 1 (11.1.2.2.0) E21607-01 |
|||||||||
| PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
| SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD | |||||||||
public interface CrawlingThreadService
CrawlingThreadService is an interface used by a crawler plug-in to perform crawl-related tasks. Its execution is context-specific to the crawling thread that invokes the plug-in crawl() method.
| Field Summary | |
|---|---|
static int |
DOC_EXCLUDED_BY_MIMETYPE |
static int |
DOC_EXCLUDED_BY_SIZE |
static int |
DOC_EXCLUDED_BY_URL_BOUNDARY |
static int |
DOC_INCLUDED |
| Method Summary | |
|---|---|
int |
checkDocumentExcluded(DocumentMetadata meta)Checks if the document should be crawled. |
String |
inferMimeType(String url)Checks the mime type based on the URL suffix. |
void |
markStatusNotChanged(DocumentMetadata meta)Marks a URL entry as not requiring any changes or updates. |
void |
submitForProcessing(DocumentContainer target)Submits the document for processing. |
| Field Detail |
|---|
static final int DOC_INCLUDED
static final int DOC_EXCLUDED_BY_URL_BOUNDARY
static final int DOC_EXCLUDED_BY_MIMETYPE
static final int DOC_EXCLUDED_BY_SIZE
| Method Detail |
|---|
void submitForProcessing(DocumentContainer target)
throws ProcessingException
DocumentContainer.STATUS_OK_FOR_INDEX. After the processing is done, this document is automatically removed from the queue. The DocumentMetadata in the submitted target is cleared automatically if the operation is a success.target - - the document container containing the content and metadata.ProcessingException
void markStatusNotChanged(DocumentMetadata meta)
throws ProcessingException
meta - - the metadata object corresponding to the URL entryProcessingExceptionint checkDocumentExcluded(DocumentMetadata meta)
The internal exclusion checking always occurs when submitting the documents.
meta - the document metadataCrawlingThreadService.DOC_INCLUDED, CrawlingThreadService.DOC_EXCLUDED_BY_URL_BOUNDARY, CrawlingThreadService.DOC_EXCLUDED_BY_MIMETYPE, or CrawlingThreadService.DOC_EXCLUDED_BY_SIZEString inferMimeType(String url)
url - the document URLnull.
|
Oracle Secure Enterprise Search Java API Reference 11g Release 1 (11.1.2.2.0) E21607-01 |
|||||||||
| PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
| SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD | |||||||||