Oracle Ultra Search Java API Reference
Release 10g (10.1)

B12028-02

oracle.ultrasearch.crawler
Interface DocumentService


public interface DocumentService

DocumentService is an interface used by a document service agent to submit document attributes and/or document contents to the crawler


Field Summary
static int FOLLOW_AND_INDEX
           
static int FOLLOW_AND_NO_INDEX
           
static int FOLLOW_UP
           
static int NO_CHANGE
          status codes for doService
static int NO_FOLLOW_AND_INDEX
           
static int NO_FOLLOW_AND_NO_INDEX
           
static int USE_CURRENT
          status codes for getRobotControl

 

Method Summary
 void close()
          Shut down the agent
 int doService(java.lang.String documentUrl, int urlId, java.io.Reader docReader)
          Ask the agent to prepare for fetching of document URL and its attribute from the data source.
 UrlData getAttribute(int urlId)
          Set the attributes of the document identified by urlId
 java.io.Reader getContents(int urlId)
          Get the new contents of the document
 int getRobotControl(int urlId)
          Get the attributes of the document identified by urlId
 void open(DataSourceParams params, java.io.PrintWriter log)
          Initialize the document service agent
 void received(int urlId)
          Called after finishing work with the target document to allow the agent to do any clean up work associate with its service

 

Field Detail

NO_CHANGE

public static final int NO_CHANGE
status codes for doService
See Also:
Constant Field Values

FOLLOW_UP

public static final int FOLLOW_UP
See Also:
Constant Field Values

USE_CURRENT

public static final int USE_CURRENT
status codes for getRobotControl
See Also:
Constant Field Values

FOLLOW_AND_INDEX

public static final int FOLLOW_AND_INDEX
See Also:
Constant Field Values

FOLLOW_AND_NO_INDEX

public static final int FOLLOW_AND_NO_INDEX
See Also:
Constant Field Values

NO_FOLLOW_AND_INDEX

public static final int NO_FOLLOW_AND_INDEX
See Also:
Constant Field Values

NO_FOLLOW_AND_NO_INDEX

public static final int NO_FOLLOW_AND_NO_INDEX
See Also:
Constant Field Values
Method Detail

open

public void open(DataSourceParams params,
                 java.io.PrintWriter log)
          throws AgentException
Initialize the document service agent
Parameters:
params - the agent parameters
log - the crawler log file
Throws:
AgentException - if unable to initialize the agent

doService

public int doService(java.lang.String documentUrl,
                     int urlId,
                     java.io.Reader docReader)
              throws AgentException
Ask the agent to prepare for fetching of document URL and its attribute from the data source.
Parameters:
documentUrl - the URL string of the document
urlId - the URL id associated with this document
docReader - Reader access to the document
Returns:
int a status code indicating required action from the crawler.
Throws:
AgentException - if unable to process the document. Throw fatal agent exception to stop the crawler. Throw warning agent exception for warning.

getAttribute

public UrlData getAttribute(int urlId)
Set the attributes of the document identified by urlId
Parameters:
urlId - URL id
Returns:
document attributes object, null if no attribute

getRobotControl

public int getRobotControl(int urlId)
Get the attributes of the document identified by urlId
Parameters:
urlId - URL id
Returns:
robot control code: NO_CHANGE, FOLLOW_AND_INDEX, FOLLOW_AND_NO_INDEX, NO_FOLLOW_AND_INDEX, NO_FOLLOW_AND_NO_INDEX

getContents

public java.io.Reader getContents(int urlId)
Get the new contents of the document
Parameters:
urlId - URL id
Returns:
Reader access to the new document contents

received

public void received(int urlId)
              throws AgentException
Called after finishing work with the target document to allow the agent to do any clean up work associate with its service
Throws:
AgentException - for any error the crawler should be aware of

close

public void close()
           throws AgentException
Shut down the agent
Throws:
AgentException - if unable to close the agent

Oracle Ultra Search Java API Reference
Release 10g (10.1)

B12028-02

Copyright © 2004 Oracle Corporation. All Rights Reserved.