Skip navigation links

Oracle Secure Enterprise Search Java API Reference
10g Release 1 (10.1.8)

B32260-01


oracle.search.sdk.crawler
Interface DocumentMetadata


public interface DocumentMetadata

DocumentMetadata is an interface used by a crawler plugin to submit URL-related data to the crawler.


Field Summary
static int ATTR_TYPE_DATE
          Date attribute data type
static int ATTR_TYPE_NUMBER
          Number attribute data type
static int ATTR_TYPE_STRING
          String attribute data type

 

Method Summary
 void addAttribute(java.lang.String name, java.math.BigDecimal value)
          Add an attribute value whose type is number
 void addAttribute(java.lang.String name, java.util.Date value)
          Add an attribute value whose type is java.util.Date Note that java.util.Date has time component
 void addAttribute(java.lang.String name, java.lang.String value)
          Add an attribute value whose type is string
 void clearData()
          Clear all meta data of this document
 void deleteAttribute(java.lang.String name)
          Delete the specified attribute
 java.lang.String getAccessURL()
          Get the value of an access URL document property
 DocumentAcl getACLInfo()
          Get the ACL associated with the document
 java.lang.String getAttributeName(int attrIndex)
          Get the name of the specified attribute
 int getAttributeType(int attrIndex)
          Get the data type of the specified attribute
 java.util.Enumeration getAttributeValues(int attrIndex)
          Get the list of values for a specified attribute
 java.util.Enumeration getAttributeValues(java.lang.String attrName)
          Get the list of values for a specified attribute
 int getContentLength()
          return the actual document content size in bytes It may not be the exact same number as the plugin set
 java.lang.String getContentType()
          Get the document content type
 int getCrawlDepth()
          Get the crawling depth of the document.
 java.lang.String getDisplayURL()
          Get the value of an URL data property.
 java.lang.String getLanguage()
          Get the ISO 639-1 language code of the document.
 int getLastDocumentStatus()
          Get the document status of the previous crawl
 java.util.Date getLastModifiedDate()
          Get the last modification date of the document
 int getNumAttributes()
          Get number of attributes
 int getNumAttrValues()
          Get number of attribute values
 java.lang.String getOwnerGuid()
          Get the owner id of the document in the form of global user id (GUID)
 java.lang.String[] getSourceHierarchy()
          Get the source hierarchy of the document
 void setAccessURL(java.lang.String value)
          Set the access URL property
 void setACLInfo(DocumentAcl acl)
          Set the document ACL.
 void setAffinity(java.lang.String value)
          Set the document affinity value for duplicate detection
 void setAttributes(java.lang.String name, java.math.BigDecimal[] values)
          Set(replace) a list of attribute values whose type is number
 void setAttributes(java.lang.String name, java.util.Date[] values)
          Set(replace) a list of attribute values whose type is java.util.Date Note that java.util.Date has time component
 void setAttributes(java.lang.String name, java.lang.String[] values)
          Set(replace) a list of attribute values whose type is String
 void setContentLength(int size)
          set the size of the document The size could be overwritten by the crawler when fetching the content
 void setContentType(java.lang.String mimeType)
          Set the content type of the document
 void setCrawlDepth(int depth)
          Set the crawling depth of the document.
 void setDisplayURL(java.lang.String value)
          Set the display URL property
 void setLanguage(java.lang.String value)
          Set the language of the document using ISO 639-1 language code; for example, 'en' for English, 'ja' for Japanese, and 'fr' for French
 void setLastModifiedDate(java.util.Date timeStamp)
          Set the last modification date of the document
 void setOwnerGuid(java.lang.String ownerGuid)
          Set the document owner in the form of Oracle OID global user id (orclguid)
 void setSourceHierarchy(java.lang.String[] hierarchyList)
          Set the path of the document in terms of information source organization.

 

Field Detail

ATTR_TYPE_STRING

public static final int ATTR_TYPE_STRING
String attribute data type
See Also:
Constant Field Values

ATTR_TYPE_NUMBER

public static final int ATTR_TYPE_NUMBER
Number attribute data type
See Also:
Constant Field Values

ATTR_TYPE_DATE

public static final int ATTR_TYPE_DATE
Date attribute data type
See Also:
Constant Field Values

Method Detail

setDisplayURL

public void setDisplayURL(java.lang.String value)
Set the display URL property
Parameters:
value - the display URL property value

getDisplayURL

public java.lang.String getDisplayURL()
Get the value of an URL data property. If there are multiple display URLs set, the first URL is returned
Returns:
the property value which can be null if there is no such property

setAccessURL

public void setAccessURL(java.lang.String value)
Set the access URL property
Parameters:
value - the access URL property value

getAccessURL

public java.lang.String getAccessURL()
Get the value of an access URL document property
Returns:
the property value which can be null

setAffinity

public void setAffinity(java.lang.String value)
Set the document affinity value for duplicate detection
Parameters:
value - the affinity string

setContentType

public void setContentType(java.lang.String mimeType)
Set the content type of the document
Parameters:
mimeType - the document key value

getContentType

public java.lang.String getContentType()
Get the document content type
Returns:
the content type

setCrawlDepth

public void setCrawlDepth(int depth)
Set the crawling depth of the document. The value of the depth can be generalized to any integer value that suits the need of the crawl.
Parameters:
depth - the crawling depth of the document

getCrawlDepth

public int getCrawlDepth()
Get the crawling depth of the document.
Returns:
the crawling depth of the document

setLanguage

public void setLanguage(java.lang.String value)
Set the language of the document using ISO 639-1 language code; for example, 'en' for English, 'ja' for Japanese, and 'fr' for French
Parameters:
value - the ISO 639-1 language code

getLanguage

public java.lang.String getLanguage()
Get the ISO 639-1 language code of the document.
Returns:
the ISO 639-1 language code of the document

setSourceHierarchy

public void setSourceHierarchy(java.lang.String[] hierarchyList)
Set the path of the document in terms of information source organization. for example, [hardware][power tools][sanders] for a URL path /hardware/power%20tools/sanders
Parameters:
hierarchyList - the hierarchy list from top to bottom

getSourceHierarchy

public java.lang.String[] getSourceHierarchy()
Get the source hierarchy of the document
Returns:
the hierarchy list from top to bottom

setLastModifiedDate

public void setLastModifiedDate(java.util.Date timeStamp)
Set the last modification date of the document
Parameters:
timeStamp - the last modification date

getLastModifiedDate

public java.util.Date getLastModifiedDate()
Get the last modification date of the document
Returns:
the last modification date

setACLInfo

public void setACLInfo(DocumentAcl acl)
Set the document ACL. ACL information is provided through the use of DocumentAcl object
Parameters:
acl - the document protecting ACL

getACLInfo

public DocumentAcl getACLInfo()
Get the ACL associated with the document
Returns:
the ACL value which can be null if there is no ACL associated with the document

setOwnerGuid

public void setOwnerGuid(java.lang.String ownerGuid)
Set the document owner in the form of Oracle OID global user id (orclguid)
Parameters:
ownerGuid - the owner global user id

getOwnerGuid

public java.lang.String getOwnerGuid()
Get the owner id of the document in the form of global user id (GUID)
Returns:
the owner global user id

getLastDocumentStatus

public int getLastDocumentStatus()
Get the document status of the previous crawl
Returns:
document status code. Return 0 if this is a first time crawl

addAttribute

public void addAttribute(java.lang.String name,
                         java.math.BigDecimal value)
Add an attribute value whose type is number
Parameters:
name - the name of the attribute
value - the value of the attribute

addAttribute

public void addAttribute(java.lang.String name,
                         java.util.Date value)
Add an attribute value whose type is java.util.Date Note that java.util.Date has time component
Parameters:
name - the name of the attribute
value - the value of the attribute

addAttribute

public void addAttribute(java.lang.String name,
                         java.lang.String value)
Add an attribute value whose type is string
Parameters:
name - the name of the attribute
value - the value of the attribute

setAttributes

public void setAttributes(java.lang.String name,
                          java.math.BigDecimal[] values)
Set(replace) a list of attribute values whose type is number
Parameters:
name - the name of the attribute
values - array of attribute values, at least one value should exist

setAttributes

public void setAttributes(java.lang.String name,
                          java.util.Date[] values)
Set(replace) a list of attribute values whose type is java.util.Date Note that java.util.Date has time component
Parameters:
name - the name of the attribute
values - array of attribute values, at least one value should exist

setAttributes

public void setAttributes(java.lang.String name,
                          java.lang.String[] values)
Set(replace) a list of attribute values whose type is String
Parameters:
name - the name of the attribute
values - array of attribute values, at least one value should exist

deleteAttribute

public void deleteAttribute(java.lang.String name)
Delete the specified attribute
Parameters:
name - the name of the attribute

getNumAttributes

public int getNumAttributes()
Get number of attributes
Returns:
number of attributes for this document

getNumAttrValues

public int getNumAttrValues()
Get number of attribute values
Returns:
number of attribute values for this document

getAttributeName

public java.lang.String getAttributeName(int attrIndex)
Get the name of the specified attribute
Parameters:
attrIndex - 0 based index indicating which attribute
Returns:
the name of the specified attribute

getAttributeType

public int getAttributeType(int attrIndex)
Get the data type of the specified attribute
Parameters:
attrIndex - 0 based index indicating which attribute
Returns:
the data type of the specified attribute: ATTR_INT, ATTR_STR, or ATTR_DATE.

getAttributeValues

public java.util.Enumeration getAttributeValues(int attrIndex)
Get the list of values for a specified attribute
Parameters:
attrIndex - 0 based index indicating which attribute
Returns:
an enumeration of String/BigDecimal/Date objects containing attribute values

getAttributeValues

public java.util.Enumeration getAttributeValues(java.lang.String attrName)
Get the list of values for a specified attribute
Parameters:
attrName - is the name of the attribute
Returns:
an enumeration of String/BigDecimal/Date objects containing attribute values. return null if there is no such attribute name

clearData

public void clearData()
Clear all meta data of this document

setContentLength

public void setContentLength(int size)
set the size of the document The size could be overwritten by the crawler when fetching the content
Parameters:
size - of the document in bytes

getContentLength

public int getContentLength()
return the actual document content size in bytes It may not be the exact same number as the plugin set

Skip navigation links

Oracle Secure Enterprise Search Java API Reference
10g Release 1 (10.1.8)

B32260-01


Copyright © 2006, Oracle. All rights reserved.