Oracle® Secure Enterprise Search Administrator's Guide 11g Release 1 (11.1.2.0.0) Part Number E14130-04 |
|
|
View PDF |
This chapter contains the following topics:
Documentum data is stored in DocBases, which can contain cabinets and folders. A Documentum Content Server instance can have one or more DocBases crawled with an EMC Documentum Content Server source. The Documentum Content Server source navigates through the DocBases and the inline cabinets to crawl all the documents in Documentum Content Server. Oracle SES creates an index, stores the metadata, and accesses information in Oracle SES to provide search capabilities according to the end user permissions.
Oracle SES supports incremental crawling; that is, it crawls and indexes only those documents that have changed after the most recent crawling was scheduled. A document is re-crawled if either the content or metadata or the direct security access information of the document has changed. A document is also re-crawled if it is moved within Documentum Content Server and the end user has to access the same document with a different URL. Documents deleted from a DocBase are removed from the index during incremental crawling.
The Documentum source in Oracle SES must use the administrator account of a DocBase for crawling and indexing documents of that DocBase.
Documentum Content Server DA (Documentum Administrator) or Documentum Content Server WebTop application must be installed and configured.
Documentum Foundation Classes (DFC) must be installed on the server running Oracle SES.
Currently supported Documentum version is 6.5.
Because EMC Documentum Content Server software is not included with Oracle SES, certain files must be copied manually into Oracle SES.
The DFC installation asks for destination directory and user directory. For Windows, the default destination directory is C:\Program Files\Documentum
and default user directory is C:\Documentum
.
For UNIX, you must create a DFC program root and a DFC user root. For example, DFC program root might be user_home/documentum_shared and DFC user root might be user_home/documentum.
Copy the dfc.properties
and DFC jar files from the following locations into ORACLE_HOME
/search/lib/plugins/dcs
.
dctm.jar
Windows: DFC_destination_directory\
Linux: DFC_destination_directory/
dfc.jar
Windows: DFC_destination_directory
\shared\
UNIX: DFC_destination_directory
/dfc
dfcbase.jar
Windows: DFC_destination_directory
\shared\
UNIX: DFC_destination_directory
/dfc
dfc.properties
Windows: DFC_destination_directory
\config\
UNIX: DFC_destination_directory
/config/
Create a new directory under ORACLE_HOME
/product/
version/SES Instance Name
/search/lib/plugin/dcs/
. For example dcsothers
.
Copy dfc.properties
to the folder created in the previous step (dcsothers
), as well as to the main folder (dcs
).
Copy dfc.jar
, dfcbase.jar
, dctm.jar
to the dcs
folder in ORACLE_HOME
/product/
version
/
SES Instance Name
/search/lib/plugin/dcs
.
Add the following to DMCL.ini
:
max_session_count = 20 max_connection_per_session = 20
In Windows, DMCL.ini
is located in the WINNT
folder. In Linux, DMCL.ini
is available in the Documentum folder (DFC user root).
In Windows 2003 server, copy dmcl40.dll
from DFC_destination_directory
/shared/
to ORACLE_HOME
/product/
version
/
SES Instance Name
/BIN
. For UNIX platforms, copy the file according to Table 7-1.
The environment variables $DOCUMENTUM_SHARED (DFC Program root) and $DOCUMENTUM (DFC user directory) must be created before installing DFC on Linux. Also note that these variables must to be exported again, and Oracle SES must be restarted when the machine restarts. These variables can also be exported permanently in Linux.
Use the following commands to export environmental variables in Linux:
For DOCUMENTUM:
export DOCUMENTUM=/home/sesuser/DOCUMENTUM
For DOCUMENTUM_SHARED:
export DOCUMENTUM_SHARED=/home/sesuser/DOCUMENTUM_SHARED
Restart the middle tier:
searchctl restart
.
On Windows, restart the machine after installing DFC.
Table 7-1 DFC Files to Copy for UNIX Platforms
Platform | Copy File | From | To |
---|---|---|---|
Linux x86 |
libdmcl40.so |
DFC_destination_directory/dfc |
$ORACLE_HOME/lib |
Linux x86-64 |
libdmcl40.so |
DFC_destination_directory/dfc |
$ORACLE_HOME/lib32 |
Solaris SPARC (64-bit) |
libdmcl40.so |
DFC_destination_directory/dfc |
$ORACLE_HOME/lib32 |
HP-UX PA-RISC (64-bit) |
libdmcl40.sl |
DFC_destination_directory/dfc |
$ORACLE_HOME/lib32 |
AIX 5L Based Systems (64-bit) |
libdmcl40.so |
DFC_destination_directory/dfc |
$ORACLE_HOME/lib32 |
HP-UX Itanium |
libdmcl40.so |
DFC_destination_directory/dfc |
$ORACLE_HOME/lib32 |
In this release, search results cannot be viewed in Documentum desktop. The documents and folders can be viewed only using Documentum Administrator (DA) or Webtop applications.
For the Container name parameter, a value of repository name alone might not work. Enter the value of RepositoryName/CabinetName
. For example, DocBaseName/CabinetName/FolderName/SubFolderName
.
For Windows, the JAR files can be taken from the application server directory where DA is deployed. For DFC installation on Linux, it is a prerequisite to create DFC program root and DFC user root. For example, the DFC program root can be USER HOME
/DOCUMENTUM_SHARED
and the DFC user root can be USER HOME
/ DOCUMENTUM
. Table 7-2 lists the location of the JAR files in Windows and Linux.
Table 7-2 Location of the JAR Files
JAR File Name | Windows Location | Linux Location |
---|---|---|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
To configure the crawler plug-in:
Create a new directory under ORACLE_HOME
/product/
version
/SES Instance Name
/search/lib/plugin/dcs/
. For example, dcsothers
.
Copy dfc.properties
to the folder created in the previous step (dcsothers
) as well as to the main folder (dcs
).
Copy dfc.jar
, aspectjrt.jar
, certjFIPS.jar
, jsafeFIPS.jar
, configservice-api.jar
to the dcs
folder in the following path ORACLE_HOME
/product/
version
/SES Instance Name
/search/lib/plugin/dcs
.
The environment variables $DOCUMENTUM_SHARED (DFC Program root) and $DOCUMENTUM (DFC user directory) must be created before installing DFC on Linux. Also note that the environment variables $DOCUMENTUM_SHARED, $DOCUMENTUM, and $CLASSPATH must be exported again, and Oracle SES must be restarted when the machine restarts. These variables can also be exported permanently in Linux.
Use the following commands to export environmental variables in Linux:
For DOCUMENTUM:
export DOCUMENTUM=/home/sesuser/DOCUMENTUM
For DOCUMENTUM_SHARED:
export DOCUMENTUM_SHARED=/home/sesuser/DOCUMENTUM_SHARED
For CLASSPATH:
export CLASSPATH=$DOCUMENT_SHARED/dctm.jar:$DOCUMENTUM_SHARED/config
Setting up identity management requires administration steps in both Oracle SES and EMC Documentum. It includes the following steps:
To activate the Documentum identity plug-in, perform the following steps:
Select Documentum Identity Plug-in.
Click Activate.
Enter a valid DocBase name.
Enter a valid user name and password.
Ensure that the environment variable DOCUMENTUM and DOCUMENTUM_SHARED are set correctly.
Click Finish.
Before activating the OID Identity plug-in for validating the users in OID, Documentum Content Server should be synchronized with OID as an LDAP server. To do this, you must import the users and groups from OID to Documentum. Perform the following tasks for this:
Create an LDAP Configuration Object in Documentum Administrator (DA). To do this:
Login to DA.
Navigate to Administration, User Management, LDAP.
In the File Menu, select File, New, LDAP Configuration Object.
In the Name field, enter a name for LDAP Configuration Object.
Select dm_user as the user subtype.
Under Communication Mode, select Regular.
Under Import, select Users and Groups.
Select Default Configuration Object to use this configuration object in the server field.
Click Next.
In the Directory Type field, select Oracle Internet Directory Server.
In the Bind Type field, select Bind by Searching for Distinguished Name.
In the Binding Name field, provide the admin user name of OID. This is usually cn=orcladmin
.
In the Binding Password field, provide the admin user password.
In the Host Name field, provide the OID host name.
Retain the default port number of OID (389
).
In the Person Object Class field, provide the information of Base Person Object, typically the value is inetOrgPerson
.
In the Person Search Base field, provide the person search base defined in OID. For example, cn=Users
, dc=us
, dc=oracle
, dc=com
.
In the Person Search Filter field, specify cn=*
.
In the Group Object Class field, provide the Group Object. Typically the value is groupOfUniqueNames
.
In the Group Search Filter field, specify cn=*
.
Click Next.
The Attribute Map information is displayed. Click Finish.
Run the LDAP_Synchronization job. To do this:
Login to DA.
Navigate to Administration, Job Management, Jobs.
Open the job dm_LDAPsynchronization.
In the state field, select Active.
Select Deactivate On Failure.
In Designated Server, select the host name of Documentum Server.
Select Run After Update.
Click the Schedule tab.
In the Start Date And Time field, set the current date and time.
Select Repeat time from the Repeat list.
Set the Frequency field to any numeric value.
Select End Date And Time and specify how long the Synchronization job should run.
Click the Method
tab.
Select Pass Standard Argument.
Click the SysObject info tab.
Click OK.
After synchronizing the Documentum Content Server with OID, you must activate the OID activity plug-in in Oracle SES. Perform the following steps:
Log in to Oracle SES as the admin user.
Click Global Settings.
Select System, Identity Management Setup.
Select Oracle Internet Directory identity plug-in manager and click Activate.
Select nickname
from the Authentication Attribute list.
Provide the following values:
Host name: The host name of the machine where OID is running.
Port: The default LDAP port number, 389
.
Use SSL: true
or false
based on your preference.
Realm: The OID realm, for example, dc=us.dc=oracle.dc=com
User name: The OID admin username, for example, cn=orcladmin.
Password: User password
Before activating AD Identity plug-in for validating the users in AD, Documentum Content Server must be synchronized with AD as an LDAP server. To do this, you must import users and groups from AD to Documentum. For this, perform the following steps:
Create an LDAP Configuration Object in DA. To do this:
Log in to DA.
Navigate to Administration, User Management, LDAP.
Select File, New, LDAP Configuration Object.
Enter a name for ldap configuration object.
Select dm_user as User Subtype.
In the Communication Mode field, select Regular.
In the Import field, select Users and Groups.
Select Default Configuration Object in the server field, and click Next.
Provide the following values:
Directory Type: Select Active Directory Server.
Bind Type: Select Bind by Searching for Distinguished Name
Binding Name: Provide the admin user name of AD. It is normally domainName/Administrator.
Binding Password: The password of the AD admin user.
Host Name: AD host name.
Port: Default port number of AD, 389
.
Person Object Class: The Base Person Object, typically the value is user
.
Person Search Base: The person search base defined in AD, for example cn=Users,dc=us, dc=oracle,dc=com
.
Person Search Filter: Enter cn=*
.
Group Object Class: The group object. Typically the value is group
.
Group Search Base: The group search base defined in AD. For example, dc=us,dc=oracle,dc=com
.
Group Search Filter: Enter cn=*
.
Click Next.
The Attribute Map information is displayed. Click Finish.
Run the LDAP_Synchronization job. To do this:
Login to DA.
Navigate to Administration, Job Management, Jobs.
Open the job dm_LDAPsynchronization.
In the state field, select Active.
Select Deactivate On Failure.
In Designated Server, select the host name of Documentum Server.
Select Run After Update.
Click the Schedule tab.
In the Start Date And Time field, set the current date and time.
Select Repeat time from the Repeat list.
Set the Frequency field to any numeric value.
Select End Date And Time and specify how long the Synchronization job should run.
Click the Method
tab.
Select Pass Standard Argument.
Click the SysObject info tab.
Click OK.
After the Documentum Content Server is synchronized with the AD, you must activate the identity for AD Identity plug-in. To perform this:
Log in to Oracle SES as admin user.
Click Global Settings, and then select System, Identity Management Setup.
Select Activity Directory Identity Plug-in Manager, and click Activate.
Provide the following values:
Authentication Attribute: Select USER_NAME
.
Directory URL: Provide the host name and the port number. For example, ldap://ldapserverhost:port
.
Directory account name: Provide the AD user name, for example Administrator
.
Directory account password: AD user password.
Directory subscriber: Provide the directory subscriber (ldap base). For example, dc=us.dc=oracle.dc=com
.
Directory security protocol: Specify either none
or portnumber
.
Click Finish.
Before activating SunOne Identity plug-in for validating the users in SunOne, you must synchronize Documentum Content Server with SunOne as an LDAP server. To do this, you must import the users and groups from OID to Documentum. Perform the following steps:
Create an LDAP Configuration Object in DA. To do this:
Log in to DA.
Navigate to Administration, User Management, LDAP.
Select File, New, LDAP Configuration Object.
Enter a name for ldap configuration object.
Select dm_user as User Subtype.
In the Communication Mode field, select Regular.
In the Import field, select Users and Groups.
Select Default Configuration Object in the server field, and click Next.
Provide the following values:
Directory Type: Select Netscape/iPlanet Directory Server
Bind Type: Select Bind by Searching for Distinguished Name
Binding Name: Provide the admin user name of SunOne. It is normally cn=Administrator.
Binding Password: The password of the SunOne admin user.
Host Name: SunOne host name.
Port: Enter the port number used for SunOne. The default port number of SunOne is 389
.
Person Object Class: The Base Person Object, typically the value is person
.
Person Search Base: The person search base defined in SunOne, for example cn=Users,dc=us, dc=oracle,dc=com
.
Person Search Filter: Enter cn=*
.
Group Object Class: The group object. Typically the value is groupOfUniqueNames
.
Group Search Base: The group search base defined in AD. For example, dc=us,dc=oracle,dc=com
.
Group Search Filter: Enter cn=*
.
Click Next.
The Attribute Map information is displayed. Click Finish.
Run the LDAP_Synchronization job. To do this:
Login to DA.
Navigate to Administration, Job Management, Jobs.
Open the job dm_LDAPsynchronization.
In the state field, select Active.
Select Deactivate On Failure.
In Designated Server, select the host name of Documentum Server.
Select Run After Update.
Click the Schedule tab.
In the Start Date And Time field, set the current date and time.
Select Repeat time from the Repeat list.
Set the Frequency field to any numeric value.
Select End Date And Time and specify how long the Synchronization job should run.
Click the Method
tab.
Select Pass Standard Argument.
Click the SysObject info tab.
Click OK.
After the Documentum Content Server is synchronized with SunOne, the identity is activated for SunOne Identity plug-in. To perform this:
Log in to Oracle SES as admin user.
Click Global Settings, and then select System, Identity Management Setup.
Select Sun Java System Directory Server Manager, and click Activate.
Provide the following values:
Authentication Attribute: Select USER_NAME
.
Directory URL: Provide the host name and the port number. For example, ldap://ldapserverhost:port
.
Directory account name: Provide the Directory Server user name, for example Administrator
.
Directory account password: Directory Server user password.
Directory subscriber: Provide the directory subscriber (ldap base). For example, dc=us.dc=oracle.dc=com
.
Directory security protocol: Specify either none
or portnumber
.
Click Finish.
Create an EMC Documentum Content Server source on the Home - Sources page. Select EMC Documentum Content Server from the Source Type list, and click Create. Enter values for the following parameters:
Container name: The names of the containers to be crawled by Oracle SES. You can crawl an entire Documentum DocBase or a specific repository/cabinet/folder
. The format is DocBaseName/CabinetName/FolderName/SubFolderName
. Multiple comma-delimited container names can be entered. This parameter is case-sensitive; hence, enter the exact same cabinet name as in the Documentum repository. Required
These are examples of container names:
DocBase1
: The entire DocBase1 is crawled.
DocBase2/Cabinet21
: Cabinet21 and its sub-folders within DocBase2 are crawled.
DocBase2/Cabinet21/Folder11
: Folder11 and its sub-folders are crawled.
DocBase1, DocBase2/Cabinet21/Folder11
: The entire DocBase1 and Folder 11 in DocBase2/Cabinet21 are crawled.
Attribute list: The comma-delimited list of Documentum attributes along with their data types to be searchable. The format is AttributeName:AttributeType, AttributeName:AttributeType
. Valid values are String, Number, and Date. See Table 7-3, "Documentum Data Type Mapping".
While crawling a DocBase, an attribute is indexed only if both name and type match the configured name and type; otherwise, it is ignored. This is an optional parameter.
For example, assume that you have the following Documentum attributes with the indicated data types
account name: String
account ID: Integer
creation date: Date
To make these attributes searchable, enter this value for Attribute list:
Account Name:String, Account ID:Number, Creation Date:Date
The default searchable attributes for Documentum Content Server are Modified Date, Title, and Author.
Multiple attributes with same name are not allowed, such as Emp_ID:String
and Emp_ID:Number
.
User name: Enter the user name of a valid Documentum Content Server user. The user should be an administrator user or a user who has access to all cabinets, folders, and documents of the DocBases configured in the Container name parameter. The user should be able to retrieve content, metadata, and ACL from cabinets, folders, documents and other custom sub classes of all DocBases configured in Container name parameter. Required.
Password: Password of the Documentum user. Required.
Crawl versions: Indicate whether multiple versions of documents should be crawled, either true
or false
. The default value is false
. Any other value is false
and only the latest versions of a document are crawled. Optional.
Crawl folder attributes: Indicate whether folder attributes must be crawled, either true
or false
. This is an optional parameter. The default value is false
. Any other value is interpreted as false
.
URL for viewing the documents: A valid URL for Documentum WebTop or DA application used for viewing the Oracle SES search results. For example:
http://
IP_address:port
/da
or
http://
IP_address:port
/webtop
Authentication Attribute: This parameter is used to set ACLs. This parameter lets you set multiple LDAP servers. If Oracle SES and Documentum Content Server are synchronized with Active Directory, then enter the value USER_NAME
. If Oracle Internet Directory is used, then enter nickname
.
FileNet Content Engine data is stored in object stores, which can be further contained inside folders on a server. A FileNet Content Engine instance can have one or more object stores that can be crawled by specifying the Object Store details in the Container name parameter in Oracle SES. The Content Engine source navigates the object store to crawl all the documents in the configured Content Engine Object Store. It stores the metadata and accesses information in Oracle SES to provide search according to the end user permissions.
Any user having administrative privileges can be used to access FileNet Content Engine Crawler plug-in for crawling and indexing documents.
Because FileNet Content Engine software is not included with Oracle SES, you must copy these files manually into Oracle SES:
javaapi.jar
, soap.jar
, xercesImpl.jar
, and xml-apis.jar
from FileNetInstalledFolder/Workplace/WEB-INF/lib
to ORACLE_HOME
/search/lib/plugins/fnetce
WCMConfig.properties
from FileNetInstalledFolder/Workplace/WEB-INF
to ORACLE_HOME
/search/lib/plugins/fnetce
If any of the parameters are updated after initial crawl, then you must update the crawler re-crawl policy to Process All Documents on the Home - Schedules - Edit Schedules page, and re-crawl the source.
If additional document types are configured after first time crawl, then these document types are not indexed on subsequent re-crawls. This is also the case if the Document Size parameter is changed after the first crawl. For example, if the Document Size was 10 MB at the time of the first crawl and it is changed to 20 MB before re-crawl, then documents greater than 10 MB are rejected. As a workaround, create the source again and then make the changes.
If a FileNet Content Engine source is used, Oracle recommends that Active Directory be used as identity management system for the Oracle SES instance. The Active Directory instance must be the same one that FileNet Content Engine is using to authenticate users on the file system.
Create a FileNet Content Engine source on the Home - Sources page. Select FileNet Content Engine from the Source Type list, and click Create. Enter values for the following parameters:
Container name: The names of the containers to be crawled by Oracle SES. You can crawl a complete objectstore or a specific Folder. The format for specifying container is ObjectStore/FolderName/SubFolderName. Multiple comma-delimited containers can be specified. Required.
The following are examples of container names:
ObjectStore1
: The entire ObjectStore1 is crawled.
ObjectStore1/Folder1/Folder12
: The documents inside Folder12 and its sub-folders are crawled.
ObjectStore1, ObjectStore2/Folder1/Folder12
: The entire ObjectStore1 and contents of Folder12 in ObjectStore2 are crawled.
User name: A valid FileNet Content Engine user. The user should be an Administrator user or a user who has access to all Folders and Documents present in the configured container. The user should be able to retrieve content, metadata, and ACL from folders, documents of all containers configured in Container name. Required.
Password: Password of the Content Engine user. Required.
Attribute list: Attribute list corresponds to the comma-delimited list of Content Engine attributes along with their data types that the administrator wants to be searchable. The format is attributeName:attributeType, attributeName:attributeType
. The valid values are String, Number, and Date. Table 7-4 identifies equivalent FileNet and Oracle SES data types.
In an object store, the crawler indexes an attribute only if a valid attribute name and data type matches the configured name and type. Otherwise, the attribute is ignored. It is optional.
For example, to make the following Content Engine attributes searchable:
Attribute name: DocumentTitle Attribute type: String
Attribute name: ID Attribute type: Number
Attribute name: DateCreated Attribute type: Date
The value of Attribute List should be: Document Title: String, Id: Number, DateCreated: Date
The default searchable attributes for FileNet Content Engine are Title, Author, and LastModifiedDate. Multiple attributes with same name are not allowed. For example: Emp_ID: String, Emp_ID: Number is not allowed.
Crawl versions: Controls whether multiple versions of documents are crawled. Valid values are true
and false
. The default value is false
, and only the latest version is crawled. Any other values are interpreted as false
.
Crawl folder attributes: Controls whether folder metadata is indexed.Valid values are true
and false
. The default value is false
. Any other values are interpreted as false
.
URL for viewing the documents: The URL for FileNet Workplace application used for viewing the search results. Workplace is a part of FileNet P8 AE. For example: http://
IP_address:port
/Workplace
Remove deleted documents from index: Controls whether documents deleted from CE object stores are removed from the index. Valid values are true
and false
. The default value is false
, because true
has a performance impact. Any other values are interpreted as false
.
Authentication attribute: The authentication attribute used to set ACL. For Active Directory, the value is USER_NAME
.
Documents in FileNet Images Services are organized into Folders. A FileNet Image Services source navigates through the folder hierarchy to crawl all documents in FileNet Image Services (IS). Oracle SES creates the index and stores the metadata of the documents retrieved from FileNet Images Services in Oracle SES to provide search according to the end users' permissions.
A FileNet Image Server instance can have one or more Libraries. A Library is the document repository and contains documents within Folders and sub-Folders. A FileNet Image Services source can crawl multiple Libraries.
Images stored in Image Services can have annotations. Some annotations contain text, and these annotations are crawled. The annotations crawled are:
Stamp
Transparent Text
Stick note
You can search on the content of these annotations after the IS library has been crawled.
A user belonging to IS SysAdmin group must be used to crawl documents and metadata in IS.
FileNet Image Services Server version 4.0 or 3.6 SP2
Image Services Resources Adapter version 3.2.1
Because FileNet Image Services software is not included with Oracle SES, you must perform these tasks manually to integrate with Oracle SES:
Deploy the ISCrawlerWeb.war
file in the same application server on which ISRA has been deployed.
For application servers that require context root to be specified while deploying a WAR file, specify Context Root as ISCrawlerWeb
.
If the application server is WebSphere Application Server, then activate URL rewriting: Click Servers - Application Servers - server_name- Web Container - Session Management - Enable URL Rewriting.
If additional document types are configured after the first crawl, then these document types are not indexed on subsequent re-crawls. The same applies if the Document Size parameter is changed after first crawl. For example, Document Size was 10 MB at the time of first crawl and it is changed to 20 MB before re-crawl, then documents with greater than 10 MB are rejected. As a workaround: update the crawler re-crawl policy to Process All Documents on the Home - Schedules - Edit Schedules page, and re-crawl the source.
XML documents are crawled by default without configuring the source for XML documents: Oracle SES provides an option of configuring the documents types, including XML, to be crawled. Currently, even if XML document type is not configured, XML documents still are crawled.
Activate an identity plug-in on the Global Settings - Identity Management Setup page.
To configure the identity plug-in for Image Services:
On the Global Settings - Identity Management Setup page, select FileNet Image Services identity plug-in, and click Activate.
Set the following parameters:
Authentication Attribute: Select NATIVE.
Web Component URL: Enter the host name and port number of the Web component URL; for example, http://
webserverhost:port
/ISCrawlerWeb
.
Administrator user name: Enter the Image Services user name.
Administrator password: Enter the password of the Image Services user.
Library name of IS Server: Enter the name of the Image Services library, such as ISCF
. The library name is the ISRA connection factory name that is created when ISRA is deployed.
Click Finish.
See the ISRA documentation for information about these tasks:
The FileNet Image Services Resource Adapter (ISRA) must be deployed on a supported application server. See the ISRA documentation for supported application servers.
A connection Factory must be created for ISRA. The connection factory should be configured for the target IS libraries. See the ISRA documentation for deployment instructions.
ISRA comes with a viewer application for viewing images and annotations, the FNImageViewer.ear application should be deployed on the same application server as ISRA. This viewer would be invoked to display images for example jpeg, tiff, bmp, gif, and annotations. See the ISRA documentation for deployment instructions.
To support secure search, the Image Services server must be synchronized with the Active Directory server. See the section titled LDAP configuration in ISRA deployment guides for importing Microsoft Active Directory users and groups to Image Services.
After Active Directory users and groups have been imported into Image Services, ISRA must be configured to authenticate with Active Directory. See the section titled LDAP Configuration in the ISRA deployment guide for details.
Create a FileNet Image Services source on the Home - Sources page. Select FileNet Image Services from the Source Type list, and click Create. Enter values for the following parameters:
Container names: The names of the containers to be crawled by Oracle SES. You can crawl an entire FileNet Image Services Library or a specific Folder. The format is LibraryName/FolderName/SubFolderName(cache_name)
. Library name is the ISRA connection factory name created when ISRA is deployed. Cache name is where the document content can be found. Multiple comma-delimited container names can be entered. Required.
For example:
Container name: LibraryName1(cache name)
: The entire LibraryName1 is crawled
Container name: LibraryName2/Folder1/(cache name)
: Folder1 and its sub-folders are crawled.
Container name: LibraryName1, LibraryName2/Folder1(cache name)
: The entire LibraryName1 and Folder 1 in LibraryName2 are crawled
Cache name: The format is cache name: DomainName:Organization
. This is an optional parameter. If the cache name is not provided, then the plug-in tries to retrieve document content from the default page cache. However, the plug-in throws an error if an invalid page cache or empty brackets () are specified. Ask the Image Services administrator for cache details.
User name: Enter the user name of a valid FileNet Image Services user. The user should be a SysAdmin user or a user who has access to all Folders and Documents of the Libraries configured in the Container name parameter. The user should be able to retrieve content, metadata and ACL from folders, documents and other custom sub classes. The user should be defined in the configured LDAP server and should be imported into IS. Required.
Password: The FileNet Image Services user password. Required.
Web component URL: The URL of J2EE application server where the crawler plug-in Web component module is deployed. The format of the URL is http://
host:port
. Required.
The Web component is also used to view the search results. On clicking an Oracle SES search result, the user is prompted to log in. After the user successfully logs in, the document is displayed.
To display images and annotations, you must deploy the FileNet Image viewer FNImageViewer.ear
. FNImageViewer.ear
is a part of ISRA CD. If the viewer is not deployed, the images are displayed in the native viewer or the user is prompted to download the document.
Attribute Names: The comma-delimited list of Image Services attributes along with their data types to search. The format is attributeName:attributeType, attributeName:attributeType
. Valid values are String, Number, and Date. Table 7-5 identifies equivalent FileNet and Oracle SES data types.
In a Library, the crawler indexes an attribute only if both name and type of the attribute in the library match the configured name and type; otherwise, it is ignored. Optional.
For example, to make the following FileNet Image Services attributes searchable:
Attribute name: account name attribute type: String
Attribute name: account ID attribute type: Integer
Attribute name: creation date attribute type: Date
The value of Attribute List is:
Account Name: String, Account Id: Number, Creation Date: Date
Set source hierarchy: Indicates whether the source should set the source hierarchy of the document, either true
or false
. The default value is false
. Any other value is interpreted as false
.
A document in Image Services can be filed in multiple folders. A user may have READ permissions on a document but not on all the folders in which the document is filed. If Set Source Hierarchy is true
, then a user could view a source hierarchy on which he or she does not have permissions in Image Services. However, the user cannot view the documents on which he or she does not have READ permissions.
Set Public Access: Indicates whether the source sets the public access of the documents whose ACL is Anyone. Set this parameter to true
or false
. The default value is false
. Any other value is interpreted as false
.
Authentication Attribute: This parameter is used to get the LDAP authentication attribute. The appropriate value varies based on the identity plug-in used for authentication. For Microsoft Active Directory, set it to USER_NAME
. For FileNet Image Services identity plug-in, set it to NATIVE
.
The Hummingbird DM Server plug-in extends the searching capabilities of Oracle SES and enables it to search Hummingbird DM Server repositories. Oracle SES can crawl documents and metadata in the Hummingbird repositories and provide secure, full-text search. It also provides metadata search and browse functionality, which allows search to be done against a specific subfolder in the hierarchy.
Hummingbird data is stored in libraries, which can contain folders, files, and workspaces. A Hummingbird DM Server instance can have one or more libraries that can be crawled with the Hummingbird DM Server plug-in by configuring parameters in Oracle SES. The Hummingbird DM Server plug-in navigates through the libraries to crawl all documents in Hummingbird DM Server. It creates an index, stores the metadata, and accesses information in Oracle SES to provide search according to the end user permissions.
Oracle SES supports incremental crawling; that is, it crawls and indexes only those documents that have changed since the most recent crawl. A document is re-crawled if the content, metadata, or the direct security access information of the document has changed. Documents deleted from a library are removed from the index during incremental crawling.
The Hummingbird plug-in includes two components: a plug-in jar file and a Web services component. The jar file is deployed in Oracle SES. The Web services component must be deployed on the computer on which Hummingbird Web Server (Webtop) is deployed.
The Hummingbird DM Server identity plug-in is used to authenticate the native users of Hummingbird DM Server.
The Hummingbird crawler plug-in should use the administrator account for the Container for crawling and indexing documents.
The Hummingbird DM Server version must be 2004 or 2005.
Hummingbird DM Server must be installed and configured. The following versions of Hummingbird DN are supported: 2004, 2005.
Hummingbird Web Server (WebTop): Hummingbird Web Server is required to see the files and folder stored in Hummingbird DM Server.
Windows .NET Framework 1.1 must be on the same computer where Hummingbird Web Server (WebTop) is running.
Import User/Groups from Active Directory Server to Hummingbird.
Login to Hummingbird WebTop with a user having administrator privileges.
Select DM ADMIN from the list at the top of page.
Go to Users and Groups - User Synchronization.
Select the Network Resource and click Load Network.
Select the name of the domain with the users to import and click Load Network.
The Network resource list shows the names of users. Select the users to import and click Import User.
Click Save.
In Library User, you can see the list of users that are imported in Hummingbird Web server.
If you update the Attribute list parameter, then a force re-crawl should be performed to delete the indexes of the old attribute list and create indexes for the new attribute list. That is, change the re-crawl policy to Process All Documents on the Home - Schedules - Edit Schedule page.
Choose an identity plug-in on the Global Settings - Identity Management Setup page.
Activate the Hummingbird identity plug-in with the following parameters.
Library name: The name of library to be crawled.
URL: This parameter is used to send the request to the Web service to retrieve the data. For example:
[http | https]://
computername:port/VirtualDirectoryName
/HBDMIdentityWebservice.asmx
The virtual directory name is given during installation of Web services for Hummingbird.
User name: User name of Hummingbird DM Server. The user must be an administrator user and a native user of Hummingbird. Required.
Password: Password for User name.
Authentication Attribute: NATIVE
.
Create a source for the newly created user-defined source type on the Home - Sources page. Enter a source name. Provide values for the configuration parameters in the following table.
Container name: The names of the containers to be crawled by Oracle SES. You can crawl an entire Hummingbird library or a specific folder. The format is LibraryName/LibraryName/FolderName/SubFolderName
. This parameter is case-sensitive.
To crawl all documents in the library the format for library is LibraryName/ LibraryName
. You can enter multiple comma-delimited container names. Required.
For example:
Container name: LibraryName/LibraryName
The entire LibraryName is crawled
Container name: LibraryName/LibraryName/Folder21
Folder21
and its sub-folders within LibraryName
are crawled.
Container name: LibraryName/LibraryName/PublicFolders/Folder1
Folder1 and its sub-folders within PublicFolders are crawled.
Attribute list: The comma-delimited list of attributes to be searchable. The format is AttributeName,AttributeName
. Optional.
Hummingbird stores all attributes as String data type so the data type of attributes in Hummingbird are the String data type in Oracle SES. Only LastModifiedDate is the Date data type in Oracle SES. The default attributes are Title, LastModifiedDate, and Author.
While crawling a library or folder, an attribute is indexed only with a match; otherwise, it is ignored. For example, to make the following Hummingbird attributes searchable:
Attribute name: account name
Attribute name: account ID
Attribute name: creation date
The value of Attribute List is: account name, account ID, creation date.
Multiple attributes with same name are not allowed. For example: Emp_ID, Emp_ID.
If custom fields have been created, then include the name of table and column separated by a dot (.
). For example: tablename.columnname,tablename.columnname
User name: User name of a valid Hummingbird DM Server user. The user must be an administrator user or a user who has access to all folders and documents configured in Container name. The user should be able to retrieve content, attributes, and documents. Required.
Password: Password of the Hummingbird user in User name. Required.
Crawl versions: Controls whether multiple versions of documents are crawled. Valid values are true
and false
. The default value is false
. Any other value is interpreted as false
, and only the latest version of a document is crawled. Optional.
Crawl folder attributes: Controls whether folder attributes are crawled. Valid values are true
and false
. The default value is false
. Any other value is interpreted as false
. Optional.
View Documents: The IP address or computer name where the Hummingbird Webtop (Hummingbird Web Server) application is installed. It is the URL for viewing search results. For example: http://
computername
.
If SSL is enabled on Hummingbird DM Web Server, the URL is https://
computername
. If Hummingbird is running on a port other than the default port (80), then append the port number using this format: http://
computername:port
.
Crawl Attachments: Controls whether attachments to the documents are crawled. Valid values are true
and false
. The default value is false
. Any other value is interpreted as false
. Optional.
Search form: The profile name used in Hummingbird. The default value is DEF_QBE. If custom attributes have been added in profile and you want to search for these attributes, then enter the name of the custom profile.
URL for Webservice: The URL of Web services that are consumed by the plug-in. For example:
[http | https]://
computername/virtual_folder
/HBDMWebService.asmx
where virtual_folder
is the name of the virtual folder created by the Web service installer.
If the Web service is running on a port other then the default port (80), then include the port number. For example:
[http | https]://
computername:port/virtual_folder
/HBDMWebService.asmx
Authentication Attribute: The name of the authentication attribute that is used to set ACL. The Oracle Internet Directory value is nick_name
. The Active Directory value is USER_NAME
. The Hummingbird identity plug-in value is NATIVE
.
Hummingbird DM version: The version of Hummingbird DM to be crawled. Valid values are 5
and 6
.
Date Format: This is to specify the date format being used in the DM Server. For example, specify the format in crawler source configuration page for date 10/23/2009 10:10:10 as MM/dd/yyyy HH:mm:ss
. If no date format is specified, or an invalid date format is specified, then the default locale settings are used to parse the date. This is an optional parameter.
Activity Log Based Crawl: Indicates whether incremental crawl should be based on Activity Log Records, which is an optimal incremental crawl. Set as True
for optimized incremental crawl, that is Activity Log based crawl, and False
for processing all documents to find modified documents.
The Web service is located in ORACLE_HOME/search/lib/plugins/hbdm
. The Web service must be installed on the same server as Hummingbird DM.
The Web service component is provided as an installable setup file. This component must be installed on the same server on which Hummingbird Web Server and Windows .NET Framework 1.1 are installed.
Separate Web service installers are provided for Hummingbird DM 5 (Hummingbird_DM5_Web_Service_Installer.zip) and Hummingbird DM 6 (Hummingbird_DM6_Web_Service_Installer.zip). Ensure that the correct Web service component is installed based on the Hummingbird DM version.
To install the Web service:
Double-click setup.exe to install the Web service.
The installer prompts for the name of the virtual directory. (The virtual directory name can be changed.) The installer creates a virtual directory on Microsoft Internet Information Server (IIS) with same name. If you have multiple Web sites in IIS running on different ports, and you want to install this Web service in a Web site other than the default Web site, then include the port number.
Provide the user name and password of Hummingbird DM Server. Enter the user name in the form: domainname\username
.
The IBM DB2 Content Manager (ICM) plug-in extends the searching capabilities of Oracle SES to search ICM repositories, which consists of item types and their instances in the form of folders and documents. Oracle SES can crawl documents and metadata in the ICM Library Server and provide secure, full-text search. Starting from the specified folders, the plug-in extends the crawling and thus the search, into their complete child tree of any specified folder. If an item type is specified for crawling, then the plug-in crawls all instances of the item types and their complete child trees.
In ICM, the library server manages the content metadata and access control to all content in a database (such as DB2), interfacing to one or more resource managers. The primary job of the Library Server is to service client requests for content. The ICM plug-in navigates through the library server to crawl documents and folders in the specified item types. It stores the metadata and accesses information in Oracle SES to provide search according to the credentials of the end users.
While the crawler connects to the library server through the APIs, the library server internally connects with the resource manager through CM-managed secure tokens. Whenever a reference is made to the document object, they are fetched from the resource manager using these tokens. With the crawler plug-in, metadata corresponding to a document is retrieved from the library server while the display URL points to the document-object on the resource manager using the token.
Oracle SES supports incremental crawling; that is, it crawls and indexes only those documents that have changed after the recent most crawl. A document is re-crawled if either the content, metadata, display URL, or the direct security access information of the document has changed. Documents deleted from a database are removed from the index during incremental crawling.
The user account used to crawl the specified item types must be an Administrator account that has access on all instances (documents and folders) to the specified item types and can retrieve and crawl all folders and documents. The administration user specified for crawling must belong to the ICMPUBLIC group and the AllPrivs privilege set.
The version of DB2 Content Manager used to set up the repositories for crawling must be 8.3.
This section lists required software (in order of installation) for the installation of DB2 Content Manager 8.3:
Server Software Requirements (Computer with ICM Server):
Windows Server 2003 Enterprise Edition
IBM WebSphere Application Server 5.1 plus FixPak 1
IBM DB2 Universal Database Enterprise Server Edition (32-bit): 8.1 plus FixPak 7A special or version 8.2 plus FixPak 7A special
DB2 Content Manager Enterprise Edition 8.3 plus FixPak1
DB2 Information Integrator for Content 8.3 with Fix Pack 3
DB2 Content Manager eClient 8.3
Client Software Requirements (Computer with Oracle SES):
IBM DB2 Run-Time Client: 8.1 plus FixPak 7A special or version 8.2 plus FixPak 7A special
DB2 Information Integrator for Content 8.3 with Fix Pack 3
DB2 Content Manager Client for Windows 8.3 (optional for Windows)
The following tasks must be performed on the computer with ICM server.
To install and configure the system with ICM server:
Install DB2 Content Manager 8.3 with the required fix-packs.
Enable LDAP on DB2:
Open the System Administration Client.
Select Tools - LDAP Configuration to display the LDAP Configuration window.
Select Enable LDAP User import and authentication
On the Server tab, select server type Active Directory.
Provide the LDAP server information on the Server page.
Click OK.
Import users and groups from the Active Directory to ICM:
In the system administration client, click Authentication and then right-click either Users or User-Groups.
Click the LDAP button and then enter the user to be imported into ICM. To view a list of all valid user names, click Show All.
Select one or more users and click OK.
From the Assign to Groups tab, assign the users to the required groups.
From the Set Defaults tab, specify the default resource manager, collection and item access control list for the users, user-groups, or both.
Click OK or Apply.
The selected users and user-groups are imported into the DB2 CM environment.
To verify the import, select Users or User-Groups. The imported users or user-groups appear in the list on the right.
Catalog the DB2 run-time client with DB2 Content Manager Library database.
To install and configure the system with Oracle SES:
Locate the services file in \WINDOWS\system32\drivers\etc or similar directory on Windows and the /etc directory on Linux.
Open the services file in a text editor and add these lines:
[Service Name] [Port #]/tcp #DB2 connection service port Example: db2c_DB2 50000/tcp #DB2 connection service port
Enter the following commands from the command line processor, where node_name
is any name of your choosing:
catalog tcpip node node_name remote [IP_address | host] server service_name
In this example, node_name
is CMDB, host
is my_computer, and service_name
is db2c_DB2:
catalog tcpip node CMDB remote my_computer server db2c_DB2
Enter the following command, where database_alias
is a name of your choosing and node_name
was specified in the previous step:
catalog db database_name as database_alias at node node_name
In this example, the alias is the same as the database name (ICMNLSDB) and the node name is CMDB.
catalog db ICMNLSDB as ICMNLSDB at node CMDB
To check the connection, issue the following command:
connect to database_alias user database_user using password
In this example, the ICMADMIN user connects to ICMNLSDB.
connect to ICMNLSDB user ICMADMIN using password
Select tabname
from syscat.tables
. All table names in the database are listed.
Oracle SES does not crawl folders that have all blank attributes.
The ICM plug-in does not support CLOB
attributes because of a limitation when using these attributes with XPath queries.
To use the ICM eClient application to view search results, Oracle recommends that users log in to eClient first and then open the Oracle SES search screen in the same window. If a user opens the Oracle SES search results directly, then ICM eClient may prompt the user to log in. Then the user must manually refresh the Oracle SES page to view the selected document.
Change of the item type ACL does not update the items or documents (and their last modified date) of that item type. Whenever an ACL of an item type is changed from the System Administration client, the effective change on the items/documents of that item type can be reflected only through a force re-crawl. Change the re-crawl policy to Process All Documents on the Home - Schedules - Edit Schedule page.
When crawling an item type hierarchy of multiple levels, the crawler might signal this error:
com.ibm.mm.sdk.common.DKUsageError: DGL7146A: The query string is too long or too complex
The CM query has a length restriction of 64k. DB2 UDB does not have such a restriction, and the problem can be fixed by removing the 64K limitation check from the API and allowing the Library Server database determine the limit.
Activate the ICM identity plug-in on the Global Settings - Identity Management Setup page with the following parameters:
Library Server name: The name of the alias of the Library Server of DB2 Content Manager that must be connected to retrieve all the item types required for crawling.
User name: User name of a valid ICM Server user. Required.
Password: Password of the ICM user. Required.
ICM Servers File: Specifies the absolute path of the cmbicmsrvs.ini file. This INI file stores the source information for the data store.
ICM Environment File: Specifies the absolute path of the cmbicmenv.ini file. This INI file stores the database connect information.
The required ICM Server (cmbicmsrvs.ini) and ICM Environment (cmbicmenv.ini) files can be found on the client side (computer with Oracle SES) at
ICM_InstallationFolder
/cmgmt/connectors/cmbicmsrvs.ini
and
ICM_InstallationFolder
/cmgmt/connectors/cmbicmenv.ini
Create a source for the newly-created user-defined source type on the Home - Sources page. Enter a source name. Provide values for these configuration parameters:
Container name: The item types to be crawled. This can be a specific item type whose instances need be crawled, or a folder/sub-folder if all item types inside that folder or sub-folder must be crawled. Container name can be a combination of multiple item types delimited by a slash (/
). Note that a backslash (\
) is an unacceptable delimiter.
Container names must be in the format:
parent_item_type_name
[@
parent_attribute_name=attribute_value
]/child_item _type_name
[@
child_attribute_name=child_attribute_value
]
or
child_item _type_name
[@
parent_attribute_name=attribute_value
,@
child_attribute_name=child_attribute_value
]
For example, you might have a root-component item type named Level-1 with attribute Attribute1 whose value is Value-1. You have another item type Level-2 that is child of Level-1, with attributes Attribute-1 (linked with Level-1) Attribute-2 with value Value-2. You have another item type Level-3 that is a child of Level-2 and has attributes Attribute-1, Attribute-2 (linked attributes) and Attribute-3 with value Value-3.
If the user wants to crawl all items formed with item type Level-3 then the container name is:
Level-1[@Attribute-1="Value-1"]/Level-2[@Attribute-2="Value-2"]/Level-3
or
Level-3[@Attribute-1="Value-1" AND @Attribute-2="Value-2"]
The values for String and Date attributes are enclosed in double quotes while the values for Number attributes are not.
Attribute list: The comma-delimited list of ICM attributes along with their data types to be searchable. The format is:
AttributeName:AttributeType, AttributeName:AttributeType
Valid values are String, Number, and Date.
A database crawl indexes an attribute only if both name and type match the configured name and type; otherwise, the attribute is ignored. Optional.
The default searchable attributes for ICM are Modified Date, Title, and Author. This attribute is case-sensitive, and multiple attributes with same name are not allowed.
User name: The ICM user name used for crawling. It must be a user with at least read privileges on the configured item types. This setting is used to make a session with ICM to get ACL, Document List, metadata, and content.
Password: The password of the ICM user in User Name.
Crawl versions: Controls whether all versions of a document are crawled or only the latest version. Valid values are true
and false
. The default value is false
. Any other value is interpreted as false
.
Crawl folder attributes: Controls whether folder metadata is indexed. Valid values are true
and false
. The default value is false
.
Library server name: The name of the alias of the Library Server of DB2 Content Manager that must be connected to retrieve all item types required for crawling.
Remove URL not in queue: Controls whether documents deleted from ICM are also removed from the index. Valid values are true
and false
. The default value is false
.
Authentication attribute: The authentication attribute used to validate the ACL. The value for the Active Directory identity plug-in is USER_NAME
, and for ICM identity plug-in is NATIVE
. Required
WebClient path: The path of an optional Web application used to render the search results. ICM allows the rendering of search results in ICM eClient and a custom Web application, which must be deployed separately on the ICM application server.
Title field: A case-sensitive, comma-delimited list of attributes that can be used as the titles in the ICMD containers specified for crawling. Required.
Time Zone: The time zone of the ICM library server. Because the library-server of ICM could be in a different time zone than the Oracle SES server, this attribute enables the Oracle SES time zone to be converted to the ICM time zone for time-based queries. If an invalid time zone is entered, then GMT is used by default.
ICM Servers File: The absolute path of the cmbicmsrvs.ini file. This INI file stores the source information for the data store.
ICM Environment File: The absolute path of the cmbicmenv.ini file. This INI file stores the database connect information.
Use ICM eClient to view search results: Controls whether ICM eClient is used to view search results or some other Web application. Enter true
for ICM eClient; false
otherwise.
The SharePoint Crawler connector enables Oracle SES to provide secure search over SharePoint Portal Server and Microsoft Office SharePoint Server 2007 (MOSS). The connector extends the searching capabilities of Oracle SES and enables it to search into an external repository. Oracle SES can crawl through the documents, items, and related metadata in SharePoint repositories and provide secure, full-text search. The connector also provides metadata search and browse functionality, which allows a search to be done against a specific subfolder in the hierarchy.
In SharePoint, data is stored in different libraries such as the Document Library, Picture Library, Lists, Discussion Boards, and so on. A SharePoint instance can have one or more sites and sub-sites that the SharePoint Crawler connector can crawl after you set up the appropriate configuration parameters in the Oracle SES Administration GUI. The SharePoint Crawler connector navigates through the Libraries and Lists to crawl all the documents and items from a SharePoint repository. It creates an index, stores the metadata, and accesses information in Oracle SES to provide search capabilities according to the end user permissions.
The SharePoint Crawler connector supports incremental crawling, which means that it crawls and indexes only those documents that have changed after the most recent crawl. A document is re-crawled if the content, metadata, or direct security access information of the document has changed since the previous crawl. Documents deleted from a Library are removed from the index during incremental crawling.
When the Crawl Security Settings parameter is set to either NORMAL
or STRICT
, the SharePoint Crawler for the Container must use the SharePoint administrator account for crawling and indexing documents.
When the Crawl Security Settings parameter is set to RELAX
, any user that has at least Visitor (Read) permissions can be identified in the SharePoint source for crawling and indexing documents.
The supported versions of SharePoint Server are:
2003 or 2.0 for SharePoint Portal Server
2007 or 3.0 for MOSS 2007
SharePoint Container names in Oracle SES should not contain any special characters. Enter a backslash (\) before a slash or a comma. Otherwise, the crawler does not recognize the Container.
Passwords entered through the Oracle SES Administration GUI are case insensitive.
Storing more than 200 files in a single folder may result in degraded performance and increased crawling time.
If the Crawler Security Settings parameter is set to RELAX
, then the user ID specified in the User Name parameter does not require administrative privileges. Visitor (Read) permissions on the site are sufficient. However, Read must have Browse Directories permissions to access any sub-sites. Otherwise, the sub-sites are not crawled.
To add Browse Directories permissions for SharePoint 2007:
Open People and Groups - Site Permissions.
Under Settings - Permission Levels, select READ.
Under Site Permissions, select Browse Directories.
Click Submit.
To add Browse Directories permissions for SharePoint 2003:
Open the Created subarea and select Manage Security.
Select the user and edit permissions.
Select READ.
Click Advanced Permissions.
Under Advanced Permissions, select Browse Directories.
Click OK.
SharePoint does not allow users without administrative privileges to browse user profiles.
If the user ID specified in the User Name parameter does not have administrative privileges, then this user needs permission to manage profiles.
To grant permission to manage profiles:
Open SharePoint Central Administration 3.02.
Click Shared Services Administration - SharedServices1.
Under User Profile and My Sites, select Personalization Service Permissions.
Add user user1 and select permissions Manage user profiles.
Save and submit the user.
User profiles are crawled if the user has specified the root site in the Site/Sub-Site URL parameter of the source configuration.
Versions of list items whose object type is folder are not getting crawled and indexed.
Site Collection Administrator users are not able to see documents if they are not listed among the document permission users.
Unable to type cast null
message is not error. This information is provided when the crawler tries to crawl attachments that are not supported for a particular entity.
Principal
user_name
cannot be validated
error is returned when the crawler obtains a user name from the SharePoint repository that is not present in the Active Directory.
Performance of the SharePoint connector can be impacted when the Crawl Versions attribute is set to true.
The following platforms are supported by the SharePoint Crawler connector:
Red Hat Linux 4
Windows 2003 Server Standard Edition and above with the latest Service Pack
Create a source for the newly-created user-defined source type on the Home - Sources page. Enter a source name. Provide values for the configuration parameters described in the following list. Also see Table 7-6, "Supported Values for SharePoint Source Parameters".
SharePoint Version: Version of the SharePoint server (SharePoint Portal Server/MOSS 2007) to crawl. (Required)
Container name: Contains the names of the containers to be crawled by Oracle SES. You can specify multiple container names as a comma-delimited list. (Required)
You can crawl an entire area or site or a specific folder. The format for specifying a container folder is AreaName/LibraryName/FolderName/SubFolderName
.
To crawl all documents in the Area or Library, the format is AreaName
or AreaName/LibraryName
.
To index the entire SharePoint portal, enter a slash (/).
To crawl all sites, enter sites
.
Examples for SharePoint Portal Server:
Container name: AreaName
The entire Area is crawled.
Container name: AreaName/LibraryName/Folder21
Folder21
and its subfolders within LibraryName
are crawled.
Container name: LibraryName
All documents inside the Library and its subfolders are crawled.
Examples for MOSS 2007:
Container name: LibraryName/Folder21
Folder21
and its sub-folders within LibraryName
are crawled.
Container name: LibraryName
All documents inside the Library and its subfolders are crawled.
The path for the container cannot contain any special characters. Enter a backslash (\
) before a slash or a comma.
Attribute list: A comma-delimited list of attributes, as described in Table 7-7. The format for an attribute list is AttributeName
, AttributeName
. Multiple attributes with same name are not allowed, such as Emp_ID, Emp_ID
.
In MOSS 2007, all attributes viewable from the UI are indexed by default. List all custom attributes to index, using the names displayed in the user interface.
In SPPS (SP 2003), the Title, LastModifiedDate, and Author attributes are indexed by default. List any other attributes to index, using the names displayed in the UI.
If you update the attribute list from the administrator parameters, then perform a forced recrawl to delete the indexes of the old attribute list and to create indexes for the new attribute list.
Domain name: The domain name of the user that is used to crawl the SharePoint site. For example, if you intend to use the OracleDomain\Administrator
user for crawling, then enter OracleDomain
for this parameter. Do not include .com
or .in
or any other suffix in the name. (Required)
User name: Specifies the user name of a valid SharePoint Portal Server/MOSS 2007 user. Do not include the domain name for this user. For example, for OracleDomain\Administrator
, enter Administrator
. (Required)
Password: Specifies the password of the SharePoint user specified in User name. (Required)
Authentication attribute: Format of the user and group identity stored in the ACL of SharePoint objects. This format must be an authentication attribute of the Oracle SES active identity plug-in, such as USER_NAME
for an Active Directory identity plug-in. Otherwise, the ACL validation fails during indexing. (Required and case sensitive)
For example, this value is USER_NAME
for the Microsoft Active Directory identity plug-in.
SPS Site/Sub-Site URL: The URL of the Site or Sub-site of the SharePoint Portal, which is used for viewing the search results. (Required)
This URL has the form http://
HostName:PortNumber
or http://
HostName:PortNumber/SubSiteName
.
Crawl Security Settings: Sets security on documents for indexing. (Required)
This setting can be one of the following:
NORMAL
: The regular crawl uses site-level access control lists (ACLs) but not document-level ACLs.
RELAX
: When the SharePoint Site Administrator user information is not available and the SharePoint user has visitor (or read) permissions on the site, this user is not able to crawl subsites under the main site. This mode is intended for exposing public documents temporarily and quickly to search. The SES administrator must be careful not to expose documents to other users inadvertently. See the work-around for this in "Known Limitations of the SharePoint 2007 Connector".
STRICT
: Captures even document-level security. This mode requires that an additional Web Service agent, Oracle MOSS Web Service, be installed on the SharePoint 2007 server. See "Deploying the Web Service on MOSS 2007".
Simple Include: Only include URLs having at least one word mentioned in this parameter. Separate the words with commas.
Simple Exclude: Exclude all URLs having one or more word(s) mentioned in this parameter. Separate the words with commas.
Regular Expression Include: Include all URLs that match the expression provided in this parameter.
Regular Expression Exclude: Exclude all URLs that match the expression provided in this parameter.
Crawl versions: Controls whether multiple versions of documents are crawled. Valid values are true
and false
. Any other value is interpreted as false
. The default value is false
, so only the latest version is crawled. (Optional)
Crawl folder attributes: Controls whether folder attributes are crawled. The default value is false
. Valid values are true
or false
, and any other value is interpreted as false
. (Optional)
Crawl attachments: This parameter indicates whether attachments should be crawled. The default value is false
. Valid values are true
or false
, and any other value is interpreted as false
. (Optional)
LDAP URL: URL of the LDAP server, such as ldap://
IP:port
, where the default port number is 389.
LDAP Search Base: LDAP Search Base, such as, DC=abc
, DC=com
. When the value of Authentication Attribute is DN, specify the LDAP URL and the LDAP search base of the LDAP server configured in the identity plug-in. Otherwise, leave these parameters blank.
Table 7-6 summarizes the supported values for the configuration parameters of the SharePoint Crawler connector.
Table 7-6 Supported Values for SharePoint Source Parameters
Parameter Name | SharePoint Portal Server | MOSS 2007 |
---|---|---|
SharePoint Version |
2003, 2.0 |
2007, 3.0 |
Container name |
(/) for full site, Library Name, List Name, Area Name |
(/) for full site, Library Name, List Name |
Attribute list |
|
|
Domain Name |
Domain name of the user |
Domain name of the user |
User name |
Valid administrator user for SharePoint Portal server |
Valid administrator user for MOSS 2007 |
Password |
Password for the user |
Password for the user |
Authentication attributes |
|
|
SPC Site/Sub-Site URL |
IP address or host name with port on which SharePoint Portal Server is installed |
IP address or host name with port on which MOSS 2007 is installed |
Crawl Security Settings |
|
|
Simple Include |
Part of URL |
Part of URL |
Simple Exclude |
Part of URL |
Part of URL |
Regular Expression Include |
All URLs that match the expression |
All URLs that match the expression |
Regular Expression Exclude |
All URLs that match the expression |
All URLs that match the expression |
Crawl versions |
|
|
Crawl folder attachments |
|
|
Crawl attachments |
|
|
LDAP URL |
URL of the LDAP server |
URL of the LDAP server |
LDAP Search Base |
LDAP Search Base |
LDAP Search Base |
Table 7-7 Attributes for List Items and Versions Crawled for SharePoint 2007
List Item Type | Attributes |
---|---|
Document Library |
Title, Author, Created, Modified |
Picture Library |
Title, ImageSize, ImageCreateDate, Description, Keywords |
Form Library |
Title, Author, Created, Modified |
Translation Library |
Title, Name, Language, Base Document Version, Translation Status, Created |
Data Connection Library |
Connection Type, Description, Keywords, Title, UDC Purpose, Created |
Slide Library |
Name, Presentation, Description, Created |
Report Library |
Name, Title, Author, Created, Report Category, Report Status |
Dash Board |
Name, Title, Author, Created |
Wiki Page Library |
Title, Author, Created, Modified |
Announcements |
Title, Body, Editor, Modified, Author, Created |
Contacts |
Company, WorkCity, Created, Email, Comments, Title, Editor, HomePhone, JobTitle, Modified, WorkZip, WorkPhone, WorkState, FirstName, Author, FullName, WorkCountry, CellPhone, WorkFax, WorkAddress |
Links |
Comments, Editor, Modified, Author, URL, Created |
Discussion Reply |
Body, Created, DiscussionTitle, Editor, Modified, Author |
Calendar |
EventType, Title, EventDate, Duration, Editor, WorkspaceLink, Modified, EndDate, Description, fRecurrence, Author, fAllDayEvent, Created |
Task |
Title, StartDate, Body, Status, Editor, Priority, AssignedTo, DueDate,Modified, Author, PercentComplete, Created |
Project Task |
Title, StartDate, Body, Status, Editor, Priority, AssignedTo, DueDate,Modified, Author, PercentComplete, Created |
Issue Tracking |
Category, LinkIssueIDNoMenu, RelatedIssues, IssueID, Priority, DueData, Comment, V3Comments, IsCurrent, Created, Title, Status, Editor, AssignedTo, Modified, Author |
Custom List |
Title, Editor, Modified, Author, Created |
Languages and Translators |
Language_x0020_From,Language_x0020_To,Modified,Author,Translator,Created, Editor |
KPI List |
Title, PercentExpression, Editor, ViewGuid, Modified, Value, AutoUpdate, KpiComments, Author, Goal, ValueExpression, Warning, KpiDescription, DataSource, LowerValuesAreBetter, Created |
For MOSS 2007, if the Crawl Security Settings parameter is set to STRICT
, then you must install an extra web service, Oracle MOSS Web Service. The following installation and deinstallation files are provided by the OracleMOSSService installer at ORACLE_HOME/search/lib/plugins/sps/WebService.zip
:
OracleMossService.wsp
install.cmd
de-install.cmd
To install or deinstall the Oracle MOSS Web Service:
Click install.cmd to install, or click de-install.cmd to deinstall.
Verify that the STSADM.exe file is in the following location: Drive:\Program Files\Common Files\Microsoft Shared\web server extensions\12\BIN.
If STSADM.exe is not in that folder, specify the correct path when the installer prompts for it.
Press any key to continue.
Livelink data is stored in Workspaces, which in turn can contain folders, files, projects, and task lists. A Livelink Enterprise Server instance can have one or more Workspaces that can be crawled. Oracle SES navigates through the Workspaces to crawl all the objects in Livelink Enterprise Server. It creates an index, stores the metadata, and accesses information in Oracle SES to provide search according to the end user permissions.
The administrator account must be used by the Livelink crawler plug-in for the container for crawling and indexing documents.
The Livelink Enterprise Server version must be 9.2, 9.5.0, 9.5.5
Because Open Text Livelink software is not included with Oracle SES, certain files must be copied manually into Oracle SES. Copy the lapi.jar file from LAPI installation folder into ORACLE_HOME/search/lib/plugins/llcs
.
The Directory Services module of Livelink should be installed with Livelink, if users and groups are importing from LDAP server and you want to use the Active Directory identity plug-in.
To import users and groups of Active Directory into Livelink Server:
Create an LDAP user that has permission in Active Directory to administer users and groups. This user synchronizes the Active Directory with Livelink.
To extend the schema of Active Directory, install the Active Directory Schema snap-in:
Select Run from Windows Start menu.
Type mmc /a
in the Open field and click OK.
On the Console menu, choose Add/Remove Snap-in and click Add.
Under Snap-in, double-click Active Directory Schema. Click Close, then OK. Save the console (for example, as "Active Directory Schema.msc"). If the new snap-in does not appear under Snap-in, then you may have to re-install the Windows 2003 Administrative Tools and start again at step 2.
Open the following file in a text editor.:
livelink_home/module/directory_2_3_0/ot-livelink-schema.conf
Open the Active Directory Schema console from the Windows All Programs menu. The console has a name such as Active Directory Schema.msc.
Right-click Active Directory Schema and select Operations Master.
Right click the Attributes folder and select Create Attribute.
Create the attribute llserverinfo
using the information from ot-livelink-schema.conf, as shown in Table 7-8.
Create the attribute llquery
using the information from ot-livelink-schema.conf
as follows:
Browse through the Directory Services Administration section of the Livelink Administration page to enable the following configuration.
To enable the Synchronization Features:
Click the Choose Directory Services link.
Select LDAP Synchronization (Read-Only LDAP) from the Synchronization list.
For Livelink CGI Hosts, specify 127.0.0.1,
Livelink_Server_IP
Click Save Changes.
To configure LDAP Read-Only Parameters, set the parameters described in Table 7-10.
Click Save Changes.
Click Synchronize LDAP Read-only.
Click Synchronize.
Table 7-10 LDAP Read-Only Parameters
Parameter | Value |
---|---|
New User Password Policy |
Hidden |
User name Case Sensitivity |
Preserve case |
Livelink Server Name |
Computer name on which Livelink Server is running |
LDAP Server |
Computer name or IP Address on which LDAP server is running |
LDAP Server Port |
389 |
Search Root |
cn=Users,dc=otdomain,dc=com |
LDAP User name |
cn=<LDAP_User_Name>,cn=Users, dc=otdomain,dc=com |
LDAP Password |
<LDAP_User_Password> |
Log-in Name |
sAMAccountName or cn |
First Name |
givenname |
Last Name |
sn |
Title |
title |
|
|
Contact |
telephonenumber |
Department Mapping |
disable |
Group Name |
cn |
Group Leader |
managedBy |
Group Member |
Member |
Group Member Query |
llquery |
Privileges |
Select Log-in enabled, Public Access |
Group Search Filter |
objectclass=group |
Synchronize Group |
checked |
The Livelink Enterprise Server identity plug-in authenticates native users of Livelink Enterprise Server. The identity plug-in communicates with the directory to authenticate a user's credentials, validate a user or group and return the associated canonical form, and return the groups associated with a given user.
Activate the identity plug-in on the Global Settings - Identity Management Setup page, as described in "Activating the Active Directory Identity Plug-in".
Create an Open Text source on the Home - Sources page. Select Open Text from the Source Type list, and click Create. Enter values for the following parameters:
User name: Name of a valid Livelink Enterprise Server user. The user must be an Administrator user or a user who has access to all folders and documents of the workspaces configured in the Container name parameter. The user should be able to retrieve content, metadata, and ACL from folders, documents and other custom sub classes of all workspaces configured in Container name parameter. Required
Password: Password of the Livelink user. Required.
Server Name and Port Number for Livelink: The computer name/IP address and the port number on which Livelink server is running. The format is ServerName:port
.
Container name: The names of the containers to be crawled by Oracle SES. You can crawl an entire Livelink Workspace or a specific folder. The format for is: WorkspaceName/FolderName/SubFolderName
. You can enter multiple comma-delimited container names. Required.
For example:
Container name: Workspace1: The entire Workspace1 is crawled.
Container name: Workspace2/Folder21: Folder21 and its sub-folders within Workspace2 are crawled.
Attribute list: The comma-delimited list of Livelink attributes along with their data types to be searchable. The format of an attribute list is AttributeName:AttributeType, AttributeName:AttributeType
. Valid values are String, Number, and Date. Optional.
Table 7-11 shows equivalent Open Text and Oracle SES data types. The crawler indexes an attribute only if both name and type match with configured name and type; otherwise, it is ignored. Multiple attributes with same name are not allowed. For example Emp_ID:String, Emp_ID:Number
The default searchable attributes for Livelink Enterprise Server are Modified Date, Title, and Author.
For example: Consider the following Livelink attributes:
Attribute name: account name attribute type: String
Attribute name: account ID attribute type: Integer
Attribute name: creation date attribute type: Date
For these attributes to be searchable, the value of Attribute List must be:
Account Name: String, Account ID: Number, Creation Date:Date
Crawl versions: Controls whether multiple versions of documents are crawled. Valid values are true
and false
. The default value is false
. Any other value is provided interpreted as false
, and only the latest versions of the documents are crawled. Optional.
Crawl folder attributes: Controls whether folder attributes are crawled. Valid values are true
and false
. The default value is false
. Any other value is provided interpreted as false
, and only the latest versions of the documents are crawled. Optional.
Authentication attribute: The attribute used to set ACL. With Active Directory, the value is USER_NAME
. With the Livelink identity plug-in, the value is NATIVE
. Required and case sensitive.
Crawl objects with public access: Controls whether objects with public access are crawled without an ACL. Valid values are true
and false
. When false
, all objects with this ACL are ignored.
Livelink URL: The Livelink URL for viewing objects from the Livelink Server. For example, for Windows, the URL must be
http | https://
host/livelink_service
/livelink.exe
.
For other application servers like WebLogic, Tomcat, and WebSphere, the URL must be
http | https://
host:port/livelink_service
/livelink
.
Documents in Oracle Content Database are organized into folders. Oracle SES navigates the folder hierarchy to crawl all documents in Oracle Content Database. It creates an index, stores the metadata, and accesses information in Oracle SES to provide search according to the end users' permissions.
The metadata crawled includes folder_url
(URL of the folder containing the document) and folder_path
(path of the folder containing the document). These let you show the direct folder path and direct folder URL for each document hit.
Oracle SES supports incremental crawling; that is, it only crawls and indexes documents that have changed since the last crawling. A document is re-crawled if either the content or the direct security access information of the document changes. A document is also re-crawled if it is moved within Oracle Content Database and the end user has to access the same document with a different URL. Deleted documents are removed from the index during incremental crawling.
This book uses the product name Oracle Content Database to mean both Oracle Content Database and Oracle Content Services. Oracle Content Database sources are certified with Oracle Content Database release 10.2 and release 10.1.3 and Oracle Content Services release 10.1.2.3.
Known Issues:
The administrator account used by the Oracle Content Database source must have the ContentAdministrator role on the site that is being crawled and indexed. Also, end users searching documents in Oracle Content Database must have the GetContent and GetMetadata permissions.
By default, Oracle Content Database has a limit of three concurrent requests (simultaneous operations) for each user. However, Oracle SES has a default of five concurrent crawler threads. When crawling Oracle Content Database, only three of the five threads can successfully crawl, which causes the crawl to fail.
Workaround: For an Oracle Content Database source, change the Number of Crawler Threads on the Home - Sources - Crawling Parameters page to a value of 3 or fewer.
Or, modify the Oracle Collaboration Suite configuration in Oracle Enterprise Manager to allow more than three concurrent requests. For example:
Access the Enterprise Manager page for the Collaboration Suite Midtier. For example: http://example.domain:1156/
.
Click the Oracle Collaboration Suite midtier standalone instance name. For example: ocsapps.example.domain
.
In the System Components table, click Content.
From Administration, click Node Configurations.
In the Node Configurations table, click HTTP_Node. For example: ocsapps.computer.domain_HTTP_Node.
On Properties, change the value for Maximum Concurrent Requests Per User. Enter a value larger than or equal to the number of crawling threads used by Oracle SES. This value is listed on the Global Settings - Crawler Configuration page.
The Oracle SES instance and the Oracle Content Database instance must be connected to the same or mirrored Oracle Internet Directory system or other LDAP server.
To set up a secure Oracle Content Database source:
Read "Known Issues:" and confirm that the number of crawler threads does not exceed the available concurrent connection settings for each user in Oracle Content Database.
Activate the Oracle Internet Directory identity plug-in for the Oracle Content Database instance on the Global Settings - Identity Management Setup page in Oracle SES.
For Oracle Content Database 10.1.2.3 and 10.2.0.4, use the following LDIF file to create an application entity for the plug-in. (An application entity is a data structure within LDAP used to represent and keep track of software applications accessing the directory with an LDAP client.)
ORACLE_HOME/bin/ldapmodify -h oidHost -p OIDPortNumber -D "cn=orcladmin" -w password -f ORACLE_HOME/search/config/ldif/csPlugin.ldif
This defines the entity that is used for the connector: orclApplicationCommonName=ocsCsPlugin,cn=ifs,cn=products,cn=oraclecontext
. The entity has the password welcome1
.
The Content Database JDBC connector is an alternative to the Content Database connector provided in Oracle SES Release 10.1. The JDBC connector greatly improves the performance of incremental crawls. If the elapsed time of an incremental crawl is an important consideration in your deployment of Oracle SES, then use the JDBC connector.
Oracle SES crawler supports crawling from Oracle Content Database 10.1.2.0.4 or later. See the readme file for Oracle Content Database 10.2.1.0.4 patchset for details on configuring high volume full and incremental crawls in Oracle Content Database.
Note that it may be necessary to grant the SES user access to one of the Oracle Content Database objects. To do this, use the command:
GRANT SELECT ON ODMC_ALERT_SEQ TO sesuser
where sesuser
is the SES user.
For example,
GRANT SELECT ON ODMC_ALERT_SEQ TO eqsys
Note:
The JDBC connector requires installation of a patch to Oracle Content Database. If the patch is not available for your version of Content Database, then use the older connector as described in "Creating an Oracle Content Database Source".To create an Oracle Content Database JDBC source:
Open the Oracle SES Administration GUI to the Home page.
Select the Sources secondary tab.
For Source Type, select Oracle Content Database (JDBC), then click Create to display Step 1 Parameters.
Enter a source name and the values for the parameters described in Table 7-12.
Click Next to display Step 2 Authorization.
Enter the settings described in Table 7-13.
Click Create or Create and Customize to create the source.
Table 7-12 Oracle Content Database JDBC Source Parameters (Step 1)
Parameter | Value |
---|---|
Database Connection String |
JDBC connection string to Oracle Content Database in the form |
Content DB System User |
SYSTEM user for Content Database. |
Alert Table Name |
Name of the Alert table for Content Database, which typically has the form |
Database User ID for Crawl |
Valid user ID for the Content DB database. |
Database Password for Crawl |
Password associated with the user ID for crawling. |
Document Count |
Maximum number of documents to be crawled. |
URL Prefix |
URL to Oracle Content Database in the form |
Document Access (DAV) User ID |
Valid Content Database user ID for using WebDAV to access documents. |
Document Access (DAV) Password |
Password associated with the DAV user ID. |
Starting Path for Crawl |
Full path where the crawl starts. Enter |
Table 7-13 Oracle Content Database JDBC Authorization Parameters (Step 2)
Parameter | Value |
---|---|
Authorization Database JDBC Connection String |
JDBC connection string to Oracle Content Database in the form |
Content DB System User |
System user for Content Database, such as |
Database User ID |
User ID to connect to the database. |
Database Password |
Password associated with the database user ID. |
Use the Run-Time Result Filter |
Controls use of a final security check:
|
Authorization User ID Format |
Format of user ID in the authorization query. Enter a supported authentication attributes of the active ID plugin, such as |
If Oracle Content Database release 10.2 or Oracle Content Services release 10.1.2 is used, then the Entity name and Entity password parameters are required, the last six parameters related with keystore are not required, and the crawler plug-in uses service to service (S2S) authentication to connect to Oracle Content Database.
If Oracle Content Database release 10.1.3 is used, then the last six parameters in the following table are required, the Entity name and Entity password are not required, and Oracle SES uses Web services authentication to connect to Oracle Content Database. See "Required Tasks for Oracle Content Database Release 10.1.3".
Create an Oracle Content Database source on the Home - Sources page. Select Oracle Content Database from the Source Type list, and click Create.
Enter values for the parameters listed in Table 7-14.
Table 7-14 Oracle Content Database Source Parameters
Parameter | Value |
---|---|
Oracle Content Database URL |
|
Starting paths |
/ |
Depth |
-1 |
Oracle Content Database admin user |
|
Entity name |
|
Entity password |
welcome1 |
Crawl only |
|
Use e-mail for authorization |
|
Oracle Content Database Version |
For example, 10.1.3.2.0 |
SES keystore location |
For example, /scratch/ocs/cdb/cdb-ses/keystore/sesClientKeystore.jks |
SES keystore type |
jks |
SES keystore password |
******* |
SES private key alias |
client |
SES private key password |
******* |
CDB Server public key alias |
server |
Table 7-15 Oracle Content Database Authorization Manager Plug-in Parameters
Parameter | Value |
---|---|
Oracle Content Database URL |
http://host name:port/content |
Oracle Content Database admin user |
orcladmin |
Entity name |
|
Entity password |
welcome1 |
Use e-mail for authorization |
|
You can use a real-time result filter (query-time authorization) to ensure that the user has access to each result document. Set this parameter to |
|
Oracle Content Database Version |
For example, 10.1.3.2.0 |
SES keystore location |
For example, /scratch/ocs/cdb/cdb-ses/keystore/sesClientKeystore.jks |
SES keystore type |
jks |
SES keystore password |
******** |
SES private key alias |
client |
SES private key password |
******* |
CDB Server public key alias |
server |
This section describes the required steps for Web services authentication when using Oracle Content Database release 10.1.3. This procedure uses the JDK keytool to create the keys.
See Also:
"Setting Up a Server Keystore for WS-Security" in the Oracle Fusion Middleware Administrator's Guide for Oracle Universal Online Archive athttp://download.oracle.com/docs/cd/B32110_01/content.1013/b32191/security.htm#CHDGCJEH
Configure a server keystore at the Oracle Content Database middle tier if the keystore is not set up yet.
The file ORACLE_HOME
/j2ee/OC4J_Content/config/oc4j.properties
defines the keystore type and the keystore properties file location. If you use a different file name for the keystore, then edit the file on the following entry:
oracle.ifs.security.KeyStoreLocation=
/home/oracle/product/10.1.3.2.0/OracleAS_1/content/settings/server-keystore.jks
Change to the settings directory:
cd Oracle_home/content/settings
Create the Oracle Content Database server keystore with the following keytool command, substituting a secure password for password.
Oracle_home/jdk/bin/keytool -genkey -keyalg RSA -validity 5000 -alias server -keystore server-keystore.jks -dname "cn=server" -keypass password -storepass password
To list the keys in the store:
Oracle_home/jdk/bin/keytool -list -keystore server-keystore.jks -keypass password -storepass password
Sign the key before using it:
Oracle_home/jdk/bin/keytool -selfcert -validity 5000 -alias server -keystore server-keystore.jks -keypass password -storepass password
Export the server public key from the server keystore to a file:
Oracle_home/jdk/bin/keytool -export -alias server -keystore server-keystore.jks -file cdbServer.pubkey -keypass password -storepass password
Store both the keystore password and the private server key password in a secure location so Oracle Content Database can access the keystore and the private key.
Oracle_home/content/bin/changepassword -k
When prompted for the old password, press [Enter] if it is the first time to set the password; otherwise, enter the previous password. Then, enter and confirm the keystore password (-storepass
password
) that you provided in step 1.b.
See ORACLE_HOME
/content/log/changepassword.log
.
Configure a client keystore at the Oracle SES installation.
Create the SES client keystore with the following keytool command, substituting a secure password for password:
Oracle_home/jdk/bin/keytool -genkey -keyalg RSA -validity 5000 -alias client -keystore sesClientKeystore.jks -dname "cn=client" -keypass password -storepass password
To list the keys in store:
Oracle_home/jdk/bin/keytool -list -keystore sesClientKeystore.jks -keypass password -storepass password
Sign the key before using the key:
Oracle_home/jdk/bin/keytool -selfcert -validity 5000 -alias client -keystore sesClientKeystore.jks -keypass password -storepass password
Restart the WebCenter middle tier from the Oracle Enterprise Manager console.
Export the server public key from the server keystore to a file:
Oracle_home/jdk/bin/keytool -export -alias client -keystore sesClientKeystore.jks -file sesClient.pubkey -keypass password -storepass password
Import Oracle SES client public keys into the Oracle Content Database server keystore (sesClient.pubkey
must be copied to Oracle Content Database):
cd Oracle_home/content/settings Oracle_home/jdk/bin/keytool -import -alias client -file sesClient.pubkey -keystore server-keystore.jks -keypass password -storepass password
Import Oracle Content Database server public keys into the Oracle SES keystore. (cdbServer.pubkey
must be copied to Oracle SES):
Oracle_home/jdk/bin/keytool -import -alias server -file cdbServer.pubkey -keystore sesClientKeystore.jks -keypass password -storepass password
Note:
Check the server logs atORACLE_HOME
/content/logs
for keystore issues with the crawler plug-in.Oracle SES crawls the following attributes for Oracle Content Database Sources:
AUTHOR
CREATE_DATE
DESCRIPTION
FILE_NAME
LASTMODIFIEDDATE
LAST_MODIFIED_BY
TITLE
MIMETYPE
ACL_CHECKSUM
: The check sum calculated over the ACL submitted for the document.
DOCUMENT_LANGUAGE
: Oracle SES language code taken from Oracle Content Database language string. For example, if Oracle Content Database uses "American", then Oracle SES submits it as "en-us".
DOCUMENT_CHARACTER_SET
: The character set for the Oracle Content Database document.
Oracle SES also can search categories or customized attributes created by the user in Oracle Content Database.
You can apply categories to files and links, and divide categories into subcategories having one or more attributes. When a document in Oracle Content Database is attached to a category, you can search on the attribute of category. (The attributes appear in the list of search attributes.)
For example, suppose you create a category named testCategory
with testAttr1 and testAttr2
. Document X
is created and assigned to testCategory
. You must assign the value to the testCategory
attributes. After crawling, testAttr1
and testAttr2
appears in the search attribute list.
Customized attribute values can be the following types: String, Integer, Long, Double, Boolean, Date, User, Enumerated String, Enumerated Integer, and Enumerated Long:
Index Long, Double, Integer, Enumerated Integer, and Enumerated Long type customized attributes are type Number attributes in Oracle SES. The display name has an _N
suffix.
Index Date customized attributes are type Date attributes in Oracle SES. The display name has a _D
suffix).
Index String, Enumerated String, and User customized attributes are type String attributes in Oracle SES.
Limitations on Custom Attributes for Oracle Content Database
The Oracle Content Database SDK has more features than the Oracle Content Database Web GUI. The Web GUI does not support String arrays, but the SDK does. If you use the SDK to build customized administration and user GUIs that support the String array type, then a customized attribute can have multiple values.
If a document in Oracle Content Database is attached to a category and the attributes in that category are left blank, then the attribute is not available in the attribute list for an Advanced Search. The crawler skips attributes with null values. However, if another document has the same attribute with a real value, then the attribute is indexed.
The Oracle Content Server connector enables Oracle SES to search Oracle Content Server (formerly Stellent Server), which is the foundation of the Oracle Universal Content Management solution. Users throughout the organization can contribute content from native desktop applications, manage content through rich library services, publish content to Web sites or business applications, and access the content with a browser.
The Content Server connector supports Oracle Content Server 7.5.2 or 10gR3 with XMLCrawlerExport (the Oracle Content Server RSS component).
Oracle Content Server includes an RSS feed generator component (XMLCrawlerExport) on top of the content server. This component generates RSS feeds as XML files from its internal indexer, based on indexer activity. It has access to the original content (for example, a Microsoft Word document), the Web viewable rendition, and all the metadata associated with each document. The component also has a template that contains a Idoc script that applies the metadata values from the indexer to generate the XML document. (Idoc is an Oracle Content Server proprietary scripting language.) Oracle Content Server generates feeds for all documents for the initial crawl, and feeds for updated and deleted documents for the incremental crawl. Each document can be an item in the feed, with the operation on the item (such as insert, delete, update), its metadata (such as author, summary), URL links, and so on.
The Oracle Content Server connector reads the feeds provided by Oracle Content Server according to a crawling schedule. Oracle SES parses and extracts the metadata information, and fetches the document content, using its generic RSS crawler framework.
Oracle SES supports the control feed method, in which individual feeds can be located anywhere and a control feed file is generated containing the links to other feeds. This control file is input to the connector through the configuration file. Control feed must be used when two computers are on different domains or on different platforms, or if they use remote access protocol, such as HTTP or FTP, for communication between the two servers.
See Also:
Oracle Content Database page at http://www.oracle.com/technology/products/contentdb/index.html
The Oracle Content Server security model is based on the concept of permissions, which defines the privileges a user has on a document. The following table shows the set of permissions supported by Oracle Content Server. Each permission is a superset of the previous ones. For example, Write permission includes Read permission. Admin permission is a superset of all the permissions.
Table 7-16 Oracle Content Server Permissions
Permission | Description |
---|---|
Read |
View documents |
Write |
View, Check In, Check Out, and Get Copy of documents |
Delete |
View, Check In, Check Out, Get Copy, and Delete documents |
Admin |
View, Check In, Check Out, Get Copy, and Delete documents An Administration user with Workflow rights can start or edit a workflow for the document. An Administration user can also check in documents with another user specified as the Author. |
Oracle Content Server provides multiple security models, including an out-of-the-box security system and integration with centralized security models such as LDAP and Active Directory.
Oracle Universal Content Management security can work in these modes:
Universal Content Management native identity plugin where Universal Content Management is not connected to a directory
Oracle Internet Directory
Active Directory only where Universal Content Management is connected to Active Directory using LDAP. A connection to Active Directory using Microsoft Security is not supported.
The Oracle SES Oracle Content Server connector supports the two most popular security models among current Oracle Content Server customers: Roles and Groups, and Accounts.
A security group is a set of files grouped under a unique name. Every file in the library belongs to a security group. Access to security groups is controlled by the permissions, which are assigned to roles, which are assigned to users. For example, the EngAdmin role has Read, Write, Delete, and Admin permission to all content in the EngDocs security group. User Joe is assigned to role EngAdmin; therefore, Joe has all permissions to the documents in EngDocs group.
Accounts provide greater flexibility and granularity than groups. An account is a group of content. It introduces another metadata field that is filled out upon content check-in. When accounts are enabled, content items also can be assigned to an account in addition to the security group. A user must have access to the account to read, write, delete or administer content in that account. When accounts are used, the account becomes the primary permission to satisfy before security group permissions are applied.
A user's access to a document is like the intersection between their account permissions and security group permissions. For example, a user is assigned the EngAdmin role, which has all permissions to the documents in EngDocs security group. At the same time, the user is also assigned Read and Write permission to the EngProjA account. Therefore, the user has only Read and Write permission to a content item that is in the EngDocs security group and the EngProjA account.
Accounts can also be set up in a hierarchical structure. A user has permission to the entire subtree starting from the account node. For instance, a user assigned to the Eng account has access to Eng/AbcProj and Eng/XyzProj, or any accounts beginning with Eng. In other words, users that have permission to a particular account prefix also have access to all accounts with that prefix.
Note:
Oracle Content Server uses a prefix test for account filtering, so a slash (/
) has no special meaning. A user granted permission to account A has access to any documents in account A*, such as A, AB, or A/B. The hierarchical structure takes advantage of the prefix semantics, but it is enforced with the account model. Hence, there is no special character as the level divider when testing for account permissions.See Also:
Oracle Universal Content Management documentation athttp://www.oracle.com/technology/products/content-management/ucm/index.html
To activate the Oracle Content Server identity plug-in:
On the Global Settings page, select Identity Management Setup under the System heading.
The Global Settings - Identity Management Setup page is displayed.
Select Oracle Content Server and click Activate.
Enter values for the parameters described in Table 7-17, then click Finish.
Table 7-17 Oracle Content Server Connector Setup Parameters
Parameter | Value |
---|---|
HTTP endpoint for authentication |
HTTP endpoint for Oracle Content Server authentication. For example, |
Admin User |
Administrative user who accesses the Oracle Content Server Identity Service API |
Password |
Administrative user password |
To create an Oracle Content Server source using the Oracle SES Administration GUI:
On the Home page, click the Sources secondary tab to display the Sources page.
Select Oracle Content Server from the Source Type list, then click Create to display Step 1 Parameters.
Enter values for the parameters described in Table 7-18.
Click Next to display Step 2 Authorization, then set values for the parameters described in Table 7-18.
Scroll down to Security Attributes to verify that ACCOUNT
and DOCSECURITYGROUP
are listed. If they are not, then the source was not created correctly. Verify that the Configuration URL in Step 1 is correct.
Click Create to create the Oracle Content Server source.
After processing each data feed, a status feed is uploaded to the location specified in the configuration file. This status feed is named one of the following:
data_feed_file_name
.suc
indicates the data feed was processed successfully.
data_feed_file_name
.err
indicates that an error was encountered while processing the feed. The errors are listed in this status feed.
Tip:
To index multibyte character sets, set the default character set of the crawler to UTF-8 regardless of the character set of Oracle Content Server. See "Modifying the Crawler Parameters".Table 7-18 Oracle Content Server Source Parameters (Step 1)
Parameter | Value |
---|---|
Configuration URL |
URL of the XML configuration file providing details of the source, such as the data feed type, location, security attributes, and so on. Obtain the location of the file from the Oracle Content Server administrator. Use the following format to enter the configuration URL:
|
Authentication Type |
Java authentication type. Set this parameter when the data feeds are accessed over HTTP. Enter one of the following values:
|
User ID |
User ID to access the data feeds. The access details of the data feed are specified in the configuration file. Obtain a user ID from the Oracle Content Server administrator. |
Password |
Password for User ID. Obtain the password from the Oracle Content Server administrator. |
Realm |
Realm of the Oracle Content Server instance. |
Oracle SSO Login URL |
URL that protects all OracleAS Single Sign-on applications. Set this parameter when the Authentication Type is ORASSO. |
Oracle SSO Action URL |
URL that authenticates OracleAS Single Sign-on user credentials. The login form is submitted to this URL. Set this parameter when Authentication Type is ORASSO. |
Scratch Directory |
Directory where Oracle SES can write temporary status logs. The directory must be on the same system where Oracle SES is installed. Optional. |
Maximum number of connection attempts |
Maximum number of attempts to connect to the target server for access to the data feed. |
Table 7-19 Oracle Content Server Connector Authorization Parameters (Step 2)
Parameter | Value |
---|---|
HTTP Endpoint for Authorization |
HTTP endpoint for Oracle Content Server authorization, such as |
Display URL Prefix |
HTTP host information to prefix the partial URL specified in the access URL of the documents in RSS feeds to form the complete URL. This complete URL is displayed as the URL when a user clicks the document link in the Oracle SES search results page. For example, you might display |
Administrator User |
Administrative user to access the Authorization Service API of Oracle Content Server. |
Administrator Password |
Administrative user password. |
Display Crawled Version |
Controls access to the crawled documents:
|
Authorization User ID Format |
Format of the user ID used by the Oracle Content Server authorization API, such as |
Use Cached User and Role Information to Authorize Results |
Controls user authorization:
|
User Role Data Source to Cache the Filter |
The name of the Oracle Content Server Users source that has crawled the user's SecurityGroup and Account information. |
Authentication Type |
Java authentication type. Enter |
Realm |
Realm of the Oracle Content Server instance. |
Oracle SSO Login URL |
URL that protects all OracleAS Single Sign-on applications. Set this parameter when the Authentication Type is ORASSO. |
Oracle SSO Action URL |
URL that authenticates OracleAS Single Sign-on user credentials. The login form is submitted to this URL. Set this parameter when Authentication Type is ORASSO. |
Note:
In previous releases, the base path of Oracle SES was referred to asORACLE_HOME
. In Oracle SES release 11g, the base path is referred to as ORACLE_BASE
. This represents the Software Location that you specify at the time of installing Oracle SES.
ORACLE_HOME
now refers to the path ORACLE_BASE
/seshome
.
For more information about ORACLE_BASE
, see "Conventions".