|Oracle® Ultra Search Administrator's Guide
10g Release 2 (10.2)
|PDF · Mobi · ePub|
This chapter contains the following topics:
Note:Some information in this chapter is generic to all types of Oracle Ultra Search installations and other informations are specific to installing and configuring Oracle Ultra Search with an Oracle Database release.
If you are installing Oracle Ultra Search with the Oracle Application Server release, then refer to Chapter 3, "Using Oracle Ultra Search with Oracle Application Server".
Oracle Ultra Search hardware requirements are based on the amount of data that you plan to process using Oracle Ultra Search. Oracle Ultra Search uses Oracle Text as its indexing engine and the Oracle Database as its repository.
Sufficient RAM Along with the resource requirements for the database and the Text indexing engine, also consider the memory requirements of the Oracle Ultra Search crawler. The Oracle Ultra Search crawler is a pure Java program. When the crawler is launched, the Java Virtual Machine (JVM) is configured to start with 25MB and grow to 256MB. When crawling very large amounts of data, these values might need to be adjusted.
The Oracle Ultra Search administration tool is a J2EE 1.2 standard Web application. It can be installed and run on a separate host from the Oracle Ultra Search backend. You can install and run this on the same host as the Oracle Ultra Search backend. You need to allocate enough memory for the J2EE engine. Oracle recommends using the Oracle HTTP Server with the Oracle Application Server Containers for J2EE (OC4J). Allocate enough memory for the HTTP Server as well as for the Java Development Kit (JDK) that runs the J2EE engine.
Sufficient Disk Space As customer requirements vary widely, Oracle cannot recommend a specific amount of disk space. As a general guideline, the minimum requirements are as follows:
3GB of disk space is required for the Oracle Application Server Infrastructure or database and the Oracle Ultra Search backend.
15MB of disk space for the Oracle Ultra Search middle tier on top of the Web server's disk requirements.
For each remote crawler host 3GB of disk space is reequired.
Disk space for a large
TEMPORARY tablespace depends upon the amount of RAM on the host.
Disk space for the Oracle Ultra Search instance user's tablespace.
The Oracle Ultra Search instance user is a database user that you must create. All data that is collected and processed as part of crawling and indexing is stored in this user's schema.
You should create the tablespace as large as the total amount of data that you want to index. For example, if you estimate that the total amount of data to be crawled and indexed is 10GB, then create a tablespace that is at least 10GB for the Oracle Ultra Search instance user. Make sure to assign that tablespace as the default tablespace of the Oracle Ultra Search instance user.
Oracle Ultra Search database schema: Data dictionary and PL/SQL packages.
Oracle Ultra Search crawler: Java program plus supporting files, libraries, and so on.
Oracle Ultra Search remote crawler: Crawler residing on a remote Oracle home.
The Oracle Ultra Search backend is installed as part of the Oracle Database Server installation.
See Also:Oracle Universal Installer Concepts Guide.
The Oracle Ultra Search middle tier includes the following:
Oracle Ultra Search administration tool
Oracle Ultra Search Java query API
Oracle Ultra Search query applications
The Oracle Ultra Search middle tier is installed as part of the Oracle Database Server installation.
This section describes Oracle Ultra Search postinstallation tasks. There are five steps to the postinstallation:
Use the following command to start the Oracle Ultra Search middle tier. You must run this command manually to start the Oracle Ultra Search middle tier after installation.
Use the Oracle Process Manager and Notification Server (OPMN) to start the OC4J_Portal instance. For example:
$ORACLE_HOME/opmn/bin/opmnctl startproc instancename=OC4J_Portal
The Oracle Ultra Search installer creates a default Oracle Ultra Search instance based on the Oracle Ultra Search test user. You can test the Oracle Ultra Search functionality based on the default instance after installation.
The default instance name is
WK_INST. It is created based on the database user
WK_TEST. The default user password is
For security purposes,
WK_TEST is locked after installation. The administrator should log on to the database as DBA, unlock the
WK_TEST user account, and set the password to be
WK_TEST. The password expires after the installation. If the password is changed to anything other than
WK_TEST, then you must also update the cached schema password using administration tool in the Edit Instance page after you change the password in the database.
The default instance is also used by the Oracle Ultra Search query application. Make sure to update the
Caution:Storing clear text passwords in
data-sources.xml file is located in the
$ORACLE_HOME/oc4j/j2ee/OC4J_SEARCH/config directory. Under tag
<data-sources> add the following:
<data-source class="oracle.jdbc.pool.OracleConnectionCacheImpl" name="UltraSearchDS" location="jdbc/UltraSearchPooledDS" username="username" password="password" url="jdbc:oracle:thin:@database_host:oracle_port:oracle_sid" />
In the preceding syntax, the following variables were used:
username and password are the Oracle Ultra Search instance owner's database user name and password.
database_host is the host name of the back end database computer.
oracle_port is the port to the user's Oracle Database.
oracle_sid is the SID of the user's Oracle Database.
In addition to user name, password, and JDBC URL,
xml enables configuration of the connection cache size, and the cache scheme.
The following tag specifies the minimum and maximum limits of the cache size, the inactivity time out interval, and the cache scheme.
<data-source class="oracle.jdbc.pool.OracleConnectionCacheImpl" name="UltraSearchDS" location="jdbc/UltraSearchPooledDS" username="wk_test" password="wk_test" url="jdbc:oracle:thin:@localhost:1521:isearch" min-connections="3" max-connections="30" inactivity-timeout="30"> <property name="cacheScheme" value="1"/> </data-source>
If you are adding the data source to the default Oracle Ultra Search instance user
WK_TEST, then make sure to unlock
Note:The URL of the JDBC data source can be provided in the form of
See Also:"Unlock WK_TEST"
There are three values for the caching schemes:
1 = DYNAMIC_SCHEME
2 = FIXED_WAIT_SCHEME
3 = FIXED_RETURN_NULL_SCHEME
Restart the Oracle Ultra Search middle tier, in the Database release. For example:
For the Oracle Application Server release, use Oracle Process Manager and Notification Server (OPMN) to start the OC4J_Portal instance. For example:
$ORACLE_HOME/opmn/bin/opmnctl startproc instancename=OC4J_Portal
If the database character set is changed after installation, you must reconfigure the Oracle Ultra Search backend to adapt to the new character set.
wk0prefcheck.sql is run by the
wksys user to reconfigure default cache character set and index preferences.
wk0idxcheck.sql is needed for reconfiguring instances created before the database character set change (for example, the default instance). This script must be run by the instance owner, and
wk0prefcheck.sql must be run first because it depends on reconfigured default settings generated by
wk0idxcheck.sql also drops and recreates the Oracle Text index used by Oracle Ultra Search. If there are already data sources indexed, then you must force a recrawl of all of the data sources.
wk0idxcheck.sql must be run once for each instance. For example, if there are two instances, inst1 and inst2, owned by owner1 and owner2, respectively, then
wk0idxcheck.sql should be run twice, once by owner1 and once by owner2.
Note:Oracle Ultra Search only supports database character sets supported by Oracle Text. For example, the AL32UTF8 character set is not supported. For Unicode support, use UTF8. For the complete list of supported database character sets, refer to the Oracle Text Reference for lexer types.
To configure the middle-tier and infrastructure to work with OracleAS Metadata Repository after its character set has been changed, do the following:
Modify the character set of all Database Access Descriptors (DADs) accessing the metadata repository to the new database character set.
Using the Application Server Control Console, navigate to the middle-tier instance home page.
In the System Components section, click HTTP_Server.
On the HTTP_Server home page, click Administration.
On the HTTP_Server Administration page, select PL/SQL Properties. This opens the
mod_plsql Services page.
Scroll to the DADs section and click the name of the DAD that you want to configure. This opens the Edit DAD page.
In the NLS Language field, type in a NLS_LANG value whose character set is the same as the new character set for OracleAS Metadata Repository.
Repeat steps e to g for all DADs accessing OracleAS Metadata Repository.
Reconfigure the Oracle Ultra Search index as follows:
Connect to OracleAS Metadata Repository as
WKSYS and invoke the following SQL script to reconfigure the default cache character set and index preference:
Connect to OracleAS Metadata Repository as the default user (
WKTEST) and invoke the following SQL script:
The script requests you to enter the instance name (
WK_INST). Enter "y" when prompted to go ahead with the change.This script reconfigures the instance (in this case, the default instance). It also truncates the Oracle Text index used by Oracle Ultra Search and you must force a recrawl to rebuild the index.
Repeat step b for all Oracle Ultra Search instances that were created before you changed the database character set. Invoke the script as the instance owner, and then force a recrawl of all data sources, if necessary.
This section describes how to check whether your installation was successful.
If you log on to the Oracle Ultra Search administration tool successfully, then you have completed the Oracle Ultra Search administration tool configuration process. Do the following to check the Oracle Ultra Search Administration Tool:
Check whether the Web Server is running.
Attempt to log on to the administration tool:
Visit the following URL
In the preceding URL, hostname.domainname is the full name of the host where you have installed the Oracle Ultra Search middle tier, and port is the default Web server port.
Log on to the Oracle Ultra Search administration tool by entering the Oracle Ultra Search instance owner's database user name and password.During the installation of the Oracle Ultra Search backend, a new Ultra Search instance owner,
WK_TEST is created.
The first time any JSP page is accessed, it takes a few seconds to compile. Subsequent accesses are much faster.
After you verify that the Oracle Ultra Search administration tool is working, you should be able to run the Oracle Ultra Search query applications.
To test the Oracle Ultra Search query applications, do one of the following:
Visit the following URL:
Follow the links in the Oracle Ultra Search welcome page:
Locations for query applications are listed in the following section. Access the query source code by going to the directories list. You can also see a working demonstration of each query JSP page with the URL root, and you can append the correct JSP file name at the end of the URL root.
The query application is shipped as
Portlet is shipped as
This section describes how to troubleshoot Oracle Ultra Search.
Query finds no results
Error when processing binary files
The Oracle Ultra Search crawler uses the Oracle Text INSO filter,
ctxhx, for processing of binary files. These are non-text, non-HTML files such as PDF files, Microsoft Word files, and so on. For Oracle Ultra Search to use the INSO filter, the shared library path environment variable must contain the
During installation, the Oracle Universal Installer automatically sets the variable to include
$ORACLE_HOME/ctx/lib. If you restart the database after the installation, then you must manually set your shared library path environment variable to include
$ORACLE_HOME/ctx/lib before starting the Oracle process. You must restart the database to pick up the new value for filtering to work.
On UNIX set the
$LD_LIBRARY_PATH environment variable to include
On Windows set the
$PATH environment variable to include
Error when crawling a file data source
If the globalization setting for an environment that starts the Oracle Database is not compatible with the target files' locale, then a file not found error occurs or files or directories with names containing the CJK character. This error occurs in a multibyte language environment like Chinese, Japanese, or Korean. This is because the crawler relies on the correct locale setting to read operating system files.
To correct this, set the correct locale, restart the Oracle Database and make Oracle Ultra Search to re-crawl the data source. For example:
Shutdown the Oracle Database instance:
SQL> shutdown immediate
Set the locale to
'ja' with the following:
> setenv LANG ja > setenv LC_ALL ja
Restart the Oracle Database instance:
Restart the Oracle Ultra Search schedule with a forced re-crawl.
Cannot log on to the Oracle Ultra Search administration tool
ultrasearch.properties file contains configuration information used by Oracle Ultra Search middle tier. The file is automatically configured by the Oracle Universal Installer, so there is no need to edit this file.
With a software or an advanced database installation, you must manually configure the Oracle Ultra Search administration tool by editing it. You must replace
%THIN_JDBC_CONN_STR% with a JDBC string to the database, and replace
%DOMAIN% with the domain name.
Here is an example of the
connection.driver=oracle.jdbc.driver.OracleDriver #If set, The JDBC connection URL specified here will override the dynamically #acquired one from Oracle Internet Directory. #This setting is also used by the query sample (gsearch.jsp) #Example: connection.url=jdbc:oracle:thin:@<host>:<port>:<sid> connection.url=%JDBC_CONN_STR% oracle.net.encryption_client=REQUESTED oracle.net.encryption_types_client=(RC4_56,DES56C,RC4_40,DES40C) oracle.net.crypto_checksum_client=REQUESTED oracle.net.crypto_checksum_types_client=(MD5) oid.app_entity_cn=m16bi.sgtcnsun03.cn.oracle.com domain=us.oracle.com
In the preceding example, the following variables were used:
connection.driver specifies the JDBC driver you are using.
connection.url specifies the database to which the middle tier connects. Oracle Ultra Search supports following formats:
host:port:SID (where host is the full host name of the Oracle Database instance running Oracle Ultra Search, port is the listener port number for the Oracle Database instance, and SID is the Oracle Database instance ID)
HA-aware string (for example, TNS keyword-value syntax)
oracle.net.encryption_client, oracle.net.encryption_types_client, oracle.net.crypto_checksum_client, and oracle.net.crypto_checksum_types_client control the properties of the secure JDBC connection made to the database. Refer to Oracle Database JDBC Developer's Guide and Reference for more information.
oid.app_entity_cn specifies the Oracle Ultra Search middle tier application entity name.
domain specifies the common domain for the Identity Management computer and the Oracle Ultra Search middle tier computer. This enables delegated administrative service (DAS) list of values to work with Internet Explorer. For example, if the Oracle Ultra Search middle tier in us.oracle.com and the Identity Management computer is uk.oracle.com, then the common domain is oracle.com.
Note:You need not to configure the JDBC connect string in the
The Oracle Ultra Search remote crawler enables multiple crawlers to run in parallel on different hosts. However, all remote crawler hosts must share common resources, such as common directories and a common Oracle Ultra Search database.
The Oracle Ultra Search remote crawler is part of the Oracle Ultra Search backend. The crawler installation procedure is similar to the Oracle Ultra Search backend installation.
On each remote crawler host, the Oracle Ultra Search backend is installed under a common directory known as
ORACLE_HOME. The remote
ORACLE_HOME directory is referred to as
If you have not installed the Oracle HTTP Server during the Oracle Application Server installation, then you must perform the following steps manually for remote crawling:
Locate the file that defines the environment.
%ORACLE_HOME% with the value of the
REMOTE_ORACLE_HOME environment variable.
%s_jreLocation% with the directory path of a Java runtime environment (JRE) version 1.2.2 and higher. You should specify the root directory of the JRE.
The mechanisms of communication are RMI and JDBC. Configuration of the remote crawler differs depending on which mechanism you use. The JDBC-based mechanism requires you to provide a database user (or role) during the registration process.
See Also:"Using the Remote Crawler" more information on the RMI and JDBC mechanisms
The registration process is done by running a SQL script on the Oracle Ultra Search remote crawler host. The SQL script connects over SQL*Plus to the Oracle backend database and registers the remote crawler host.
Locate the correct
The Oracle Ultra Search middle tier is installed under a common directory known as
ORACLE_HOME. If you have installed other Oracle products prior to the Oracle Ultra Search middle tier, then you may have multiple
ORACLE_HOME directories on your host. The registration script requires that you enter the
ORACLE_HOME directory in which the Oracle Ultra Search middle tier is installed.
You must run the registration script as the
WKSYS super-user or as a database user who has been granted super-user privileges.
Be sure to run the correct version of SQL*Plus, because multiple versions can reside on the same host if you have installed some other Oracle products. On UNIX platforms, make sure that the correct values for
TNS_ADMIN variables are set. On Windows, choose the correct menu item from the Start menu.
After you have identified how to run the correct SQL*Plus client, you must log on to the Oracle Ultra Search database. To do this, you might need to configure an Oracle Net service setting for the Oracle Ultra Search database.
After SQL*Plus is running, log on to the database using the schema and password that you located in Step 2.
Run the registration script.
Start up SQL*Plus as the
WKSYS super-user and enter the following:
The registration script for RMI-based remote crawling is the following:
For example, if the value for
$REMOTE_ORACLE_HOME on a UNIX host is
/home/oracle10g, then enter the following at the SQL*Plus prompt to register an RMI-based remote crawler:
The RMI-based registration script prompts you for three variables:
RMI_HOSTNAME: The remote hostname. This is where the RMI registry/daemon will run.
RMI_REGISTRY_PORT: The port that the RMI registry is listening on.
ORACLE_HOME: The Oracle home located in Step 1.
/u01/oracle10g on a UNIX host or
d:/u01/oracle10g on a Windows host. Remember to use forward slashes for Windows hosts.
The registration script for JDBC-based remote crawling is the following:
For example, if you are running SQL*Plus on Windows, and
$REMOTE_ORACLE_HOME is in
d:\Oracle\Oracle10g, then enter the following at the SQL*Plus prompt to register a JDBC-based remote crawler:
The JDBC-based registration script prompts for three variables:
LAUNCHER_NAME: An arbitrary string used to identify a JDBC-based remote crawler launcher, which is needed when you start up the JDBC-based remote crawler launcher.
CONNECTUSER: The database user (or role) that the JDBC-based remote crawler launcher will use to establish a database connection and listen for launch events.
ORACLE_HOME: The Oracle home located in Step 1.
The registration script invokes the
wk_crw.register_remote_crawler PL/SQL API. The
ORACLE_HOME variables are used to compose arguments for the
wk_crw.register_remote_crawler API. You may optionally choose to call this API, especially if you need to register multiple remote crawlers programatically.
Verify and complete the remote crawler profile configuration. Be sure to enter the correct values for both variables. To verify that the registration has completed correctly, do the following:
Log on to the Oracle Ultra Search administration tool.
Click the Remote Crawler Profiles tab in the Crawler tab. You should see the remote crawler launcher you registered in the remote crawler profile list.
For RMI-based remote crawlers, you will see the host:port combination that uniquely identifies the RMI-subsystem.
For JDBC-based remote crawlers, you will see the Launcher name.
Click Edit to complete the configuration process for the remote crawler profile.
See Also:Oracle Net Services Administrator's Guide for information on how to configure a service setting
If you enter wrong values for the
sql script, then you need to unregister the remote crawler using the
sql script. Run the unregister script the same way you ran the registration script. The
sql script calls the
unregister_remote_crawler PL/SQL API. After you have successfully unregistered the remote crawler, you can rerun the
Before you upgrade, log on to the Oracle Ultra Search administration tool. Stop and disable all crawler synchronization schedules in every Oracle Ultra Search instance. You can enable all crawler synchronization schedules after the upgrade.
See Also:"Schedules Page" for details on how to stop and disable the synchronization schedule
To upgrade Oracle Ultra Search shipped with the Oracle Database release, do the following:
Run the Oracle Ultra Search backend upgrade. This includes upgrading the Oracle Ultra Search database schemas and server files. Install the new Oracle software, and run Oracle Database Upgrade Assistant to upgrade the database and Oracle Ultra Search component to the new release. See the Oracle Database Upgrade Guide for details.
Follow the steps in "Installing the Oracle Ultra Search Middle Tier" to install the new Oracle Ultra Search middle tier.
After upgrading to the current release, follow these post-upgrade configuration steps:
ORACLE_SID environment variables to Oracle Database 10g.
Change directories to
Run the following statement:
sqlplus "sys/password as sysdba"
Run the following statement:
@wk0config.sql WKSYSPW JDBC_CONNSTR LAUNCH_ANYWHERE NET_SERVICE_NAME
In the preceding statement, the following parameters were used:
WKSYSPW is the password for the
JDBC_CONNSTR is the JDBC connection string. Use the format
hostname:port:sid, such as
machine1:1521:iasdb, if the database is not in the Oracle Real Application Clusters environment.
If the database is in a Oracle Real Application Cluster environment, then use the TNS keyword-value format instead, because it enables connection to any node of the system:
(DESCRIPTION=(LOAD_BALANCE=yes) (ADDRESS_LIST= (ADDRESS=(PROTOCOL=TCP)(HOST=cls02a)(PORT=3001)) (ADDRESS=(PROTOCOL=TCP)(HOST=cls02b)(PORT=3001))) (CONNECT_DATA=(SERVICE_NAME=sales.us.acme.com)))
In the preceding syntax, the following parameters were used:
LAUNCH_ANYWHERE is the mode of the database. Setting it to
TRUE indicates that the database is in Oracle Real Application Cluster mode,
FALSE indicates that the database is not in Oracle Real Application Cluster mode.
NET_SERVICE_NAME is the network service name used by
wk0config.sql to establish the database connection. Setting it to
""(empty string) while running
wk0config.sql from the database host eliminates the need to specify the network service name.
The following is an example of the post-upgrade script for a non-Oracle Real Application Cluster environment:
@wk0config.sql welcome1 machine:1521:iasdb FALSE""
The following is an example of the post-upgrade script for an Oracle Real Application Clusters environment:
@wk0config.sql welcome1 "(DESCRIPTION=(LOAD_BALANCE=yes) (ADDRESS_LIST= (ADDRESS=(PROTOCOL=TCP)(HOST=cls02a)(PORT=3001)) (ADDRESS=(PROTOCOL=TCP)(HOST=cls02b)(PORT=3001))) (CONNECT_DATA=(SERVICE_NAME=sales.us.acme.com)))" FALSE ""