|Oracle® Secure Enterprise Search Administrator's Guide
11g Release 2 (11.2.2)
Part Number E23427-01
|PDF · Mobi · ePub|
Oracle SES contains features that you can tune to optimize search performance. This section contains suggestions on how to improve performance (such as response time and throughput) and scalability of Oracle SES. It identifies the most common ways to improve search quality.
Suggested links enable you to direct users to a designated Web site for particular query keywords. For example, when users search for "Oracle Secure Enterprise Search documentation" or "Enterprise Search documentation" or "Search documentation", you could suggest
Suggested link keywords are rules that determine which suggested links are returned (as suggestions) for a query. A rule can include query terms and logical operators. For example, "secure AND search". With this rule, the corresponding suggested link is returned for the query "secure enterprise search", but it is not returned for the query "secure database".
The rule language used for the indexed queries supports the following operators:
Table 10-1 Suggested Link Keyword Operators
dog and cat
dog ; cat
dog or cat
Note:Do not use special characters, such as #, $, =, and &, in keywords.
Suggested links appear at the top of the search result list. Oracle SES can display up to two suggested links for each query.
This feature is especially useful for providing links to important Web pages that are not crawled by Oracle Secure Enterprise Search. Add or edit suggested links on the Search - Suggested Links page in the Oracle SES Administration GUI.
By tuning the security filter settings, you can prevent time outs.
To change the configuration of the security filter:
Log in to the Oracle SES Administration GUI.
Click the Global Settings tab, then Query Configuration.
Scroll down to Security Filter Configuration and change these settings using the guidelines provided in the Help.
Security Filter Lifespan
Minimum Number of Threads
Maximum Number of Threads
You can further tune the security filter by using the Administration API to set the
<search:securityFilterRefreshWaitTimeout> parameters in the
queryConfig object. For example, these settings allow an expired security filter to be used immediately when a fresh security filter is unavailable:
<search:preserveStaleSecurityFilterOnError>true </search:preserveStaleSecurityFilterOnError> <search:securityFilterRefreshWaitTimeout>0 </search:securityFilterRefreshWaitTimeout>
The settings listed previously are also parameters of the
queryConfig object and can be modified using the API. See the Oracle Secure Enterprise Search Administration API Guide.
Parallel querying significantly improves search performance and facilitates searches of very large data sources. The query architecture is based on Oracle Database partitioning and enhancements in Oracle Text.
To make the best use of this feature, Oracle recommends that you run Oracle SES on a server with a 4-core CPU, with at least 8GB of RAM and multiple fast disk drives.
Parallel querying is automatically implemented on Oracle SES when the partitioning option is enabled. You can specify partitioning only during installation.
To enable partitioning:
Acquire a license for the Oracle Partitioning option.
During installation, answer Yes when the Repository Creation Utility (RCU) asks if you have a partitioning license. Then Oracle Database is installed with partitioning, and Oracle SES automatically supports parallel query.
Database tablespaces are registered as storage areas in Oracle SES. To make optimum use of the parallel querying feature, you should distribute partitioned tablespaces across all physical disks and register them with Oracle SES.
Use the Administration API to manage
Configuring a partition includes listing the storage areas, identifying the partitioning attribute, and updating the partitioning rules. Use the Administration API to manage the
See Also:Oracle Secure Enterprise Search Administration API Guide for information about managing storage areas and partitions
Index fragmentation management allows the search engine index to be updated while Oracle SES is executing searches. This is achieved by temporarily saving index changes to an in-memory index and periodically merging them with the larger disk-based search engine index. This reduces fragmentation and leads to faster response times. Index fragmentation management is implemented automatically on Oracle SES, but it can be tuned by configuring Oracle Text, where you can turn index fragmentation management on and off, and specify the frequency of index merges.
Optimizing the index also reduces fragmentation, and it can significantly increase the speed of searches. Schedule index optimization on a regular basis. Also, optimize the index after the crawler has made substantial updates or if fragmentation is more than 50%. Verify that index optimization is scheduled during off-peak hours. Optimization of a very large index could take several hours.
You can see the fragmentation level and run index optimization on the Global Settings - Index Optimization page in the Oracle SES Administration GUI. Index optimization has these options:
Specify a maximum duration for the index optimization process. The actual time taken for optimization does not exceed this limit, but it can be shorter. A longer optimization time results in a more optimized index. In this mode, the optimization process does not require a large amount of free disk space.
Specifies that the optimization continues until it is finished. Allowing the optimization to complete creates a more compact index and supports better performance than a partial optimization.
In this mode, Oracle SES creates a temporary copy of the index. The required disk space almost equals the current index size. If sufficient free disk space is not available, then the optimization fails. Use the appropriate SQL query shown here to estimate the minimum disk requirement:
Oracle SES Without Partitioning
SELECT SUM(bytes)/1048576 AS "MBytes" FROM dba_segments WHERE segment_name IN ('DR$EQ$DOC_PATH_IDX$I','DR$EQ$DOC_PATH_IDX$X');
Oracle SES With Partitioning
SELECT SUM(sz) AS "MBytes" FROM ( SELECT MAX(bytes)/1048576 sz FROM dba_segments WHERE segment_name LIKE 'DR#EQ$DOC_PATH_IDX$%I' UNION SELECT MAX(bytes)/1048576 sz FROM dba_segments WHERE segment_name LIKE 'DR#EQ$DOC_PATH_IDX$%X' ) ;
These queries return an estimate of the minimum disk space needed for optimization. Oracle SES may require more disk space than this estimate.
After the optimization is complete, Oracle SES releases the disk space consumed during the optimization. The space can be used by future crawls or any activity that consumes disk space.
To improve indexing performance, adjust the following parameters on the Global Settings - Set Indexing Parameters page of the Oracle SES Administration GUI:
When the crawled data in the cache directory reaches Indexing Batch Size, Oracle SES starts indexing. The bigger the batch size, the longer it takes to start indexing each batch. Only indexed data can be searched: Data in the cache cannot be searched. The default size is 250M.
Document fetching and indexing run concurrently. While indexing is running, the Oracle SES crawler continues to fetch documents and store them in the cache directory.
A large amount of memory improves indexing performance because it reduces I/O. It also improves query performance because the created index is less fragmented from the beginning, while a fragmented index can be optimized later. Set this parameter as high as possible without causing memory paging.
A smaller amount of memory might be useful when indexing progress should be tracked or when run-time memory is scarce. The default size is 275M. In general, increasing the Indexing Memory Size parameter can reduce fragmentation.
See the Home - Statistics page in the Oracle SES Administration GUI for lists of the most popular queries, failed queries, and ineffective queries. This information can lead to the following actions:
Refer users to a particular Web site for failed queries on the Search - Suggested Links page.
Fix common errors that users make in searching on the Search - Alternate Words page.
Make important documents easier to find on the Search - Relevancy Boosting page.
Once daily, SES automatically summarizes logged queries. The summarizing task might use the server resource if there are a large number of logged queries, which may impact query performance. This issue is visible for stress tests where several queries are executed every second. The ideal solution in such instances is to disable the query statistics option.
To disable the query statistics option:
From the Administration GUI Home page, select the Global Settings tab, then click Query Configuration.
Under Query Statistics, select No for the Enable Query Statistics option.
Two methods can help you locate URLs for relevancy boosting: locate by search and manual URL entry.
When Oracle SES is deployed in an Oracle Real Applications Cluster (Oracle RAC) environment, the usage profile is typically one of the following:
Small index with a large query load
Large index with a small-to-large query load
A third option, a small index and a small query load, typically operates on a single computer.
The load balancing solutions provided by Oracle RAC and the WebLogic Server are sufficient for this type of Oracle SES deployment. Most or all of the index can reside in memory or the buffer cache. You only need to set up the listeners appropriately for Oracle SES.
To set up the listeners:
Provide a local listener on each Oracle RAC instance.
Do not configure remote listeners.
Oracle recommends dedicated processes over shared processes.
Oracle SES is installed in a WebLogic domain as described in "Secure Search in Oracle Fusion Applications". The default settings for stuck threads can result in slow query performance even under a moderate load.
To change the search server configuration
Log in to the WebLogic console, as described in "Accessing the Oracle WebLogic Server Administration Console".
In the left panel under Change Center, click Lock & Edit.
In the left panel under Domain Structure, expand Environment and click Servers. The Summary of Services page is displayed in the main panel.
In the Name column, click search_server1. The Settings for search_server1 page is displayed.
Select the Configuration tab.
Configure these settings:
Stuck Thread Max Time: 3600
Stuck Thread Timer Interval: 1800
Repeat these steps for any other search server instances, such as search_server2.
In the left panel under Change Center, click Activate Changes.
To support a large number of simultaneous users, you may need to increase the values of these database initialization parameters:
In Fusion Applications, the Oracle SES middle tier uses connection pooling to communicate with the backend database. The database connection uses dedicated server mode, so that when 10 users run concurrent searches, the database requires 10 user processes.
The crawler also uses several threads, and each thread uses several database connections. You can alter the number of crawler threads on the Home - Sources - Crawling Parameters page of the Oracle SES Administration GUI.
Use the combined estimate of concurrent user processes and crawler threads for the value of
PROCESSES. Then modify
SESSIONS to a compatible value, typically calculated as
1.1 * PROCESSES.
You can monitor the number of open cursors using the statistics stored in the
V$SESSTAT dynamic performance view. If the number of open cursors for user sessions frequently approaches the maximum, then you can increase that number.
Open SQL*Plus and log in to Oracle Database as a privileged user, such as
For a list of all initialization parameters and their current settings, issue this SQL*Plus command:
ALTER SYSTEM commands, using values appropriate for your system, to change the value of the parameters. For example, this command sets PROCESSES to 800:
ALTER SYSTEM SET processes=800 SCOPE=spfile;
Restart Oracle Database for the new settings to take effect.
Heavy query load should not coincide with heavy crawl activity, especially when there are large-scale changes on the target site. If it does, such as when a crawl is scheduled around the clock, then increase the size of the Oracle UNDO tablespace with the
An Oracle SES search operation looks up the Oracle Text index and some internal tables to generate a hit list. To maintain the best search performance, reduce disk I/O as much as possible by keeping these objects in the buffer cache. If you have plenty of physical memory, you can enlarge the buffer cache so it can retain these objects.
The search operation accesses these database objects the most frequently:
|Object Name||Partitioned Object Name||Object Type|
$X and $R are the most important and are typically smaller than $I. If the database has large KEEP pool or can support one, consider putting the $X and $R tables in it to maintain good performance when accessing them. While the $I table is also important for search, it can become too large to cache in its entirety.
Check the cache hit ratio for these objects regularly in Enterprise Manager or an Automatic Workload Repository (AWR) report. Crawling and optimization can change the size of these objects.
To put a table in the KEEP pool:
Open SQL*Plus or another SQL interface and connect as a privileged user.
ALTER INDEX command using this syntax, where
table_name is the $R or $X table.
ALTER INDEX table_name STORAGE(BUFFER_POOL KEEP)
Verify the new location of the table:
SELECT buffer_pool FROM dba_indexes WHERE index_name = table_name;
Example 10-1 shows the SQL commands that put the $X file in the KEEP pool.
Example 10-1 Putting DR$EQ$DOC_PATH_IDX$X in the KEEP Pool
SQL> SELECT buffer_pool FROM dba_indexes WHERE index_name='DR$EQ$DOC_PATH_IDX$X'; BUFFER_POOL --------------------- DEFAULT SQL> ALTER INDEX dr$eq$doc_path_idx$x STORAGE (BUFFER_POOL KEEP); Index altered. SQL> SELECT buffer_pool FROM dba_indexes WHERE index_name='DR$EQ$DOC_PATH_IDX$X'; BUFFER_POOL --------------------- KEEP