Oracle® Secure Enterprise Search Administrator's Guide 10g Release 1 (10.1.8.2) Part Number E10418-03 |
|
|
View PDF |
This chapter describes new features of Oracle Secure Enterprise Search (SES) 10g Release 1 (10.1.8.2), Release 1 (10.1.8.1), and Release 1 (10.1.8). It also provides pointers to additional information.
This release fixes many bugs from 10.1.8.1, and it also includes the following new features:
New Features for End Users:
Oracle SES adds results clustering, which is the ability to automatically group search results. Oracle SES can cluster by topic or by metadata attribute (for example, author or creation date of documents).
Topic clusters are groups of dynamically-formed subcategories based on the results of each query. Oracle SES displays a tree structure alongside search results to help users refine their queries. For example, searches for "support" on the company network return a set of categories with groups of topics such as "customer support" or "support contacts" to help guide the search.
A new AJAX-based interactive search user interface presents this and allows for viewing and manipulating results clusters, browsing, and a customized results list.
Note:
The new 10.1.8.2 query application is certified with Internet Explorer versions 6 and 7 and Firefox versions 1.5 and 2.x. Existing 10.1.8.1 functionality is certified on all Oracle SES-supported browsers through the classic user interface: http://<host>:<port>/search/query/search-classic.jspRelease 10.1.8.2 also provides enhanced query syntax.
You can search on attributes. For example, the query [DocVersion:>1] returns documents that have the number attribute Docversion where the attribute value is larger than 1.
Several new operators allow for thesaurus operations (for example, synonym terms, narrower terms, broader terms), fuzzy spellings, and wildcard matching. Some of the new operators work through Oracle Text. For example, a thesaurus is loaded using the Oracle Text ctxload
function before being usable through the new synonym term operator.
Other operators allow for more complex syntax (for example, expression-like grouping, AND, OR) or allow NEAR term search.
New Features for Developers:
The new Document Service API lets developers implement their own document processing on the content found by the Oracle SES crawler. It provides a hook to plug your own code into the processing pipeline. Perform operations, such as generating and inserting your own metatags or extracting entities like addresses or phone numbers from the content for compliance and auditing. Essentially, you can use this service to build your own customized search engine, while taking advantage of the existing Oracle SES crawler, application, and infrastructure.
See Also:
"Document Service API"Developers can now influence relevancy ranking in their Oracle SES instances by changing how document attributes like title and keywords influence rankings.
Oracle SES provides an XML connector framework to crawl any repository that provides an XML interface to its contents.
See Also:
"Overview of XML Connector Framework"Federated search includes several enhancements. For example, you can selectively route user queries to the Oracle SES endpoint instances based on a user-defined rule. This improves performance and scalability for secure sources where the data can be accessed by a particular user or group. For example, in an e-mail search deployment, the crawled e-mail data for a particular user is likely located on only one of the several Oracle SES machines. The administrator can define a rule to route user queries to the endpoint instances based on user names. This alleviates unnecessary load on other instances in federated search.
See Also:
"Customizing Federated Sources"Other New Features:
Release 10.1.8.2 provides the following new or improved connectors:
Oracle SES includes a new connector to Oracle Content Server (formerly known as Stellent Content Server)
Oracle SES includes a new connector to Oracle Mail. One thread finds all the e-mail users in the system from a corporate directory. All crawl threads subsequently, and in parallel, handle the e-mails of each of the users one by one.
The Microsoft Exchange connector has been enhanced to provide simplified set-up. Most importantly, it eliminates the need for installing an Oracle SES-specific agent on the Exchange server side by using WebDAV as the underlying protocol.
See Also:
"Setting Up Microsoft Exchange Sources"Oracle AS Portal: Oracle SES provides an option in the crawler.dat
file to turn on smart incremental crawling. This makes re-crawls more efficient by getting a list of changed pages and items directly from OracleAS Portal.
IBM Lotus Notes: The Lotus Notes connector now lets you enable or disable multiple attachment support.
See Also:
"Setting Up Lotus Notes Sources"WSRP Portlet configuration has been simplified. Digital certificates for safeguarding communication with Portal instances are now optional.
In 10.1.8.2, the crawler can detect the character set of plain text and XML documents, if its character set is not specified by the repository. The crawler needs the correct character set to properly index the document.
This release fixes many bugs from 10.1.8, and it also includes the following new features:
Oracle SES includes a new Database connector built on JDBC, so you can crawl any JDBC-enabled database. This source type provides additional security on the row level.
Oracle SES includes a new Oracle E-Business Suite 12 connector based on application data available as XML feeds.
Oracle SES now provides identity plug-ins for OpenLDAP release 2.2 and 2.3 and Sun Java System Directory Server release 5.1 and 5.2.
See Also:
"Secure Search Options"OracleAS Portal users can register the Oracle SES WSRP portlet (or, secure portlet) from their Portal pages. This requires OracleAS Portal 10.1.4.
Oracle Content Database plug-in now supports Web services authentication when using Oracle Content Database release 10.1.3.
The new automatic character set detection feature enables the crawler to automatically detect character set information for HTML, plain text, and XML files. Character set detection allows the crawler to properly cache files during crawls, index text, and display files for queries. This is important when crawling multibyte files (such as files in Japanese or Chinese).
See Also:
"Character Set Detection"Oracle SES provides a new parameter for the crawler configuration file (crawler.dat
) that lets you include any multimedia file type you want to crawl, and the file name will be indexed as title.
See Also:
"Default Exclusion Rules"Oracle SES is now certified on Internet Explorer 7.0
Note:
For release 10.1.8.1, Release Notes are posted only on Oracle Technology Network (OTN). They are not included in the documentation library on the DVD.You must register online before using OTN; registration is free and can be done at
http://www.oracle.com/technology/membership/
If you already have a user name and password for OTN, then you can go directly to the documentation section of OTN at
Out-of-the-box, with no additional coding required, Oracle SES 10.1.8 provides more access than any other enterprise search engine. It can find and verify information in the following:
Files in Microsoft NT File systems (NTFS)
EMC Documentum Content Server DocBases
IBM Lotus Notes databases
FileNet Content Engine object stores
FileNet Image Services libraries
Open Text Livelink
Microsoft Exchange
Oracle SES ships with plug-ins (a plug-in is a software module that adds features by Oracle SES) for all these applications. (To use some of the new plug-ins, additional licensing is required.) Oracle SES controls access to private documents and restricts access to specific workgroups based on access control information obtained during the indexing and stored in its search engine index.
Oracle SES also searches across a number of Oracle sources: OracleAS Portal, Oracle Collaboration Suite Content Services and Calendar, Oracle Content Database, selected modules of Oracle E-Business Suite, and Oracle Siebel.
Oracle SES is now directly integrated with access control and identity management solutions. No synchronization with Oracle Internet Directory is necessary for Oracle SES to ensure access control. Oracle SES can directly access Active Directory (no extra coding required) through new identity plug-in and authorization APIs. Oracle SES ships plug-ins for Oracle Internet Directory and Microsoft Active Directory, among others.
New suggested content feature lets you index and display real time content along with the search results. A style sheet can be applied to the content before it is displayed in the search result list.
In addition to the existing Query Web Service API, Oracle SES now includes an Admin Web Service API. This API lets you perform a subset of administrative actions, such as starting and stopping a crawler schedule or getting the index fragmentation level. The Admin Web service is located at the following URL: http://host:port/search/ws/admin/SearchAdmin
.
See Also:
Oracle Secure Enterprise Search Java API Reference
The "Web Services Interface" section in the Oracle SES administration tutorial:
http://st-curriculum.oracle.com/tutorial/SESAdminTutorial/index.htm
Other improvements include a simplified method for configuring secure search with OracleAS Single Sign-On, a title fallback feature to override default document titles picked up during crawling with a more meaningful title later, a more simple configuration of federated sources, and case-insensitive relevancy boosting (documents with "Oracle" are boosted when you enter "oracle".)
Upgrade from Oracle SES Release 1 (10.1.6) is supported.