Oracle® Secure Enterprise Search Administrator's Guide 10g Release 1 (10.1.8.2) Part Number E10418-03 |
|
|
View PDF |
This chapter contains the following topics:
Oracle Secure Enterprise Search (SES) is a complete stacked application. As part of the Oracle SES installation, Oracle Database 10g Release 1 (10.1.0.5) Enterprise Edition (EE) is installed. Restricted use of the Oracle Database EE is for storing and managing the search index, metadata, cache, and Oracle SES configuration information. The Oracle Application Server Containers for J2EE (OC4J) is included with Oracle SES. This embedded version is provided solely to run the Oracle SES user interfaces and APIs.
The Oracle SES home software use is restricted to support the Oracle SES database repository and no other databases created using the Oracle SES executables are supported. Oracle SES connectors listed on the Oracle price list may be licensed separately to use in conjunction with the Oracle SES installation.
Oracle Secure Enterprise Search enables a secure, high quality, easy-to-use search across all enterprise information assets. Key features include:
The ability to search and locate public, private and shared content across Intranet web-servers, databases, files on local disk or on file-servers, IMAP email, document management systems, applications, and portals
Highly secure crawling, indexing, and searching
A simple, intuitive search interface leading to an excellent user-experience
Excellent search quality, with the most relevant items for a query shown first, even when the query spans diverse public and private data sources
Analytics on search results and understanding of usage patterns
Sub-second query performance
Ease of administration and maintenance leveraging your existing IT expertise
See Also:
Oracle Secure Enterprise Search Installation Guide for requirements and tips and information on how to get started using Oracle SES
The Oracle SES home page for updated information on known issues, as well as code samples and best practices. The Oracle Secure Enterprise Search Release Notes on OTN has version information and known issues. http://www.oracle.com/technology/products/oses/index.html
A collection of information is called a source. Each source has a type, such as Web sites or database tables. Oracle SES provides built-in source types and also provides a published plug-in (or connector) architecture for you to add new types. Multiple Oracle SES instances can share content through the federated source type.
Oracle SES includes the following built-in source types:
Web: A Web source represents the content on a specific Web site. Web sources facilitate maintenance crawling of specific Web sites.
Table: A table source represents content in an Oracle database table or view.
File: A file source is the set of documents that can be accessed through the file protocol.
E-mail: An e-mail source derives its content from e-mails sent to a specific e-mail address. When Oracle SES crawls an e-mail source, it collects e-mail from all folders set up in the e-mail account, including Drafts, Sent Items, and Trash e-mails.
Mailing list: A mailing list source derives its content from e-mails sent to a specific mailing list.
OracleAS Portal: An OracleAS Portal source lets you search across multiple OracleAS Portal repositories, such as Web pages, files on disk, and pages on other OracleAS Portal instances.
Additionally, out-of-the-box, with no additional coding, Oracle SES provides more access than any other enterprise search engine. It can find and verify information in the following repositories:
Files in Microsoft NT file systems (NTFS)
EMC Documentum Content Server
IBM Lotus Notes
FileNet Content Engine
FileNet Image Services
Open Text Livelink
Microsoft Exchange
Oracle E-Business Suite
Siebel
Oracle Content Server (formerly known as Stellent Content Server)
Oracle Content Database
Oracle Calendar
Oracle Mail
This book divides source information into content management source types, collaboration source types, and applications source types.
Note:
Some of the plug-ins shipped with Oracle SES require extra licensing fees. Contact Oracle sales for details.Individual client libraries may need to be installed (and licensed from the vendor) for some content sources to work. For example, EMC Documentum requires a compatible version of Documentum Foundation Classes (DFC), a Java library, to be installed on the computer running Oracle SES. Oracle SES does not ship with DFC.
See Also:
Chapter 5, "Configuring Access to Content Management Sources"
Oracle Secure Enterprise Search Release Notes OTN for a list of supported platforms
Oracle SES includes the following components:
Oracle SES uses a crawler to collect data from the sources. The Oracle SES crawler is a Java process activated by a schedule. When activated, the crawler spawns a configurable number of processor threads that fetch information from various sources and index the documents. This index is used for searching sources.
The crawler maps links and analyzes relationships. Whenever the crawler encounters embedded non-HTML, or non-textual documents during the crawling, it automatically detects the document type and filters and indexes the document.
Use the Oracle Secure Enterprise Search administration tool to manage and monitor Oracle SES components. For example:
Define sources and crawling scope
Configure the search application
Monitor crawl progress and search quality
Customize search results
See Also:
Oracle SES administration tutorial for help understanding common administrator tasks:
http://st-curriculum.oracle.com/tutorial/SESAdminTutorial/index.htm
Oracle SES administration tool context-sensitive online help
Oracle Secure Enterprise Search provides several APIs. For example, with the Web Services API, you can integrate Oracle SES search capabilities into your search application. You can also customize the default Oracle SES ranking to create a more relevant search result list for your enterprise or configure clustering for customized applications.
The Crawler Plug-in API enables you to create a custom secure crawler plug-in (or connector) to meet your requirements. The Document Service API accepts input from documents and performs some operation on it. For example, you could create a document service for auditing or to show custom metatags.
Information in an enterprise can be spread across Web pages, databases, mail servers or other collaboration software, document repositories, file servers, and desktops. Oracle SES searches all your data through the same interface. Oracle SES is fully globalized and works with many languages including Chinese, Japanese, Korean, Arabic, and Hebrew.
This section introduces a few of the features in Oracle SES. It includes the following topics:
See Also:
Chapter 3, "Understanding Crawling and Searching" for more features relating to the crawlerMuch of the information within an organization is publicly accessible. Anyone is allowed to view it. Therefore, it is relatively easy for a crawler to find and index that information.
However, there are other sources that are protected. These protected sources might be viewable only by certain users or groups of users. For example, while users can search in their own e-mail folders, they should not be able to search anyone else's e-mail.
For protected sources, the Oracle SES crawler indexes documents with the proper access control list. When end users perform a search, only documents that they have privileges to view will be returned.
See Also:
"Enabling Secure Search"Oracle SES can search multiple Oracle SES instances with their own document repositories and indexes. It provides a unified framework to search the different repositories that are crawled, indexed, and maintained separately. A federation broker calls the federation endpoint to collect content matching the search criteria for the sources managed at that endpoint.
Federated search allows a single query to be run across all Oracle SES instances. It aggregates the search results to show one unified result list to the user. User credentials are passed along with the query so that each federation endpoint can authenticate the user against its own document repository.
Create a federated source on the Home - Sources page of the Oracle SES administration tool.
The following diagram illustrates Oracle SES federation architecture.
Oracle SES provides an extensible crawler plug-in (or connector) framework that lets you crawl and index proprietary document repositories. The Crawler Plug-in API enables you to create a custom secure crawler plug-in (or connector) to meet your requirements. You can also create an identity plug-in and an authorization plug-in for crawling that datastore.
See Also:
The Oracle Secure Enterprise Search home page at http://www.oracle.com/technology/products/oses/index.html
for updated information on known issues, as well as code samples and best practices