1 Introduction to Oracle Secure Enterprise Search

This chapter contains the following topics:

Overview of Oracle Secure Enterprise Search

Oracle Secure Enterprise Search (SES) provides uniform search capabilities over multiple repositories.

Oracle SES uses a crawler to collect data from these sources. The crawler supports a number of built-in source types, as well as a published plug-in (or connector) architecture for adding new types. Multiple Oracle SES instances can also share content through the federated source type.

Oracle SES supports numerous built-in source types:

  • Web: A Web source represents the content on a specific Web site. Web sources facilitate maintenance crawling of specific Web sites.

  • Table: A table source represents content in an Oracle database table or view.

  • File: A file source is the set of documents that can be accessed through the file protocol.

  • E-mail: An e-mail source derives its content from e-mails sent to a specific e-mail address. When Oracle SES crawls an e-mail source, it collects e-mail from all folders set up in the e-mail account, including Drafts, Sent Items, and Trash e-mails.

  • Mailing list: A mailing list source derives its content from e-mails sent to a specific mailing list.

  • OracleAS Portal: An OracleAS Portal source lets you search across multiple OracleAS Portal repositories, such as Web pages, files on disk, and pages on other OracleAS Portal instances.

  • Oracle Calendar: An Oracle Calendar source represents the content in an Oracle Calendar repository. Oracle SES can crawl content (meetings and events) and metadata in Oracle Calendar and provide secure full-text search over an Oracle Calendar repository. You can specify more than one thread to crawl. Deleted items are removed from the index during incremental crawling. You can search based on title, author, start or end date (year, month, day), event type, status, or location.

  • Oracle Content Database: An Oracle Content Database source represents the content in an Oracle Content Database repository.

    Note:

    Oracle Content Database and Oracle Content Services are the same product. This book uses the product name Oracle Content Database to mean Oracle Content Database and Oracle Content Services. Oracle Content Database sources are certified with Oracle Content Database release 10.2 and Oracle Content Services release 10.1.2.3.
  • Oracle Applications (Oracle E-Business Suite 11i and Siebel 8): Search Oracle Applications with an Oracle E-Business Suite 11i source or a Siebel 8 source.

  • Federated: A federated source lets you search secure content across distributed Oracle SES instances.

Additionally, out-of-the-box, with no additional coding required, Oracle SES 10.1.8 provides more access than any other enterprise search engine. It can find and verify information in the following:

Description of benri006.gif follows
Description of the illustration benri006.gif

See Also:

Oracle Secure Enterprise Search Components

Oracle SES includes the following components:

Oracle Secure Enterprise Search Crawler

The Oracle SES crawler is a Java process activated by a set schedule. When activated, the crawler spawns a configurable number of processor threads that fetch information from various sources and index the documents. This index is used for searching sources.

The crawler maps links and analyzes relationships. Whenever the crawler encounters embedded non-HTML, or non-textual documents during the crawling, it automatically detects the document type and filters and indexes the document.

Oracle Secure Enterprise Search Administration Tool

Use the Oracle Secure Enterprise Search administration tool to manage and monitor Oracle SES components. For example:

  • Define sources and crawling scope

  • Configure the search application

  • Monitor crawl progress and search performance

See Also:

Oracle Secure Enterprise Search APIs and Applications

Oracle Secure Enterprise Search provides several APIs. For example, the Crawler Plug-in API enables you to create a custom secure crawler plug-in (or connector) to meet your requirements. With the Web Services API, you can integrate Oracle SES search capabilities into your search application.

Oracle SES also provides an out-of-the-box search application.

Oracle Secure Enterprise Search Features

Information in an enterprise can be spread across Web pages, databases, mail servers or other collaboration software, document repositories, file servers, and desktops. Oracle SES searches all your data through the same interface. Oracle SES is fully globalized and works with 27 languages including Chinese, Japanese, Korean, Arabic, and Hebrew.

This section introduces a few of the features in Oracle SES. It includes the following topics:

See Also:

Chapter 3, "Understanding Crawling and Searching" for more features relating to the crawler

Secure Search

Much of the information within an organization is publicly accessible. Anyone is allowed to view it. Therefore, it is relatively easy for a crawler to find and index that information.

However, there are other sources that are protected. These protected sources might only be viewable by certain users or groups of users. For example, while users can search in their own e-mail folders, they should not be able to search anyone else's e-mail.

For protected sources, the Oracle SES crawler will index documents with the proper access control list. When end users perform a search, only documents that they have privileges to view will be returned.

Federated Search

Oracle Secure Enterprise Search provides the capability of searching multiple Oracle SES instances with their own document repositories and indexes. It provides a unified framework to search the different document repositories that are crawled, indexed, and maintained separately. A federation broker calls the federation endpoint to collect content matching the search criteria for the sources managed at that endpoint.

Federated search allows a single query to be run across all Oracle SES instances. It aggregates the search results to show one unified result list to the user. User credentials are passed along with the query so that each federation endpoint can authenticate the user against its own document repository.

Create a federated source on the Home - Sources page of the Oracle SES administration tool.

The following diagram illustrates Oracle SES federation architecture.

Description of benri005.gif follows
Description of the illustration benri005.gif

Web Services API

Oracle SES offers a Web services API that lets you integrate Oracle SES search capabilities into your search application.

Extensible Crawler Plug-in Framework

Oracle SES provides an extensible crawler plug-in (or connector) framework that lets you crawl and index proprietary document repositories.

See Also: