The process of reading sources and creating the search engine index.


An Oracle Secure Enterprise Search program that reads sources to create the search engine index.


Distinguished Name. The unique name of a directory entry in Oracle Internet Directory. It includes all the individual names of the parent entries back to the root. The DN tells you exactly where the entry resides in the directory's hierarchy.


Digital Imaging and Communications in Medicine (DICOM) is the predominant medical imaging standard for exchanging digital information between medical imaging equipment (such as radiological imaging) and other systems, ensuring inter operability. DICOM images contain rich metadata about the patient, medical equipment, and other medical information. DICOM is a major feature of Oracle Multimedia 11g release.


Unit of indexing, returned as one entry in the hit list. For example, a document could be all the collected information about a person from a human resources system.

duplicate documents

Documents that are identical to each other; that is, they are the exact same size, same content, same title, and so on.

federated search

Oracle SES provides the capability of searching multiple Oracle SES instances with their own document repositories and indexes. It provides a unified framework to search the different document repositories that are crawled, indexed, and maintained separately. A federation broker calls the federation endpoint to collect content matching the search criteria for the sources managed at that endpoint.

hit list

A list of results for a search.


Data sources that can be browsed based on the source or path of the documents. An infosource hierarchy has this structure: Source Group > host or folder > folder. An infosource hierarchy provides a count of documents at levels below the top (that is, the Source Group level). A user's access to the infosource structure relies on the user's access to documents stored in the structure.

Users can search documents under a particular path or level by selecting the corresponding node in the infosource browse hierarchy.


An Oracle Secure Enterprise Search structure that is updated after a crawl. It is used to improve performance of searches.


The programming API that enables Java applications to access a database through the SQL language. JDBC drivers are written in Java for platform independence but are specific to each database.


Lightweight Directory Access Protocol. A standard for representing and accessing user and group profile information.


A program that breaks the source text into tokens, usually words, in accordance with a specified language.


List of values.

near duplicate documents

Documents that are similar to each other. They may or may not be identical to each other.

Oracle Application Server (Oracle AS)

Oracle's integrated application server:

  • Is standards compliant (J2EE, Web Services, and XML)

  • Delivers a comprehensive set of capabilities, including portal, caching, wireless, integration, and personalization

  • Provides a single, unified platform for Java and J2EE, Web Services, XML, SQL, and PL/SQL

OracleAS Portal

A component of Oracle Application Server used for the development, deployment, administration, and configuration of enterprise class portals. OracleAS Portal incorporates a portal building framework with self-service publishing features to enable you to create and manage information accessed within your portal.

OracleAS Single Sign-On

A component of Oracle Application Server that enables users to log in to all features of the Oracle AS product suite and to other Web applications, using a single user name and password.

OracleAS Web Cache

A component of Oracle Application Server that improves the performance, scalability, and availability of frequently used Web sites. By storing frequently accessed URLs in memory, Oracle Application Server Web Cache eliminates the need to repeatedly process requests for those URLs on the Web server.

Oracle Content Database

A consolidated, database-centric content management application that provides a comprehensive, integrated solution for file and document life cycle management. Oracle Content Database also offers a comprehensive set of Web services that developers can use to build and enhance content management applications. This book uses the product name Oracle Content Database to mean both Oracle Content Database and Oracle Content Services.

Oracle Content Server

Formerly known as Stellent Content Server. Oracle Content Server enables users throughout the organization to contribute content from native desktop applications, manage content through rich library services, publish content to web sites or business applications, and access the content with a browser.

Oracle Content Services

See Oracle Content Database.

Oracle HTTP Server

The Web server component of Oracle Application Server, built on Apache Web server technology and used to service HTTP requests. Also referred to as OHS in the guide.

Oracle Internet Directory

A repository for storing user credentials and group memberships. By default, the OracleAS Single Sign-On authenticates user credentials against Oracle Internet Directory information about dispersed users and network resources.

Oracle Secure Enterprise Search application

Application for searching the Oracle Secure Enterprise Search index.

Oracle WebLogic Server

Oracle WebLogic Server is a fast and reliable server that is used to build and run enterprise applications and services. It is the middle tier server on which Oracle SES operates.


The level of match of the search results to the search string.


The frequency with which each source is crawled.


The process of querying the search engine.


A utility for starting and stopping the search engine.

search metadata

Information about the sources, crawls, and schedules.

secure search

A type of search that only returns results that the user is allowed to view based on access privileges.

seed URL

The starting point for a crawl.


Simple Object Access Protocol. A lightweight, XML-based protocol for exchanging information in a decentralized, distributed environment. SOAP supports different styles of information exchange, including: Remote Procedure Call style (RPC) and Message-oriented exchange.


A source of data to be searched, such as Web sites, files, database tables, content management repositories, collaboration repositories, or applications.


Web Distributed Authoring and Versioning (WebDAV) is a standard protocol used to provide users with a file system interface to Oracle XML repository over the Internet. The most popular way of accessing a WebDAV server folder is through WebFolders on Microsoft Windows.WebDAV is an extension to the HTTP 1.1 protocol. It allows clients to perform remote web content authoring through a coherent set of methods, headers, request body formats, and response body formats. WebDAV provides operations to store and retrieve resources, to create and list contents of resource collections, to lock resources for concurrent access in a coordinated manner, and to set and to retrieve resource properties.


A general purpose XML language for describing the interface, protocol bindings, and deployment details of Web services.