Oracle Ultra Search User's Guide Release 9.0.3 Part Number B10043-01 |
|
|
View PDF |
This section describes Ultra Search new features and provides pointers to additional information.
Although Ultra Search in the Oracle9i Application Server (9iAS) is the same product as Ultra Search in the Oracle9i database, there are a couple differences:
You can define, edit, or delete your own data sources and types in addition to the ones provided. You might implement your own crawler agent to crawl and index a proprietary document repository, such as Lotus Notes or Documentum, which contain their own databases and interfaces. The proprietary repository is called a user-defined data source. The module that enables the crawler to access the data source is called a crawler agent.
Ultra Search includes fully functional sample query applications to query and display search results. The sample query applications include a sample search portlet. The sample Ultra Search portlet demonstrates how to write a search portlet for use in Oracle 9iAS Portal. This same portlet is installed as a feature of the Oracle 9iAS Portal product.
Ultra Search provides a search portlet that can be embedded in Oracle Portal pages. It is implemented as a Java Server Page application.
The Ultra Search search portlet supports most of the functionality provided by the Query API Complete Sample application.
Oracle Ultra Search offers a flexible API to incorporate search functionality to your portal site. The new functionalities in query API include the following:
AND
, OR
), with control over attribute operator evaluation orderThe URL rewriter is a user-supplied java module for implementing the Ultra Search UrlRewriter interface. It is used by the crawler to filter or rewrite extracted URL links before they are put into the URL queue. URL filtering removes unwanted links, and ULR rewriting transforms the URL link. This transformation is necessary when access URLs are used.
Robots exclusion lets you control which parts of your sites can be visited by robots. If robots exclusion is enabled (default), then the Web crawler traverses the pages based on the access policy specified in the Web server robots
.txt
file. For example, when a robot visits http://www.foobar.com/, it checks for http://www.foobar.com/robots.txt. If it finds it, the crawler analyzes its contents to see if it is allowed to retrieve the document. If you own the Web sites, then you can disable robots exclusions. However, when crawling other Web sites, you should always comply with robots
.txt
by enabling robots exclusion.
When gathering information from a database-based Web application, Ultra Search lets you specify a URL to display the data retrieved on a browser. The URL points to a screen in the Web application corresponding to the data in the database. This is available for table data sources, file data sources, and user-defined data sources.
Document attributes, or metadata, describe the properties of a document. Each data source has its own set of document attributes. The value is retrieved during the crawling process and then mapped to one of the search attributes and stored and indexed in the database. This lets you query documents based on their attributes. Document attributes in different data sources can be mapped to the same search attribute. Therefore, you can query documents from multiple data sources based on the same search attribute.
The list of values (LOV) for a search attribute can help you specify a search query. If attribute LOV is available, then the crawler registers the LOV definition, which includes attribute value, attribute value display name, and its translation.
Ultra Search provides a command-line tool to load metadata into an Ultra Search database. If you have a large amount of data, this is probably faster than using the HTML-based administration tool. The loader tool supports the following types of metadata:
You can override the search results and influence the order that documents are ranked in the query result list with document relevance boosting. This can promote important documents to higher scores and make them easier to find.
For initial planning purposes, you might want the crawler collect URLs without indexing. After crawling is done, you can examine document URLs and status, remove unwanted documents, and start indexing. You can update the crawling mode to the following:
You can create a read-only snapshot of a master Ultra Search instance. This is useful for query processing or for a backup. You can also make a snapshot instance updatable. This is useful when the master instance is corrupted and you want to use a snapshot as a new master instance.
The Ultra Search administration tool supports three modes of logging on, depending on the type of user. You can log on as:
IAS_ADMIN
user
Note: Single sign-on (SSO) is available only with the Oracle9i Application Server (9iAS) release. It is not available with the Oracle9i database release. |
In previous releases, the code for the default query syntax expansion implementation was contained in the WK_QUERYEXP
PL/SQL package. Now, the Contains query lets you specify a query syntax similar to most internet search engines. The syntax boosts scores for documents that match the user's query in the 'title' StringAttribute.