Oracle® Secure Enterprise Search Administrator's Guide 11g Release 2 (11.2.2) Part Number E23427-01 |
|
|
PDF · Mobi · ePub |
An OracleAS Portal source enables users to search across multiple portal installations and repositories, such as Web pages, disk files, and pages on other OracleAS Portal instances. Oracle Secure Enterprise Search can securely crawl both public and private OracleAS Portal content.
To create an OracleAS Portal source:
On the Home page, select the Sources secondary tab to display the Sources page.
For Source Type, select OracleAS Portal.
Click Create to display the Create OracleAS Portal Source page.
Complete the following fields. Click Help for additional information.
Source Name: Name that you assign to this OracleAS Portal source.
URL Base: Base URL for OracleAS Portal.
Page Groups: List of page groups in OracleAS Portal retrieved when you click Retrieve Page Groups. Select the ones to crawl.
Click Create & Customize.
Select the Authentication tab.
Select Enable OracleAS Single Sign-On Authentication and enter your credentials.
Click Apply.
Follow the steps for crawling and indexing in "Getting Started Basics for the Administration GUI" for the mailing list schedule.
The portal crawler can crawl a subtree under a specific folder or page instead of under an entire portal tree.
To set the boundary rule to crawl a specific folder or page:
On the Home page, click the Sources secondary tab to display the Sources page.
Select a source and click Edit to display the Edit User-Defined Source page.
Click the URL Boundary Rules subtab.
Under Inclusion Rules for the URL, select the starts with rule and enter the value of the PORTAL_PATH
for the folder or page.
For example, to crawl only the P2 subtree of a portal tree, enter the path from the root to P2, such as /Proot/P1/P2
.
The crawler picks up key attributes offered by OracleAS Portal, as described in Table 5-1.
Table 5-1 OracleAS Portal Source Attributes
Attribute | Description |
---|---|
createdate |
Date the document was created |
creator |
User name of the person who created the document |
author |
User-editable field so that they can specify a full name or whatever they want |
page_path |
Hierarchy path of the portal page/item in the portal tree (contains page titles) |
portal_path |
Hierarchy path of the portal page/item in the portal tree, used for browsing and boundary rules (contains page names) When searching OracleAS Portal 10.1.2, portal_path appears as upper case in the browse. When searching OracleAS Portal 10.1.4, portal_path appears in lowercase. |
title |
Title of the document |
description |
Brief description of the document |
keywords |
Keywords of the document |
expiredate |
Expiration date of the document |
host |
Portal host |
infosource |
Path of the Portal page in the browse hierarchy |
language |
Language of the portal page or item |
lastmodifieddate |
Last modified date of the document |
mimetype |
Usually 'text/html' for portal |
perspectives |
User-created markers that can be applied to pages or items, such as 'INTERNAL ONLY', 'REVIEWED', or 'DESIGN SPEC'. For example, a Portal containing recipes could have items representing recipes with perspectives such as 'Breakfast', 'Tea', 'Contains Nuts', 'Healthy' and one particular item could have several perspectives assigned to it. |
wwsbr_name_ |
Internal name of the portal page or item |
wwsbr_charset_ |
Character set of the portal page or item |
wwsbr_category_ |
Category of the portal page or item |
wwsbr_updatedate_ |
Date the last time the portal page or item was updated |
wwsbr_updator_ |
Person who last updated the page or item |
wwsbr_subtype_ |
Subtype of the portal page/item (for example, container) |
wwsbr_itemtype_ |
Portal item type |
wwsbr_mime_type_ |
Mimetype of the portal page or item |
wwsbr_publishdate_ |
Date the portal page or item was published |
wwsbr_version_number_ |
Version number of the portal item |
URL boundary rules are not enforced for URL items. A URL item is the metadata that resides on the OracleAS Portal server. Oracle SES does not touch the display URL or the boundary rules for URL items.
The portal_path
attribute is used to compare boundary rules. Portal pages and items are organized in a tree structure. When a page is included or excluded, its entire subtree starting with that node is included or excluded.
If OracleAS Portal user privileges change, the content the crawler collects might not be properly authorized. For example, in a Portal crawl, the user specified in the Home - Sources - Authentication page does not have privileges to see certain Portal pages. However, after privileges are granted to the user, on subsequent incremental crawls, the content still is not picked up by the crawler. Similarly, if privileges are revoked from the user, the content might still be picked up by the crawler.
To be certain that Oracle SES has the correct set of documents, whenever a user's privileges change, update the crawler re-crawl policy to Process All Documents on the Home - Schedules - Edit Schedules page, and restart the crawl.