7 Managing Portal Content

This chapter explains how to make content available through the portal's Knowledge Directory and how to manage that content in the portal using portal tools such as filters and crawlers.

It includes the following sections:

About Portal Content
About the Portal Knowledge Directory
Working in the Portal Knowledge Directory
About Document and Object Properties
Working with Properties
About Filters
Working with Filters
About Content Types
Working with Content Types
About Importing Content
Working with Content Web Services
Content Crawlers
Creating a or Editing a Snapshot Query

For additional information about Oracle WebCenter Console for Microsoft SharePoint, see

About Portal Content

The portal is designed to enable users to discover all of the enterprise content related to their employee role by browsing or searching portal areas.

Portal users should be able to assemble a My Page that provides access to all of the information they need. For example, to write user documentation, technical writers must be able to assemble a My Page that includes portlet- or community-based access to documentation standards and conventions, solution white papers, product data sheets, product demonstrations, design specifications, release milestones, test plans, and bug reports, as well as mail-thread discussions that are relevant to customer support and satisfaction. To perform their role, technical writers do not need access to the personnel records that an HR employee or line-manager might require, or to the company financial data that the controller or executive staff might need, for example. A properly designed enterprise portal, then, would reference all of these enterprise documents so that any employee performing any function can access all of the information they need; but a properly designed enterprise portal would also ensure that only the employee performing the role can discover the information.

Complete the following tasks to enable managed discovery of enterprise content through the portal:

For all file types you plan to support in your portal, configure document properties to store document metadata and to enable document filters used by the Knowledge Directory, content crawlers, the Smart Sort utility, and the Search Service. For details, see About Document and Object Properties.
Configure access to content sources that can be selected by users or content crawlers to add document records to the Knowledge Directory and search index. For details, see About Importing Content.
Configure content crawlers and crawl jobs to create links to back-end content sources, such as internet locations, file system locations, Documentum Content Servers, Exchange Servers, Lotus Notes Servers, or other IMAP-compliant servers. For details, see Content Crawlers.
Allow users at least Edit access to the folders in the Knowledge Directory to which you want them to be able to upload document records.
Configure portlets that users can add to their My Pages. For details, see Chapter 8, "Extending Portal Services with Portlets."
Create communities that users can add to their My Communities list. For details, see Chapter 9, "Providing Content and Services to Users through Communities."
Run a Search Update job to index these documents so that they can be discovered with the search. For details, see About the Search Update Job.

Permissions Required for Accessing, Crawling, and Submitting Documents

There are several kinds of permissions a user must view, submit, or crawl documents.

Action	Permissions Needed
Access documents imported into the portal	Read access to the document link in the Knowledge Directory Read access to the Knowledge Directory folder in which the link is stored Read access to the content source used to import the document If the document is not gatewayed, access to the document in the source repository
Crawl documents into the portal	Edit access to the Knowledge Directory folder into which they are crawling documents Edit access to the administrative folder in which they are creating the content crawler Select access to the content source Access Administration activity right Create Content Crawlers activity right Select access to a job that can run the content crawler or Create Jobs activity right plus Edit access to an administrative folder that is registered to an Automation Service
Submit a document into the portal	Edit access to the Knowledge Directory folder into which they are submitting a document Select access to a content source that supports document submission If the associated content Web service does not support browsing, knowledge of the path to the document

Note:

If you have content sources that access sensitive information, be aware that users that have access to the content source and have the additional permissions listed in the table could access anything that the user that the content source impersonates can access. For this reason, you might want to create multiple content sources that access the same repository but that use different authentication information and for which you allow different users access.

About the Portal Knowledge Directory

The Knowledge Directory is similar to a file system tree in that documents are organized in folders and subfolders. A folder can contain documents uploaded by users or imported by content crawlers, as well as links to people, portlets, and communities. If your administrator has given you permission, you might also be allowed to add documents to the Knowledge Directory, or submit yourself as an expert on a particular topic.

The default portal installation includes a Knowledge Directory root folder with one subfolder named Unclassified Documents. Before you create additional subfolders, define a taxonomy, as described in the Oracle Fusion Middleware Deployment Guide for Oracle WebCenter Interaction. For example, you probably want to organize the Knowledge Directory in a way that enables you to easily delegate administrative responsibility for the content and facilitate managed access with access control lists (ACLs).

After you have opened a Directory folder, you see additional features: documents, document display options, subfolders, and related objects.

Documents

On the left, you see the documents to which you have at least Read access. Each document includes an icon to signify what type of document it is (for example: Web page, PDF, MS Word document), the document name, the document description, when the document was last modified, a link to view additional document properties, and a link that displays the URL to this document (enabling you to e-mail a link to the document). At the bottom of the list of documents, you see page numbers indicating how many pages of documents exist in this folder.

Document Display Options

At the top of the list of documents, you see lists that let you change how documents are sorted, how many documents are displayed per page, and filter what types of documents are displayed.

Subfolders and Related Objects

On the right, you see the subfolders in this folder, and any objects that the folder administrator has specified as related to this folder.

Note:

You see only those folders and objects to which you have at least Read access.

Under Subfolders, you see the subfolders in this folder.
Under Related Communities, you see communities that have information related to the documents in this folder.
Under Related Folders, you see other Directory folders that have information related to the documents in this folder.
Under Related Portlets, you see portlets that have information or functionality related to the documents in this folder.
Under Related Experts, you see the users that are familiar with the documents in this folder (for example, an expert might have written one of the documents in the folder).
Under Related Content Managers, you see the users that manage the documents in this folder and the content sources and content crawlers associated with this folder.

The Unclassified Documents Folder

The Unclassified Documents folder stores documents that were crawled in, but did not sort into any of the folders selected in the content crawler. If a document cannot be placed in any target folders or subfolders, the content crawler might place the document in the Unclassified Documents folder. This is determined by a setting on the Advanced Settings page of the Content Crawler Editor. If you have the correct permissions, you can view the Unclassified Documents folder when you are editing the Directory or by clicking Administration, then, in the Select Utility list, selecting Access Unclassified Documents.

Working in the Portal Knowledge Directory

This section describes the following main tasks:

Setting Knowledge Directory Preferences
Browsing the Directory
Creating a Directory Folder
Editing a Directory Folder
Deleting Folders and Documents
Submitting Content to the Directory
Sending a Link to a Document
Working with Tags
Moving Folders and Documents
Copying Folders and Documents
Modifying Security on Folders and Documents
Requesting Migration for Folders and Documents
Troubleshooting Security Changes to Folders and Documents

It also covers the following low-level tasks:

Specifying How Folder Contents Are Sorted
Specifying How Content Is Sorted Into a Folder
Adding Filters to a Folder
Adding Related Resources to a Folder
Specifying a Default Content Source for a Folder
Adding or Editing Properties for a Document
Specifying Expiration Settings for a Document
Specifying Refresh Settings for a Document

Setting Knowledge Directory Preferences

You can specify how the Knowledge Directory displays documents and folders, including whether to generate the display of contents from a Search Service search or a database query, by setting Knowledge Directory preferences.

To access the Knowledge Directory Preferences Utility you must be a member of the Administrators group.

Click Administration.
In the Select Utility list, click Knowledge Directory Preferences.

In the Subfolder Description type list, choose the type of subfolder description to display in the Knowledge Directory:

Option	Description
none	Displays no subfolder description
abbreviated	Displays only the first 100 characters of the folder description
full	Displays the full subfolder description

In the Maximum number of subfolders to display list, choose the number of subfolders to display under the current folder.
In the Number of subfolder columns list, choose a number of columns to display subfolders.
Note:
- Documents are always displayed in a single column.
- This setting does not apply in adaptive page layout mode.
In the Number of documents to show per page box, type a number.

In the Document Description type list, choose the type of document description to display in the Knowledge Directory:

Option	Description
none	Displays no document description
abbreviated	Displays only the first 100 characters of the document description
full	Displays the full document description

In the Related Resources placement list, choose the desired placement, relative to folders and documents: Left, Right, Top, or Bottom.

Note:

Related resources are specified on the Related Resources page of the Folder Editor.
In the Browsing Source list, choose the source of the folder information that displays when browsing the Knowledge Directory:

Option Description

Search

Uses the portal Search Service to generate the list of folder contents

Database

Queries the portal database

Note:

If you have a large collection of documents, you can improve browsing performance by choosing Search.
In the Default Document Submission Content Type list, choose the default content type, which is used when you submit a document that is not mapped to any content type.

If you do not want to specify a default, choose None.
Under Browsing Column Properties, select the properties you want to display as custom columns when browsing documents the Knowledge Directory:
- To add a property, click Add Property, then, in the list that appears, select the desired property.
  
  Note:
  
  Only numeric and date properties can be selected as custom column properties.
- To delete properties, select the properties you want to delete and click the Delete icon.
- To change the order in which properties display use the icons to the right of the properties:
  - To move a property to the top of this list, click the Move to Top icon.
  - To move a property up one space in this list, click the Move Up icon.
  - To move a property down one space in this list, click the Move Down icon.
  - To move a property to the bottom of this list, click the Move to Bottom icon.
  The order in which properties appear on this page is the order in which the columns appear in the Knowledge Directory.

Option	Description
Search	Uses the portal Search Service to generate the list of folder contents
Database	Queries the portal database

Browsing the Directory

When you open the Directory, you see the folders and subfolders to which you have at least Read access.

To open a folder or subfolder, click its name.

Note:

If the folder includes a description, it appears as a tooltip. To view the description, place your mouse over the folder name.

Note:

Beneath the banner, you see the parent hierarchy for the folder you are viewing (sometimes referred to as a breadcrumb trail). To move quickly to one of these folders, click the folder's name.

After you have opened a Directory folder, you see the documents to which you have at least Read access and the following additional features.

To change the sort order of documents between ascending and descending, in the Sort by drop-down list, select the desired option: Document Name Ascending or Document Name Descending.
To change the number of documents that are displayed per page, in the Items per page drop-down list, select the desired number. By default, 20 items are shown per page.
To filter the documents by document type (for example, MS Word documents or PDF documents), in the Show only item type drop-down list, select the desired document type.
To open a document, click its name.
To view the properties of a document, click the Properties link under the document description.
To find objects with a particular tag, under the document description, click the tag. For information about tags, see Tags.
To view a related community, under Related Communities, click the community name.

Note:

If you have at least Select access to the community, you can join the community.
To open a related folder, under Related Folders, click the folder name.
To preview a related portlet, under Related Portlets, click the portlet name.

Note:

If you have at least Select access to the portlet, from the portlet preview page, you can add the portlet to one of your My Pages.
To view the user profile for a related expert, under Related Experts, click the user's name.

Note:

If you have the Self-Selected Experts activity right, and are not already listed as an expert, click Add Me to add yourself as an expert on the folder's topic.
To view the user profile for a related content manager, under Related Content Managers, click the user's name.

At the bottom of the list of documents, you see page numbers indicating how many pages of documents exist in this folder.

To view another page of items, at the bottom of the list of documents, click a page number or click Next >>.

There are other tasks you can perform in edit mode. To enter edit mode, click Edit Directory.

To modify the settings for a folder or document, to the far-right of the folder or document, click the edit icon.

If the folder you are viewing contains documents, you see the following additional features:

To approve documents, so that they display in the Directory, select the documents and click the approve icon.
To unapprove documents, so they do not display in the Directory, select the documents and click the unapprove icon.
To view and modify document settings, click the document settings icon.

Creating a Directory Folder

To create a folder in the Directory you must have the following rights and privileges:

Edit Knowledge Directory activity right
Create Knowledge Directory Folders activity right
At least Edit access to the parent folder (the folder in which you are creating the new folder)

To create a folder in the Directory:

Click Directory.
Click Edit Directory.
Navigate to the folder in which you want to create a new folder.
Click the create folder icon.
In the Create Document Folder dialog box, type a name and description for the folder, and click OK.

You can perform additional tasks when you edit the folder.

Editing a Directory Folder

To edit a Directory folder you must have the following rights and privileges:

Edit Knowledge Directory activity right
At least Edit access to the folder

To edit a Directory folder:

Click Directory.
Click Edit Directory.
Navigate to the folder you want to edit.
Click the edit icon to the right of the folder you want to edit.

To edit the current folder (the open folder), click the edit folder icon (in the folder title bar).

The Folder Editor opens.
On the Main Settings page, perform tasks as necessary:
- Edit the folder name and description.
- Specifying How Folder Contents Are Sorted
- Specifying How Content Is Sorted Into a Folder
- Adding Filters to a Folder
On the Related Resources page, perform tasks as necessary:
- Adding Related Resources to a Folder
On the Advanced Settings page, perform tasks as necessary:
- Specifying a Default Content Source for a Folder

Submitting Content to the Directory

This section describes the methods for submitting content to the Directory:

Using Simple Submission to Submit or Upload Documents to the Portal Knowledge Directory
Using Advanced Submission to Submit or Upload Documents to the Portal Knowledge Directory
Using Advanced Submission to Submit Web Documents to the Portal Knowledge Directory

You can also import content with content crawlers. For details, see Content Crawlers.

Using Simple Submission to Submit or Upload Documents to the Portal Knowledge Directory

With the proper permissions, you can submit documents to the Knowledge Directory.

Before you submit or upload a document to the Knowledge Directory:

Ensure that the language of the document you are submitting matches the language specified as your locale on the Edit Locale Settings page of My Account. For example, if your default locale is Japanese, the document you are submitting must also be in Japanese. If you are submitting a document in a language different from your default locale, either change your default locale to match the language of the document before you submit it, or, if you have permission to edit the Knowledge Directory, you can use Remote Document or Web Document submission to select a different language.

To submit or upload a document to the Knowledge Directory you must have the following rights and privileges:

At least Edit access to the parent folder (the folder that will store the document)
At least Select access to the content source that provides access to the location where the document is stored

Note:

If you want to use a content type other than the default content type associated with the content source, if you want to submit a document to more than one Knowledge Directory folder, or if you want to submit a document in a language different from your default locale, use Remote Document or Web Document submission, available when editing the Knowledge Directory.

To submit or upload a document to the Knowledge Directory:

Click Directory.
Open the folder in which you want to place the document.
Click Submit Documents.

The Submit a Document dialog box opens.

If you are editing the Knowledge Directory, you open the Submit a Document dialog box by selecting Simple Submit in the Submit Document list on the right.
In the Document source list, accept the default document source or select another.

The document source tells the portal how to find the document you are submitting.

Note:

If you are uploading a file, you must select Content Upload.
Specify a file by performing one of the following actions:
- If you are submitting a Web document, in the URL text box, type the document's URL.
- If you are submitting or uploading a file, specify a file by performing one of the following actions:
  - Type the UNC path to the document in the File path text box.
    
    If you are leaving the file in the remote location, you must type a network path (for example, \\myComputer\myFolder\myFile.txt). If you are uploading the file, the path can be a local path (for example, C:\myFolder\myFile.txt) or a network path.
  - Click Browse to navigate to the location of the file you want to submit.
    
    If you are leaving the file in the remote location, you must supply a network path to the file, and therefore, you cannot browse your local drives; you must browse the network to your computer and then to the location of the file. If you are uploading the file, you can browse to local drives or network drives.
    Note:
    - Depending on how the administrator configured the content source, the Browse button might not appear, therefore you might not be able to browse to the file. If you do not see a Browse button, type the path in the File path text box.
    - If the Browse button does display but you cannot browse to the folder where the file you want to submit is located, the content source you chose might not have the necessary privileges to access the file location. Click Cancel and resubmit the file using a different content source.
If desired, override the default name or description.
- To override the default name, select Use this name and, in the text box, type the name.
- To override the default description, select Use this description and, in the text box, type the description.

Once the folder administrator (one who has Admin access to the folder) approves your submission, links to the document you submitted or uploaded appear in the Knowledge Directory.

Using Advanced Submission to Submit or Upload Documents to the Portal Knowledge Directory

With the proper permissions, you can use advanced submission to submit documents to the Knowledge Directory. Advanced submission enables you to select a content type other than the default content type associated with the content source, submit a document to more than one Knowledge Directory folder, or submit a document in a language different from your default locale. Depending on your portal configuration, you might also be able to upload a file to the Knowledge Directory. When you upload a file, it is copied from the remote repository into the portal's document repository and a pointer is created to that copied file.

To use advanced submission to submit or upload a document to the Knowledge Directory you must have the following rights and privileges:

Edit Knowledge Directory activity right
At least Edit access to the parent folder (the folder that will store the document)
At least Select access to the content source that provides access to the location where the document is stored

To use advanced submission to submit or upload a document to the Knowledge Directory:

Click Directory.
Click Edit Directory.
Open the folder in which you want to place the document.
In the Submit Document list on the right, select Remote Document.

The Choose a Content Source dialog box opens.
Select the content source that provides access to the content you want to submit or upload and click OK.

Note:

If you are uploading a file, you must select Content Upload.
Specify a file by performing one of the following actions:
- Type the UNC path to the document in the File path text box.
  
  If you are leaving the file in the remote location, you must type a network path (for example, \\myComputer\myFolder\myFile.txt). If you are uploading the file, the path can be a local path (for example, C:\myFolder\myFile.txt) or a network path.
- Click Browse to navigate to the location of the file you want to submit.
  
  If you are leaving the file in the remote location, you must supply a network path to the file, and therefore, you cannot browse your local drives; you must browse the network to your computer and then to the location of the file. If you are uploading the file, you can browse to local drives or network drives.
  Note:
  - Depending on how the administrator configured the content source, the Browse button might not appear, therefore you might not be able to browse to the file. If you do not see a Browse button, type the path in the File path text box.
  - If the Browse button does display but you cannot browse to the folder where the file you want to submit is located, the content source you chose might not have the necessary privileges to access the file location. Click Cancel and resubmit the file using a different content source.
If you want to override the default name or description, click the Name and Description page and edit the values.
- To override the default name, edit the value in the Name box.
- To override the default description, edit the value in the Description box.

Once the folder administrator (one who has Admin access to the folder) approves your submission, links to the document you submitted or uploaded appear in the Knowledge Directory.

Using Advanced Submission to Submit Web Documents to the Portal Knowledge Directory

To use advanced submission to submit a document to the Knowledge Directory you must have the following rights and privileges:

Edit Knowledge Directory activity right
At least Edit access to the parent folder (the folder that will store the document)
At least Select access to the content source that provides access to the location where the document is stored

To use advanced submission to submit a document to the Knowledge Directory:

Click Directory.
Click Edit Directory.
Open the folder in which you want to place the document.
In the Submit Document list on the right, select Web Document.

The Choose a Content Source dialog box opens.
Select the content source that provides access to the content you want to submit and click OK.

Note:

If you are submitting an unsecured Web document, you can select World Wide Web.
In the URL text box, type the document's URL.
Under Choose Content Type, select the content type to apply to this document.
- To use the folder's default content type, leave Default Content Type selected.
- To choose a different content type, select This content type, click Change, in the dialog box, select the content type you want to use, and click OK.
Under Choose Knowledge Directory Folders, specify into which folders you want to submit this document.
- To add a folder, click Add Folder.
- To remove folders, select the folders you want to delete and click the Remove icon.
- To change the order of the names in the list from ascending to descending alphabetical order (or vice versa), click Folder Names.
Under Document Content Language, choose the language used for the majority of the document's content.

The language you choose is the language by which the document is indexed. The search engine uses the language when searching.
If you want to override the default name or description, click the Name and Description page and edit the values.
- To override the default name, edit the value in the Name box.
- To override the default description, edit the value in the Description box.

Once the folder administrator (one who has Admin access to the folder) approves your submission, links to the document you submitted or uploaded appear in the Knowledge Directory.

Editing the Settings for a Document

To edit the settings for a document you must have the following rights and privileges:

Edit Knowledge Directory activity right
At least Edit access to the document

To edit the settings for a document:

Click Directory.
Click Edit Directory.
Navigate to the document you want to edit.
Click the edit icon to the right of the document you want to edit.

The Document Editor opens.
On the Main Settings page, perform tasks as necessary:
- Edit the name and description for the document.
- Adding or Editing Properties for a Document
On the Document Settings page, perform tasks as necessary:
- Specifying Expiration Settings for a Document
- Specifying Refresh Settings for a Document
On the Security page, perform tasks as necessary:
- Setting Security on an Object
The default security for this document is based on the security of the parent folder.
On the Migration History and Status page, perform tasks as necessary:
- Viewing Migration History and Status for an Object

Deleting Folders and Documents

To delete Directory content you must have the following rights and privileges:

Edit Knowledge Directory activity right
Admin access to the Directory content you want to delete

To delete Directory content:

Click Directory.
Click Edit Directory.
Navigate to the content you want to delete.
Select the folders and/or documents you want to delete and click the delete icon.

Sending a Link to a Document

To send a link to a document in the Directory:

Click Directory.
Navigate to the document.
Under the document description, click Send Document Link.
In the Document Link dialog box, copy the text, then click Close.
In your e-mail application, paste the text into an e-mail message and send it.

When other portal users click the URL in your e-mail, the document opens. If a user does not have permission to see the document, an error message is displayed.

Working with Tags

To add a tag to a document:
1. Under the document description, click Add Tag.
2. In the text box, type the tag you want to apply to the document. To add more than one tag, separate tags with commas (,).
3. To save your tag, click outside of the text box or press ENTER.
To rename a tag you added to a document:
1. Under the document description, place your cursor over the tag icon and click Rename Tag.
2. In the text box, edit the tag.
3. To save your changes, click outside of the text box or press ENTER.
Note:

You cannot rename a tag that you did not add. Tags you did not add have a read-only tag icon.
To delete a tag you added to a document, place your cursor over the tag icon, click Delete Tag, then click OK in the confirmation dialog box.

Note:

You cannot delete a tag that you did not add. Tags you did not add have a read-only tag icon. If you are an administrator, you can delete tags created by other users through the Tagging Engine Administration utility (accessed by clicking Administration, then, in the Select Utility menu, clicking Tagging Engine Administration).

Moving Folders and Documents

To move Directory content you must have the following rights and privileges:

Edit Knowledge Directory activity right
At least Edit access to the target folder (the folder to which you are moving the content)
At least Edit access to the content you want to move

To move Directory content:

Click Directory.
Click Edit Directory.
Navigate to the content you want to move.
Select the folders and/or documents you want to move and click the move icon.
In the Choose Target Folder dialog box, expand the folders as necessary, choose a folder, and click OK to move the content.

Copying Folders and Documents

To copy Directory content you must have the following rights and privileges:

Edit Knowledge Directory activity right
At least Edit access to the target folder (the folder to which you are copying the content)
At least Edit access to the content you want to copy

To copy Directory content:

Click Directory.
Click Edit Directory.
Navigate to the content you want to copy.
Select the folders and/or documents you want to copy and click the copy icon.
In the Choose Target Folder dialog box, expand the folders as necessary, choose a folder, and click OK to copy the content.

Modifying Security on Folders and Documents

To modify security on Directory content you must have the following rights and privileges:

Edit Knowledge Directory activity right
At least Edit access to the content for which you want to modify security

To modify security on Directory content:

Click Directory.
Click Edit Directory.
Navigate to the content for which you want to modify security.
Select the folders and/or documents and click the security icon.
In the Edit Security dialog box, perform the following actions:
- To allow more users or groups access to the folders or documents, click Add Users/Groups.
- To specify the type of access a user or group has, to the right of the user or group, select the proper access from the drop-down list under the name of the folder or document.
  
  For a description of the available privileges, see About Access Controls Lists and Access Privileges.
  
  Note:
  
  If a user is a member of more than one group included in the list, or if they are included as an individual user and as part of a group, that user gets the highest access available to her for this folder or document. For example, if a user is part of the Everyone group (which has Read access) and the Administrators Group (which has Admin access), that user gets the higher privilege to the community: Admin.
- To cancel your changes and reset the security to what it was, click Reset.
- To delete a user or group from the security of all selected folders or documents, select the user or group and click the remove icon.
- To see which users are included in a group, click the group name.
- To set identical access rights to a folder or document for every user or group, repeatedly click the column icon in the Set Column area at the bottom of the desired column until the desired access level displays in the lists.
- To set identical access rights to every folder or document for a user or group, repeatedly click the row icon in the Set Row area at the right of the desired user or group until the desired access level displays in the lists.

Requesting Migration for Folders and Documents

To request migration of Directory content you must have the following rights and privileges:

Edit Knowledge Directory activity right
At least Select access to the content for which you want to request migration

To request migration of Directory content:

Click Directory.
Click Edit Directory.
Navigate to the content for which you want to request migration.
Select the folders and/or documents and click the migration icon.
In the Script Prompt dialog box, type a comment about why you need this content migrated and click OK.

Troubleshooting Security Changes to Folders and Documents

If you get a time-out error when applying a security change to all child objects in a Knowledge Directory hierarchy, you can use the following workaround.

This issue is most likely caused by trying to apply changes too many levels of nested subfolders with a large number of child objects (other folders or documents). In this situation, you can perform the following steps to work around the issue:

To find out which folders are updated with the security changes, select all first level subfolders then select the security icon.
Scroll through the list to see at which folder the security changes stopped being applied.
Work with the remaining folders and apply the needed security changes: open each first level folder separately, set the security, save, and select Yes to apply the security changes to all the child objects for that folder.
Repeat this process for all remaining first level subfolders that did not get the security applied to them successfully due to the error.

Specifying How Folder Contents Are Sorted

If the Folder Editor is not already open, open it now.
Under Sorting, specify how folder contents are sorted:
1. In the Order this way when browsing list, choose the property by which you want folder contents sorted in Browse Mode. Only numeric, date, and name properties are displayed in the list.
2. In the Order this way when editing list, choose the fixed property by which you want folder contents sorted in Edit Mode.

Specifying How Content Is Sorted Into a Folder

You can edit the way content is sorted into this folder by crawlers or the Smart-Sort Editors.

If the Folder Editor is not already open, open it now.
Under When sorting, allow into this folder, choose the criteria that links must meet to be imported into the folder:
- All links: All links can sort into this folder.
- No links: No links are sorted into this folder or its subfolders, but you can still add links manually.
- Links that pass: Choose whether links must pass all filters or at least one filter to sort into this folder. You can manually add links that do not pass the filters.
Under Default folder, choose the folder in which to put links that cannot be sorted into this folder or any of its subfolders.

If you choose No folder, content that cannot be sorted into this folder or any of its subfolders is not sorted into any folder in this branch. However, a crawler might still place this content into another branch or in the Unclassified Documents folder.

Adding Filters to a Folder

Note:

Filters in this section are disregarded if you chose All links or No links under Filter Settings.

If the Folder Editor is not already open, open it now.
In Filters, specify the filters that links must pass to sort into this folder:
- To add a filter, click Add Filter, select filters, and click OK.
- To create a filter, click Create Filter.
- To remove filters, select the filters you want to remove and click the remove icon.
- To toggle the names in the list between ascending and descending alphabetical order, click Filter Names.

For information on filters, see About Filters.

Adding Related Resources to a Folder

Related resources enable you to associate communities, other folders, portlets, experts, and content managers with a folder. The folder's associated objects display in Browse Mode when enabled in the user's experience definition.

If the Folder Editor is not already open, open it now.
On the left, under General Settings, click Related Resources.
Add related resources as necessary:
- To add related communities, under Communities, click Add Community, select communities, and click OK.
- To add related folders, under Folders, click Add Folder, select folders, and click OK.
- To add related portlets, under Portlets, click Add Portlet, select portlets, and click OK.
- To add related experts (users who are knowledgeable about the contents of the folder), under Experts, click Add Expert, select users, and click OK.
- To add related content managers (users who are responsible for managing the content in the folder), under Content Managers, click Add Content Managers, select users, and click OK.

Specifying a Default Content Source for a Folder

You can specify the default content source for documents submitted to a folder. When a user submits a document, the content source you choose is selected by default. However, users can change the content source to any content source to which they have at least Select access.

Note:

If no default content source has been set for the folder to which a user wants to submit a document, the portal searches recursively up through the folder hierarchy until it locates a folder for which a default content source has been set, and uses that content source as the default.

If the Folder Editor is not already open, open it now.
On the left, under General Settings, click Advanced Settings.
Under Default Content Source, select the default content source for documents submitted to the folder:
- To use the same default content source as the folder's parent folder uses, click Parent folder's content source. The parent folder's default content source is displayed in parentheses next to this option.
  
  Note:
  
  Because the root folder does not have a parent folder, the Parent folder's content source option does not display if you are editing the root folder.
- To select another content source, click This content source, and select a content source from the list.

Adding or Editing Properties for a Document

Document properties are metadata about the document that the search engine uses to index the document, similar to a library card catalog.

If the Document Editor is not already open, open it now.
Under Customized Document Properties, modify properties for the selected document:
- To add a property, click Add Property. This adds a property to the bottom of the list. Select the property you want from the drop-down list and enter a value.
- To create a property, click Create Property. To learn about creating properties, see Chapter 7, "Creating or Editing a Property."
- To remove properties, select the properties you want to remove and click the remove icon.
- To change a property value, modify the property's Value column.
- To toggle the properties in the list between ascending and descending alphabetical order, click Property.

Specifying Expiration Settings for a Document

You might want to set a document to expire if the document will become irrelevant at some point.

These settings are referenced by the Document Refresh Agent when the Document Refresh job runs.

If the Document Editor is not already open, open it now.
Click the Document Settings page.
Under Document Expiration, choose whether the link to the document should expire:
- If you do not want the document link to expire, select Never expire.
  
  Note:
  
  If you set a document to be refreshed and to be deleted if the source document is not found, even a document set to never expire can be deleted.
- If you want the document link to expire, chose Delete on, and type a date in the box or click the date picker icon to choose a date.
  
  When the Document Refresh job runs, it will delete any document links that have reached the expiration date.

Specifying Refresh Settings for a Document

You can have the Document Refresh Agent periodically update the document and its properties. When a link is refreshed, the Document Refresh Agent verifies whether the source document still exists. If the document exists, the Document Refresh Agent updates the associated property values from the source document. If the document does not exist, the Document Refresh Agent applies the settings you specify for dealing with broken links.

These settings are referenced by the Document Refresh Agent when the Document Refresh job runs.

If the Document Editor is not already open, open it now.
Click the Document Settings page.
Under Link and Property Refresh, specify the refresh settings that should be used by the Document Refresh Agent:
- If you do not want to refresh the document, select Never.
- If you want to refresh the document, select Every, and type a number in the box and choose an interval from the list.
- To prevent the Document Refresh Agent from refreshing document properties, select Only confirm the validity of the links to these documents.
Under Broken Links, specify what happens to links to the document if the Document Refresh Agent finds that the source document does not exist:
- If you want to leave the broken link in the portal, select Left alone.
- If you want to remove the broken link from the portal immediately, select Deleted immediately.
- If you want to leave the broken link for a specified amount of time, select Deleted after, and type a number in the box and choose an interval from the list.
  
  You might want to leave a broken link in the portal for a short while in case the source document repository is temporarily inaccessible.

About Document and Object Properties

Properties provide information about, as well as a way to search for, documents and objects in your portal. For example, you might want to create an Author property so users can find all the documents or objects created by a particular user.

When you add documents to the portal, the portal maps source document fields to portal properties according to mappings you specify in the Global Content Type Map, the particular content type definition, the Global Document Property Map, and any content crawler-specific content type mappings.

Global Document Property Map

The Global Document Property Map provides default mappings for properties common to the documents in your portal. When users import documents into the portal (either manually or through a content crawler), property values can be extracted from the source documents according to the property mappings you specify in the associated content types and the property mappings from the Global Document Property Map.

When a user imports a document into the portal, the portal performs the following actions:

The portal determines which content type to use, based on the Global Content Type Map or the content crawler's content type settings.
The portal populates property values based on the property mappings in the content type.
If there are additional mapped properties in the Global Content Type Map (not included in the content type's property mappings), the portal populates the property values, based on those property mappings.

Therefore, you can map common properties in the Global Document Property Map and specify only special mappings, default values, and override values in content types.

Note:

Some property mappings are set when the portal is installed so that the portal can produce some general metadata even if you do not create any specialized mappings.

Global Object Property Map

The Global Object Property Map displays all the types of portal objects with which you can associate properties. When users create a portal object, they can specify values for the associated properties on the Properties and Names page of the object's editor.

User Information - Property Map

The User Information — Property Map enables you to map user information to user properties in the portal. The information in these user properties can then be displayed in the user's profile, or it can be sent to content crawlers, remote portlets, or federated searches so that users do not have to enter this information on a separate preference page.

Working with Properties

This section describes how to perform the following tasks:

Creating or Editing a Property
Mapping Source Document Attributes to Portal Properties Using the Global Document Property Map
Associating Properties with Portal Objects Using the Global Object Property Map
Associating User Information with Properties Using the User Information — Property Map

Creating or Editing a Property

To create a property you must have the following rights and privileges:

Access Administration activity right
Create Properties activity right
At least Edit access to the parent folder (the folder that will store the property)

To edit a property you must have the following rights and privileges:

Access Administration activity right
At least Edit access to the property

To create or edit a property:

Click Administration.
Open the Property Editor.
- To create a property, open the folder in which you want to store the property. In the Create Object list, click Property.
- To edit a property, open the folder in which the property is stored and click the property name.
In the Property Type list, choose what kind of information this property stores.
- Textual stores text values.
- Simple Number stores whole numbers.
- Floating Point Number stores numbers that include decimal points.
- Date stores date values.
- Reference stores a reference to an administrative object in the portal.
  
  After choosing this option, in the second list, choose what type of administrative object this property references.
- Encrypted Text stores encrypted text values.
Note:

After you save this property, the property type cannot be changed.
If this property stores a Web address, select Treat this property like an URL.

If you choose this option, users can click-through the property, so the values for this property must always be URLs.

Note:

This option is only available for textual properties.
If this property applies to documents imported into the portal, select This property is supported for use with documents.

Note:

This option is not available for reference properties.
If this property is generated automatically and you want to store the value in the database but not display it on the Properties and Names page of object editors, clear the This Property is visible in the UI check box.
If you do not want users to be able to edit the values for this property, select Read Only.
If you want users to be able to search for objects based on the values for this property, select Searchable.

For example, if you specify that the Author property is searchable, users can search for all the objects created by a particular person.
If you want to require that users specify a value for this property before they can save the associated object, select Make this property mandatory.

Note:

This option is not available if this property is set to Read Only.
If you want to allow users to specify more than one value for this property, select Multiple values can be selected for this Property.
In the Property Chooser Type list, specify what format should be used for value selection.
- None displays a text box in which users can type property values.
- Managed Dropdown displays a list of values you specify from which users can choose.
  - To create the values users can choose from, click Add Value and type a value in the text box.
  - To remove a value, select the value and click the Remove icon.
    
    To select or clear all value check boxes, select or clear the box to the left of Value Names.
- Unmanaged Dropdown displays a list populated with the values from a database table from which users can choose.
  1. In the Database Table Name box, type the name of the table from which you want to populate your list.
  2. In the Pick Column box, specify the column from which you want to populate the list.
  3. In the Sort Column box, type the name of the column upon which the values are sorted.
- Tree displays a hierarchical list populated with the values from a database table from which users can choose.
  1. In the Database Table Name box, type the name of the table from which you want to populate the list.
  2. In the Pick Column box, specify the column from which you want to populate the list.
  3. In the Sort Column box, type the name of the column upon which the values are sorted.
  - To add a column, click Add Value and enter the Pick Column and Sort Column values.
  - To remove a column, select it and click the Remove icon.
  - To select or clear all the column check boxes, select or clear the box to the left of Pick Column.
Note:

This option is not available for date or reference properties.
On the Names and Descriptions page, perform tasks as necessary:
- Naming and Describing an Object
  
  You can instead enter a name and description when you save this property.
On the Security page, perform tasks as necessary:
- Setting Security on an Object
The default security for this property is based on the security of the parent folder.
If you are editing a property, on the Migration History and Status page, perform tasks as necessary:
- Working with Snapshot Queries
Note:

The Migration History and Status page is not available when creating an object.

If you set this property to be searchable, you must rebuild the search index.

Mapping Source Document Attributes to Portal Properties Using the Global Document Property Map

To access the Global Document Property Map you must be a member of the Administrators Group.

To map source document attributes to portal properties:

Click Administration.
In the Select Utility list, choose Global Document Property Map.
Create mappings between portal properties and document attributes.

Note:

Some property mappings are set when the portal is installed so that content crawlers can produce some general metadata even if you do not create any specialized mappings.
- To add a property mapping, click Add Property; then, in the Add Property dialog box, select the properties you want to add and click OK.
- To associate an attribute, click the property name and, in the text box, type the associated attributes, separated by commas (,).
  
  The first attribute with a value populates the property.
- To delete a mapping, select the mapping and click the Delete icon.
  
  To select or clear all of the mapping check boxes, select or clear the box to the left of Property Name.
- To toggle the order in which the properties are sorted, click Property Name.
Note:

You can map any attribute from the source document. For information on source document attributes, refer to the documentation for the third-party software.

HTML Metadata Handling

Generally, you will be able to determine what source document attributes can be mapped to portal properties, but it might not be as clear in HTML documents.

This table shows the names of the attributes that are returned by the HTML accessor. You can map the attribute names to portal properties.

Note:

The HTML Accessor handles all common character sets used on the web, including UTF-8.

HTML Metadata	Name of Attribute Returned by HTML Accessor	Default Mapping or Mapping Suggestion
<TITLE> Tag	Title	Title (default)
<META> Tag	The attribute name is the NAME value. Example: <META NAME="creation_date" CONTENT="18-Jan-2004"> The attribute that would be extracted from the example would be named “creation_date”	Using the example, you could map creation_date to the Created property.
Headline Tags	The attribute name is the name of the tag followed by an ordinal, one-based index in parentheses. The Accessor returns a value for each headline tag (<H1>, <H2>, <H3>, <H4>, <H5>, and <H6>) and each bold tag (<B>). Example: <H1>Value 1</H1> <H3>Value 2</H3> <H1>Value 3</H1> <B>Value 4</B> The HTML Accessor returns the following source document attribute-value pairs: <h1>(1) Value 1 <h3>(1) Value 2 <h1>(2) Value 3 <B>(1) Value 4	If on a particular news site, the second <H2> tag contains the name of the article and the third <B> tag contains the name of the author, you could map the portal property Title to <H2>(2) and the portal property Author to <B>(3).
HTML Comments	It is common practice to store metadata in HTML comments using the following format: <!-- Writer: jm --> <!-- AP: md --> <!-- Copy editor: mr --> <!-- Web editor: ad --> In other words, the format is the HTML comment delimiter followed by the name, a colon, the value, and a close comment delimiter. The HTML Accessor parses data in this format and returns the following source document attribute-value pairs: Writer jm AP md Copy editor mr Web editor ad	Using the example, you could map Writer to the portal property Author.
Parent URL	Documents imported through a Web content crawl return an attribute named Parent URL with the value of the URL of the parent page that contains a link to the document.	URL (default)
Anchors	The HTML Accessor provides special handling for internal anchors (<a name=”target”>) and URLs that reference them (http://server/page#target).	You might map anchors to portal attributes in the following ways: Alternate Sources for the portal Title attribute When the document URL for an HTML document contains a fragment identifier (for example, #target in the example above) and the Accessor finds that anchor in the document, it discards all title and headline tags preceding the anchor and returns, as the suggested document title, the first subsequent headline tag. All subsequent tags are indexed relative to the anchor tag, so mapping a property to <H1>(2) means “use the second <H1> tag after the anchor tag named in the document URL.” Mapping Anchor Section to Document Description or Summary The HTML Accessor returns an attribute named Anchor Section containing text immediately following the named anchor tag (stripped of markup tags and HTML decoded). Mapping this property to the document description allows the portal to generate a relevant description for each section of a large document. The HTML Accessor generates its own summary by returning the first summary-sized chunk of text in the document stripped of HTML markup tags and correctly HTML decoded. It returns this summary as an attribute named Summary. The Accessor executes the DocumentSummary method, which returns the value of the Anchor Section attribute, if available. If this attribute is not available, its second choice is the value of the Description attribute from the <META NAME=”description”> tag. If this is not available, its third and final choice is the Summary attribute.

Associating Properties with Portal Objects Using the Global Object Property Map

To access the Global Object Property Map you must be a member of the Administrators group.

Click Administration.
In the Select Utility list, click Global Object Property Map.
To associate properties, click the Edit icon; then, in the Choose Property dialog box, select the properties you want to associate with the object, and click OK.

To toggle the order in which the objects are sorted, click Objects.

Associating User Information with Properties Using the User Information — Property Map

To map user information to portal properties you must have the following rights and privileges:

Access Administration activity right
Access Utilities activity right
At least Select access to the properties you want to map

Note:

The Full Name attribute is automatically mapped to display name of the user unless you override it on this page.

Click Administration.
In the Select Utility list, click User Profile Manager.
Under Edit Object Settings, click User Information - Property Map.
Add a property. Click Add; then, in the Choose Property dialog box, select the property you want to add and click OK.
Map attributes to the property:
1. Click the Edit icon next to the property name.
2. In the text box, type the attribute.
  
  To map the property to multiple attributes, separate the attribute names with commas (,).
Repeat Steps 4 and 5 to map additional properties.

To remove properties, select the property you want to remove and click the Remove icon.

After mapping user information to portal properties, you must import the user information through profile sources or have users manually enter the information by editing their user profiles.

About Filters

Filters control what content goes into which folder when crawling in documents or using Smart Sort to filter content into new folders. A filter sets conditions that document links must pass to be sorted into associated folders in the Knowledge Directory.A filter is a combination of a basic fields search and statements. The basic fields search operates on the name, description, and content of documents. Statements can operate on the basic fields or any other additional document properties. Statements define what must or must not be true to allow the document to pass the filter. The statements are collected together in groupings. The grouping defines whether the statements are evaluated with an AND operator (all statements are true) or an OR operator (any statement is true). If some statements should be evaluated with an AND operator and some should be evaluated with an OR operator, you can create separate groupings for the statements. You can also create subgroupings or nested groupings, where one grouping is contained within another grouping. The statements in the lowest-level grouping are evaluated first to define a set of results. Then the statements in the next highest grouping are applied to that set of results to further filter the results. The filtering continues up the levels of groupings until all the groupings of statements are evaluated.

Working with Filters

This section describes the following tasks:

Creating or Editing a Filter
Deleting a Filter
Defining Filter Conditions
Testing Filters
Applying a Filter to a Folder

Creating or Editing a Filter

To create a filter you must have the following rights and privileges:

Access Administration activity right
Create Filters activity right
At least Edit access to the parent folder (the folder that will store the filter)

To edit a filter you must have the following rights and privileges:

Access Administration activity right
At least Edit access to the filter

To create or edit a filter:

Click Administration.
Open the Filter Editor.
- To create a filter, open the folder in which you want to store the filter. In the Create Object list, click Filter.
- To edit a filter, open the folder in which the filter is stored and click the filter name.
On the Main Settings page, perform tasks as necessary:
- Defining Filter Conditions
On the Properties and Names page, perform tasks as necessary:
- Naming and Describing an Object
  
  You can instead enter a name and description when you save this filter.
- Managing Object Properties

Add the filter to folders.

Deleting a Filter

To delete a filter you must have the following rights and privileges:

Access Administration activity right
Admin access to the filter

To delete a filter:

Click Administration.
Navigate to the filter.
Select the filter you want to delete and click the delete icon.

Defining Filter Conditions

A filter is a combination of a basic fields search and statements. The basic fields search operates on the name, description, and content of documents. Statements can operate on the basic fields or any other additional document properties. Statements define what must or must not be true to allow the document to pass the filter. The statements are collected together in groupings. The grouping defines whether the statements are evaluated with an AND operator (all statements are true) or an OR operator (any statement is true). If some statements should be evaluated with an AND operator and some should be evaluated with an OR operator, you can create separate groupings for the statements. You can also create subgroupings or nested groupings, where one grouping is contained within another grouping. The statements in the lowest-level grouping are evaluated first to define a set of results. Then the statements in the next highest grouping are applied to that set of results to further filter the results. The filtering continues up the levels of groupings until all the groupings of statements are evaluated.

A filter needs at least a basic fields search or a statement.

If the Filter Editor is not already open, open it now.
To search the name, description, and content values, type the text you want to search for in the Basic fields search text box.

You can use the text search rules described in Using Text Search Rules.
Select the operator for the grouping of statements you are about to create:
- If a document should pass the filter only when all statements in the grouping are true, select AND.
- If a document should pass the filter when any statement in grouping is true, select OR.
Note:

The operator you select for a grouping applies to all its statements and subgroupings directly under it.
Define each statement in the grouping:
1. Click Add Statement.
2. In the first list, select the searchable property for which you want to filter the values.
3. In the second list, select the operator to apply to this condition.
  
  This list will vary depending on the property selected:
  - For any text property, you can search for a value that contains your search string, or you can search for properties that have never had a value (Contains No Value).
    
    Note:
    
    If the property contained a value at some point, but the value has been deleted, the property will not match the Contains No Value condition.
  - For any date property, you can search for a value that comes after, comes before, is, or is not the date and time you enter in the boxes. You can also search for a value within the last number of minutes, hours, days, or weeks that you enter in the box.
  - For any number property, you can search for a value that is greater than, is less than, is, is not, is greater than or equal to, or is less than or equal to the number you enter in the text box.
4. In the box (or boxes), specify the value the property must meet.
  
  Note:
  
  If you are searching for a text property, you can use the text search rules described in Using Text Search Rules.
To remove the last statement in a grouping, select the grouping, and click Remove Statement.
If necessary, add more statements by repeating Step 4.
If necessary, add more groupings:
- To add another grouping, select the grouping to which you want to add a subgrouping, click Add Grouping, then define the statements for that grouping (as described in Step 4).
  
  Note:
  
  You cannot add a grouping at the same level as Grouping 1.
- To remove a grouping, select the grouping, and click Remove Grouping.
  Note:
  - Any groupings and statements in that grouping will also be removed.
  - You cannot remove Grouping 1.
To verify that the filter works, click Test Filter.

The results of the test appear under Filter Test Report.

You might want to test your filters before you use them extensively. See Testing Filters.

Testing Filters

You might want to test your filters before you use them extensively.

To test filters, crawl content into a test folder; then perform one of the following tests:
- Run an advanced search on the folder using the same criteria you used for your filter.
  
  If your advanced search returns the content you expected, you can apply the filter to the appropriate folder, confident that the filter will allow the proper documents into the folder.
- Add the filters to subfolders and perform a smart sort to sort the content into the subfolders according to the filters.
  
  If the subfolders contain the content you expected, the filters work correctly.

Applying a Filter to a Folder

After you create a filter, you assign it to folders to control what content goes into the folder when crawling in documents or using Smart Sort to filter content into new folders.

Note:

When users submit documents, the filters do not apply.

Click Directory.
Click Edit Directory.
Open the Folder Editor for the folder to which you want to apply a filter.
- To edit the root folder (or folder you are in), in the action toolbar in the upper-right of the Edit Directory page, click the Edit Folder icon.
- To edit a subfolder, click the Edit icon to the right of the folder name.
Under Filter Settings, select Links that pass and choose whether documents must pass all filters or at least one filter to sort into this folder.
Under Filters, specify the filters that documents must pass to sort into this folder.
- To add a filter, click Add Filter, select filters, and click OK.
- To create a filter, click Create Filter.
- To remove filters, select the filters you want to remove and click the Remove icon.
- To toggle the names in the list between ascending and descending alphabetical order, click Filter Names.

About Content Types

Content types specify several options — the source content format (such as Microsoft Office, Web page, or Lotus Notes document), whether the text of the content should be indexed for searching, and how to populate values for document properties. You should create a separate content type for each unique combination of these options. For example, if departments use different Microsoft Word attributes for document descriptions, you might have to create one content type that pulls the description from the Subject attribute and one that pulls it from the Comments attribute.

Working with Content Types

This section describes the following tasks:

Creating or Editing a Content Type
Deleting a Content Type
Mapping Content Types to Imported Content Using the Global Content Type Map

Creating or Editing a Content Type

You should create a separate content type for each unique combination of these options. For example, if departments use different Microsoft Word attributes for document descriptions, you might have to create one content type that pulls the description from the Subject attribute and one that pulls it from the Comments attribute.

To create a content type you need the following rights and privileges:

Access Administration activity right
Create Content Types activity right
At least Edit access to the parent folder (the folder that will store the content type)

To edit a content type you need the following rights and privileges:

Access Administration activity right
At least Edit access to the content type

To create or edit a content type:

Click Administration.
Open the Content Type Editor.
- To create a content type, open the folder in which you want to store the content type. In the Create Object list, click Content Type.
- To edit a content type, open the folder in which the content type is stored and click the content type name.
In the Document Accessor list, choose the accessor associated with the type of document for which you are creating this content type.

The accessor determines how the portal extracts information from these documents. Use the File Accessor to extract general information (such as name and description) from any type of document. Use other accessors to extract more detailed information; for example, the HTML Accessor can extract title and author values.
If these documents include textual content and you want that content to be searchable, select Index documents of this type for Search.

For example, you would not want to index .zip files, but you probably would want to index .txt files.
Specify property mappings.
- To add a property mapping, click Add Property; then, in the Select Properties dialog box, select the properties you want to map and click OK.
- To change the source document attributes that are associated with a property or to add a default or override value, click the Edit icon to the far-right of the property and type the settings in the appropriate text boxes.
  - If you want the values for this property to be extracted from the source document attributes, in the Mapped Attributes box, type the associated attributes, separated by commas (,).
    
    Content crawlers search the attributes in the order in which you list them here; if there is no value for the first listed attribute, the content crawler looks for the second listed attribute, and so on.
  - If you want this property to have a value even when the source document does not have a value for any mapped attribute, in the Default Value box, type the value you want this property to be given.
  - If you want this property to have the same value for all documents, in the Override Value box, type the value you want this property to be given.
    
    Note:
    
    If you set an override value, the attribute mappings and default value for this property are ignored.
  - To save your settings, click the Save icon.
- To remove a mapping, select it and click the Remove icon.
  
  To select or clear all of the mapping check boxes, select or clear the box to the left of Properties.
- To toggle the order in which the properties are sorted, click Properties.
On the Properties and Names page, perform tasks as necessary:
- Naming and Describing an Object
  
  You can instead enter a name and description when you save this content type.
- Managing Object Properties
On the Security page, perform tasks as necessary:
- Setting Security on an Object
The default security for this content type is based on the security of the parent folder.
If you are editing a content type, on the Migration History and Status page, perform tasks as necessary:
- Viewing Migration History and Status for an Object
Note:

The Migration History and Status page is not available when creating an object.

Deleting a Content Type

To delete a content type you must have the following rights and privileges:

Access Administration activity right
Admin access to the content type

To delete a content type:

Click Administration.
Navigate to the content type.
Select the content type you want to delete and click the delete icon.

Mapping Content Types to Imported Content Using the Global Content Type Map

The Global Content Type Map enables you to map file extensions (for example, .doc, .txt, .html) to content types, to define which content types are applied to imported content (whether imported by a content crawler or uploaded by a user).

To access the Global Content Type Map you must be a member of the Administrators group.

The initial content types and mappings enable you to import any type of content into the portal, but you will probably want to create custom content types and mappings to import metadata specific to your company's needs.

Note:

Users with the Advanced Document Submission activity right can use remote document submission or Web document submission and can override the default content type specified in the Global Content Type Map.
When users create content crawlers, the Content Type page displays the default mappings specified in the Global Content Type Map. They can then override these mappings to fit the needs of the individual content crawler.

Click Administration.
In the Select Utility list, choose Global Content Type Map.
Configure identifiers for content types.

You probably want to index full-text and import specific metadata from the majority of content you import into the portal. However, there are some file types that do not include much metadata and cannot be full-text indexed (for example, .exe or .zip files). You can map these file types to the Non Indexed Files content type. Initially, the Global Content Type Map specifies that any file extension that is not mapped uses the Non Indexed Files content type (the last identifier in the list is *, which includes all file extensions).

The * identifier mapping allows any type of file to be imported into the portal.
- If you want to limit the types of files that can be imported into the portal:
  - Remove the mappings for any file types you want to exclude from the portal.
  - Add mappings for any non-indexed files you want to include in the portal (for example, .zip files).
  - Remove the * mapping.
- If you do not want to limit the types of files that can be imported into the portal, keep the * mapping at the bottom of the list so that it is not applied to a file type that has a mapping.

Prioritizing a List of Objects

You can change the order of objects

To move a group to the top of the list, click the Move to Top icon.
To move a group up one space in the list, click the Move Up icon.
To move a group down one space in the list, click the Move Down icon.
To move a group to the bottom of the list, click the Move to Bottom icon.

About Importing Content

You can provide access through your portal to existing content in external document repositories, such as secured Web sites or Windows NT file systems.

This section describes the components involved in importing content and how you can import content security:

Content Providers
Content Web Services
Content Sources
Example of Importing Content Security

Content Providers

A content provider is a piece of software that tells the portal how to use the information in the external content repository. There are three different types of content providers:

Crawl providers tell the portal how to navigate through the content hierarchy.
Document providers tell the portal how to get information from a particular type of document.
Upload providers tell the portal how to copy a document into the Document Repository.

Oracle provides content providers for the following types of content repositories as part of Oracle WebCenter Interaction:

Windows NT File System
Documentum
Microsoft Exchange
Lotus Notes
Oracle Universal Content Management

Note:

You must install the content provider before you can create the associated content Web service. For information on installing content providers, refer to the Oracle Fusion Middleware Installation Guide for Oracle WebCenter Interaction for Windows or the Oracle Fusion Middleware Installation Guide for Oracle WebCenter Interaction for Unix and Linux).

If your content resides in a custom system, such as a custom database, you can import it by writing your own content provider using the IDK. For details, see the Oracle Fusion Middleware Web Service Developer's Guide for Oracle WebCenter Interaction.

Content Web Services

Content Web services enable you to specify general settings for your external user repository, leaving the target and security settings to be set in the associated remote content source and remote content crawler, enabling you to crawl multiple locations of the same content repository without having to repeatedly specify all the settings.

Content Sources

Content sources provide access to external content repositories, enabling users to submit documents and content managers to create content crawlers to import documents into the Knowledge Directory. Each content source is configured to access a particular document repository with specific authentication. For example, a content source for a secured Web site can be configured to fill out the Web form necessary to gain access to that site. Register a content source for each secured Web site or back-end repository from which content can be imported into your portal.

There are two types of content sources: Web content sources and remote content sources. Web content sources provide access to Web sites. Remote content sources provide access to external content repositories, such as a Windows NT file system, Documentum, Microsoft Exchange, or Lotus Notes.

Note:

If you delete a content source from which documents have been imported into the portal, the links to the documents will still exist, but users will no longer be able to access these documents.

Content Source Histories

Content sources keep track of what content has been imported, deleted, or rejected by content crawlers accessing the content source. It keeps a record of imported files so that content crawlers do not create duplicate links. To prevent multiple copies of the same link being imported into your portal, set multiple content crawlers that are accessing the same content source to only import content that has not already been imported from that content source.

Content Sources and Security

Because a content source accesses secured documents, you must secure access to the content source itself. Content sources, like everything in the portal, have security settings that allow you to specify exactly which portal users and groups can see the content source. Users that do not have at least Select access to a content source cannot select it, or even see it, when submitting content or building a content crawler.

Using Content Sources and Security to Control Access

You can create multiple content sources that access the same repository of information. For example, you might have two Web content sources accessing the same Web site. One of these content sources could access the site as an executive user that can see all of the content on the site. The other content source would access the site as a managerial user that can see some secured content, but not everything. You could then grant executive users access to the content source that accesses the Web site as an executive user, and grant managerial users access to the content source that accesses the Web site as a managerial user.

Note:

If you crawled the same repository using both of these content sources, you would import duplicate links into your portal, as described previously in Content Source Histories.

Content Sources Available with the Portal

Some content sources (and their necessary content Web services and remote servers) are automatically created in the Portal Resources folder when you install the portal. There are also content sources that are available with the portal installation, but require additional steps to complete installation. For information on the additional installation steps, refer to the Oracle Fusion Middleware Installation Guide for Oracle WebCenter Interaction for Windows or the Oracle Fusion Middleware Installation Guide for Oracle WebCenter Interaction for Unix and Linux.

World Wide Web: This content source provides access to any unsecured Web site.
Content Upload: This content source lets users upload a document from an internal network. You should upload a document if it is not normally accessible by the users you want to see it. For example, if the document is located on your computer and your computer is not accessible by other users, you should upload the document. Additionally, if you run an extranet, where users may not typically have access to your internal network, you should upload documents you want to make accessible externally.

Content Crawlers

Content crawlers enable you to import content into the portal. Web content crawlers enable you to import content from Web sites. Remote content crawlers enable you to import content from external content repositories such as a Windows NT file system, Documentum, Microsoft Exchange, Lotus Notes, or Oracle Universal Content Management.

Metadata Imported by Content Crawlers

Content crawlers index the full document text, but some content crawlers can import additional metadata.

Content Crawler	Import Links to Documents	Import Document Security	Import Folder Security
Web Content Crawler	Yes	No	No
Remote Windows Content Crawler	Yes	Yes (Windows)	Yes (Windows)
Remote Exchange Content Crawler (Windows)	Yes	No	No
Remote Lotus Notes Content Crawler (Windows)	Yes	Yes	No
Remote Documentum Content Crawler	Yes	Yes	Yes

Content Crawler Best Practices

To facilitate maintenance, we recommend you implement several instances of each content crawler type, configured for limited, specific purposes.
For file system content crawlers, you might want to implement a content crawler that mirrors an entire file system folder hierarchy by specifying a top-level starting point and its subfolders. Although the content in your folder structure is available on your network, replicating this structure in the portal offers several advantages:
- Users are able to search and access the content over the web.
- Interested users can receive regular updates on new content with snapshot queries.
- You can use default profiles to direct new users to important folders.
However, you might find it easier to maintain controlled access, document updates, or document expiration by creating several content crawlers that target specific folders.
If you plan to crawl Web locations, familiarize yourself with the pages you want to import. Often, you can find one or two pages that contain links to everything of interest. For example, most companies offer a list of links to their latest press releases, and most Web magazines offer a list of links to their latest articles. When you configure your content crawler for this source, you can target these pages and exclude others to improve the efficiency of your crawl jobs.
If you know that certain content will no longer be relevant after a date—for example, if the content is related to a fiscal year, a project complete date, or the like—you might want to create a content crawler specifically for the date-dependent content. When the content is no longer relevant, you can run a job that removes all content created by the specific content crawler.
For remote content crawlers, you might want to limit the target for mail content crawlers to specific user names; you might want to limit the target for document content crawlers to specific content types.

For additional considerations and best practices, see the Oracle Fusion Middleware Deployment Guide for Oracle WebCenter Interaction.

Example of Importing Content Security

Assume that you create an authentication source called myAuthSource importing users and groups into the portal from a source domain called myDomain. This authentication source uses the category Employees. Therefore, the text “Employees\” is prepended to each user's name and each group's name to distinguish these users and groups from those imported through other authentication sources. For example, if you have a user myDomain/Mary in the source domain, the user is imported into the portal as Employees/Mary.

Every authentication source automatically creates a group that includes all the users imported through that authentication source. In this example, because the authentication source is called myAuthSource, the group that includes all imported users is called Everyone in myAuthSource.

Suppose that you want to import content from a Lotus Notes system called myNotes, which includes users and groups equivalent to those found in the myDomain domain. Because you have already imported these groups and users into the portal, your Notes content crawler can import Notes security information along with each Notes document. The groups in the Notes system do not have to have the same names as their corresponding groups in the myDomain domain or in the portal; the important thing is that there are Notes groups that have equivalent portal groups. If there are Notes groups that do not have equivalent groups in the portal, your Notes content crawler will ignore security settings referring to such groups.

When your Notes content crawler finds a document, it creates a list of the Notes groups that have access to it. This list is called an ACL (Access Control List). The ACLs created for Notes documents do not contain entries for specific Notes users, only for Notes groups. (Notes content crawlers only grant access to portal groups. Windows File content crawlers do grant access to portal users.) Each ACL entry is written as {Notes Server Name}\{Notes Group Name}. In this example, the content crawler creates an ACL with the single entry myNotes\Engineering, because this is the only Notes group that has access to that document.

The content crawler then refers to the Global ACL Sync Map to determine which portal group corresponds to this Notes group. This is a two-stage process:

Knowing that you would import documents and security through Notes content crawlers, on the Prefix: Domain Map page, you mapped the myAuthSource category Employees to the source domain myNotes. Guided by this entry, your content crawler modifies the ACL entry from myNotes\Engineering to Employees\Engineering.
Knowing that your Notes system uses a different group name than your myDomain domain, on the Portal: External Group Map page, you mapped the Notes system group Engineering to the myDomain group, now, the portal group, Developers. Guided by this entry, your content crawler modifies the ACL entry from Employees\Engineering to Employees\Developers.

As a result, all the users in the portal group Developers are automatically granted access to the document.

Working with Content Web Services

This section describes the following main tasks:

Creating or Editing a Content Web Service
Deleting a Content Web Service

It also cover the following low-level tasks:

Sending General Settings from a Web Service to Associated Content Crawlers

Creating or Editing a Content Web Service

Before you create a content Web service, you must:

Install the content provider on the computer that hosts the portal or on another computer
Create a remote server pointing to the computer that hosts the content provider (optional, but recommended)

To create a content Web service you must have the following rights and privileges:

Access Administration activity right
Create Web Service Infrastructure activity right
At least Edit access to the parent folder (the folder that will store the content Web service)
At least Select access to the remote server that the content Web service will use

To edit a content Web service you must have the following rights and privileges:

Access Administration activity right
At least Edit access to the content Web service)
If you plan to change the remote server association, at least Select access to the remote server that the content Web service will use

To create or edit a content Web service:

Click Administration.
Open the Content Web Service Editor.
- To create a content Web service, open the folder in which you want to store the content Web service. In the Create Object list, click Web Service — Content.
- To edit a content Web service, open the folder in which the content Web service is stored and click its name.
On the Main Settings page, complete the following tasks:
On the HTTP Configuration page, perform tasks as necessary:
- Specifying How Gatewayed Content is Handled
- Specifying What Content is Gatewayed for a Web Service
On the Advanced URL Settings page, perform tasks as necessary:
On the Advanced Settings page, perform tasks as necessary:
On the Authentication Settings page, perform tasks as necessary:
- Specifying Authentication Settings for a Web Service
On the Preferences page, perform tasks as necessary:
- Sending User Preferences from the Web Service to Associated Objects
On the User Information page, perform tasks as necessary:
- Sending User Information from a Web Service to Associated Objects
On the Debug Settings page, perform tasks as necessary:
- Enabling Error Tracing for a Web Service
On the Associated Objects page, perform tasks as necessary:
- Viewing Objects Associated with a Web Service
On the Properties and Names page, perform tasks as necessary:
- Naming and Describing an Object
  
  You can instead enter a name and description when you save this authentication Web service.
- Managing Object Properties
On the Security page, perform tasks as necessary:
- Setting Security on an Object
The default security for this content Web service is based on the security of the parent folder. Administrative users with at least Select access to this content Web service and the Create Content Source activity right can create content sources based on the Web service.
If you are editing a content Web service, on the Migration History and Status page, perform tasks as necessary:
- Viewing Migration History and Status for an Object
Note:

The Migration History and Status page is not available when creating an object.

Deleting a Content Web Service

To delete a content Web service you must have the following rights and privileges:

Access Administration activity right
Admin access to the content Web service

To delete a content Web service:

Click Administration.
Navigate to the content Web service.
Select the content Web service you want to delete and click the delete icon.

Note:

Deleting a content Web service will break any associated content sources.

Sending General Settings from a Web Service to Associated Content Crawlers

To specify what information the content Web service passes to associated content crawlers:

If the Content Web Service Editor is not already open, open it now.
Click the Advanced Settings page.
Under Settings, specify what general information, if any, you want this Web service to pass to its associated content crawlers:
- To allow users to import documents into the Directory, select one or both of the following options:
  - If this Web service requires only a file path to import a document, select Support Document Submission using file paths.
  - This Web service might require detailed configuration information; for example, a content Web service that imports information from Lotus Notes requires users to navigate to the document they want to import. If this is the case, select Support Document Submission using Remote UI.
  - If you want to allow users to upload documents into the Document Repository (rather than just creating a link to the source document), select Supports Document Submission Upload. If the documents that users submit come from repositories that are not accessible over the network, you should choose this option.
- If content crawlers associated with this Web service can copy the source folder structure into the portal, select Supports mirroring the source folder structure.
- If content crawlers associated with this Web service can copy the source document security into the portal, select Supports importing security with each document.
- To send the time zone of the user from which the portlet request is sent, select Send timezone to Portlets.
- If this Web service requires the user to have an API session (for example, if the Web service uses the SOAP API), select Send Login token to Portlets; in the Login Token duration box, type the number of minutes you want the API session to last.
- To send the ID of the experience definition from which the request is sent, select Send Experience Definition ID to Portlets.

Working with Content Sources

This section describes the following main tasks:

Creating or Editing a Remote Content Source
Creating or Editing a Web Content Source
Deleting a Content Source

It also covers the following low-level tasks:

Gatewaying Imported Content
Providing Access to Web Content Through a Proxy Server
Selecting a Web Service for Gatewayed Content
Providing Access to Web Content by Impersonating a User
Providing Access to Web Content Through a Login Form
Providing Access to Web Content Through Cookies
Providing Access to Web Content Through Header Information

Creating or Editing a Remote Content Source

Before you create a remote content source, you must:

Install the crawl provider on the computer that hosts the portal or on another computer
Create a remote server pointing to the computer that hosts the crawl provider (optional, but recommended)
Create a crawler Web service on which to base this content source

To create a remote content source you must have the following rights and privileges:

Access Administration activity right
Create Content Sources activity right
At least Edit access to the parent folder (the folder that will store the content source)

To edit a remote content source you must have the following rights and privileges:

Access Administration activity right
At least Edit access to the content source

To create or edit a remote content source:

Click Administration.
Open the Remote Content Source Editor.
- To create a remote content source, open the folder in which you want to store the content source. In the Create Object list, click Content Source - Remote. In the Choose Web Service dialog box, select the Web service that provides the basic settings for your content source and click OK.
- To edit a remote content source, open the folder in which the user is stored and click the content source name.
On Main Settings page, perform tasks as necessary:
- If necessary, edit the content Web service associated with this content source by clicking the Web service name.
- Gatewaying Imported Content
Note:

Depending on what type of remote content source you are creating, you might see additional settings and additional pages.
On the Properties and Names page, perform tasks as necessary:
- Naming and Describing an Object
  
  You can instead enter a name and description when you save this content source.
- Managing Object Properties
On the Security page, perform tasks as necessary:
- Setting Security on an Object
If you are editing a content source, on the Migration History and Status page, perform tasks as necessary:
- Viewing Migration History and Status for an Object
Note:

The Migration History and Status page is not available when creating an object.

Users with at least Select access to this content source can now submit documents from this content source or create content crawlers that will access this content source.

Creating or Editing a Web Content Source

Note:

The World Wide Web content source, created upon install, provides access to any unsecured Web site.

To create a Web content source you must have the following rights and privileges:

Access Administration activity right
Create Content Sources activity right
At least Edit access to the parent folder (the folder that will store the content source)

To edit a Web content source you must have the following rights and privileges:

Access Administration activity right
At least Edit access to the content source

To create or edit a Web content source:

Click Administration.
Open the Web Content Source Editor.
- To create a Web content source, open the folder in which you want to store the content source. In the Create Object list, click Content Source - WWW. In the Choose Web Service dialog box, select the Web service that provides the basic settings for your content source and click OK.
- To edit a remote content source, open the folder in which the user is stored and click the content source name.
On the Main Settings page, perform tasks as necessary:
- Providing Access to Web Content by Impersonating a User
- Gatewaying Imported Content
- Selecting a Web Service for Gatewayed Content (only necessary if you selected to gateway content)
On the Proxy Server Configuration page, perform tasks as necessary:
- Providing Access to Web Content Through a Proxy Server
On the Login Form Settings page, perform tasks as necessary:
- Providing Access to Web Content Through a Login Form
On the Cookie Information page, perform tasks as necessary:
- Providing Access to Web Content Through Cookies
On the Header Information page, perform tasks as necessary:
- Providing Access to Web Content Through Header Information
On the Properties and Names page, perform tasks as necessary:
- Naming and Describing an Object
  
  You can instead enter a name and description when you save this content source.
- Managing Object Properties
On the Security page, perform tasks as necessary:
- Setting Security on an Object
If you are editing a content source, on the Migration History and Status page, perform tasks as necessary:
- Viewing Migration History and Status for an Object
Note:

The Migration History and Status page is not available when creating an object.

Users with at least Select access to this content source can now submit documents from this content source or create content crawlers that will access this content source.

Deleting a Content Source

To delete a content source you must have the following rights and privileges:

Access Administration activity right
Admin access to the content source

To delete a content source:

Click Administration.
Navigate to the content source.
Select the content source you want to delete and click the delete icon.

Note:

Deleting a content source will break the links to any content imported, submitted, or uploaded from that content source.

Gatewaying Imported Content

When users click a link to an imported document, they can either be directed to the actual location of the source document or the content can be gatewayed, and the user will be redirected to a URL (generated from the settings in your portal configuration file) that, in turn, displays the document. Gatewaying content allows users to view documents they might not otherwise be able to view, either due to security on the source repository or firewall restrictions. You configure gateways settings on the Main Settings page of the Content Source Editor.

If the Content Source Editor is not already open, open it now.
Under URL Type, specify what happens when users follow document links:
- If you want to direct users to the actual location of the document, choose Does not use the Gateway to open documents. Be warned, however, that with this option, even users with access to this content source's documents will not be able to open the documents if the documents are not publicly available and the users are not connected to your network.
- If you want to redirect users to a URL (generated by the settings in your portal configuration file) that, in turn, displays the document, choose Uses the gateway to open documents.
  Note:
  - If you want your users to be able to view documents even when they are not connected to your network, you should choose this option.
  - If the associated content Web service supports content upload (specified on the Advanced Settings page of the Content Web Service Editor), you must use the gateway or content uploads will fail.
By default Web content sources do not gateway content, whereas remote content sources do.

Providing Access to Web Content Through a Proxy Server

If you use a proxy server to access the internet, you can specify the proxy server settings on the Proxy Server Configuration page of the Content Source Editor.

If the Content Source Editor is not already open, open it now.
Click the Proxy Server Configuration page.
In the Address box, type the name of your proxy server.
In the Port box, type the port number for your proxy server.
If this proxy server requires security information:
1. In the User name box, type the name of the user you want the portal to impersonate to access this proxy server.
2. In the Password box, type the password for the user you specified.
3. In the Confirm box, type the password again.
If you do not require the proxy server to access computers hosted on your local network, select Bypass proxy server for local addresses.
If there are other sites that do not require the proxy server, in the Do not use for addresses beginning with box, type the base URLs of these Web sites.

Separate multiple URLs with semicolons (;).

Selecting a Web Service for Gatewayed Content

If you selected to gateway the content from this content source, you must select a Web service to associate with this content source. You can specify that information on the Main Settings page of the Web Content Source Editor.

If the Content Source Editor is not already open, open it now.
Under Web Service, associate a content Web service with this content source:

This section appears only if you selected to gateway content.
- To associate an existing content Web service, click Browse; then, in the Choose Web Service dialog box, choose a content Web service and click OK.
- To remove the association, click Remove.
- To edit the associated content Web service, click its name.

Providing Access to Web Content by Impersonating a User

If the Web site accessed by this content source requires a specific user name and password to access the site, you can specify that information on the Main Settings page of the Web Content Source Editor.

If the Content Source Editor is not already open, open it now.
Under Target Site Security, specify the security information required to access this Web site:
1. In the User name box, type the name of the user that this portal will impersonate to access content from this Web site.
2. In the Password box, type the password for the user.
3. In the Confirm password box, type the password again.

Providing Access to Web Content Through a Login Form

If the Web site accessed by this content source requires users to complete a form to access the site, you can specify the login form settings on the Login Form Settings page of the Content Source Editor.

If the Content Source Editor is not already open, open it now.
Click the Login Form Settings page.
In the Login URL box, type the URL to the login form that must be completed.
In the Post URL box, type the URL to which this login form posts data.

To find the URL, search the form's source HTML for the <FORM> tag; the ACTION attribute contains the URL to which the form posts.
Under Form Fields, specify the information needed to gain access to this site:

To determine this information, you can either contact the person who wrote the form or search the form's source HTML for each <INPUT> tag.
- To add information for an <INPUT> tag:
  1. Click Add.
  2. In the Name box, type the text after “name=” from the <INPUT> tag.
    
    For example, if the form includes <INPUT type="password" name="Password" size="10">, type Password.
  3. In the Value box, type the text you would normally type in the form field.
    
    Using the example from the previous step, you would type the password needed to access the site.
- To remove a name/value pair, select the name/value and click the Remove icon.
  
  To select or clear all of the name/value pair boxes, select or clear the box to the left of Name.

Providing Access to Web Content Through Cookies

If the Web site accessed by this content source requires information to be sent in the form of cookies, you can specify the cookie settings on the Cookie Information page of the Content Source Editor.

If the Content Source Editor is not already open, open it now.
Click the Cookie Information page.
Determine what cookie information you must send through one of the following methods:
- Contact the person who wrote the form.
- Viewing the cookies through your internet browser:
  1. Set your internet browser to prompt you before accepting cookies.
    
    Refer to your browser's online help for instructions.
  2. Navigate to the Web site this content source will access.
  3. When prompted to accept a cookie, view the cookie information.
    
    For each cookie you receive, make note of the name and data values and the base URL from which it was sent.
Under Cookies, specify the cookie information needed to gain access to this site:
- To add information for a cookie:
  1. Click Add.
  2. In the Name box, type the text displaying in the Name field for the cookie.
  3. In the Value box, type the text displaying in the Data field for the cookie.
  4. In the Cookie URL box, type the base URL from which the cookie was sent.
    
    For example, if you need a cookie to access all areas of a Web site, you might type http://www.mysite.com, but if the cookie is needed to access only a particular area of a Web site, you might type http://www.mysite.com/securedcontent.
- To remove a cookie, select the cookie and click the Remove icon.
  
  To select or clear all of the cookie boxes, select or clear the box to the left of Name.

Providing Access to Web Content Through Header Information

If the Web site accessed by this content source requires header information to access the site, you can specify the header information on the Header Information page of the Content Source Editor.

If the Content Source Editor is not already open, open it now.
Click the Header Information page.
Paste the required header information into the text box if one of the following is true:
- The Web site accessed by this content source only responds to requests with specific information in the included HTTP header.
- Your proxy server sends requests beyond your firewall only if there is specific information in the header.

Working with Content Crawlers

This section describes the following main tasks:

Creating or Editing a Remote Content Crawler
Creating or Editing a Web Content Crawler
Deleting a Content Crawler

It also cover the following low-level tasks:

Specifying Where and How Far to Crawl
Setting Destination Folders for Imported Content
Mirroring the Source Folder Structure
Setting a Content Crawler to Obey Folder Filters
Automatically Approving Imported Documents
Importing Security with Imported Documents
Manually Granting Access to Imported Documents
Avoiding Importing Unwanted Content
Specifying a Time-Out Setting for a Web Content Crawler
Specifying Expiration and Refresh Settings for Imported Documents
Customizing the Content Type Mappings for a Content Crawler
Specifying What to Do with Rejected Documents
Specifying What to Do on Subsequent Crawls
Marking Imported Documents with a Crawler Tag
Configuring the Number of Threads Used to Crawl Content
Testing a Content Crawler
Troubleshooting the Results of a Crawl
Destination Folder Flowchart

Creating or Editing a Remote Content Crawler

You can create a remote content crawler to import content (and security) from external document repositories.

Before you create a remote content crawler, you must:

Install the content provider on the computer that hosts the portal or on another computer.
Create a remote server.
Create a content Web service.
Create a content source.
Create the folders in which you want to store the imported content.
Create and apply any filters to the folders to control the sorting of imported content.
Create any users and groups to which you want to grant access to the imported content.

To create a remote content crawler you must have the following rights and privileges:

Access Administration activity right
Create Content Crawlers activity right
At least Edit access to the parent folder (the folder that will store the content crawler)
At least Select access to the content source on which this content crawler will be based
At least Select access to the folders in which you want to store the imported content
At least Select access to the users and groups to which you want to grant access to the imported content

To edit a remote content crawler you must have the following rights and privileges:

Access Administration activity right
At least Edit access to the remote content crawler
At least Select access to the content Web service on which this content crawler will be based
If you plan to change the folders into which you will store the imported content, at least Select access to the folders
If you plan to change the users and groups to which you will grant access to the imported content, at least Select access to the users and groups

To create or edit a remote content crawler:

Click Administration.
Open the Remote Content Crawler Editor.
- To create a remote content crawler, open the folder in which you want to store the content crawler. In the Create Object list, click Content Crawler — Remote. In the Choose Content Source dialog box, select the content source that provides access to the content you want to crawl and click OK.
- To edit a remote content crawler, open the folder in which the content crawler is stored and click the remote content crawler name.
On the Main Settings page, perform tasks as necessary:
- Define where and how far to crawl. Depending on what type of content repository you are crawling, you see different options.
- Setting Destination Folders for Imported Content
- Mirroring the Source Folder Structure
- Setting a Content Crawler to Obey Folder Filters
- Automatically Approving Imported Documents
- Importing Security with Imported Documents
- Manually Granting Access to Imported Documents
On the Document Settings page, perform tasks as necessary:
- Specifying Expiration and Refresh Settings for Imported Documents
On the Content Type page, perform tasks as necessary:
- Customizing the Content Type Mappings for a Content Crawler
On the Advanced Settings page, perform tasks as necessary:
- In the list under Content Language, choose the language in which the majority of content to import is written.
- Specifying What to Do with Rejected Documents
- Specifying What to Do on Subsequent Crawls
- Marking Imported Documents with a Crawler Tag
- Configuring the Number of Threads Used to Crawl Content
On the Set Job page, perform tasks as necessary:
- Associating an Object with a Job
On the Properties and Names page, perform tasks as necessary:
- Naming and Describing an Object
  
  You can instead enter a name and description when you save this authentication Web service.
- Managing Object Properties
On the Security page, perform tasks as necessary:
- Setting Security on an Object
The default security for this content crawler is based on the security of the parent folder.
If you are editing a content crawler, on the Migration History and Status page, perform tasks as necessary:
- Viewing Migration History and Status for an Object
Note:

The Migration History and Status page is not available when creating an object.

To import content, run the job you associated with this content crawler.

Creating or Editing a Web Content Crawler

You can create a Web content crawler to import content from Web sites and RSS feeds.

Before you create a Web content crawler, you must:

Create a content source, if necessary, to access secured content.
Create the folders in which you want to store the imported content.
Create and apply any filters to the folders to control the sorting of imported content.
Create any users and groups to which you want to grant access to the imported content.

To create a Web content crawler you must have the following rights and privileges:

Access Administration activity right
Create Content Crawlers activity right
At least Edit access to the parent folder (the folder that will store the content crawler)
At least Select access to the content source on which this content crawler will be based
At least Select access to the folders in which you want to store the imported content
At least Select access to the users and groups to which you want to grant access to the imported content

To edit a Web content crawler you must have the following rights and privileges:

Access Administration activity right
At least Edit access to the Web content crawler
If you plan to change the folders into which you will store the imported content, at least Select access to the folders
If you plan to change the users and groups to which you will grant access to the imported content, at least Select access to the users and groups

To create or edit a Web content crawler:

Click Administration.
Open the Web Content Crawler Editor.
- To create a Web content crawler, open the folder in which you want to store the content crawler. In the Create Object list, click Content Crawler — WWW. In the Choose Content Source dialog box, select the content source that provides access to the content you want to crawl and click OK.
- To edit a Web content crawler, open the folder in which the content crawler is stored and click the Web content crawler name.
On the Main Settings page, perform the following tasks as necessary:
On the Web Page Exclusions page, perform the following tasks as necessary:
- Avoiding Importing Unwanted Content
On the Target Settings page, perform the following tasks as necessary:
- Specifying a Time-Out Setting for a Web Content Crawler
On the Document Settings page, perform the following tasks as necessary:
- Specifying Expiration and Refresh Settings for Imported Documents
On the Content Type page, perform the following tasks as necessary:
- Customizing the Content Type Mappings for a Content Crawler
On the Advanced Settings page, perform tasks as necessary:
- In the list under Content Language, choose the language in which the majority of content to import is written.
- Specifying What to Do with Rejected Documents
- Specifying What to Do on Subsequent Crawls
- Marking Imported Documents with a Crawler Tag
- Configuring the Number of Threads Used to Crawl Content
On the Set Job page, perform the following tasks as necessary:
- Associating an Object with a Job
On the Properties and Names page, perform tasks as necessary:
- Naming and Describing an Object
  
  You can instead enter a name and description when you save this authentication Web service.
- Managing Object Properties
On the Security page, perform tasks as necessary:
- Setting Security on an Object
The default security for this content crawler is based on the security of the parent folder.
If you are editing a content crawler, on the Migration History and Status page, perform tasks as necessary:
- Viewing Migration History and Status for an Object
Note:

The Migration History and Status page is not available when creating an object.

To import content, run the job you associated with this content crawler.

Deleting a Content Crawler

To delete a content crawler you must have the following rights and privileges:

Access Administration activity right
Admin access to the content crawler

To delete a content crawler:

Click Administration.
Navigate to the content crawler.
Select the content crawler you want to delete and click the delete icon.

Specifying Where and How Far to Crawl

To specify where and how far to crawl:

If the Content Crawler Editor is not already open, open it now.
On the Main Settings page, in the URL to crawl box, type the URL to the site from which you want to import content.
In the Crawl radius list, specify the maximum number of links away from the target page to crawl. For example, if you select 1, this content crawler attempts to import every page directly linked to the target page; if you select 2, this content crawler attempts to import every page directly linked to the target page, and every page directly linked to those linked pages.
By default, this content crawler creates a link to the URL you entered in step 3. If you do not want to create a link to this page, clear the Import the target page check box. For example, if you crawl the results of a search, you would not want to import the target page (the search results page); you would want to import each linked page (each result).

Setting Destination Folders for Imported Content

To set the destination folder for imported content:

If the Content Crawler Editor is not already open, open it now.
Under Destination Folders, specify into which folders you want to import content. The content crawler attempts to import a link to every document it finds into the most subordinate subfolder within the destination folder that allows the link to pass.

To view a flowchart showing how the content crawler determines into which folders it will import content, see Destination Folder Flowchart.
- To add destination folders, click Add Folder; then, in the Choose Folders dialog box, select the folders you want to add and click OK. To crawl documents into a folder, you must have at least Edit access to that folder.
- To remove a folder, select the folder and click the remove icon.
- To select or clear all of the folder check boxes, select or clear the box to the left of Folder Path.
- To toggle the order in which the folders are sorted (ascending/descending), click Folder Path or click the icon to the right of that.

Mirroring the Source Folder Structure

If the content Web service used by the content crawler supports folder mirroring (specified on the Advanced Settings page of the Content Web Service Editor), you can have the content crawler create Directory folders that duplicate the folder structure of the content repository being crawled.

Note:

If you mirror the folder structure and import security information with each document (described in step 5), the folder security is imported for the mirrored folders.
If you mirror the folder structure, upon successive runs the content crawler removes any portal folders that do not have corresponding source folders. For this reason, if you run this content crawler periodically, neither you nor anyone else should modify the mirrored portal folders or documents in any way.
You cannot change the mirror setting after creation of this content crawler. That is, if you set this content crawler to mirror the folder structure, you cannot edit this setting later.

To mirror the source folder structure:

If the Content Crawler Editor is not already open, open it now.
On the Main Settings page, select Mirror the source folder structure.

Setting a Content Crawler to Obey Folder Filters

By default, crawled documents do not must pass the filters of destination folders, so all documents will be imported into all destination folders. If you want the content crawler to obey the filters applied to the destination folders when importing content, change the setting in the content crawler.

Note:

This feature is not available if you mirror the source folder structure.

To set a content crawler to obey folder filters:

If the Content Crawler Editor is not already open, open it now.
On the Main Settings page, select Apply Filter of Destination Folder.

Automatically Approving Imported Documents

By default, documents require approval, meaning that before the link to the imported document is available to users, it must be approved by a portal administrator with at least Edit access to the destination folder. You can instead automatically approve all imported documents.

To automatically approve imported documents:

If the Content Crawler Editor is not already open, open it now.
On the Main Settings page, select Automatically approve imported documents.

If you are mirroring the folder structure, you might want to set imported documents to be approved automatically and restrict users to Read access (users in the Administrators group always have Admin access). If you set imported documents to require approval, be aware that any portal administrator who has at least Edit access can also modify the folders and content, and can therefore make your portal folders and content out of sync with your source repository.

Importing Security with Imported Documents

If the content Web service used by this content crawler supports security importation and the source repository users and groups correspond to portal users and groups (specified in the Global ACL Sync Map), you can have this content crawler import the security settings for each document. This automatically makes documents that are available to source repository users available to the mapped portal users.

Note:

Because read access is equivalent in the source repository and the portal, but write access is not, only read access is imported; write access is ignored because write access to a document in an external repository enables you to edit the document, but write access (referred to as Edit access) in the portal enables you to edit the properties and security settings of that document.

To import security with imported documents:

If the Content Crawler Editor is not already open, open it now.
On the Main Settings page, select Import security with each document.

Manually Granting Access to Imported Documents

To manually grant users and groups access to the content imported by a content crawler:

If the Content Crawler Editor is not already open, open it now.
On the Main Settings page, under Document Access Privileges, perform the following actions:
- To add users or groups, click Add Users/Groups; then, in the Choose Groups and Users dialog box, select the users and groups you want to add and click OK.
  
  To add a user or group, you must have at least Select access to that user or group.
- For each user or group, in the associated Privilege list, choose the access privilege you want to grant for content imported by this content crawler.
- To remove a user or group, select the user or group and click the remove icon.
- To select or clear all of the user and group check boxes, select or clear the box to the left of Users/Groups.
- To toggle the order in which the users and groups are sorted (ascending/descending), click Users/Groups or click the icon to the right of that.
- To view the members of a group, click the group name.

Avoiding Importing Unwanted Content

To configure this content crawler to avoid importing unwanted Web pages into your portal:

If the Content Crawler Editor is not already open, open it now.
Click the Web Page Exclusions page.
By default, this content crawler follows the Web server's recommendations about which pages might be of value to automated crawlers. If you want to ignore these recommendations, clear the Obey the target site's robot exclusion protocols check box.

In general, these recommendations help limit unwanted content from being crawled into the portal. However, some sites offer very strict recommendations. If your content crawler is not importing any content from a site, try turning this option off.
By default, this content crawler saves the URLs to imported Web pages in the case used on the source Web site. To change the URLs to lower case, select Convert all URLs to lower case.
To avoid importing content from an area of a Web site or to avoid importing particular pages:
- To specify an area to avoid, click Add exclusion filter; then, in the text box, type the URL to the area of the Web site to avoid.
  
  You can use wildcard notation (*) to make the exclusion more general. For example, to avoid crawling sales information from a site, you might type http://mycompany.com*sales. As a result, this crawler would not import any pages from mycompany.com that have "sales" anywhere in the URL.
  Note:
  - Wildcards are assumed on either side of your text. For example, if you type sales, the crawler will not import any pages from any site accessible from the target URL that has "sales" anywhere in the URL.
  - If you list exclusions and inclusions (described below), the exclusions apply only to the included pages. For example, if you excluded sales and included http://mycompany.com, your crawler would import all pages from http://mycompany.com except for those pages that had "sales" anywhere in the URL.
- To remove an exclusion filter, select it and click the remove icon.
- To select or clear all exclusion filter check boxes, select or clear the box to the left of Exclusion Filters.
By default, this content crawler does not crawl or import any pages specified in the exclusions. If your content crawler will navigate from a link on an excluded page to a page that is not excluded and that should be imported, choose Crawl excluded pages, but do not import them.
To limit your crawl to an area of a Web site or a particular page:
- To specify where this content crawler may crawl, click Add inclusion filter; then, in the text box, type the URL to the area of the Web site to which you want to restrict your crawl. Because Web sites can contain links to other sites, you might want to use inclusions to keep your content crawler on a particular site. To avoid crawling other sites, add the base URL of the site you want to crawl to the inclusion list; for example, http://mycompany.com.
  
  You can use wildcard notation (*) to make the inclusion more general. For example, if you want to crawl only information on single sign-on (SSO), you might type http://mycompany.com*sso. As a result, this content crawler would import only pages from mycompany.com that have "sso" anywhere in the URL.
  Note:
  - Wildcards are assumed on either side of your text. For example, if you type sso, the content crawler will import any pages from any site accessible from the target URL that has "sso" anywhere in the URL.
  - If you list inclusions and exclusions, the exclusions apply only to the included pages. For example, if you included http://mycompany.com and excluded sso, your content crawler would import all pages from http://mycompany.com except for those pages that had "sso" anywhere in the URL.
- To remove an inclusion filter, select the it and click the remove icon.
- To select or clear all inclusion filter check boxes, select or clear the box to the left of Inclusion Filters.

Specifying a Time-Out Setting for a Web Content Crawler

To specify a time-out for a Web content crawler:

If the Content Crawler Editor is not already open, open it now.
Click the Target Settings page.
To specify the maximum amount of time that this content crawler waits for a Web page to load, next to Time-out period, type a number in the box and choose a period in the drop-down list. If the page does not load in this time, the content crawler moves on to the next page.

Specifying Expiration and Refresh Settings for Imported Documents

You can have the Document Refresh Agent periodically verify that the source documents for links in the portal still exist, update document properties, or expire document links.

You might want to set document links to expire if the document will become irrelevant at some point. For example, if you import forms for 2010 company benefits, you might want to have them expire at the end of the year, so that users do not use outdated forms.

When a link is refreshed, the Document Refresh Agent verifies whether the source document still exists. If the document exists, the Document Refresh Agent updates the associated property values from the source document. If the document does not exist, the Document Refresh Agent applies the settings you specify for dealing with broken links.

These settings are referenced by the Document Refresh Agent when the Document Refresh job runs.

If the Content Crawler Editor is not already open, open it now.
Click the Document Settings page.
Under Document Expiration, choose whether the link to the document should expire:
- To specify that document links should not be deleted due to expiration, choose Never expire.
- To specify that document links should be deleted after a specified period, choose Delete after, type a number in the box, and choose a period in the drop-down list.
  
  When the Document Refresh job runs, it will delete any document links that have reached the expiration date.
  
  Tip:
  
  If you want to delete all documents previously imported by this content crawler, you can set the documents to expire immediately (for example, setting them to delete after 1 minute) and apply these settings to existing documents as described in step 4. The next time the Document Refresh job runs, it deletes all documents previously imported by this content crawler.
Under Link and Property Refresh, specify the refresh settings that should be used by the Document Refresh Agent:
- If you do not want to refresh the document, select Never.
- If you want to refresh the document, select Every, and type a number in the box and choose an interval from the list.
- To prevent the Document Refresh Agent from refreshing document properties, select Only confirm the validity of the links to these documents.
Under Broken Links, specify what happens to links to documents if the Document Refresh Agent finds that the source documents do not exist:
- If you want to leave broken links in the portal, select Left alone.
- If you want to remove broken links from the portal immediately, select Deleted immediately.
- If you want to leave broken links for a specified amount of time, select Deleted after, and type a number in the box and choose an interval from the list.
  
  You might want to leave broken links in the portal for a short while in case the source document repository is temporarily inaccessible.
If you change the settings on this page after this content crawler has run and you want to apply these new settings to previously imported documents, select Apply these settings to existing documents created by this content crawler. These settings will be applied when you click Finish, but documents will not be deleted and properties will not be updated until you run the Document Refresh job.

Customizing the Content Type Mappings for a Content Crawler

By default, a content crawler uses the content type mappings specified in the Global Content Type Map. However you can customize these mappings to fit the needs of the content you are crawling.

Content type mappings show a content crawler how to assign content types to imported content. When a content crawler finds a new document, it starts at the top of its content type mappings list and looks for an extension that matches the document. It uses the content type that is mapped to the first matching extension. If it cannot find a matching extension, it does not import the document.

To customize the content type mappings for a content crawler:

If the Content Crawler Editor is not already open, open it now.
Click the Content Type page.
Perform the following actions to change the mappings for this content crawler:
- To add a mapping to this list, under the appropriate type map grouping, click New Identifier; then, in the Identifier Editor, type the file extension, choose a content type, and click Finish. The new mapping displays at the bottom of the list.
- To create a new content type to map to a new or existing extension, click Create Content Type; then, in the Content Type Editor, choose a document accessor and click Finish.
- To remove a mapping, select the mapping and click the remove icon.
- To select or clear all of the mapping check boxes, select or clear the box to the left of Identifiers.
  - To move a mapping to the top of this list, click the move to top icon.
  - To move a mapping up one space in this list, click the move up icon.
  - To move a mapping down one space in this list, click the move down icon.
  - To move a mapping to the bottom of this list, click the move to bottom icon.
- To edit a mapping, click the edit icon and change the mapping in the Identifier Editor.

Specifying What to Do with Rejected Documents

To customize the content type mappings for a content crawler:

If the Content Crawler Editor is not already open, open it now.
Click the Advanced Settings page.
Under Rejected Documents, specify what to do with documents that do not successfully sort into a folder:
- To import these documents anyway, choose Import into the Unclassified Documents folder.
  
  Note:
  
  The Unclassified Documents folder is available to users with the Access Unclassified Documents activity right. To access unclassified documents, in the Directory menu, click Edit Directory and open the Unclassified Documents folder. You can also click Administration, then, in the Select Utilities menu, choose Access Unclassified Documents.
- To avoid importing these documents, choose Do not import.
If you are editing an existing content crawler, you see additional options under Rejected Documents that allow you to specify what to do when this content crawler finds a previously rejected document. The definition of "previously rejected" depends on how you defined new links, as described in Specifying What to Do on Subsequent Crawls.
- If you chose by this Content Crawler, previously rejected documents include all documents rejected by this content crawler.
- If you chose from this Content Source, previously rejected documents include all documents rejected from this content source.
Specify what to do with previously rejected documents:
- To have this content crawler try to import previously rejected documents, select Re-Import.
- To avoid importing these documents, choose Do not import.
If absolutely necessary, you can delete the history of previously rejected documents. Again, the definition of "previously rejected" depends on how you defined new links, as described in Specifying What to Do on Subsequent Crawls. If you chose from this Content Source, you are deleting the rejection history for all content crawlers that import documents from this content source. If you are still sure that you must delete the history of previously rejected documents, click Clear Rejection History.

Specifying What to Do on Subsequent Crawls

You can refresh metadata and import new content from content crawlers that have previously imported content.

If you are editing an existing content crawler, you see the section Importing Documents. Under Importing Documents, specify whether to import only new documents. By default, the content crawler attempts to import only new documents (those that have not been previously imported by this content crawler or other content crawlers that access this same content source). You can change the content crawler setting to import multiple copies of each document, which might be useful while testing your content crawlers. You can also specify whether the content metadata should be updated.

If the Content Crawler Editor is not already open, open it now.
Click the Advanced Settings page.
To import only new documents, select Import only new links.

New options display.If you want to import all content again the next time this content crawler runs, leave the option unselected and skip the rest of the steps.
Specify what new links means:
- To import only those documents that have not been previously imported by this content crawler, choose by this Content Crawler.
- To import only those documents that have not been imported from the associated content source (either by this content crawler, another content crawler, or manually by a user), choose from this Content Source.
Note:

The option you choose here also applies to the rejection history (discussed in Specifying What to Do with Rejected Documents) and deletion history (discussed below). For example, if you select from this Content Source, the rejection history includes content rejected by any content crawler that has crawled the content source.
To refresh the previously imported documents as specified on the Document Settings page, select refresh them.

Generally, refreshing documents is the job of the Document Refresh Agent; refreshing documents slows the content crawler down. However, if you changed the document settings for this content crawler or changed the property mappings in the associated content types, refreshing documents updates these settings for the previously imported documents.

Note:

If you are crawling an RSS feed, the refresh them option refreshes the properties (such as the title and description) with the values from the target documents, not the RSS feed. If you want to retain the properties from the RSS feed, do not select refresh them.
If you created additional folders or applied different filters to destination folders, select try to sort them into additional folders to sort the previously imported documents into new Knowledge Directory folders.

Another content crawler might have imported documents from the same content source but into different folders than the destination folders specified for this content crawler. Ensure that you really want to re-sort those documents into the destination folders specified for this content crawler.
To re-import documents that were previously deleted (manually, due to expiration, or due to missing source documents), select regenerate deleted links.

Note:

This might re-import documents that were at one time deemed inappropriate for your portal.

If absolutely necessary, you can delete the history of documents that have been deleted from the portal. Remember that the deletion history is defined by what you specified as new documents in step 4.

If you chose by this Content Crawler, the history includes all documents imported by this content crawler that have been deleted.
If you chose from this Content Source, the history includes all documents imported from this content source that have been deleted. Therefore, you are deleting the history for all content crawlers that import documents from this content source.

If you are still sure that you must delete the record of documents deleted from the portal, click Clear Deletion History.

Marking Imported Documents with a Crawler Tag

For troubleshooting purposes, you might want to mark imported documents with a unique crawler tag so you know which content crawler imported a particular document.

To mark imported documents with a crawler tag:

If the Content Crawler Editor is not already open, open it now.
Click the Advanced Settings page.
Type a unique tag in the Mark imported documents with the following Content Crawler Tag box.

Configuring the Number of Threads Used to Crawl Content

Note:

The allowable ranges for these settings are set in the portal configuration file. The values set here are also limited by the maximum threads allowable in the automation service used for the job associated with the content crawler.

To configure the number of threads used to crawl content for a content crawler:

If the Content Crawler Editor is not already open, open it now.
Click the Advanced Settings page.
Under Runtime Configuration, set the following:
- In the Maximum document-fetching threads box, type the maximum number of concurrent threads used to fetch content from the content source.
- In the Maximum card-indexing threads box, type the maximum number of concurrent threads used in processing content once it has been crawled into the portal.

Testing a Content Crawler

Before you have a content crawler import content into the public folders of your portal, test it by running a job that crawls document records into a temporary folder.

Create a test folder and remove the Everyone group, and any other public groups, from the Security page on the folder to ensure that users cannot access the test content.

Ensure that the content crawler creates the correct links.

Examine the target folder and ensure the content crawler has generated records and links for desired content and has not created unwanted records and links.

If you iterate this testing step after modifying the content crawler configuration, ensure that you delete the contents of the test folder and clear the deletion history for the content crawler.
Ensure that the content crawler creates correct metadata.

Ensure that all documents are given the right content types, and that these content types correctly map properties to source document attributes.

Go to the Knowledge Directory, and look at the properties and content types of a few of the documents this content crawler imported to see if they are the properties and content types you expected.

To view the properties and content type for a document:
1. Click Directory and navigate to the folder that contains the document whose properties and content type you want to view.
2. Click Properties under the document to display the information about the document. The properties are displayed in a table along with their values. The content type is displayed at the bottom of the page.
If you iterate this testing step after modifying the content crawler configuration, ensure that you configure the content crawler to refresh these links.
Test properties, filters, and search.

To test that document properties have been configured to enable filters and search, browse to the test folder, and perform a search using the same expression used by the filter you are testing. Either cut and paste the text from the filter into the portal search box or use the Advanced Search tool to enter expressions involving properties. Select Search Only in this Folder. The links that are returned by your search are for the documents that will pass your filter.

Troubleshooting the Results of a Crawl

There are several things you can troubleshoot if your content crawler does not import the expected content.

Ensure that your folder filters are correctly filtering content.

To learn about testing your filters, see Testing Filters.
Ensure that your content crawler did not place unwanted content into the target folder.

If a document does not filter into any subfolders, your content crawler might place the document in the target folder. This is determined by a setting on the Main Settings page of the Folder Editor.
Ensure that the content crawler did not place content into the Unclassified Documents folder.

If a document cannot be placed in any target folders or subfolders, your content crawler might place the document in the Unclassified Documents folder. This is determined by a setting on the Advanced Settings page of the Content Crawler Editor.

If you have the correct permissions, you can view the Unclassified Documents folder when you are editing the Knowledge Directory or by clicking Administration, then, in the Select Utility list, select Access Unclassified Documents.
Ensure that you have at least Edit access to the target folder.
For Web content crawlers, ensure that the robot exclusion protocols or any exclusions or inclusions are not keeping your content crawler from importing the expected content.

This is determined by a setting on the Web Page Exclusions page of the Content Crawler Editor.
Ensure that the authentication information specified in the associated content source allows the portal to access content.
Review the job history for additional information.

Destination Folder Flowchart

This flowchart shows how a content crawler determines into which folders to import content. The process starts in the upper-left corner. The content crawler goes through this process for the destination folder you select and then continues down the levels of subfolders, if necessary. The content crawler repeats this process for each destination folder you add to the Main Settings page of the Content Crawler Editor.

If the content crawler is set to ignore the filters of destination folders, the first step in this flowchart is treated as if the document passes the filters for the folder. Be aware that only the filters of the destination folders will be ignored; if the destination folder has any subfolders with filters, these subfolder filters will not be ignored.

Note:

If the document does not pass the filters of the destination folder, the content crawler checks to see if the destination folder has a default folder. It is only for subfolders of the destination folder that the content crawler checks to see if the parent folder has a default folder.

Description of destinationfolder.jpg follows

Description of the illustration destinationfolder.jpg

Mapping External Document Security to Imported Portal Users with the Global ACL Sync Map

Users imported through an authentication source can automatically be granted access to the content imported by some remote content crawlers through mappings in the Global ACL Sync Map. The Global ACL Sync Map maps authentication source prefixes to domain names and external groups to portal groups.

Every authentication source has a prefix. This prefix is used to distinguish the users and groups imported through the authentication source. If you plan to import security information with imported content, you might must map your authentication source prefixes to the source domains or map portal groups to external groups through the Global ACL Sync Map.

When a content crawler finds an ACL entry for a source document referring to any of the mapped external groups, the content crawler replaces the external group name with a reference to the mapped portal group.

You can use these mappings to unify disparate group names existing across document repositories. For example, you might want make Notes documents that are available to the Notes group Engineers and Exchange messages that are available to the Exchange group Engineering available to the portal group Developers. To do so, you would add the Developers group to this page and map "Engineers,Engineering" to it.

The Global ACL Sync Map is used by content crawlers bringing security settings, in the form of Access Control Lists (ACLs), into your portal along with documents. The Global ACL Sync Map shows content crawlers how the users and groups found on source document ACLs correspond with portal users and groups. Using this information, a content crawler can set portal security on imported content. For an example-based explanation of this process, see Example of Importing Content Security.

To access the Global ACL Sync Map you must be a member of the Administrators group. To open the Global ACL Sync Map:

Click Administration.
From the Select Utility menu, choose Global ACL Sync Map.
On the Prefix: Domain Map page, map authentication source prefixes to source domains:
- To add a prefix to the map, click Add Mapping; then, in the Select Authentication Sources dialog box, select the authentication sources you want to map and click OK.
  Note:
  - If your authentication source prefix matches the domain name, the mapping occurs automatically and you do not must add the mapping to this page.
  - If more than one authentication source uses the same prefix, you only must map one of the authentication sources.
- To edit the prefix in this mapping (this will not affect the prefix in the authentication source), in the Authentication Source Prefix column, click the edit icon. In the text box that displays, edit the name, then click the arrow icon to save your change.
- To specify which domains map to a selected prefix, in the Domain Name column, click the edit icon and, in the text box that displays, type the domains you want to map, separated by commas (,). Click the arrow icon to save the mapping.
- To remove a mapping, select the mapping and click the remove icon.
- To select or clear all of the mapping check boxes, select or clear the box to the left of Authentication Source Prefix.
- To toggle the order in which the mappings are sorted (ascending/descending), click Authentication Source Prefix or click the icon to the right of that.
On the left, under Utility Settings, click Portal: External Group Map.
On the Portal: External Group Map, map portal groups to external groups:
- To add a portal group to the map, click Add Mapping; then, in the Select Groups dialog box, select the groups you want to map and click OK.
- To edit the portal group name in this mapping (this will not affect the actual portal group), in the Portal Group Name column, click the edit icon. In the text box that displays, edit the name, then click the arrow icon to save your change.
- To specify which external groups map to a selected portal group, in the External Group Name column, click the edit icon and, in the text box that displays, type the external groups you want to map, separated by commas (,). Click the arrow icon to save the mapping.
- To remove a mapping, select the mapping and click the remove icon.
- To select or clear all of the mapping check boxes, select or clear the box to the left of Portal Group Name.
- To toggle the order in which the mappings are sorted (ascending/descending), click Portal Group Name or click the icon to the right of that.

About Snapshot Queries

Snapshot queries enable you to display the results of a search in a portlet or e-mail the results to users. You can select which repositories to search (including Oracle WebCenter Collaboration), and limit your search by language, object type, folder, property, and text conditions.

Working with Snapshot Queries

This section describes the following tasks:

Creating a or Editing a Snapshot Query
Defining Snapshot Query Conditions
Limiting a Snapshot Query
Formatting the Results of a Snapshot Query
Previewing the Results of a Snapshot Query
E-mailing the Results of a Snapshot Query
Creating a Snapshot Portlet to Display the Results of a Snapshot Query

Creating a or Editing a Snapshot Query

To create a snapshot query you must have the following rights and privileges:

Access Administration activity right
Create Snapshot Queries activity right
At least Edit access to the parent folder (the folder that will store the snapshot query)
At least Select access to any properties by which you want to filter your results
At least Select access to any Knowledge Directory or administrative folders to which you want to restrict your results

To edit a snapshot query you must have the following rights and privileges:

Access Administration activity right
At least Edit access to the snapshot query
At least Select access to any properties by which you want to filter your results
At least Select access to any Knowledge Directory or administrative folders to which you want to restrict your results

To create or edit a snapshot query:

Click Administration.
Open the Snapshot Query Editor.
- To create a snapshot query, open the folder in which you want to store the snapshot query. In the Create Object list, click Snapshot Query.
- To edit snapshot query, open the folder in which the snapshot query is stored and click the snapshot query name.
On the Construct Snapshot Query page, perform tasks as necessary:
- Defining Snapshot Query Conditions
- Limiting a Snapshot Query
On the Format Snapshot Query Result page, perform tasks as necessary:
- Formatting the Results of a Snapshot Query
On the Preview Snapshot Query Result page, perform tasks as necessary:
- Previewing the Results of a Snapshot Query
On the Snapshot Portlet List page, perform tasks as necessary:
- Creating a Snapshot Portlet to Display the Results of a Snapshot Query
On the Properties and Names page, perform tasks as necessary:
- Naming and Describing an Object
  
  You can instead enter a name and description when you save this snapshot query.
- Managing Object Properties
On the Security page, perform tasks as necessary:
- Setting Security on an Object
The default security for the snapshot query is based on the security of the parent folder.
If you are editing a snapshot query, on the Migration History and Status page, perform tasks as necessary:
- Viewing Migration History and Status for an Object
Note:

The Migration History and Status page is not available when creating an object.

If you did not create a snapshot portlet on the Snapshot Portlet List page, a snapshot portlet is automatically created when you save this snapshot query.

Deleting a Snapshot Query

To delete snapshot query you must have the following rights and privileges:

Access Administration activity right
Admin access to the snapshot query

To delete snapshot query:

Click Administration.
Navigate to the snapshot query.
Select the snapshot query you want to delete and click the delete icon.

Note:

Deleting a snapshot query will break any associated snapshot portlets.

Defining Snapshot Query Conditions

A snapshot query is a combination of a basic fields search and statements. The basic fields search operates on the name, description, and content of documents and objects. Statements can operate on the basic fields or any other additional document or object properties. Statements define what must or must not be true to return the document or object in the results. The statements are collected together in groupings. The grouping defines whether the statements are evaluated with an AND operator (all statements are true) or an OR operator (any statement is true). If some statements should be evaluated with an AND operator and some should be evaluated with an OR operator, you can create separate groupings for the statements. You can also create subgroupings or nested groupings, where one grouping is contained within another grouping. The statements in the lowest-level grouping are evaluated first to define a set of results. Then the statements in the next highest grouping are applied to that set of results to further filter the results. The filtering continues up the levels of groupings until all the groupings of statements are evaluated.

A snapshot query needs at least a basic fields search or a statement.

If the Snapshot Query Editor is not already open, open it now.
Click the Construct Snapshot Query page.
To search the name, description, and content values, type the text you want to search for in the Basic fields search text box.

You can use the text search rules described in Using Text Search Rules.
Select the operator for the grouping of statements you are about to create:
- If a document or object should be returned only when all statements in the grouping are true, select AND.
- If a document or object should be returned when any statement in grouping is true, select OR.
Note:

The operator you select for a grouping applies to all its statements and subgroupings directly under it.
Define each statement in the grouping:
1. Click Add Statement.
2. In the first list, select the searchable property for which you want to filter the values.
3. In the second list, select the operator to apply to this condition.
  
  This list will vary depending on the property selected:
  - For any text property, you can search for a value that contains your search string, or you can search for properties that have never had a value (Contains No Value).
    
    Note:
    
    If the property contained a value at some point, but the value has been deleted, the property will not match the Contains No Value condition.
  - For any date property, you can search for a value that comes after, comes before, is, or is not the date and time you enter in the boxes. You can also search for a value within the last number of minutes, hours, days, or weeks that you enter in the box.
  - For any number property, you can search for a value that is greater than, is less than, is, is not, is greater than or equal to, or is less than or equal to the number you enter in the text box.
4. In the box (or boxes), specify the value the property must meet.
  
  Note:
  
  If you are searching for a text property, you can use the text search rules described in Using Text Search Rules.
To remove the last statement in a grouping, select the grouping, and click Remove Statement.
If necessary, add more statements by repeating Step 4.
If necessary, add more groupings:
- To add another grouping, select the grouping to which you want to add a subgrouping, click Add Grouping, then define the statements for that grouping (as described in Step 4).
  
  Note:
  
  You cannot add a grouping at the same level as Grouping 1.
- To remove a grouping, select the grouping, and click Remove Grouping.
  Note:
  - Any groupings and statements in that grouping will also be removed.
  - You cannot remove Grouping 1.

You might also want to limit your search by language, object types, or folders.

Limiting a Snapshot Query

You can limit your snapshot query to specific languages, portal repositories, objects, folders, projects, or portlets.

If the Snapshot Query Editor is not already open, open it now.
Click the Construct Snapshot Query page.
To limit your search to a specific language, under Limit Search by Document Language, select a language from the Specify Language list.
Under Specify Range of Search, select which of the repositories to search: Knowledge Directory, Portal (administrative objects), Oracle WebCenter Collaboration.

You will see a check box for Collaboration only if you have this product installed.
Next to Repository General Settings, choose whether to search folders or documents from any source within the repositories you selected, and whether to search all subfolders within those repositories.
If you selected the Knowledge Directory as one of the repositories to search, specify settings under Knowledge Directory Search Settings.
- Next to Search Results Contain, select Knowledge Directory Folders, and/or Knowledge Directory Documents.
  
  Note:
  
  You must select at least one of these options.
- To restrict the search to selected folders, click Add Document Folder, in the Add Document Folder dialog box, select the folders to which you want to restrict your search and click OK.
- To remove a folder, select it and click the Remove icon.
If you selected the portal as one of the repositories to search, specify settings under Portal Search Settings.
- Next to Search Results Contain, select the portal administrative object types to include in the search, such as portlets or communities.
  
  Note:
  
  To select or clear all the object types, select or clear the box next to All Types.
- To restrict the search to selected folders, click Add Administrative Folder, in the Add Administrative Folder dialog box, select the folders to which you want to restrict your search and click OK.
- To remove a folder, select it and click the Remove icon.
If you selected Collaboration as one of the repositories to search, under Restrict to Selected Collaboration Projects, click Add Project to select one or more Oracle WebCenter Collaboration projects to search, then click Finish.

Next, specify the format for your results.

Formatting the Results of a Snapshot Query

You can define how your snapshot query results appear. By default, results are listed in order of relevance; that is, those results that most closely match your query are listed first. You can change the order in which results are displayed, limit the number of items returned, specify a style in which the snapshot portlet will be displayed, and e-mail results to users.

If the Snapshot Query Editor is not already open, open it now.
Click the Format Snapshot Query Result page.
In the Maximum items displayed box, type the number of items that should appear on a page.
In the Order results by list, select the property type by which you want to sort results.

Note:

You can sort by only numeric fields. For example, you can sort your search results by Content Type ID or Object Last Modified.
To select the available fields for display on search results, under Query Return Fields, click Add Query Fields, select the fields you want to add, and click OK.

The fields you add here can be selected in the administrative preferences of snapshot query portlets associated with this snapshot query. Selecting all or a subset of these fields in the administrative preferences of a particular snapshot query portlet determines what end users see in results appearing in that portlet.
If you want the content snapshot portlet to appear with a subscribe button that enables users to receive e-mail about search results, select Enable e-mail subscriptions.
Note:
- You must configure an external operation to send e-mail notifications for this snapshot query. See E-mailing the Results of a Snapshot Query.
- Users receive the e-mail only if their e-mail addresses are available in their user profiles.

Next, preview your results.

Previewing the Results of a Snapshot Query

You can preview the results of a snapshot query before you save it.

If the Snapshot Query Editor is not already open, open it now.
Click the Preview Snapshot Query Result page.

The fields displayed in these results are the ones you added on the Format Snapshot Query Result page, under Query Return Fields. However, for each snapshot portlet associated with this query, you can select all or a subset of the available query return fields in the portlet's administrative preferences.

Next, create a snapshot portlet on the Snapshot Portlet List page or save the snapshot query to automatically create a snapshot portlet.

E-mailing the Results of a Snapshot Query

You can e-mail the results of a snapshot query to users by creating an external operation and editing SavedSearchMailer.bat for Windows or SavedSearchMailer.sh for UNIX or Linux.

Before you create an external operation to e-mail the results of a snapshot query, you must create the snapshot query and snapshot portlet.To create an external operation you must have the following rights and privileges:

Access Administration activity right
At least Edit access to the Snapshot Query Mailer external operation
At least Edit access to the parent folder (the folder that will store the external operation)
At least Select access to the job that will run this external operation or Create Jobs activity right to create a job to run this external operation

To email the results of a snapshot query:

If you have not already done so, edit SavedSearchMailer.bat for Windows or SavedSearchMailer.sh for UNIX or Linux to specify the settings for your mail server and customize the e-mail values.

The saved search mailer file is located on the computer that hosts the Automation Service, in install_dir\scripts (for example, C:\Oracle\Middleware\wci\ptportal\10.3.3\scripts for Windows or /oracle/middleware/wci/ptportal/10.3.3/scripts for UNIX or Linux).You must replace the following argument values:

Argument	Description
SENDER	The name you want to display as the From value in the automated e-mails
MAIL_SERVER	Your SMTP mail server
REPLYTO	The e-mail address that users can reply to from the automated e-mails

Optionally, you can replace the following argument values:

Argument	Description
USER	The name of the user you want to send the automated e-mails
PWD	The password for the user that will send the automated e-mails
MIMETYPE	The MIME type you want to use for the automated e-mails
SUBJECT	The text you want to display in the automated e-mail subject line By default the subject includes the name of the snapshot query (represented by `<search_name>`) and the name of the user receiving the results (represented by `<name>`).
BODY_HEADER	The text you want to display at the top of the automated e-mail body
BODY_SEPARATOR	Any code you want to use to generate a separation between the header and the results
BODY_FOOTER	The text you want to display at the bottom of the automated e-mail body

In the portal, click Administration.
Open the snapshot portlet for which you want to e-mail results.
Click the Properties and Names page.
Copy or make note of the Object ID, then close the snapshot portlet.
Open the Intrinsic Operations folder.
Select the Snapshot Query Mailer external operation and click the External Operation icon.
In the Target Folder dialog box, select the folder in which you want to store your new external operation and click OK.
Open the copy of the snapshot query mailer you just created.
Replace 200 in the arguments with the object ID of the snapshot query you want to e-mail.
Click the Set Job page and complete the following task:
- Associating an Object with a Job
Click the Properties and Names page and rename the external operation.

Set the job to run on a regular basis.

Creating a Snapshot Portlet to Display the Results of a Snapshot Query

You can create a snapshot portlet to display the results of a snapshot query on a portal page.

To create a content snapshot portlet associated with this snapshot query, click Create Content Snapshot Portlet.

The portlet appears under Portlet List, and is added to the same administrative folder as this snapshot query.
Note:
- If you create a content snapshot portlet manually (rather than having one automatically created when saving the snapshot query), the name of the portlet will be New Snapshot Query.
- If you do not manually create a content snapshot portlet on this page, one will be created automatically when you save the snapshot query; the portlet will have the same name as the snapshot query.
- To delete a snapshot portlet, you must delete it from the administrative folder that contains its associated snapshot query.
To change the name of a snapshot portlet:
1. Click the portlet name.
  
  The Portlet Editor opens.
2. Click the Properties and Names page.
3. Edit the name.
To select the fields displayed in the results and the fields that users can search on.for a snapshot portlet, edit the administrative preferences:
1. Click the portlet name.
  
  The Portlet Editor opens.
2. Click the Edit button next to Configure this Portlet.
3. Edit the preferences.

Users with at least Select access to this portlet can now add this portlet to their My Pages. Community administrators with at least Select access to this portlet can now add this portlet to their communities.