10 Configuring Elasticsearch in WebCenter Portal
Configure Elasticsearch to index and search objects in WebCenter Portal.
Note:
Beginning with Release 12c (12.2.1.4.0), Oracle WebCenter Portal has deprecated the support for Oracle SES. If you have upgraded from a prior release, your upgraded instance might be configured to use Oracle SES. In this case, you must configure WebCenter Portal to use Elasticsearch to index and search objects.
Permissions:
To perform the tasks in this chapter, you must be granted the WebLogic Server Admin
role through the Oracle WebLogic Server Administration Console and the Administrator
role granted through WebCenter Portal Administration.
For more information about roles and permissions, see Understanding Administrative Operations, Roles, and Tools.
Understanding Search with Elasticsearch
Elasticsearch is a highly scalable search engine. It allows you to store, search, and analyze big volumes of data quickly and provides a distributed, multitenant-capable full-text search engine with an HTTP web interface and schema-free JSON documents.
Advantages of Elasticsearch
-
Elasticsearch provides full-text search capabilities as it is built on Lucene.
-
Elasticsearch is document-oriented. It stores data as structured JSON documents and indexes all fields by default, with a higher performance result.
-
Elasticsearch is API driven; actions can be performed using a simple Restful API.
-
Elasticsearch retrieves search results fast because it searches an index instead of searching the text directly.
You can configure Elasticsearch to search the following resources in WebCenter Portal:
-
Documents, including wikis and blogs
-
Portals, page metadata, page content (contents of HTML, Text, and Styled Text components), lists, and people resources
-
Announcements and Discussions (available only for portals upgraded from prior releases)
Prerequisites for Configuring Elasticsearch
Ensure the following requirements:
-
Oracle WebCenter Portal is installed.
-
Optional. If you choose to use WebCenter Content for search, ensure that WebCenter Content is configured and all required components are enabled. See Managing Connections to Oracle WebCenter Content Server.
Configuration Roadmap for Elasticsearch in WebCenter Portal
Table 10-1 Roadmap - Setting Up Elasticsearch in WebCenter Portal
Actor | Task |
---|---|
Administrator |
|
Administrator |
|
Administrator |
|
Administrator |
|
Administrator |
Customizing Search Settings in WebCenter Portal Administration |
Administrator |
(Optional) Modifying Elasticsearch Global Attributes |
Administrator |
(Optional) Configuring Search Custom Attributes for Elasticsearch |
Administrator |
(Optional) Creating Custom Facets in Elasticsearch |
Creating a Crawl Admin User in WebCenter Portal
You can designate an existing user as crawl admin or create a crawl admin user (for example, mycrawladmin
) in WebCenter Portal and in your back-end identity management server to search using Elasticsearch. You must create a crawl admin user only once.
Note:
See your identity management system documentation for information on creating users.
The following example uses Oracle Directory Services Manager to create the mycrawladmin
user:
Modifying the Default Connection Settings for Document Content Crawl Plugin in Elasticsearch Server
After installing Elasticsearch, you can modify the default connection settings for document content crawl plugin using the configuration file.
You can specify the following attributes in the configuration file:
-
es.wcc.connection.timeout is the connection time-out interval, in seconds. This is the amount of time Elasticsearch server will wait to establish the connection to the WebCenter Content server. The default value is 30 seconds.
-
es.wcc.read.timeout is the read time-out interval, in seconds. Once Elasticsearch server is connected to the WebCenter Content server, this attribute specifies the amount of time allowed for the WebCenter Content server to respond in a given request. The default value is 30 seconds.
-
es.wcc.max.connection.attempts is the maximum number of connection attempts to access the WebCenter Content server. The default value is 3.
Configuring WebCenter Content for Search
This topic describes how to configure WebCenter Content for search.
Note:
The following topics are applicable only if WebCenter Content is configured.
Creating a Crawl User in WebCenter Content
This procedure describes how to create a new crawl user in WebCenter Content.
If you want users with the admin role to crawl, then use an admin user account as the crawl user. If you want non-admin users to crawl, then create a new crawl user.
- Log on to WebCenter Content as an Administrator.
- To create a role
sescrawlerrole
, do the following: - To create a user
sescrawler
, and assign thesescrawlerrole
role to the user, do the following: - On the WebCenter Content home page, expand Administration, then Admin Server. Select General Configuration and append the
sceCrawlerRole=sescrawlerrole
entry in the Additional Configuration Variables section. - Restart WebCenter Content.
Configuring the SESCrawlerExport
Component
Before you begin, verify that the SESCrawlerExport
component is enabled. If not, enable the component (see Enabling the WebCenterConfigure Component) and restart the WebCenter Content server.
SESCrawlerExport
component for admin and non-admin users:
Configuring WebCenter Portal for Search
To configure WebCenter Portal for search, you need to configure the connection between WebCenter Portal and Elasticsearch and grant the crawl application role to the crawl admin user. Finally, you have to configure the WebCenter Content crawl user in Elasticsearch.
Note:
Only one search connection can exist. Before running createSearchConnection
WLST command, ensure that you delete any existing search connection.
Synchronizing Users in WebCenter Portal
Before performing a portal full crawl, we recommend you to run the LDAP synchronization WLST command to ensure that all users are available in portal.
Configuring Search Crawlers
You can configure the following types of crawlers to index WebCenter Portal resources:
-
Portal Crawler: This uses the Portal crawl source to crawl certain objects, such as lists, page metadata, page content (contents of HTML, Text, and Styled Text components), portals, and profiles.
-
Documents Crawler: This uses the Documents crawl source to crawl documents, including wikis and blogs.
-
Discussions Crawler: This uses the Discussions crawl source to crawl discussion forums and announcements. This option is available only for portals upgraded from prior releases that include Discussions.
The following topics describe how to create different crawl sources using Scheduler UI in WebCenter Portal Administration:
Creating a Portal Crawl Source
Creating a Documents Crawl Source
Taking a Snapshot of the Content
The snapshot generates a configFile.xml
file at the location specified by the SESCrawlerExport component FeedLoc parameter. XML feeds are created in the subdirectory with the source name; for example, wikis. Performing a snapshot can take some time depending on the number of items you have stored on the Content Server instance and how many sources you are generating.
Note:
It is important to take a snapshot before the first crawl or any subsequent full crawl of the source.
Modifying Elasticsearch Global Attributes
WebCenter Portal uses Elasticsearch to index and search the objects. The attributes wcESConnectionTimeoutPeriod
and wcESReadTimeoutPeriod
are used to configure the interaction between WebCenter Portal and Elasticsearch. The wcESDocumentsCrawlerThreads
attribute is used to configure the number of threads required to process the crawling of documents.
The following are the attributes:
-
wcESConnectionTimeoutPeriod is the connection timeout interval, in seconds. This is the amount of time WebCenter Portal will wait to establish the connection to the Elasticsearch server. The default value is 30 seconds.
-
wcESReadTimeoutPeriod is the read timeout interval, in seconds. Once WebCenter Portal is connected to the Elasticsearch server, this specifies the amount of time allowed for the Elasticsearch server to respond in a given request. The default value is 30 seconds.
-
wcESDocumentsCrawlerThreads: The tasks for crawling the documents for search are handled in threads. This is done by creating a thread pool with a fixed number of threads, where each thread handles the crawl for the documents. The attribute
wcESDocumentsCrawlerThreads
can be used to specify the number of threads used to create a thread pool. The default value is 10. If a thread is not available for a crawl task, the task is in queue, waiting for other task to complete.
You can modify the default value of the attributes in Attributes page in WebCenter Portal administration. After you modify the value, you must restart the WebCenter Portal server for the changes to take effect.
Enabling AutoSuggest in WebCenter Portal
In WebCenter Portal, you can enable autosuggest, so that a list of suggested keywords appear as you type a keyword in the search field. The suggested keywords are based on the portal that you recently accessed.
The attribute, wcEnableAutoSuggest
, is used to configure autosuggest. By default, the attribute is set to false
. You can modify the value of the attribute in the WebCenter Portal Administration - Attributes page. After you modify the value, you must restart the WebCenter Portal server for the changes to take effect.
Configuring Search Custom Attributes for Elasticsearch
When you search using WebCenter Portal, only certain predefined attributes show up in the search results. WebCenter Portal allows you to see additional attributes in your search results. This can be achieved from the Search Setting page in portal administration, where the Custom Attributes section lets you select which custom search attributes should appear in search results and the order in which they appear. This list in the Search Setting page is driven by search-service-attributes.xml
. It contains list of all attributes that we crawl for each service. Types in elastic search index is defined by this metadata. You can add a new custom attribute or modify the existing one in the search-service-attributes.xml
file.
The following procedure describes how to add a new search custom attribute using Document service as an example.
Creating Custom Facets in Elasticsearch
Oracle WebCenter Portal supports faceted search to refine the search results without running a new search. With faceted search, search results are grouped under a predefined category and thus help users to narrow down the search results based on a relevant category. for example, Author
, Portal
, Last Modified date
.
In Oracle WebCenter Portal, by default, certain
predefined facets are provided in the Search Setting page. The list of facets is driven
by the search-service-custom-facets.xml
metadata file and each facet in
the file is mapped with searchAttribute
metadata file.
The following is the sample of the search-service-custom-facets.xml
metadata file:
<custom-facet dataType="keyword" displayName="Author" displayNameKey="ES_FACET_AUTHOR" mappedSearchAttribute="author" name="Author" itemsToDisplay="20"/> <custom-facet dataType="keyword" displayName="Service ID" displayNameKey="ES_FACET_SERVICEID" mappedSearchAttribute="wc_serviceId" name="Service ID"/> <custom-facet dataType="keyword" displayName="Portal" displayNameKey="ES_FACET_SCOPE" mappedSearchAttribute="wc_scopeGuid" name="Scope GUID" itemsToDisplay="20"/> <custom-facet dataType="keyword" displayName="Tags" displayNameKey="ES_FACET_TAGS" mappedSearchAttribute="wc_tagWords" name="Tags"itemsToDisplay="20"/> <custom-facet dataType="keyword" displayName="Mimetype" displayNameKey="ES_FACET_MIMETYPE" mappedSearchAttribute="mimetype" name="Mimetype" itemsToDisplay="20"/> <custom-facet dataType="date" displayName="Last Modified Date" displayNameKey="ES_FACET_LASTMODIFIED" mappedSearchAttribute="lastmodified" name="Last Modified Date" />
where,
-
name
is the name of the facet. -
displayNameKey
is the value of the custom facet metadata field. -
displayName
is the display name of the facet.Note:
If your business is supported in multiple languages, you can translate the newly added custom facets to the desired languages. See Translating Strings for Search Facets. -
mappedSearchAttribute
is one of the attribute defined in the metadata file,search-service-attributes.xml
, and used to map the custom facet with the search attribute,searchAttribute
. -
dataType
is the type of data. The available data types arekeyword
,date
. This value should be same as the type value in thesearch-service-attributes.xml
metadata file. itemsToDisplay
is an optional attribute that defines the maximum number of items to be displayed under a facet. If the value is not specified, the default value is configured using option Administration > Tools and Services > Search in WebCenter Portal.Note:
This attribute is supported only if dataTypekeyword
is used.
Based on your business needs, you can add or modify the list of the facets in the
search-service-custom-facets.xml
metadata file for any of the
services available in Oracle WebCenter Portal, for
example, you can add custom facets for documents, people and other services listed in
the search-service-custom-facets.xml
metadata file.
This section shows how to add a custom facet for the document service. To add custom facet for the document service, you need to first add a custom metadata field in Oracle WebCenter Content, and then rebuild the content index.
The following steps shows you how to add the custom facet for the document service:
Adding a Custom Metadata Field in Oracle WebCenter Content
Update the Metadata for the Document in Your Portal
CustomMetadataField
.
Rebuilding the WebCenter Content Index
In Oracle WebCenter Content, after you've created the new metadata field, you need to rebuild the collection and update the search index using the Repository Manager utility.
Configuring the SESCrawlerExport Component
You need to update the Oracle WebCenter Content SESCrawlerExport
component with the newly created metadata field.
SESCrawlerExport
component:
Configuring the Search Setting Metadata
You need to add the defined custom attribute in Oracle WebCenter Portal. For the new custom attribute to appear in the search settings page, you need to add or update the search-service-attributes.xml
.
Configuring the Search Custom Facet Metadata
In WebCenter Portal Search Setting page, you can select which facets to display with search results. This list in the Search Setting page is driven by the search-service-custom-facets.xml
metadata file. It contains a list of facets used in WebCenter Portal. Each facet in the search-service-custom-facets.xml
metadata file is mapped with a custom attribute using the mappedSearchAttribute attribute.
Scheduling a Crawl
You can schedule an incremental search crawl or manually start a full crawl. The topics in this section describe how to schedule a crawl and how to start, enable, or disable a crawl.
Scheduling an Incremental Crawl
Enabling and Disabling a Scheduled Crawl
Customizing Search Settings in WebCenter Portal Administration
You can customize Result Types and Filtering, Search Scope, Facets, and Custom Attributes on the Search Settings page in WebCenter Portal Administration. Portal managers can reset only the search scope for the portals that they manage.
To customize search settings for Elasticsearch: