Configuring OpenSearch Integration via YAML File

YAML file for OpenSearch is a human-readable configuration file written in the YAML format, used to define settings and parameters for OpenSearch and search categories for Siebel application. This file provide a structured way to configure and customize the behavior and environment of a Siebel with OpenSearch deployment. This file serves as the source of truth, providing the up-to-date status of indexing and model deployment. Administrators can use it to identify issues in case of failures.

Follow these steps to configure YAML file:

  1. Go to <Siebel_Build_Location>/ses/applicationcontainer_internal/webapps.
  2. Open modernsearchconfig.yaml, configure both downstream (OpenSearch) and upstream (Siebel) sections.
  3. Save changes and restart your Siebel Tomcat Server.

Downstream Settings:

Note: If you modify any of these parameters — username, password, url, port — you must restart your internal Tomcat Server for the changes to take effect.
Category Settings Description
Connectivity

name: OpenSearch

username: <CHANGE_ME>

password: <CHANGE_ME>

version: 2.15.0

url: <CHANGE_ME>

port: <CHANGE_ME>

Connectivity details to access to your OpenSearch instance.
Index

category:

- servicerequests: isIndexed: false

- contacts: isIndexed: false

- accounts: isIndexed: false

- opportunities: isIndexed: false

- literature: isIndexed: false

  • Indexing status for 5 seeded categories. For each category, the default Indexing status is 'false'. Once the indexing process has completed, the status will be updated to 'true' by OpenSearch. Administrator should not manually update the status value.
  • If you are adding a new category like 'Products', make sure you also add a new entry into this Index section:

- products:

isIndexed: false

Provide a unique index name that maps to your search category. This step is required to ensure the newly added category can be indexed properly by OpenSearch.

  • Index Naming Conventions:
    • Lowercase letters only. Uppercase letters are not allowed.
    • Cannot contain spaces.
    • Allowed characters: letters, numbers, hyphens, underscores periods
    • Cannot start with: underscore, hyphen, plus sign
    • Length: <255 bytes
Language
locale:
                  defaultLanguage: English
The value must correspond to a language supported by OpenSearch 2.15.0 language analyzers and match the expected string format defined by OpenSearch.
Search
maxResults: 30
              percentOfTopScore: 20

maxResults: Maximum results retrieved from OpenSearch search engine.

percentOfTopScore: Threshold expressed as a percentage of the top hit's score. When applied, OpenSearch uses this threshold to decide which documents are close enough in relevance to the highest scoring document to be considered in the rescoring or filtering process.

  • Suppose the top document in your search has a score of 10.0.

  • If percentOfTopScore is set to 90, it means documents with a score at least 90% of 10.0, i.e., scores >= 9.0, will be included for further processing or rescoring.

  • Documents scoring less than that threshold would be excluded from that step.

ML

modelId: <CHANGE_ME> # DO NOT TOUCH, UPDATED BY SYSTEM

modelGroupId: <CHANGE_ME> # DO NOT TOUCH, UPDATED BY SYSTEM

modelRegistrationTaskId: <CHANGE_ME> # DO NOT TOUCH, UPDATED BY SYSTEM

modelDeploymentTaskId: <CHANGE_ME> # DO NOT TOUCH, UPDATED BY SYSTEM

NOTE: Don't manually change values in this section.

modelId: system field, to be updated by OpenSearch

modelGroupId: system field, to be updated by OpenSearch

modelStatus: system field, to be updated by OpenSearch. Sample status: INIT, REG_IN_PROGRESS,REG_COMPLETE,DEPLY_IN_PROGRESS, DEPLOY_COMPLETED. Check this value to track model registration status.

modelRegistrationTaskId: system field, to be updated by OpenSearch. Refers to a unique identifier for tracking the status of a model registration task

modelDeploymentTaskId: system field, to be updated by OpenSearch. Refers to a unique identifier used to track the status of a model deployment task.

Note: If user wants to test any new OpenSearch model, make sure to choose the model that supports Siebel application: 768-dimensional dense vector space

Pipeline
  - multi_match:
              weightFactor: 0.5
          - neural:
              weightFactor: 0.5

Multi-Match Weight Factor: used for keyword-based matches

Neural Weight Factor: used for semantic-based matches

Tuning Weight Factors for Hybrid Search:

  • Start with equal weights.
  • Increase multi_match_weight if exact term matching is more important.
  • Increase neural_weight if you're prioritizing semantic similarity.
  • The sum of the 2 weight factors should not exceed 1.
Suggester

numberOfSuggestions: 5

minLength: 3

transpositions: true

fuzzyEnabled: true

fuzziness: AUTO

prefixLength: 3

  • numberOfSuggestions: 5

    Return up to 5 suggestions (e.g., autocomplete or spellcheck results).

  • minLength: 3

    Only suggest corrections or completions for queries at least 3 characters long.

  • transpositions: true

    Allows swapping of two adjacent characters (like "teh" → "the") as valid edit in fuzzy matching.

  • fuzzyEnabled: true

    To enable fuzziness feature.

  • fuzziness: AUTO

    Fuzziness refers to the tolerance of misspellings or slight variations in the terms you're searching for. When you set fuzziness: AUTO, OpenSearch will automatically determine the fuzziness level based on the length of the term being searched.

  • prefixLength: 3

    Trigger fuzziness after the prefixLength is reached.

Upstream Settings:

Category Settings Description

Data Source

- type: SiebelDB

Reserved Static Value. Do not change.

Category

- Service Requests ModernSearch: embeddingData: "{{Status}} SR with {{SR Number}} of {{Account}} is having {{Description}}"

- Contacts ModernSearch: embeddingData: "{{First Name}} {{Last Name}} of {{Account}} from {{Personal City}}"

- Accounts ModernSearch: embeddingData: "{{Name}} located at {{Location}} with {{Account Status}} is assigned to {{Sales Rep}}"

- Opportunities ModernSearch: embeddingData: "{{Name}} of {{Account}} is having revenue {{Primary Revenue Amount}} assigned to {{Sales Rep}}"

- Literature ModernSearch: embeddingData: "{{Name}} is having content {{Description}}"

Service Requests ModernSearch, Contacts ModernSearch, Accounts ModernSearch, Opportunities ModernSearch, Literature ModernSearch are all seeded categories, each with 1 pre-defined embeddingData.

embeddingData:

  • embeddingData consists of key common fields to give actual meaning of a particular search category.
  • It is defined to support and optimize semantic search and auto suggestion. Both semantic search and auto suggestion perform on embeddingData virtual field only.

Guidelines to define embeddingData:

  • Only maintain 1 embeddingData for each category. Do not add additional embeddingData entry.
  • Administrator can edit the seeded embeddingData as needed. It's recommended to use fields which contain unstructured data like Description, Summary, Notes, etc. Make sure these fields have been indexed, otherwise the field will show 'NA' in the results page.
  • The field name should match the name defined in Siebel Application → Search Category → Available Fields.
  • If no embeddingData is defined, you can leave the entry blank. For example: embeddingData: null. In this case, OpenSearch will only perform keyword matching. It will not perform semantic search or auto suggestion.
Note: There should be 1:1 mappings between the Category names in upstream and the Index name in downstream, in YAML file. Follow the same sequence order when mapping the category name and index name. See diagram below:
YAML File Workflow