Defining Search Objects
Search objects contain algorithms used to search the TMS repository in these situations:
-
To find matches for verbatim terms during Autoclassification.
-
To find Candidate Terms during manual classification.
In the Define Search Objects window, you can create search objects using pre-defined algorithms, custom (PL/SQL) regular expressions, or you can use custom PL/SQL packages as defined for a candidate or autocode object. If you currently use custom PL/SQL packages, see Defining a Search Object.
This section includes:
- Defining VT Transformation
- Defining Packages
- Defining a Search Object
- Assigning Search Object to a Dictionary
- Defining the Search Object Order
- Removing a Search Object's Association with a Dictionary
- Deleting a Search Object
- Creating Custom Search Algorithms
Parent topic: Defining Other TMS Elements
Defining VT Transformation
Note:
In TMS 5.3 release, you can define VT transformation.
A verbatim term transformation allows an administrative user to create a transformation through a regular expression in order to manipulate the verbatim term.
A transformation can represent a substitution of a verbatim term text or a replacement of the text from a verbatim term. For example, stop words or temperature can be removed from the verbatim term. You can also replace abbreviations with the full spelling of a term.
Sample substitution and replacement transformations are included in the application. The sample transformations may not be removed or updated. You can create a copy of the sample transformation and customize. For more information, see Sample Substitution and Replacement Transformations.
To define a VT transformation:
- From the Definition menu, select Define Search Objects. The window opens with the VT Transformations tab selected.
- Specify the following information about the VT transformation, then save:
Substitution Transformation
It consists of:
-
Name is a required text field to uniquely label a substitution transformation.
-
Type is a required drop-down list displaying: Word, Prefix, or Suffix.
-
Word is any alphanumeric text string.
-
Prefix is partial text string representing beginning characters of a verbatim term. For example, Anti-, Pre-, or Post-.
-
Suffix is partial text string representing later characters of a verbatim term. For example, -et, -aire, or -es.
-
-
Level is a required drop-down list displaying: Substitution Terminology - Stopwords, Substitution Terminology - Abbreviations, Substitution Terminology - Prefix, Substitution Terminology - Suffix, and all other levels associated with active dictionaries of type Substitution. This field contains the Substitution Dictionary Name and Levels.
-
Substitution Terminology - Stop words represent words that are removed from the verbatim term. For example, as, an, the, and, OR at.
-
Substitution Terminology - Abbreviations represent words that are shorten to represent a whole word. For example, Fx is fracture or Dr is doctor.
-
Substitution Terminology - Prefix is an affix placed before a word, base, or another prefix to modify a term's meaning, as by making the term negative. For example, un- or Ms.
-
Substitution Terminology - Suffix is an affix that follows the element to which it is added. For example, -ly.
-
-
Description is a text field to describe the term substitution transformation.
-
Test is a button that triggers the test (for testing purposes only)
-
Test Verbatim Term is a field to allow entry of a verbatim term (for testing purposes only).
Note:
In the PL/SQL regexp_replace function, this is the value used in the string parameter.
-
Test Result is a read-only field displaying the result of the test (for testing purposes only).
-
Created By is a read-only field displaying user that created the record.
-
Creation Timestamp is a read-only field displaying timestamp when record was created.
-
Modified By is a read-only field displaying user that last modified the record.
-
Modification Timestamp is a read-only field displaying timestamp when record was last modified.
For each Substitution transformation record, you can test the substitution transformation by entering a text string in the Test Verbatim Term field and clicking the Test button. The Test field will display the resulting transformation.
Replacement Transformation
It consists of:
-
Name is a required text field to uniquely label a replacement transformation. Must be unique across all substitution and replacement transformation objects.
-
Pattern is a required field. PL/SQL regular expression to be applied to a verbatim term. When saved, a test is performed and an error will be thrown if the regular expression is not valid.
Note:
In the PL/SQL regexp_replace function, this is the value used in the pattern parameter.
-
With replacement transformations, PLSQL regular expressions are applied to the verbatim term and text in the verbatim term is replaced when matches are made by the regular expression.
-
The format of the REGEXP_REPLACE function is: REGEXP_REPLACE( string, pattern [, replacement_string [, start_position [, nth_appearance [, match_parameter ] ] ] ] )
-
-
Repl. Str. is an optional replacement text field. When the regular expression finds a match in the verbatim term, the matching text is replaced with this value. If left empty, then the matching text is simply removed from the verbatim term. This can be static text or a pattern.
Note:
In the PL/SQL regexp_replace function, this is the value used in the replacement_string parameter.
-
Space - Select this checkbox when a single space " " needs to be configured in the "Repl. Str" field. Entering any character other than space in "Repl. Str" will automatically uncheck this box. (This must be to workaround a forms issue attempting to save a value with only a single space)
-
Start - Optional start position. This value specifies the position in the string where the regular expression will start searching from. If empty, then 1 is used by default.
Note:
In the PL/SQL regexp_replace function, this is the value used in the start_position parameter.
-
Occ - Optional occurrence. This value specifies which occurrence of the pattern within the string to replace. If empty, then all occurrences of the pattern within the string will be replaced. If set to 0, then all occurrences of the pattern within the string will be replaced.
Note:
In the PL/SQL regexp_replace function, this is the value used in the nth_appearance parameter.
-
Description is a text field to describe the term replacement transformation.
-
Test is a button that triggers the test (for testing purposes only).
-
Test Verbatim Term is a field to allow entry of a verbatim term (for testing purposes only) (NOTE: In the PL/SQL regexp_replace function, this is the value used in the string parameter).
-
Test Result is a read-only field displaying the result of the test (for testing purposes only).
-
Created By is a read-only field displaying user that created the record.
-
Creation Timestamp is a read-only field displaying timestamp when record was created.
-
Modified By is a read-only field displaying user that last modified the record.
-
Modification Timestamp is a read-only field displaying timestamp when record was last modified.
For each Replacement Transformation record, you can test the replacement transformation by entering text into the Test Verbatim Term field and clicking Test. The Test Result field will display the resulting transformation.
Parent topic: Defining Search Objects
Substitution Terminology Dictionaries
For information, see Substitution Terminology Dictionaries.
Parent topic: Defining VT Transformation
Sample Substitution and Replacement Transformations
-
Substitution Terminology Stop words transformation
-
Name: Stop words
-
Type: Word
-
Level: Substitution Terminology > Stopwords
-
Description: Remove stop words from the VT
-
-
Substitution Terminology Abbreviations transformation
-
Name: Abbreviations
-
Type: Word
-
Level: Substitution Terminology > Abbreviations
-
Description: Replace abbreviations in the VT
-
-
Substitution Terminology Prefix transformation
-
Name: Prefix
-
Type: Prefix
-
Level: Substitution Terminology > Prefix
-
Description: Remove prefix in the VT
-
-
Substitution Terminology Suffix Transformation
-
Name: Suffix
-
Type: Suffix
-
Level: Substitution Terminology > Suffix
-
Description: Remove suffix in the VT
-
-
Replacement Transformation - Removes (Text) transformation
-
Name: Removes (Text)
-
Pattern: (^|\W)*\(.*?\)(\W|$)
-
Repl. Str: NULL
-
Space: Checked
-
Start: NULL
-
Occ: NULL
-
Match: None
-
Description: Removes text between an open and closed parentheses. For example, "Headache (bad)" = "Headache".
-
-
Replacement Transformation - Removes temperature transformation
-
(Replacement) Name: Removes temperature
-
Pattern: (^|\W)(([0-9])|([0-9](.|,)[0-9]))? *((CELSIUS|FAHRENHEIT|DEGREES|DEGREE|C|F)( ?))(\W|$)
-
Repl. Str: NULL
-
Space: Checked
-
Start: NULL
-
Occ: NULL
-
Match: Case insensitive
-
Description: Removes temperature. For example, "Low Fever 38C" = "Low Fever".
-
-
Replacement Transformation - Removes number transformation
-
(Replacement) Name: Removes numbers
-
Pattern: (^|\W)*[0-9]+(\W|$)
-
Repl. Str: NULL
-
Space: Checked
-
Start: NULL
-
Occ: NULL
-
Match: None
-
Description: Removes numbers. For example, "Test 39 times" = "Test times".
-
-
Replacement Transformation - Removes duplicate words transformation
-
(Replacement) Name: Removes duplicate words
-
Pattern: (^|\W)(\w+)(\W|$)\2(\W|$)
-
Repl. Str: \1\2\3
-
Space: NULL
-
Start: NULL
-
Occ: NULL
-
Match: None
-
Description: Removes duplicate words. For example, "Test Test" = "Test".
-
-
Replacement Transformation - Removes date formatted string transformation
-
(Replacement) Name: Removes date formatted string
-
Pattern: *(([0-9] {1,2}(-|/)([A-za-z]{2,3}|[0-9]{1,2} )(-|/)[0-9]{2,4})|([A-za-z]{3}(-|/)[0-9]{2,4} )) *
-
Repl. Str: NULL
-
Space: checked
-
Start: NULL
-
Occ: NULL
-
Match: None
-
Description: Removes date formatted text string. There is no validation if the text string is a valid date. For example, "Test 39-OCT-2018" = "Test".
-
-
Replacement Transformation - Remove all non-letter/numbers transformation
-
(Replacement) Name: Removes all non-letter/numbers
-
Pattern: [^a-zA-Z0-9 ]
-
Repl. Str: NULL
-
Space: NULL
-
Start: NULL
-
Occ: NULL
-
Match: None
-
Description: Removes any non-alphanumeric character. For example, "Headache, bad?" = "Headache bad".
-
Parent topic: Defining VT Transformation
Defining Packages
The packages tab allows you to continue to use the custom PL/SQL packages assigned to a search object. You can continue to use custom packages during autocoding or to determine candidate matches.
To define a package:
Parent topic: Defining Search Objects
Defining a Search Object
In the Search Object tab, you will define the search object by assigning a name, the VT Transformation, how to match the term using the Term Matching, what terms to include, and a description of the search object. The Domain Match and Non appr vta match search objects are included as part of the TMS installation and cannot be updated. You can test each search object against a base dictionary to verify the results as expected.
To define a search object:
For more information, see:
Parent topic: Defining Search Objects
Testing a Search Object
Note:
In TMS 5.3 release, you can test a search object.
Once a search object is defined and saved. You can test the selected search object to verify the results are expected as defined.
- Select a Dictionary from the drop-down list of all active base dictionaries.
- Enter a verbatim term.
- Click Test Search to execute the selected search object on the entered verbatim term.
- The Result field will return Error, Omission, Classified, or Many depending on the returned results.
The Result field will return Error, Omission, Classified, or Many depending on the returned results.
-
Error is displayed if there is an unexpected database exception.
-
Omission is displayed if no match is found.
-
Classified is displayed if a direct match is found.
-
Many is displayed if there is more than one match found.
Transformed Verbatim Term is a display only field of the verbatim term with the search object applied.
The Term Match field displays the resulting dictionary term or global VTA when there is a direct match.
Parent topic: Defining a Search Object
Assigning Search Object to a Dictionary
After setting up the Search Object, assign the search object to a dictionary and domain. You must specify the dictionaries with which each Search Object is available for use, as follows:
Note:
You will need to assign the Domain VTA and/or Non appr vta match search objects to a dictionary and domain if you wish to continue to use.
Parent topic: Defining Search Objects
Defining the Search Object Order
In the Search Object Order tab, you can define the order in which search objects are executed during autocoding for a given domain/dictionary. The search objects assigned in the Apply to Dictionary are available once you have selected the Dictionary/Domain.
To define the order in which search objects are executed:
Once you have defined the search object execution order, you can test the execution order.
Testing the Search Object Order
Test the search object order in the second table (bottom block). This table displays the following fields:
-
Verbatim Term is an enterable free-form text field.
-
Result is a display only drop-down list (Omission, Classified, Error, Setup Error):
-
Omission result is if no match, multiple matches can be found from the search object order, OR if a direct match is found on the search object order when the Action 1:1 Match is set to Omission.
-
Classified result is if an exact match is found from the search object order.
-
Error result is if an unexpected exception is thrown in database. Data corruption of the dictionary.
-
Setup Error is if an error encountered in database setting global variables (for example, instance needs to be registered, synchronization needs to be run on instance, and so on). For example, if dictionary is not assigned to a domain or dictionary not a base/virtual dictionary.
-
-
Test Search button.
-
The Candidates table displays the matched term(s) in the dictionary. The results must be the same as the Classify VTO candidate search where only distinct matches are returned.
-
Term is the dictionary term or VTA in the domain/dictionary.
-
(Term) Type is either a dictionary term or verbatim term.
-
Code is the dictionary code field.
-
Alt Code is the alternate code defined for the dictionary term.
-
Approved? checkbox indicates if the dictionary term/VTA is approved.
-
Comment is a field displaying any text entered about the dictionary term/VTA.
-
Parent topic: Defining Search Objects
Removing a Search Object's Association with a Dictionary
If you decide that you no longer want to use a particular search object with a dictionary in any domain, you can remove the search object from the Search Object Order, Apply to Dictionary, and Search Objects tabs.
Omissions previously created using the search object still exist and are queryable and viewable in the Classify VT Omissions window. However, the Search Object field for these omissions is blank, and the system does not use the search object to create any additional omissions associated with the dictionary.
To remove a search object execution order in a dictionary:
- From the Definition menu, select Define Search Objects.
- Click the Search Object Order tab.
- From the Dictionary list, choose the dictionary from which you want to remove a search object.
- Select the domain from the Domain drop-down list. TMS lists the search objects defined for this dictionary and domain.
- Scroll to, or query for, the search object you want to remove, and select its row.
- With the search object's row highlighted, delete the record.
- Save. TMS deletes the search object's association with this dictionary.
Parent topic: Defining Search Objects
Deleting a Search Object
If you decide that you no longer want to use a search object with any dictionary, you can delete it entirely from TMS. Previously created omissions still exist, and are queryable and viewable in the Classify VT Omissions window. However, the Search Object field for these omissions is blank, and the system does not use the search object to create any additional omissions.
To delete a search object assigned to a dictionary:
-
From the Definition menu, select Define Search Objects.
-
Click the Apply to Dictionary tab.
-
From the Dictionary drop-down list, choose the dictionary from which you want to remove a search object. TMS lists the search objects defined for this dictionary.
-
Select the Search Object record and delete the record.
To delete a search object:
- From the Definition menu, select Define Search Objects.
- Click the Search Objects tab.
- Query for the search object you want to delete.
- Select the Search Object record and delete the record.
Parent topic: Defining Search Objects
Creating Custom Search Algorithms
Note:
In TMS 5.3 release, you can no longer customize search algorithms for Extended Searches.
Each of the search algorithms provided with TMS searches only for direct matches to the verbatim term in the TMS repository. You may want to take advantage of the text-retrieval capabilities available in the Oracle database. The interMedia Text software, formerly known as the Context Server Cartridge, is integrated in the RDBMS.
To create a custom search algorithm, create a PL/SQL function in the database and enter its name in the Search Object Definition window in TMS in the appropriate field—Autocode Object or Candidate Object—and set the candidate_type='package'. For an overview, see Defining a Search Object).
If the function is part of a package, follow the naming convention package_name
.
function_name
for functions. When TMS runs Autoclassification, a Candidate Term search, it calls the function in the order you specify in the Exec Order field.
You will probably want to make each search algorithm available in both situations, but this is optional.
The input and output parameters vary in each situation as follows:
- Writing a Search Algorithm for Autoclassification
- Writing a Search Algorithm for Candidate Term Generation
Parent topic: Defining Search Objects
Writing a Search Algorithm for Autoclassification
You can create one or more custom search algorithm to supplement TMS's Autoclassification process. Create a PL/SQL function in the database and enter its name in the Autocode Object field in the Packages tab of the Define Search Objects window in TMS (see Defining a Search Object). Each Autoclassification search algorithm must have the following specifications:
FUNCTION function_name ( pDefDictionaryId IN tms_def_dictionaries.def_dictionary_id%TYPE , pDefDomainId IN tms_def_domains.def_domain_id%TYPE , pVtLevelId IN tms_def_levels.def_level_id%TYPE , pVtCodeLevelId IN tms_def_levels.def_level_id%TYPE , pVTASearchFlag IN VARCHAR2 , pTermUpper IN tms_dict_contents.term_upper%TYPE , pVDefDictionaryID IN tms_def_dictionaries.def_dictionary_id%TYPE , pCutoffDate IN tms_def_dictionaries.cutoff_date%TYPE , pDictContentid IN OUT tms_dict_contents.dict_content_id%TYPE ) RETURN PLS_INTEGER;
The function_name may be part of package_name
.
function_name
and must correspond to the value entered in the Autocode Object field of the Define Search Objects window.
The function has one output parameter that identifies the dictionary term or VTA match it finds. The function returns an integer as follows:
Value | Returned if Function Finds | Next, TMS |
---|---|---|
0 |
no matches |
runs next search object, if any |
1 |
one and only one match |
stops Autoclassification, returns match |
2 |
more than one match |
if the exit criterion Stop 1:m is set to Y, Autoclassification stops and returns the search object that found the matches; if the exit criterion is set to N, TMS runs the next search object, if any. |
Parent topic: Creating Custom Search Algorithms
Writing a Search Algorithm for Candidate Term Generation
You can create one or more custom search algorithm to supplement TMS's Candidate Term search process. Create a PL/SQL function in the database and enter its name in the Search Object Definition window in TMS in the Candidate Object field (see Defining a Search Object). Each candidate search algorithm must have the following specifications:
FUNCTION function_name ( pDefSearchId IN tms_vt_omissions.def_search_id%TYPE , pDefDictionaryId IN tms_vt_omissions.def_dictionary_id%TYPE , DefDomainId IN tms_vt_omissions.def_domain_id%TYPE , pVtLevelId IN tms_def_levels.def_level_id%TYPE , pVtCodeLevelId IN tms_def_levels.def_level_id%TYPE , pTermUpper IN tms_dict_contents.term_upper%TYPE , pVDefDictionaryID IN tms_def_dictionaries.def_dictionary_id%TYPE , pCutoffDate IN tms_def_dictionaries.cutoff_date%TYPE , pCutoffDate IN tms_def_dictionaries.cutoff_date%TYPE) RETURN VARCHAR2;
The function_name may be part of package_name
.
function_name
and must correspond to the value entered in the Candidate Object field of the Define Search Objects window.
The function returns a string of characters that populates the Where clause, which populates the lower block of the Classify VT Omissions window. The function can be invoked by the user during manual classification (see "Candidate Terms and Search Objects" in Classification Concepts).
In addition, you can restrict an algorithm's search to only current terms by adding the following to the populated where clause:
AND end_ts = to_date(3000000, 'J')
Parent topic: Creating Custom Search Algorithms