You can submit a search to Oracle SES in the following places:
The Search page of the Oracle SES Administration GUI
An Oracle SES secure portlet in an Oracle Application Server Portal page
A query string of the Oracle SES Web services API
See Also:
Oracle SES Search Application HelpFigure 3-1 shows the default Oracle SES Search Application. You enter your search string and click the Search button, the same as for any other search engine.
A search string can consist of one or more words. It is case-insensitive. Clicking the Search button returns all matches for that search string. The match can be in the body of the document or in an attribute value, such as the author.
Figure 3-2 shows sample results for the search string Secure Enterprise Search
. You can reorder the way results are presented using the Group by and Sort by lists.
Oracle incorporates KWIC (keyword in context) as part of the search result. This has a size restriction of 4K. That is, if the searched keyword appears in the first 4K of a document, then the KWIC is shown for the search result. If the keyword appears after the first 4k, then no KWIC is shown.
Oracle SES applies stemming to the query term. Stemming expands the term to other terms that share the same root. For example, [banks] returns documents containing the word banks, banking, or bank. Oracle SES uses stemming based on the language of query, which is determined by the language of browser in the default query application, or it is input by the caller in the query API. Implicit stemming expansion applies on individual terms in term search, in proximity search, and in attribute shortcut search for STRING attributes. Implicit stemming expansion does not apply to phrase search, and it can be turned off by enclosing the term in double quotes.
Oracle SES can provide a suggestion for a query in the form of a "do you mean..." message for alternate word pairs. When you select the Auto-Expand option in the Oracle SES Administration GUI, Oracle SES automatically includes the alternate keyword as part of the search. An alternate keyword definition composed of multiple words is handled like a phrase in the search. For example, if the alternate word pair is "RAC" and "Real Application Clusters," then a query for "RAC" returns documents containing the word "RAC" or the phrase "Real Application Clusters."
If the search administrator has turned on spell checker, then Oracle SES only gives suggestions for term search, phrase search, and proximity search. Spell checker does not give suggestions for terms or phrases in attribute shortcut search.
Alternate words expansion is turned off for searches that use advanced constructs like thesaurus-based search, proximity search, fuzzy search, phrase search, and compulsory exclusion search. See Table 3-1, "Search String Rules".
See Also:
"Alternate Words"The results can include the following links:
Cached: The cached HTML version of the document.
Links: Pages that link to and from this document.
Source Group: Browse the source group.
Any links on top of the search text box are source groups. Clicking a source group restricts the search to that group.
Table 3-1 describes rules that apply to the search string. Text in square brackets represents characters entered into the search.
Rule | Description |
---|---|
Single term search |
Entering one term finds documents that contain that term. For example, [Oracle] matches all documents that contain the word Oracle anywhere in that document. You can enter any two searchable items (including term, phrase, attribute shortcut, and proximity search) in a query with a white space separating them and the AND operator applies. The operator [&] also explicitly denotes an AND relationship. For example, [oracle text] and [oracle & text] both return documents containing oracle and text. |
Phrase search ["..."] |
Put quotes around a set of words to find documents that contain that exact phrase. Oracle SES does not apply implicit stemming expansion to a query phrase, but it can apply explicit term expansion to terms in a phrase. All operators except term expansion operators in a phrase are not treated as valid operators but as normal special characters. For example, [oracle "RAC performance"] returns documents containing oracle and the phrase "RAC performance". Documents containing the stemming form "RAC performances" are not returned. (There is no implicit stemming expansion on either term.) The query ["sec*re search"] returns documents with the phrase "secure search". The query ["sec^re search"] returns documents with the phrase "sec re search". |
Attribute shortcut search [attribute_nume:attribute_value] |
Search on attributes with an attribute name, a colon (:), and then the value to be searched. Implicit stemming is applied to the attribute value term. You can specify operators as options. When no operator is specified, Oracle SES uses Contains for STRING attributes and Equals for NUMBER and DATE attributes. For example, [DocVersion:>1] returns documents that have number attribute Docversion where attribute value is larger than 1. The query [title:"oracle text"] returns documents with the phrase "oracle text" in the title attribute. The query [oracle | title:S*S] returns documents with the term oracle or SES in the title attribute. The query [title:^oracle] has the same effect as [title:oracle]. The contains [^] operator applies only to the STRING attribute.
See Also: "Searching on Date Attributes" and "Advanced Search" |
Proximity search ["..."~] |
A proximity search specifies the maximum distance within which multiple terms occur. A proximity search must have the search terms in double quotes. When the maximum spanning distance is not specified, Oracle SES applies a default window of 100 terms. The maximum number is 100. When a value larger than 100 is specified, Oracle SES treats it as 100. For example, ["ses performance"~10] returns documents with the terms |
Fuzzy [...~] search |
Put the operator (~) after a single term to return documents that contain terms similar to the query term. For example, [hallo~] returns documents containing term hello. The query [specifi*tion~] returns documents containing the term specification. Note: If a single term enclosed in double quotes is followed by ~, then the query is not a proximity search but a fuzzy search. The query ["parformance"~] returns documents containing the term performance. |
Thesaurus-based search Synonym [~...] search Narrower term [<] search Broader term [>] search |
Thesaurus-based operators require that a thesaurus be loaded into Oracle SES. Put the operator [~] at the beginning of a term to return documents that contain the original query term or a synonym for it. For example, [title:~"RAC"] returns documents with RAC or the synonym real application clusters in the title. A synonym relationship is symmetric: real application clusters is a synonym of RAC, and RAC is a synonym of real application clusters. In attribute search, it applies only to the STRING attribute. The query [<"Northern California"] returns documents with the thesaurus-defined narrower term San Francisco or the original phrase Northern California. The query [product:>chair] returns documents whose product attributes contain the broader term furniture or the original term chair. Broader and narrower terms are symmetric. Specifying that furniture is a broader term of chair also implicitly specifies that chair is a narrower term of furniture. See Also: "Thesaurus-Based Search" |
OR [|] search |
Use the OR [|]operator to connect any two searchable items. For example, [oracle | "RAC performance"~ ] returns documents with the term oracle or with the terms RAC and performance in any 100 terms spanning windows. The query [oracle | title:SES] returns documents with the term |
Grouping ( ) search |
Use parentheses ( ) to group query components to change precedence of the binary logical operators AND and OR. The grouped query components must form a valid query. If the query string inside parentheses is not a valid query, then Oracle SES implicitly rewrites it to the closest valid query. For example, [(oracle | database) sales] returns documents containing sales and containing either oracle or database. The query [(oracle |) sales] returns documents containing oracle and sales. This is because [oracle |] is not a valid query. |
Wildcard matching [*] for multiple characters |
Put the operator [*] in the middle or end of a term for wildcard matching. It can be applied multiple times in one term. For example, [ora*] finds documents that contain words beginning with ora, such as Oracle and orator. The query [title:a*e] returns documents with the title containing words such as apple or ape. Multiple character wildcard expansion could result in too many results. For example, [a*] could find too many results. Oracle SES throws an error to refine the queries. The wildcard operator [*] is ineffective with the escape character [\] just before it. For example [Pro\*c]. Wildcard matching cannot be used with Chinese or Japanese native characters. |
Wildcard matching [?] for single characters |
Put the operator (?) in middle or end of a term for wildcard matching for a single character. It can be applied multiple times in one term. For example, [orac?e] and [or?cl?] both return documents containing terms that replace ? with a single character, such as Oracle. The wildcard operator [?] is ineffective with the escape character [\] just before it. Wildcard matching cannot be used with Chinese or Japanese native characters. |
Compulsory inclusion [+] search |
Put the operator (+) at the beginning of any searchable item (including term, phrase, attribute, and proximity search) to require that the word be found in all matching documents. There should be no space between the [+] and the search term. For example, searching for [Oracle +Applications] only finds documents that contain the words Oracle and Applications. When compulsory inclusion search is used with the OR (|) operator, the compulsory inclusion operator does not have any effect. For example, searching for [text | +database] returns documents containing the term text or database. |
Compulsory exclusion [-] search |
Put the operator (-) at the beginning of any searchable item (including term, phrase, attribute, and proximity search) to require that the word not be found in all matching documents. It can be a single word or a phrase, but there should be no space between the [-] and the token. For example, [oracle –applications] returns documents containing oracle but not containing applications. The query [oracle –"application server"] returns documents containing oracle but not containing the phrase "application server". The query [oracle –title:oracle] returns documents containing oracle but with the title not containing oracle. The query [oracle –"application server"~] returns documents containing oracle but not containing application and server in any 100 terms spanning window. The compulsory exclusion query cannot be the only query. For example, the query [-oracle] raises an error. Also, the compulsory exclusion query cannot be connected with the OR [|] operator. For example, [oracle | -database] raises an error. |
Filetype search [filetype:filetype] |
Use [filetype:filetype] after the search term to limit results to that particular file type. A search can have only one file type. No operator is allowed in file type shortcut search. For example, [documentation filetype:pdf] returns PDF format documents for the term documentation. The "filetype" shortcut must be lowercase, but the file type name is case-insensitive; that is, [documentation filetype:PDF] returns the same documents. The following file types are supported, with their corresponding mimetype: filetype string: mimetype ps: application/postscript ppt: application/vnd.ms-powerpoint, application/x-mspowerpoint doc: application/msword xls: application/vnd.ms-excel, application/x-msexcel, application/ms-excel txt: text/plain html: text/html htm: text/html pdf: application/pdf xml: text/xml rtf: application/rtf |
Site search [site:host] |
Use [site:host] after the search term to limit results to that particular site. For example, [site:www.oracle.com filetype:pdf] returns documents from www.oracle.com in PDF format. The "site" shortcut must be lowercase, but the host name is case-insensitive; that is, [site:www.Oracle.com filetype:pdf] returns the same documents. Oracle SES only supports exact host matching. The query [site:*.oracle.com] does not work. |
Group search [sg:source group] |
Use [sg:source group] to limit results to that particular source group. All other search restrictions are valid in a group search. For example, [sg:intranet] returns documents in the intranet source group. The "sg" shortcut must be lowercase, but the source group name is case-insensitive; that is, [sg:IntraNet] returns the same documents. In federated searches, the source group names are the source groups in the local (broker) node. If the local source groups contain federated sources, then Oracle SES translates the local source group name to the federated source group name by changing the query, which is then sent to federated source for results. |
Source groups are groups of sources that can be searched together. A source group consists of one or more sources, and a source can be assigned to multiple source groups. Source groups are defined on the Search - Source Groups page. Infosource nodes, or folders, are only generated for Web, e-mail, and OracleAS Portal source types.
On the basic Search page, users can browse source groups that the administrator created. Click a source group name to see the subgroups under it, or drill down further into the hierarchy by clicking a subgroup name.To view all the documents under a particular group, click either View All or the number next to the source group name. You can also perform a restricted search in the source group from this page.
The source hierarchy lets end users limit search results based on document source type. The hierarchy is generated automatically during crawl time.
Date attribute values on the result list are shown in Greenwich Mean Time (GMT). For example, when you crawl a document on your local computer with a last modified date value of "Sep 13 2007 20:30:00 PDT", the Oracle SES crawler converts this date value to the corresponding GMT date value, which is "Sep 14 2007 3:30:00 GMT". These two values represent the same moment in time, but Oracle SES only displays the date (not the time or time zone). Therefore, the last modified date displayed in the result list is Sep 14 2007 and not Sep 13 2007.
To search on the lastModifiedDate
attribute, use the GMT date value. In the previous example, you would search on lastModifiedDate=09/14/2004
. The query lastModifiedDate=09/13/2004
does not return the document. Note that you must enter the date using the format mm/dd/yyyy
.
To display the lastModifiedDate in your local time zone:
Open the ORACLE_HOME
/search/webapp/config/search.properties
file in a text editor.
Set ses.qapp.convert_timezone=true
.
Restart the Oracle SES middle tier:
searchctl restart
The browser picks up your local time zone and lastModifiedDate
is converted to your local time zone before displaying the search results.
The URL submission feature lets users submit URLs to be crawled and indexed. These URLs are added to the seed URL list for a particular source and are included in the crawler search space.
This feature is disabled on the Search page if it is disabled on the Global Settings - Query Configuration page or if you have not created a Web source.
To allow users to submit URLs:
Log on to the Oracle SES Administration GUI.
On the Global Settings page, select Query Configuration.
Select Allow URL Submission.
Select the Web source to which the submitted URLs are added.
Click Apply.
To examine the submitted URLs before they are indexed by the crawler:
On the Home page, select the Schedules secondary tab.
Click the Edit icon of a schedule.
Under Update Crawling Mode on the Edit Schedule page, select Examine URLs Before Indexing.
Click Update Crawling Mode.
You can configure how search options are displayed on the Search page. You can choose to display the Attribute Filters link or the Advanced Search link next to the basic search option.
This is done by specifying the following setting at the beginning of each of the four Freemarker template files: query.ftl
, results.flt
, noresults.ftl
, and error.ftl
.
<#assign advSearchOpt = "configuration_setting">
For example,
<#assign advSearchOpt = "filters">
The configuration options are:
filters
— Displays the Attribute Filters link next to the Search button. Using this link, you can access the attributes filter table. This is the default display setting.
link
— Displays the Advanced Search link next to the Search button. Using this link, you can access the advanced search page.
both
— Displays the Attribute Filters link next to the Search button. The More Advanced
link at the bottom of the attribute filters table takes you to the Advanced Search page.
See Also:
Chapter 10, "Customizing the Search User Interface" for more information about configuring Freemarker template filesAttribute filters enable you to refine your search query by specifying attribute values of documents. Oracle SES includes default attributes for every instance, such as title, description, and keywords. For example, the Author attribute is mapped to the From header in e-mail documents and the Author metatag in HTML documents. Search administrators can also create custom attributes.
The attributes filter table provides several user friendly features that makes it easier for you to create a search query based on attributes. The features include auto-complete for attribute names that match the user's input, and a calendar picker that is displayed when the user selects a date attribute. In the Attribute Filters table, you can specify any number of attribute filters by specifying a boolean condition followed by "any" or "all". Attribute filter rows may be added or removed by clicking on the icons to the right of each row.
To assist the user in selecting search attributes, auto complete is enabled for the attribute name. As you begin entering the attribute name, the attributes with the matching characters are displayed in a list and you can select from this list. Additionally, the list displays the data type next to the attribute name.If a source group or multiple source groups are selected, then this list of search attributes is restricted to those attributes that are included in any of the selected source groups.
The choice of operator is specific to the data type of the selected attribute. For example, STRING
attributes allow the Equals
and Contains
operators. NUMBER
and DATE
attributes allow Equals
, Less than
, Less than or equals
, Greater than
, and Greater than or equals
operators. Also note that the choice of available operators depends on the attribute itself. For example, with the Tag
attribute, you can use only the Equals
operator.
When a date attribute is selected and a user clicks or tabs into the attribute value input box for the date attribute, a calendar control automatically appears to allow the user to easily select a date. In addition to providing a visual interface, this also aids the user in entering the correct date format for Oracle SES, which is currently MM/DD/YYYY.
When the attribute filters table is collapsed/hidden, the Attribute Filters
link indicates the number of valid attribute filters.
Although there may be many more rows in the attribute filter table, this number only counts rows that have both a valid attribute name and a non-empty value. Incomplete rows are ignored if the query is submitted, so this number is an accurate count of the filters in the search query.
When a search is performed with the attributes specified in the attributes filters table, the equivalent "advanced search" query with attribute shortcuts is constructed to display to the user various statistics like the title, hit statistics, and bottom query box. This shows the user what the equivalent query would be if entered solely in the query box.
See "Search by Attribute" for more information.
Oracle SES 11g provides a feature-rich set of advanced search options for the end user. You can target your search results more accurately using the Advanced Search page, which is shown in Figure 3-3. Advanced Search lets you refine searches in the following ways:
Oracle SES can search documents in different languages. Specifying a language restricts searches to documents that are written in that language. Select a language in the Language box.
The following are possible internationalization issues with supported operators:
Proximity Search ["secure search"~10]: The term distance definition could be different for non-whitespace delimited languages, such as Japanese. The behavior of proximity search for those languages could be different.
Implicit stemming: This is available in English, German, French, Spanish, Italian, Dutch.
Wildcard search [feat*e] or [featu?e]: The term definition is different for non-whitespace-delimited languages, such as Japanese. The behavior of wildcard expansion for those languages is different.
Fuzzy expansion [hallo~]: This is available in English, German, French, Chinese, Japanese, Spanish, Italian, and Dutch.
If one or more source groups are defined, then corresponding check boxes appear when you select specific categories. You can limit your search to source groups by selecting those check boxes. If no source group is selected, then all documents are searched.
A source group represents a collection of documents. They are created by the Oracle SES administrator.
Similar to the attributes filter table, the Attribute Selection panel enables you to specify attribute values for a search. Date format must be entered in the MM/DD/YYYY format. Click Add more attributes to enter more than four search attribute values. Table 3-2 provides the list of operators supported for each data type.
Table 3-2 Oracle SES Attribute Search Operators and Types of Attributes
Attribute | Contains (^) | Equals (=) | Synonym (~) | Lessthan, Narrower terms (<) | Lessthanequals (<=) | Greaterthan, Broader terms (>) | Greaterthanequals (>=) |
---|---|---|---|---|---|---|---|
String |
Yes |
Yes |
Yes |
Yes |
No |
Yes |
No |
Number |
No |
Yes |
No |
Yes |
Yes |
Yes |
Yes |
Date |
No |
Yes |
No |
Yes |
Yes |
Yes |
Yes |
The Equals
operator returns a hit only if the attribute value that you enter exactly matches the attribute value in the document. The Contains
operator returns a hit if the attribute value you enter matches any of the tokens in the attribute values in the document. The token must be an exact match; partial matches are not returned.
For example, suppose that the following four documents are indexed:
Document | Author | Number of Tokens |
---|---|---|
doc1 | "scott" | |
doc2 | "scott tiger" | 2 (scott, tiger) |
doc3 | "scottm tiger" | 2 (scottm, tiger) |
doc4 | "scott.tiger@oracle.com" | 4 (scott, tiger, oracle, com) |
An attribute restricted search for "author equals scott" returns only doc1. But an attribute restricted search for "author contains scott" returns doc1, doc2, and doc4. There is no hit for doc3 because scott
is only a partial match for scottm
.
When a query is prefixed with 'otext::
', Oracle SES identifies it as an Oracle Text syntax query. Both the Oracle SES Search Application and in the Web Services API support this query syntax.
Oracle Text query syntax and Oracle SES query syntax cannot be used in the same query.
To use the Oracle Text query syntax, note the following:
A highlight query highlights terms in returned documents. Use a highlight query only after an Oracle Text-compatible query and prefixed by the string highlight:
. When no highlight query is specified, Oracle SES chooses highlight terms from the original queries.
To support thesaurus-based (that is, synonym, broader or narrower term) searching, load a thesaurus.
For 'about' to work, change the index to index themes.
Relevancy boosting is disabled when the otext
syntax is used.
Table 3-3 compares the syntax of Oracle Text and Oracle SES.
Table 3-3 Syntax Comparison Between Oracle Text and Oracle SES
Query | Oracle Text | Oracle SES |
---|---|---|
Term |
otext::secure |
secure |
Phrase |
otext::secure search otext::{secure search} |
"secure search" |
Proximity search |
otext::secure ; search otext::near((secure, search),10) |
"secure search"~ "secure search"~10 |
Attribute search |
otext::oracle within "title" otext::(oracle & text) within "title" N/A for numeric and date attribute |
title:oracle title:oracle & title:text lastmodifieddate:10/20/2006 |
AND operator |
otext::secure & search |
secure search |
OR operator |
otext::secure | search |
secure | search |
ACCUM and Weight |
otext::secure*10, search *5 |
N/A |
Compulsory exclusion |
otext::oracle ~apps |
oracle -apps |
Compulsory inclusion |
N/A |
oracle +apps |
Grouping operator |
otext::(rac | {real application clusters}) & whitepaper |
(rac | "real application clusters") whitepaper |
Stemming operator |
otext::$feature |
N/A (implicit stemming, turned off by using double quotes) |
Multiple character wildcard Single character wildcard Fuzzy expansion |
otext::feat%e otext::featu_e otext::?hallo |
feat*e featu?e hallo~ |
Soundex |
otext::!smythe |
N/A |
Theme search |
otext::about(dogs) |
N/A |
Synonym search Narrower term search Broader term search |
otext::syn(dog) otext::NT(dog) otext::BT(dog) |
~dog <dog >dog |
Query template |
otext::<query> <textquery>123</textquery></query> |
N/A |
Query relaxation |
otext::<query> <textquery> <progression> <seq>secure search</seq> <seq>secure;search </seq> <seq>secure & search</seq> </progression> </textquery></query> |
N/A (implicitly done) |
Highlight |
otext::oracle highlight:search |
N/A (implicitly done) |
See Also:
Oracle Text Reference, available on Oracle Technology NetworkA thesaurus is a list of terms or phrases with relationships specified among them, such as a synonym, a broader term, and a narrower term. When a user issues a search query, Oracle SES can expand the search results to include matches for the related terms. Users can include the thesaurus operators in their queries:
~
Synonyms
>
Broader terms
<
Narrower terms
If no thesaurus is loaded or if the specified term or phrase cannot be found in the loaded thesaurus, then query expansion is not possible. Oracle SES only returns documents containing the original term.
You can apply thesaurus-based expansion operators to attributes in an attribute search. Because an attribute value is either a term or a phrase, the expansion has the same effect on both, except that the expansion in attribute shortcut search is restricted to attribute value search.
You can expand the search results by providing a thesaurus. A thesaurus is a list of terms or phrases with relationships specified among them, such as a synonym, a broader term, and a narrower term. When a user issues a search query, Oracle SES can expand the search results to include matches for the related terms.
A thesaurus contains domain-specific knowledge. You can build a thesaurus, buy an industrial-specific thesaurus, or use utilities to extract a thesaurus from a specific corpus of documents. The thesaurus must be compliant with both the ISO-2788 and ANSI Z39.19(1993) standards.
Only one thesaurus can be loaded at a time, and it must be named DEFAULT. If no thesaurus is loaded or if the specified term or phrase cannot be found in the loaded thesaurus, then thesaurus-based query expansion is not possible. Oracle SES only returns documents containing the original term or phrase. The default expansion level is 1.
Oracle SES maintains an alternate word list containing word pairs. The two words in the word pair can be used alternatively. The semantic similarity between the two words is higher than that between two synonyms.
Oracle SES uses alternate words in the following places:
To provide a suggestion for a query. For example, the query [RAC] can result in the suggestion: "did you mean 'real application clusters'?"
To provide an implicit expansion based on alternate words. For example, the query [RAC] returns documents containing RAC or real application clusters.
Oracle SES provides an option for each alternate word pair to do implicit expansion for this pair.
Oracle SES can maintain an alternate word list containing word pairs. The two words in the word pair can be used alternatively. The semantic similarity between the two words is higher than that between two synonyms. Oracle SES uses the list to provide suggestions or to expand the search results.
See Also:
Oracle Secure Enterprise Search Administration API Guide for more information about configuring alternate wordsOracle Secure Enterprise Search Administration API Guide for more information about the Administration APIs related to altWord
.
On the Basic search page, users can browse source groups that the administrator created by clicking the Browse link next to the Search box. This action displays a Browse popup window containing a tree view of the browse information hierarchy. Users can click an expand icon (>) to see the infosource nodes under it, or drill down further by clicking additional expand icons.
Clicking the View All link or the document count number of a browse node refreshes the entire page to show the set of all documents within that infosource node. Clicking the node label causes the list of source groups above the Search box to be replaced with the message "Search within: infosource node name".
The infosource node in the browse tree, along with its subtree, are highlighted to indicate which node is selected to search within.
The search result set is not immediately replaced. Instead, users must click Search to perform a restricted search within the selected infosource node. Only one infosource node may be selected at a time for "search-within."
To restrict search to a set of top-level source groups (rather than an infosource node), select multiple source groups within the Browse Tree popup. The source group has a check mark next to it, and the list of selected source group names are displayed above the Search box. Again, users must click Search to perform a restricted search within the selected source groups.