This chapter explains the attributes provided for the search database. The Database attributes are divided as follows:
Before knowing about the Search database, you need to know how to partition the database. To partition the database, use the run-cs-cli rdmgr -G command, because stopping the search server is required.
The initial Manage Databases page lists the available databases. You can select a database by selecting the checkbox preceding to it. Click the New, Reindex, Purge, Analyze, Manage, or Expire resource descriptions button to perform the necessary action on the selected database.
You should reindex the database if you have edited the schema to add or remove an indexed field (as author), or if a disk error has corrupted the index. You need to restart the server after you change the schema.
Because the time required to reindex the database is proportional to the number of RDs in the database, a large database should be reindexed when the server is not in high demand.
When you purge the contents of the database, disk space used for indexes will be recovered, but disk space used by the main database will not be recovered; instead, it is reused as new data is added to the database.
Expiring a database deletes all RDs that are deemed out-of-date. It does not decrease the size of the database. By default, an RD is scheduled to expire in 90 days from the time of creation.
The table below lists the Database Management attributes and their description.
Table 5–1 Database Management Attributes
Attribute |
Default Value |
Description |
---|---|---|
Name |
True or False |
Name for the database used by Search. |
Federated |
True or False |
For a Federated database, this value is True. Otherwise, the value is False. |
Import agents are the processes that bring resource descriptions from other servers or databases and merge them into your search database.
The initial Manage Import Agents page lists the available import agents. You can select an import agent by selecting the checkbox preceding to it. Click the New, Enable, Disable, Delete, or Run All Enabled Import Agents to perform the necessary action on the selected import agent. To schedule the import agents, select Scheduling on the lower menu bar.
If you choose to create a new import agent or edit or modify an existing import agent, the following database import agent attributes are displayed.
The table below lists the Database Import Agent attributes and their description.
Table 5–2 Database Import Agent Attributes
The initial Resource Descriptions page allows you to search the Resource Descriptions in the database. For example, you can correct a typographical error in an RD or manually assign RDs discovered by the robot to categories.
The table below lists the Resource Descriptions attributes and their description.
Table 5–3 Resource Descriptions Attributes
Attribute |
Default Value |
Description |
---|---|---|
New |
Opens up the New Resource Description page where you can enter the URL to create a new search RD. |
|
Edit |
Opens up the Edit URL page where you can modify only the attributes of a search RD, which can be edited. |
|
Edit All |
Opens up the Edit Resource Descriptions page where you can modify a group of search RD. |
|
Delete |
Deletes the selected search RD. |
|
Filter |
All |
The options available are Categorized (to list Categorized RDs), Uncategorized (to list Uncategorized RDs), and Custom Filter. |
Custom Filter |
This attribute provides the following options: Query (Selected by default) URL Category Text box — To enter the search string. On selecting the Category option, the Choose button appears. Click the Choose button to go to the Select a Category page where you cab select the category. |
A successful search displays the Number of RDs found and a list box with the RDs found. If you navigate to the Edit page for a resource description, you can modify only the attributes of a resource description, which can be edited. By default, you cannot edit some of the RD attributes listed in the table below. To edit all these attributes except the Classification attribute, change the settings in the Database/Schema/Edit schema attribute page.
The table below lists the Database RD Editable attributes and their description. The default value for these attributes depends on the selected RD.
Table 5–4 Database RD Editable Attributes
Attribute |
Description |
---|---|
Author |
Author(s) of the document. |
Author e-mail |
Email address to contact the Author(s) of the document. |
Classification |
Category name if classified; No Classification if not classified. |
ReadACL |
Related to document level security. |
Content-Charset |
Content-Charset information from HTTP Server. |
Content-Encoding |
Content-Encoding information from HTTP Server. |
Content-Language |
Content-Language information from HTTP Server. |
Content-Length |
Content-Length information from HTTP Server. |
Content-Type |
Content-Type information from HTTP Server. |
Description |
Description from RD. |
Expires |
Date on which resource description is no longer valid. |
Full-Text |
Entire contents of the document. |
Keywords |
Keywords taken from meta- tags. |
Last-Modified |
Date when the document was last modified. |
Partial-text |
Partial selection of text from the document |
Phone |
Phone number for Author contact |
Title |
Title of RD |
URL | |
virtual-db |
Used to implement virtual database. |
When you click the Schema tab under Databases, you will get the Manage Search Schema page. This page lists the available Search Schema attributes. The schema determines what information is in a resource description and what form that information is in. You can add new attributes or fields to an RD and set which ones can be edited and which ones can be indexed. When importing new RDs, you can convert schemas embedded in new RDs into your own schema.
The table below lists the Search Schema attributes and their description.
Table 5–5 Search Schema Attributes
Attribute |
Description |
---|---|
Author |
Author(s) of the document. |
Author-EMail |
Email address to contact the Author(s) of the document. |
Content-Charset |
Content-Charset information from HTTP Server. |
Content-Encoding |
Content-Encoding information from HTTP Server. |
Content-Language |
Content-Language information from HTTP Server. |
Content-Length |
Content-Length information from HTTP Server. |
Content-Type |
Content-Type information from HTTP Server. |
Description |
Brief one-line description for document. |
Expires |
Date on which resource description is no longer valid. |
Full-Text |
Entire contents of the document. |
Keywords |
Keywords that best describe the document. |
Last-modified |
Date when the document was last modified. |
Partial-Text |
Partial selection of text from the document. |
Phone |
Phone number for Author contact. |
ReadACL |
Used by Search servers to enforce security. |
Title |
Title of the document. |
URL |
Uniform Resource Locator for the document |
virtual-db |
Used to implement virtual database. |
When you select the checkbox preceding to a search schema attribute and click on it, the Edit search schema name page appears. This page displays all the attributes to edit a search schema attribute. The table below lists the attributes and their description to edit a search schema attribute.
Table 5–6 Edit Search Schema Attribute Attributes
Attribute |
Default Value |
Description |
---|---|---|
Name Description Aliases |
Author Author(s) of the document Blank |
When you import new RDs, you can convert schemas embedded in new RDs into your own schema. You would use this conversion when there are discrepancies between the names used for fields in the import database schema and the schema used for RDs in your database. An example would be if you imported RDs that used Writer as a field for the author and you used Author in your RDs as the field for the author. The conversion would be Writer to Author, so you would enter Writer in this text box. |
Editable |
false |
If true (checked), the selected attribute (field) appears as Editable attribute in the Edit page for a resource description. Description, Keywords, Title and ReadACL are editable. |
Indexable |
true |
If true (checked), the selected attribute (field) can be used as a basis for indexing. Author, Title and URL appear in the menu in the Advanced Search screen for the end user. This allows end users to search for values in those particular fields. Author, Expires, Keywords, Last Modified, Title, URL and ReadACL can be used as the basis for indexing. |
Score Multiplier |
Blank |
A weighting field for scoring a particular element. Any positive value is valid. |
Data Type |
String |
Defines the data type. You need to choose the data type from the list box. |
The Analysis page shows a sorted list of all sites and the number of resources from that site currently in the search database. Select Update Analysis to update the analysis on file.
The table below lists the Database Analysis attributes and their description.
Table 5–7 Database Analysis Attributes
Attribute |
Default Value |
Description |
---|---|---|
Number of RDs |
Current number of RDs retrieved from the URL. |
Lists current number of RDs from that URL. |
URL |
URL that the robot has successfully searched. |
A URL that has added. |
Protocol |
Protocol it uses to retrieve the RDs from that URL. |
Lists the protocol used while collecting the RDs from a web site. |