At dbwriter startup, and in continuous mode once an hour, derived values are calculated. You can configure which values to calculate in an XML file, which is by default in $SGE_ROOT/dbwriter/database/<database_type>/dbwriter.xml. <database_type> defines the type of database being used; currently, Oracle and Postgres are supported. The path to the configuration file is passed to dbwriter using the -calculation parameter.
The configuration file uses an XML format, and contains entries of rules for both derived values and deleted values (described in the next section). The rules for derived values have the following format.
The top-level start tag is <derive> It must be specified with three attributes:
object — which can be host, queue, user, group, department or project. Based on this attribute, the derived value is ultimately stored in one of: sge_host_values, sge_queue_values, sge_user_values, sge_group_values, sge_department_values, sge_project_values.
interval — which can be hour, day, month, or year.
variable — which is the name of the new derived value.
A second-level start tag, either <sql> or <auto>, describing the way the value should be derived. These tags are shown in detail as follows.
<sql> – This tag contains an SQL statement used for calculating the derived values. The exact syntax of the entries depends upon the type of database being used.
<auto> - for certain simple derived values, this tag can be used instead of a full SQL query. This tag has two attributes:
function — which gives the aggregate function to apply to the variable. This can be any function valid for the type of database being used. Some typical functions are AVG, SUM, VALUE, COUNT, MIN or MAX.
variable — which can be any variable tracked in the following tables: sge_host_values, sge_queue_values, sge_user_values, sge_group_values, sge_department_values, sge_project_values the variable specified must be from the table indicated by the object attribute of the enclosing <derive> tag, for example, if the object is host, the variable must be found in sge_host_values.
Two end tags matching the two start tags
Here is an example of a derivation rule using the <sql> tag. The sge_queue table has a composed primary key comprising q_qname and q_hostname. For a rule specified for the queue object_type, a query will be made for each entry in the sge_queue table, the placeholders __key_0__ will be replaced by the queue name,ilwill be replaced by the hostname.
<!-- average queue utilization per hour --> <derive object="queue" interval="hour" variable="h_utilized"> <sql> SELECT DATE_TRUNC( 'hour', qv_time_start) AS time_start, DATE_TRUNC( 'hour', qv_time_start) + INTERVAL '1 hour' AS time_end, AVG(qv_dvalue * 100 / qv_dconfig) AS value FROM sge_queue_values WHERE qv_variable = 'slots' AND qv_parent = (SELECT q_id FROM sge_queue WHERE q_qname = __key_0__ AND q_hostname = __key_1__) AND qv_time_start <= '__time_end__' AND qv_time_end > '__time_start__' GROUP BY time_start </sql> </derive>
Here is an example of a derivation rule using the <auto> tag.
<!-- average load per hour --> <derive object="host" interval="hour" variable="h_load"> <auto function="AVG" variable="np_load_avg" /> </derive>
At dbwriter startup, and in continuous mode once an hour, outdated records will be deleted. Which values to calculate can be configured in an XML file, by default in $SGE_ROOT/dbwriter/database/<database_type>/dbwriter.xml. <database_type> is the type of database being used; currently, Oracle and Postgres are supported. The path to the configuration file is passed to dbwriter using the -deletion parameter.
The configuration file uses an XML format, and contains entries of rules for both derived values and deleted values. Deletion rules are of the following format.
A top-level start tag <delete> with three attributes:
scope — which specifies the type of data to be deleted. Valid entries are job, job_log, share_log, ip, queue_values, ip, department_values, user_values, group_values. Based on this attribute, the values are deleted from the table with the same name with sge_ prepended.
time_range — which gives the unit of time_amount.
time_amount — which is the number of units (time_range) a record is kept.
An optional second-level start tag <sub_scope>, which specifies an additional condition for deletion. A subscope can be configured for all *_values scopes and the share_log scope.
One or Two end tags matching the two start tags
If a subscope is configured for a *_values rule, it contains a space separated list of variables to delete. If a subscope is specified for the share_log, it contains a space separated list of sharetree nodes to delete.
The following rule indicates that the four variables given in the subscope should be deleted from the table sge_host_values after 7 days.
<delete scope="host_values" time_range="day" time_amount="7"> <sub_scope>np_load_avg</sub_scope> <sub_scope>cpu</sub_scope> <sub_scope>mem_free</sub_scope> <sub_scope>virtual_free</sub_scope> </delete>
The following rule says to delete all variables from the table sge_host_values after 2 years.
<delete scope="host_values" time_range="year" time_amount="2"/>
The following rule says to delete all records for user fred after 1 month
<delete scope="share_log" time_range="month" time_amount="1"> <sub_scope>fred</sub_scope> </delete>