Sun N1 Grid Engine 6.1 Administration Guide

Calculating Derived Values With dbwriter

The rules for calculating derived values are specified in a derived tag, which is a sub tag of the DbWriterConfig tag. The following table lists the attributes of the derived tag:

Attribute

Description

object

The object for which data is aggregated. The object is one of the following:

  • host

  • queue

  • project

  • department

  • user

  • group

interval

The time range specifying how often to calculate the derived values. The time range is one of the following:

  • hour

  • day

  • month

  • year

variable

The name of the variable to hold the calculated data.

The following table lists the subelements of the derived tag:

sql

The SQL statement that calculates the derived values. The statement must produce the following columns:

  • time_start – Together with time_end, specifies the time period for the calculated value

  • time_end

  • value – The calculated derived value

The SQL statement can contain the following placeholders. dbwriter replaces the placeholders for each query, based on a rule:

  • __time_start__ – Start time for the query. dbwriter searches for the last previously calculated derived value from this rule, and uses this timestamp as the start time for the next query.

  • __time_end__ – End time for the query. This timestamp specifies the end of the last passed time range. For example, if the time range is day, and if derived values are calculated at 00:30, 00:00 is taken as time_end.

  • __key_0__, __key_1__, . . . , __key_n__ – Components of the primary key for the specified object type. For example, the sge_hosts table has the primary h_hostname. If a rule is processed for the host object type, one query is executed per entry in the sge_hosts table, the __key_0__ placeholder in the SQL statement is replaced by the hostname.

    The sge_queue table has a composed primary key that is made up of q_qname and q_hostname.

auto

dbwriter generates the SQL statement for the calculation of derived values.

The autogenerated SQL statement looks like the following template:


SELECT time_start, time_end, <function>(<value_field>) as value
FROM ( SELECT TRUNC( <timestart_field>, <interval> ) as time_start
              TRUNC( <timestart_field>, <interval> ) +
                    INTERVAL '1' <interval> as time_end,
              <value_field>
       FROM <object value table>
       WHERE <primary key field 0> = __key_0__
         AND <primary key field 1> = __key_1__
         AND . . .
         AND <parent key field> =
             (SELECT <parent key field> FROM <parent table>
              WHERE <parent filter> )
         AND <timestart_field> <= {ts __time_start__ }
         AND <timeend_field>   >  {ts __time_end__ }
GROUP BY time_start, time_end

The SQL template parameters are as follows:

Parameter

Description

<function>

Aggregate function for calculating the derived value. Comes from the function attribute of the auto tag in the XML file.

<value_field>

Depends on the object of the derived value.

<timestart_field>

Depends on the object of the derived value.

<timeend_field>

Depends on the object of the derived value.

<interval>

Comes from the interval attribute of the derived tag

<object value table>

Name of the database table where the values are stored. Depends on the object (host => host_values, user => user_values, . . .)

<primary key field n>

Primary key that is necessary to join the value table to the parent table. Depends on the object.

<parent key field>

Name of the field that holds the ID of the parent. Depends on the object.

<parent table>

Name of the parent database table. Depends on the object (host => host, user => user)

<parent filter>

Filter for the parent table. A derived value for each entry of the parent table is calculated, for example, u_user = 'user1').

Here is an example of an autogenerated SQL statement:


<derive object="host" interval="day" variable="d_load">
   <auto function="AVG" variable="h_load" />
</derive>

SELECT time_start, time_end, AGE(hv_dvalue)
FROM ( SELECT TRUNC( hv_time_start, 'day' ) as time_start,
              TRUNC( hv_time_start, 'day' ) +
                             INTERVAL '1' day as time_end,
              hv_dvalue
       FROM sge_host_values
       WHERE hv_variable = 'h_load' AND
             hv_parent =
               (SELECT h_id FROM sge_host
                WHERE h_hostname = 'foo.bar') AND
             hv_time_start <= {ts '2004-05-21 00:00:00.0'} AND
             hv_time_end    > {ts '2004-05-17 00:00:00.0'} )
GROUP BY time_start, time_end