Chapter 4. Using Agent Integrator for Polling

To track faults in critical system components or applications, management systems use polling to determine whether attributes of the managed resource have crossed some significant threshold. Polling consists in checking the value of an attribute of the managed resource at some interval. The BEA Agent Integrator can be configured to act as a proxy for the manager, doing the polling locally on the managed node. By off-loading polling to distributed Integrator agents, the load on the management station is reduced and less network bandwidth is consumed. Communication between the manager and Agent Integrator occurs only when the manager sends a SET request to activate or de-activate the polling, or when the Agent Integrator sends an SNMP trap if it detects the specified event in the managed resource.

Procedure for Setting Up Local Polling

The steps in using Agent Integrator for local polling can be summarized as follows:

Decide which resources you want to monitor.

The attributes of the resource that you want to monitor must be defined as MIB objects. These MIB objects must be supported by an agent or subagent that has been installed on the managed node.
Make the managed resource accessible to the Agent Integrator.

The Agent Integrator must know how to access the managed object. This means the object identifier for that object must lie within branches of the OID tree that are known to the Agent Integrator. If the managed object you want to monitor is supported by a SMUX subagent that has been installed on the managed node, the subagent automatically registers its section of the OID tree with the Agent Integrator when the subagent is started. This can be modified using OID_CLASS entries in the BEA Manager configuration file, as described in Chapter 7, "Configuration Files." For peer SNMP agents (or DPI or SMUX master agents), you must define the segments of the OID tree supported by those agents in NON_SMUX_PEER entries in the BEA Manager configuration file. This is described in the section "Integrator Access to Managed Objects" in Chapter 3, "Using Multiple SNMP Agents." The Agent Integrator directly supports the following MIB groups: MIB II system and snmp groups, the SMUX MIB, and the BEA Manager beaintAgtTable in the BEA Manager agent MIB. Additional MIB groups are supported by the unix_snmpd and nt_snmpd SMUX subagents, shipped with Agent Integrator. These are listed in Chapter 6, "Starting the Subagents."
Define polling instructions for the Agent Integrator

We can divide this task into two main subtasks:
- Define the desired threshold.
- Specify the action to take if the threshold is crossed.
Each polling instruction for the SNMP Integrator is called a rule. Rules are defined under the RULE_ACTION entry in the BEA Manager configuration file, beamgr.conf. You can use your favorite text editor to modify this file. (For information on how to create rules, see the section "Creating New Polling Rules.") Rules are explained below under "Introduction to Agent Integrator Rules." Chapter 7, "Configuration Files," provides the complete syntax for the RULE_ACTION entries.
Configure your SNMP management system for Agent Integrator traps

When a polling threshold is crossed, Agent Integrator sends an enterprise-specific SNMP trap notification to the destinations specified by the TRAP_HOST entries in the BEA Manager configuration file. Some configuration will be required on your SNMP-compliant management system to make use of the traps that are thus generated. The exact set of steps you need to perform vary depending upon which management system you are using. Typically some configuration or mapping is required to get the management system to perform a desired action (such as turning an icon red) when a trap is received. Consult your management system documentation for specific instructions.
Start Agent Integrator polling.

The Agent Integrator begins executing all valid polling rules when it is started. Refer to the section "Starting and Stopping Polling" for more details.
De-activate or re-activate Agent Integrator polling, when desired.

Polling rules are available as MIB objects; thus an operator can de-activate or re-activate polling from the management station by means of an SNMP SET request. This is described in the section "Starting and Stopping Polling."

Introduction to Agent Integrator Rules

An Agent Integrator rule consists of the following parts:

A unique name for the rule (no more than eight characters in length)
A condition (or threshold) that the Integrator is to check for. (This is described further in the section, "Conditions.")
An action to take if the specified threshold is crossed. (This is described in the section "States and Transitions.")
A polling frequency (specified in seconds), that is, the time delay between each access of the specified object value

Conditions

When the Agent Integrator polls, it checks to determine if a specified condition holds. A condition is defined as a relationship between an object (specified by its object identifier) and a value. (Object identifiers are described in Chapter 1, "Agent Integrator Overview." Polling is described in "Polling.")

Relations for Defining Conditions

The condition obtains (the threshold is crossed) if and only if the specified relation holds between the object and the value. For example, the relation greater than defines the following condition:

disk capacity in use greater than 90 percent

In this case, the condition holds (evaluates to true) if the object (percentage of disk capacity in use) has a value that is greater than 90. (In this example, the condition is described in English, not the actual code used to define Agent Integrator polling rules.)

Any of the relations listed in the following table can be used to define conditions.

**Table 4-1 Relations for Defining Conditions**
Symbol	Meaning
==	is identical to
!=	is not identical to
<	is less than (for numeric values is a substring of (for strings)
>	is greater than (for numeric values) contains (for strings)
<=	is less than or equal to
>=	is greater than or equal to

Polling with a SMUX Subagent

For example, suppose that we want the Agent Integrator to check if CPU usage has exceeded 80 percent. This feature of the CPU is represented by the beaSysPerfCpu object in the beaSysPerf group. This MIB group is supported by the unix_snmpd subagent, which is shipped with the Agent Integrator. This subagent uses SNMP Multiplex (SMUX) protocol to talk to the Agent Integrator. Thus, the Agent Integrator can obtain the value of this object from the unix_snmpd subagent if it is running on the same machine. We could use the following condition to define a polling rule for the Agent Integrator:

(VAL(.1.3.6.1.4.1.140.11.1.0) > 80)

The expression VAL() is used to obtain the value of the beaSysPerfCpu object. The specified condition obtains if the percentage of CPU capacity in use exceeds 80 percent. In this example, the initial dot indicates that this is an absolute OID, that is, the path to the beaSysPerfCpu object is defined from the root of the OID tree. The actual OID for the beaSysPerfCpu object is .1.3.6.1.4.1.140.11.1. However, when retrieving the value, it is necessary to specify the instance of the object to be retrieved. The last numeral, 0, is the instance index. Because beaSysPerfCPU is a scalar object - an object that can have only one instance - an index of zero is specified in this case. (How to specify non-scalar objects is discussed in the section "Instance Indexes.")

The following is an example of an Agent Integrator rule that uses the condition previously specified:

RULE_ACTION checkcpu 600 \
if (VAL(.1.3.6.1.4.1.140.11.1.0) > 80) {TRAPID_ERR = 200}

In this example, checkcpu is the name of the rule. The Agent Integrator checks the CPU usage every ten minutes. If the value of beaSysPerfCpu is greater than 80 percent, TRAPID_ERR = 200 instructs the Agent Integrator to generate an enterprise-specific trap with a specific-trap type number of 200. This type number can be used by a system administrator to identify the cause of the trap.

Note: The MIB objects whose values the Agent Integrator can obtain depends on the MIB objects supported by the agents or subagents that the Agent Integrator is managing. In the previous example, the Agent Integrator can poll for the beaSysPerfCpu object value only if the unix_snmpd subagent is running on the managed node. The MIB objects that Agent Integrator can access through a "peer" SNMP agent depends on the NON_SMUX_PEER entries in the BEA Manager configuration file, as explained in Chapter 3, "Using Multiple SNMP Agents."

Polling with SNMP Peer Agents

The Agent Integrator can also obtain MIB object values from SNMP "peer" agents on either the same machine or other machines in the network. For example, suppose that we have a peer SNMP agent that supports the MIB II interfaces group. If so, we might want the Integrator to check if a physical interface is not operational. This feature of the interface is represented by the ifOperStatus object in the ifTable in the MIB II interfaces group. In this case, we want to know whether the value of ifOperStatus is not equal to 1. (An interface is operational if its ifOperStatus value is 1.) If we want to check the ifOperStatus value for the first interface on the machine, we could use the following condition:

(VAL(.1.3.6.1.2.1.2.2.1.8.1) != 1)

This condition holds if and only if the first interface in the ifTable is operational. The last numeral, 1, specifies the instance index - the first interface entry in the table.

If the condition is satisfied, we want the Agent Integrator to take some action. For example, if the ifOperStatus value for an interface is not 1 (i.e., the interface is not up), we might want the Agent Integrator to notify the management station. To do this, we can specify that the Agent Integrator send an enterprise-specific SNMP trap to the management station with a special specific-trap value, which identifies the cause of the trap to the systems administrator.

Instead of requesting this notification if a specific interface (such as the first one in the ifTable) is down, we might want to be notified if any of the interfaces is down.

Here is an example of a rule entry that would do this:

RULE_ACTION checkIf 120 \
if (VAL(.1.3.6.1.2.1.2.2.1.8.*) != 1) {TRAPID_ERR=300}

In this example, checkIf is a name we have given to this particular rule. We have indicated that Agent Integrator should check the interface every two minutes (120). By using the asterisk wildcard for the instance index, the condition will be satisfied if any interface in the ifTable has an ifOperStatus not equal to 1, that is, all instances will be checked. If the value of the OID is not equal to 1 (the interface is not up) for any instance, an enterprise-specific trap is sent with a specific trap ID of 300.

Note: This rule only causes a trap to be generated when Agent Integrator first detects that an interface is down. If the interface continues to be down, it does not generate additional traps.

Use of Logical Operators in Conditions

Conditions are of two types, simple and complex. A simple condition consists of a relation between a managed object and a value. All of the examples in the previous sections have been simple conditions.

You can use the logical operators AND, OR, and NOT to define complex conditions. For example, if A and B are two simple conditions, you can specify a complex condition that consists of both A and B occurring. The symbols listed in the following table can be used to define complex conditions.

**Table 4-2 Logical Operators for Specifying Complex Conditions**
Symbol	Meaning
`!(``condition_A``)`	Logical negation. The threshold is crossed if and only if condition_A does not hold.
`(``condition_A` `\|\|` `condition_B``)`	Logical disjunction. The threshold is crossed if and only if either condition_A or condition_B obtain.
`(``condition_A` `&&` `condition_B``)`	Logical conjunction. The threshold is crossed if and only if both condition_A and condition_B obtain.

Scenario for Using a Complex Condition

For example, we might not want the Agent Integrator to send an alarm when ifOperStatus is not up for an interface if a system administrator has taken that interface down for repair. In that case, we could define a rule that asks the Agent Integrator to determine if two conditions hold: ifOperStatus is not up AND ifAdminStatus is up. In other words, we want to be notified if the interface should be up but is not.

Note: The MIB objects whose values the Agent Integrator can obtain depends on the MIB objects supported by the agents or subagents that the Agent Integrator is managing.

Sample Code for this Scenario

To do this, we might modify our checkIf rule as follows:

RULE_ACTION checkIf 60\
if ((VAL(.1.3.6.1.2.1.2.2.1.8.*) != 1) && \
(VAL(.1.3.6.1.2.1.2.2.1.7.*) == 1)) \
{TRAPID_ERR=301}

How this Rule Works

In this example, the Agent Integrator checks the interfaces every minute (60) and generates an enterprise-specific trap, with a specific trap value of 301, if any of the interfaces is not up (ifOperStatus not equal to 1) but has an ifAdminStatus value of up (i.e., the interface should be up but it is not).

Note: This rule causes this trap to be generated only when the condition first evaluates to true. As long as the interface continues in the same state, a new trap is not generated.

Data Types for Defining Conditions

The syntax for a simple condition is as follows:

(VAL(oid) relation value)

where

relation

Is one of the relations described in Table 4-2.

oid

Is specified in one of the formats described in the section "Specifying Object Identifiers in Conditions."

value

Can be one of the following data types:

integer
string
IP address (in the form number1.number2.number3.number4)
Object identifier, surrounded by single quotes (`). The OID should be specified exactly as returned by the agent managing that object.

Specifying Object Identifiers in Conditions

In defining polling conditions, the object identifier (OID) must be specified numerically, not using textual symbols (other than mib-2 or enterprises as indicated in the following list). One of the following formats can be used to specify the object identifier:

An absolute object identifier, that is, the full path to the object is specified from the root of the OID tree. An initial dot is used to indicate that the path starts at root, (for example, .1.3.6.1.2.1.1.1.0). Note that the trailing zero in this example is the instance index.
A relative OID under the MIB II branch can be specified in the form:
```
mib-2.number.number ...
```
When the reserved word mib-2 appears as the leading sub-oid, .1.3.6.1.2.1 is assumed to be prefixed to the rest of OID. For example:
```
mib-2.1.1.0 
```
represents the absolute OID:
```
.1.3.6.1.2.1.1.1.0
```
A relative OID under the enterprises branch can be specified in the form:
```
enterprises.number.number ...
```
When the reserved word enterprises appears as the leading sub-oid, .1.3.6.1.4.1 is assumed to be prefixed to the rest of OID. For example:
```
enterprises.140.1.0 
```
represents the absolute OID:
```
.1.3.6.1.4.1.140.1.0
```
A relative OID under the enterprises branch can also be specified in purely numeric form:
```
number.number.number ... , 
```
If there is no leading "." and the OID starts with a number, .1.3.6.1.4.1 is assumed to be prefixed to the rest of OID. For example:
```
140.1.1.0 
```
represents the absolute OID:
```
.1.3.6.1.4.1.140.1.1.0 
```

Instance Indexes

Columnar objects are used to represent a column of a tabular MIB group. Columnar objects accordingly can have multiple instances. To specify an instance, the index is appended to the rest of the OID. If the index is a single attribute, the last number in an OID is used to specify the particular instance. If the more than one attribute is required to uniquely identify an instance, an instance number for each attribute is appended to the OID, separated by a dot, in the order specified by the INDEX definition in the ASN.1 file.

For example, suppose that you want to check for the condition when the state of a particular server is anything but active. To uniquely specify a server instance, we require both the group number and the server ID. The INDEX entry for tuxTsrvrTbl in the ASN.1 file specifies the following as an INDEX to particular instances.

INDEX (tuxTsrvrGrpNo,tuxTsrvrId)

The relative OID for tuxTsrverState is the following:

140.300.20.1.1.5

Thus, to specify the particular server instance for group 55 and server ID 3, you use the following OID:

140.300.20.1.1.5.55.3

Note that the order of the two attribute instances added to the tuxTsrvrState OID is indicated by the INDEX definition above: tuxTsrvrGrpNo followed by tuxTsrvrId.

You can thus define the condition that you want to check as follows:

VAL(140.300.20.1.1.5.55.3) != 1

This condition will evaluate to true whenever this particular server instance is not active.

A specific number can be used to specify a particular instance or the asterisk wildcard can be used to specify all instances. Zero is used as the instance index in the case of scalar objects (objects that can have only one instance). The asterisk wildcard is only used to represent all instances of a columnar object. For example:

.1.3.6.1.4.1.140.1.1.0

specifies the single instance of a scalar object while:

.1.3.6.1.4.1.140.2.22.1.2.*

specifies all of the instances of a columnar object. When a wildcard is used to define a condition, the condition will be satisfied if any instance satisfies the condition.

RULE_ACTION diskchk 600 \
if (VAL(140.2.22.1.5.*) > 90) {TRAPID_ERR = 102 TRAPID_OK = 202}

In the next example, a TUXEDO application is checked to determine if the transaction triptime exceeds 36 mSec. If the threshold is crossed, an enterprise-specific trap is generated and a user script, logtime, is invoked to log the time of the event. If the triptime is subsequently less than 36 mSec after having crossed that threshold on the previous poll, an enterprise-specific trap with a number of 302 is generated.

RULE_ACTION triptime 20 \
if (VAL(140.150.1.3.*) > 35) \
{TRAPID_ERR = 301 TRAPID_OK = 302 \
COMMAND_ERR = "/usr/sbin/logtime"}

Note: The object identifier in this example is not defined in the BEA MIB. This is an example of an object that might be defined in a user-supplied custom MIB.

In the next example, Agent Integrator polls every five seconds to check whether the number of requests completed by the TUXEDO server Server1 is greater than six. If it is, an enterprise-specific trap is generated with a specific trap number of 210 and the command c:/etc/srv_reqs.cmd is executed.

RULE_ACTION Server1 5 \
if ((VAL(140.300.20.2.1.12.*) > 6)) \
{ TRAPID_ERR=210  COMMAND_ERR="c:/etc/srv_reqs.cmd" }

In the next example, Agent Integrator is checking a particular server instance in any state other than active. The server that is being checked is uniquely identified by its group number and server ID: group number 55 and server ID 3.

RULE_ACTION srvrUp 60 if (VAL(140.300.20.1.1.5.55.3) != 1 \
                         {TRAPID_ERR = 306 TRAPID_OK = 307}

Whenever the server satisfies the condition, the rule transitions to the ERR state and generates an enterprise-specific trap with the specific trap number of 306. Whenever the server becomes active again, it transitions back to the OK state and issues a trap with the specific trap number of 307.

Starting and Stopping Polling

Polling rules are defined as RULE_ACTION entries in the BEA Manager configuration file, beamgr.conf. The default location of this file is /etc on UNIX machines or C:\etc on Windows NT machines. Individual rules are MIB objects, stored as an entry (row) in the beaIntAgtTable.

The status of each rule entry determines whether the Agent Integrator will execute that rule, that is, actively check the condition specified in the rule. The status of each rule entry is stored in the beaIntAgtStatus object. Polling is active for a rule if the status of that rule is valid (integer value of 1). Polling is inactive for a rule if its status has been set to inactive (integer value of 3). The specific rule can be SET from a management station (such as OpenView or SunNet Manager) by using the unique name of the rule as the key field used to specify the entry instance (row).

Note: The Agent Integrator must be running in order to successfully SET objects in the beaIntAgtTable.

The Agent Integrator begins executing all polling rules defined in RULE_ACTION entries in the BEA Manager configuration file (beamgr.conf) when it first starts up. The status of each rule object in the beaIntAgtTable is valid at startup.

Creating New Polling Rules

Rules can be added to the configuration file in two ways:

Use a text editor, such as vi, to add a RULE_ACTION entry to file, taking care to conform to the syntax of the rule, as described in Chapter 7, "Configuration Files." However, if the Integrator is already running, the new rule will not take effect until you execute the following command:
```
reinit_agents snmp_integrator
```
This causes the Agent Integrator to re-read its configuration file.
Since individual rules are MIB objects, stored as an entry (row) in the beaIntAgtTable, you can use an SNMP manager (or the snmptest utility) to create a new entry (row) in the beaIntMgtTable. (The SNMP manager must have the ability to issue SNMP SET requests that contain multiple objects in a single SET request.) To create the new row, issue a SET request after specifying a new index value that does not already exist in the table. This causes a new RULE_ACTION entry to be created in the configuration file.

Deleting or Modifying Polling Rules

Agent Integrator polling rules can modified in the same two ways they can be created:

Using a text editor to delete (or comment out) or modify a RULE_ACTION entry in the beamgr.conf file. This change does not take effect unless you issue the following command to force the Agent Integrator to re-read its configuration file:

reinit_agents snmp_integrator
SNMP SET commands can also be used to delete or modify rules.

Stopping Agent Integrator Polling Activity

Polling can be de-activated in one of two ways:

Remove the RULE_ACTION entry in the configuration file.

You can turn off a polling rule by commenting out or deleting that RULE_ACTION entry in the BEA Manager configuration file (beamgr.conf). However, for this to take effect, you will need to execute reinit_integrator, which causes the Agent Integrator to re-read its configuration file.
Use snmptest or an SNMP-compliant manager to SET the value of the rule status to inactive.

Polling for that rule can be de-activated from the management station (or by using the snmptest utility packaged with the Agent Integrator) by setting the value of that object to inactive (an integer value of 3). Setting the value to 2 (invalid) causes the RULE_ACTION entry to be deleted from the configuration file. Figure 4-1 shows our rule diskchk, discussed earlier, being set to inactive. Note that the read/write community string (in this example, "iview") is required for SET permission.
Figure 4-1 Setting a Polling Rule to Inactive

Restarting Agent Integrator Polling Activity

When a polling rule has been de-activated using a SET request from a management station, the rule can be re-activated using a SET request to set the value of the corresponding beaIntAgtStatus object to valid (integer value of 1).