|   | |
| Sun Java(TM) System Directory Server 5 2004Q2 Deployment Planning Guide | |
Chapter 8
Directory Server MonitoringAn effective monitoring and event management strategy is crucial to any successful Directory Server deployment. Such a strategy defines which events should be monitored, which tools to use, and what action to take should an event occur. Having a plan for common-place events helps prevent possible outages and reduced levels of service, improving the availability and quality of service.
A monitoring and event management strategy should include specific components of the architecture such as the replication configuration, but should also include system and network monitoring. This chapter examines what an effective monitoring strategy should include, and presents the monitoring features within Directory Server.
Note
This chapter does not focus on system and network monitoring, as this is an area not specific to Directory Server.
This chapter is divided into the following sections:
Defining a Monitoring and Event Management StrategyThis section provides an outline of the stages involved in defining a monitoring and event management strategy. The process can be broken down into the following steps:
- Select the appropriate monitoring tools, whether they be operating system tools, Directory Server monitoring tools, or third party monitoring tools.
- Identify the key areas to be monitored in the directory architecture (these are frequently the same as the sizing and tuning attributes).
- Define what triggers an event or alarm condition when monitoring the key performance measure. This implies defining an acceptable level of performance or operation for each performance measure.
- Determine what action should be taken when an alarm condition occurs.
Directory Server Monitoring ToolsThis section provides a summary of the monitoring tools available in Directory Server, and other tools that can be used to monitor Directory Server activity. All of the key performance measures, described in the next section, can be monitored using one, or a combination of, these tools.
The access, audit, and error logs provided with Directory Server are a rich source of monitoring information. These logs can be monitored manually or parsed using custom scripts to extract monitoring information relevant to your deployment. The Directory Server Resource Kit provides a log analyzer tool, logconv.pl, that enables you to analyze Directory Server access logs. The log analyzer tool extracts usage statistics and counts the occurrences of significant events. For more information this tool, refer to Chapter 24, ”The Log Analyzer Tool,” in the Directory Server Resource Kit Tools Reference. For information on viewing and configuring log files refer to Chapter 13, “Monitoring Directory Server Using Log Files” chapter in the Directory Server Administration Guide.
Directory Server Console enables you to monitor directory operations in real time, via a graphical user interface. The Console provides general server information, including a resource summary, current resource usage, connection status, and global database cache information. It also provides general database information such as the database type, status and entry cache statistics, cache information, and information relative to each index file within the database. In addition, the Console provides information relative to the connections and operations performed on each chained suffix.
The replication monitoring tools provided with Directory Server enable you to:
- monitor the state of synchronization between a master replica and one or more consumer replicas
- compare the same entry on two or more different replicas, enabling you to assess replication status
- depict your complete replication topology, which is particularly beneficial when dealing with complex directory deployments
Directory Server supports monitoring with the Simple Network Management Protocol (SNMP). SNMP is the standard mechanism for global network control and monitoring, and enables network administrators to centralize network monitoring activity.
For a detailed description of SNMP and Directory Server’s SNMP managed object support see SNMP Monitoring. For information on how to set up and configure SNMP refer to Chapter 14, “Monitoring Directory Server Using SNMP” in the Directory Server Administration Guide.
Directory Server MonitoringThe most important step in defining a monitoring and event management strategy is determining the key areas to be monitored on one or more components in your directory architecture. What you monitor, and to what extent, will depend largely on the specifics of your deployment.
This section describes the performance measures that should be monitored, and includes the following:
Monitoring Directory Server Activity
Directory Server provides a number of ways in which you can monitor server status. These include, but are not limited to, the following:
- The Servers and Applications tab of Sun Java System Server Console displays general information regarding your server including the installation date, the version, the server status (whether or not it is started) and the port numbers.
- Directory Server Console provides access to additional monitoring information. The Status tab on this console displays the following information:
- The startup and current time on the server.
- A Resource Summary that details connections, initiated and completed operations, and entries and bytes sent to clients.
- Current Resource Usage information, including active threads, open and available connections, number of threads waiting to read from the client, and number of databases in use.
- Information on all Open Connections, including when they were opened, how many connections were started and completed, the distinguished name used by the client to bind to the server, the state of the connection (Blocked or Not blocked), and the type of connection (LDAP, or DSML.)
For more information regarding the performance counters available through Directory Server Console, refer to “Monitoring Your Server Using the Console” in the Directory Server Administration Guide.
- Running an ldapsearch command on the cn=monitor entry provides access to the same information presented in the Status tab of Directory Server Console. Note that certain monitoring information is accessible only if the user issuing the ldapsearch command is bound as Directory Manager. You can remove this access constraint by reconfiguring the access control associated with this information. For details regarding the performance counters stored under cn=monitor, refer to “Monitoring Your Server From the Command Line” in the Directory Server Administration Guide.
- The ps command displays processes that are currently running. This enables you to determine whether the Directory Server slapd daemon is running. Refer to the ps(1) man page for more information.
- The ldapsearch command-line utility enables you to test whether Directory Server is responding to requests. To avoid launching time-consuming, unindexed searches, it is wise to use base level searches. Where you have more than one database, it is also wise to create an LDAP query for each database suffix to test whether or not the database is online and responding.
- Directory Server access logs enable you to monitor server operations and to establish whether the server is running. For more information on the access logs and connection codes refer to “Access Log Content,” and “Common Connection Codes,” in Chapter 3 of the Directory Server Administration Reference.
- The Directory Server error log records the server’s start and stop status, and enables you to establish that the server is running. For more information about viewing and configuring log files refer to Chapter 13, “Monitoring Directory Server Using Log Files” in the Directory Server Administration Guide.
Monitoring Database Activity
Monitoring database activity helps to ensure that your database is online and accessible when it is required. Database monitoring information can be accessed by running an ldapsearch command on a specific area of the cn=config branch. The kind of monitoring information provided and the corresponding area of the cn=config branch are presented in Table 8-1.
The areas of database monitoring information are presented in more detail in the following section.
- The cn=database,cn=monitor,cn=ldbm database,dn=plugins,cn=config branch provides access to cache, transaction, locks and log information. For a complete list of Directory Server configuration attributes, refer to Chapter 2, “Server Configuration Reference,” in the Directory Server Administration Reference.
The type of general database information you monitor will depend on the specific requirements of your directory deployment. For example, if your Directory Server frequently handles several simultaneous transactions, you may want to monitor the maximum number of transactions being handled at a particular time. If this number (defined by the nsslapd-db-max-txns attribute) approaches the maximum number of transactions allowed (defined by the nsslapd-db-configured-txns attribute), you may want to increase the maximum number of transactions allowed, to prevent operations from failing.
For a complete list of the relevant configuration attributes refer to Chapter 2, “Server Configuration Reference,” in the Directory Server Administration Reference.
Monitoring Disk Status
Effectively monitoring disk space enables you to prevent the problems associated with inadequate disk resources. The cn=disk,cn=monitor entry provides access to the following monitoring information:
- The path to the database instance. Where several database instances reside on the same disk or an instance refers to several directories on the same disk, the short path name is displayed.
- The amount of disk space available to the server in MB.
- The status of the disk (normal, low or full). This status is based on the available space and on the thresholds configured to trigger a disk “low” and disk “full” warning. It is particularly important to monitor the disk full threshold, since the directory will no longer accept updates once this limit is reached.
For more information on the cn=disk,cn=monitor attributes as well as the configurable disk low or full thresholds, refer to Chapter 2, “Server Configuration Reference,” in the Directory Server Administration Reference.
Monitoring Replication Activity
Monitoring replication status is an essential element of your global monitoring strategy. The earlier you become aware of potential replication problems, the quicker you can resolve those problems and reestablish correct replication operation.
Directory Server 5.2 provides three replication monitoring tools which enable you to monitor various aspects of replication functionality. The replication monitoring tools function as LDAP clients and can be used over a standard or secure connection (LDAPS.) The following replication monitoring tools are provided:
insync
The insync tool indicates the state of synchronization (or replication delay) between a master replica and one or more consumer replicas. This replication delay is an indication of how accurate the data is on a consumer, compared to the data on the master.
entrycmp
The entrycmp tool allows you to compare the same entry on two or more different servers. An entry is retrieved from the master replica and the entry’s nsuniqueid is used to retrieve the same entry from a given consumer. Entry attributes and values are compared and, if these are identical, the entries are considered to be the same.
repldisc
The repldisc tool allows you to discover a replication topology. Topology discovery starts with one server and constructs a graph of all known servers within the topology. The repldisc tool then prints an adjacency matrix describing the topology. This replication topology discovery tool is useful for large, complex deployments where it might be difficult to recall the global topology you have deployed.
NoteS
- When using the replication monitoring tools, you must use either all symbolic names or all IP addresses when identifying hosts. Using a combination of the two can be problematic.
- When running the replication monitoring tools over SSL, the server on which you are running the tools must have a copy of all the certificates used by the other servers in the topology.
- These tools are based on LDAP clients, and as such, will need to authenticate to the server and use a bind DN that has read access to cn=config. For more information about the configuration details of these tools and using the tools with SSL enabled refer to Monitoring Replication Status in the Directory Server Administration Guide.
For more information about the replication monitoring tools, refer to “entrycmp,” “insync,” and “repldisc,” in Chapter 1 of the Directory Server Administration Reference.
Monitoring Indexing Efficiency
Indexing has a positive impact on read performance and a negative impact on write performance. It is therefore important to monitor indexing efficiency to maintain an appropriate balance between read and write performance. An effective indexing strategy eliminates unnecessary indexes and maintains only those indexes required for client applications.
Indexing efficiency can be monitored in the following ways:
- By consulting the access logs and monitoring the time unindexed searches take to complete, you can identify the unindexed searches that have taken a disproportionate amount of time. (Unindexed searches are identified in the log files by notes=U and long searches have a high value for etime.)
The access log also provides additional information on searches and their filters, enabling you to decide whether it might be worth creating an index to improve performance. The Directory Server Resource Kit provides a log analyzer tool, logconv.pl, that enables you to analyze Directory Server access logs. For more information this tool, refer to Chapter 24, ”The Log Analyzer Tool,” in the Directory Server Resource Kit Tools Reference.
- The Status tab of Directory Server Console allows you to monitor the most frequently used indexes per suffix or chained suffix. It indicates how many attempts have been made to use the indexes and how many attempts have been successful. The same monitoring information can be accessed by running an ldapsearch command on the cn=monitor,cn=suffixName,cn=ldbm database,cn=plugins,cn=config branch.
A list of configured indexes is available in the Configuration tab of Directory Server Console (under the Data > suffixName node). Comparing the frequently used indexes, described above, with the list of configured indexes enables you you to identify the indexes that are using resources unnecessarily, and to decide whether they can be removed. If entries contain indexed attributes and the indexes are not used, removing these indexes will improve add performance.
For more information on access log content and connection codes refer to “Access Log Content,” and “Common Connection Codes,” in Chapter 3 of the Directory Server Administration Reference. For a complete list of Directory Server configuration attributes, refer to Chapter 2, “Server Configuration Reference,” in the Directory Server Administration Reference.
Monitoring Security
Monitoring the security of your deployment is vital in maintaining a secure, accessible directory. Suggestions on how to monitor Directory Server with a view to maintaining an acceptable level of security follow:
- Monitoring the number of failed bind attempts alerts you to attempts to break into your directory. If the SNMP agent is running, failed bind attempts can be monitored by running an ldapsearch command on the SNMP managed object counter dsBindSecurityErrors located under cn=snmp,cn=config.
- Monitoring the number of open connections without any activity alerts you to potential denial of service attacks. The number of current connections and the number of completed operations can be accessed via the Status tab of Directory Server Console or by searching the attributes located under cn=monitor.
- The Effective Rights feature enables clients to query the access control rights they have to directory entries and attributes. Being able to request the access rights of a user simplifies user administration, access control policy verification, and configuration decision making.
The Effective Rights feature would most likely be used periodically rather than on a day-to-day operations basis. For more detailed information regarding the Effective Rights feature see Requesting Effective Rights Information.
SNMP MonitoringSNMP is the standard mechanism for global network control and monitoring. It allows network administrators to centralize network monitoring activities, and can be used to monitor a wide range of devices in real time. This section describes how SNMP can be used to monitor Directory Server operation, and contains the following topics:
About SNMP
SNMP is a protocol used to exchange data about network activity. With SNMP, data travels between a managed device and a network management station (NMS) where users manage the network remotely. A managed device is anything that runs SNMP, such as hosts, routers, and Directory Server. An NMS is usually a powerful workstation running one or more network management applications. A network management application usually displays graphical information about managed devices (which device is up or down, which and how many error messages were received, and so on).
Information is transferred between the NMS and the managed device through the use of two types of agents: the subagent and the master agent. The subagent gathers information about the managed device and passes the information to the master agent. Directory Server has a subagent. The master agent exchanges information between the various subagents and the NMS. The master agent runs on the same host machine as the subagents it talks to.
Multiple subagents can be installed on a host machine. For example, if Directory Server, Application Server, and Messaging Server are all installed on the same host, the subagents for each of these servers communicates with the same master agent. The master agent is installed with Administration Server.
Values for SNMP attributes that can be queried are kept on the managed device and reported to the NMS as necessary. Each attribute or variable is known as a managed object, which is anything the agent can access and send to the NMS. All managed objects are defined in a management information base (MIB), which is a database with a tree-like hierarchy. The top level of the hierarchy contains the most general information about the network. Each branch below is more specific and deals with a separate network area.
SNMP exchanges network information in the form of protocol data units (PDUs). PDUs contain information about variables stored on the managed device. These variables, also known as managed objects, have values and titles that are reported to the NMS as necessary. Communication between an NMS and a managed device takes place in one of two ways:
Directory Server supports NMS-initiated communication, described in the following section.
NMS-Initiated Communication
This is the most common type of communication between an NMS and a managed device. In this type of communication, the NMS either requests information from the managed device or changes the value of a variable stored on the managed device.
The following steps make up an NMS-initiated SNMP session:
- The NMS determines which managed devices and objects must be monitored.
- The NMS sends a protocol data unit to the managed device’s subagent through the master agent. This protocol data unit either requests information from the managed device or tells the subagent to change the values for variables stored on the managed device.
- The subagent for the managed device receives the protocol data unit from the master agent.
- If the protocol data unit from the NMS is a request for information about variables, the subagent gives information to the master agent and the master agent sends it back to the NMS in the form of another protocol data unit. The NMS then displays the information textually or graphically.
If the protocol data unit from the NMS requests that the subagent set variable values, the subagent sets these values.
SNMP Monitoring in Directory Server
Directory Server supports SNMP monitoring in two ways:
- Monitoring via an SNMP agent. SNMP attributes are mapped to a statistics file which is read each time the SNMP agent is queried. This statistics file is not present if Directory Server is not running.
- Monitoring using the ldapsearch command-line utility. SNMP attributes are stored under the cn=snmp,cn=monitor entry. The following ldapsearch command provides a list of all SNMP attributes in Directory Server:
ldapsearch -h host -p port -s base -b "cn=snmp,cn=monitor" "objectclass=*"
Figure 8-1 shows the two ways in which SNMP monitoring information can be retrieved from Directory Server.
Figure 8-1 SNMP Monitoring in Directory Server
For information on where the MIBs are defined, and how to use SNMP refer to Chapter 14, “Monitoring Directory Server Using SNMP” in the Directory Server Administration Guide.
The SNMP managed objects supported by Directory Server are based on an early draft of the Directory Server Monitoring MIB RFC 2605. The SNMP operations managed objects returned by the SNMP agent are the same as the SNMP monitoring attributes returned by an ldapsearch command. These attributes are described in “Monitoring Attributes,” in Chapter 2 of the Directory Server Administration Reference. Names of attributes returned by the SNMP agent are prefixed with ds.
In addition to the operations managed objects, Directory Server supports managed objects related to the interactions between the monitored server and its peer servers, and entity related managed objects, containing information about the current server installation. These objects are described in the “Interactions Table of Supported SNMP Managed Objects,” and the “Entity Table of SNMP Supported Managed Objects,“ in the Directory Server Administration Reference.