This chapter provides an architectural overview of Log Central, covering the following topics:
Log messages are typically used as a system management tool: to detect problems, track down the source of a fault, or track system performance. Distributed systems typically include a variety of software components that generate message logs, such as operating systems and relational database management systems (RDBMS). In the absence of any standard, software makers use different practices for message logging.
Log Central allows you to extract the information from these diverse logs and map the information into a common format. The information is maintained in a single relational database, providing a single point of access and a unified view of all information contained in log messages. This database approach improves the manageability of distributed systems.
A single failure, such as a file system filled to capacity, can generate a number of different log messages as the problem ripples through the affected software components. A unified view of the various messages means that the source of a problem can be more rapidly diagnosed.
All messages are stored in an RDBMS, and users can view the logs, generate reports, and do online monitoring through a set of graphical user interface tools-called the Log Central Console-and through several commands offered at the operating system level.
Also, each message is associated with a message definition, which includes information such as severity (degree of impact on the distributed system), probable cause of the message generated, and actions that need to be taken when it is logged. This information can be viewed and updated online by the administrator, who can use it to form a knowledge base for resolving problems.
Log information can be monitored "in real time" as it arrives at the Central Collector, using the Log Central Message Browser (part of the Log Central Console). The Central Collector stores management information in a relational database system which can be queried for analysis of problems or to track trends.
How to use the Log Central Console is discussed in Chapter 9, "Using the Log Central Console."
Log Central is based on an agent/manager architecture, as shown in Figure 1-1. Local data collection agents run on machines where you have resources to be managed-these machines are called managed nodes. Local data collection agents forward log messages to the Central Collector. The Log Central Console, the Central Collector, and the Log Central relational database together play the "manager" role in the Log Central system.
The log agents monitor log messages generated by the resources that you wish to manage, such as messages logged to the UNIX syslog or NT event log, BEA TUXEDO userlogs, or relational database system logs. Agents map the information from these log messages into Log Central's uniform internal format for forwarding to the Central Collector. The data collection agents can be distributed around the network as needed.
To implement fault tolerance, users can configure a secondary Central Collector. If the primary Central Collector becomes unavailable, management information is automatically sent to the secondary Central Collector. Control automatically switches back to the primary Central Collector when it becomes available. Once the primary Central Collector becomes available again, the information sent to the secondary Central Collector is available to the primary Collector if the primary and secondary Central Collectors have been configured to use the same RDBMS.
How to configure a backup Central Collector is described in Chapter 6, "Host and Filter Configuration."
When no Central Collector is available to the data collection agent, the agent automatically stores the information in a temporary local backup file. Information in this file is automatically recovered and passed to the Central Collector when the Central Collector becomes available.
Log Central allows you to integrate information from logs into an enterprise management system using Simple Network Management Protocol (SNMP). Both the data collection agents and the Central Collector have the ability to generate SNMP trap notifications. Two levels of trap configuration are available:
Log Messages in System Management
Agent/Manager Architecture
Log Central and Enterprise Management Systems
messaging.conf
), more complex criteria can be specified to select messages that trigger SNMP trap notifications. These criteria allow you to generate SNMP trap notifications from the distributed data collection agents. Defining agent filters to generate SNMP traps is described in Chapter 6, "Host and Filter Configuration."
Steps for integrating Log Central with your enterprise management system are described in Chapter 7, "Integrating Log Central with an SNMP Manager."
The data collection agent is made up of a Message Sender and one or more Log Monitors. Log Monitors monitor log messages generated by the resources that you wish to manage, such as messages logged to the UNIX The flow of information from the managed resource through the agent to the Central Collector is shown in Figure 1-2.
Note:
Although two Log Monitors are shown in this diagram, the data collection agent can in fact have one or any number of Log Monitor processes. You must have a separate Log Monitor process for each log that you wish to monitor.
The Log Monitor reads the logs generated by the managed resource, such as a computer system, a BEA TUXEDO application, or a relational database system. Log Monitor maps the attributes in the managed resource log messages to attributes in Log Central messages. Messages are then placed in the Message Sender queue for forwarding to the Central Collector. You need a dedicated Log Monitor process for each managed resource.
Mapping of information into the Log Central internal format is provided out-of-the-box for the following logs:
Data Collection Agents
syslog
or Windows NT event log, BEA TUXEDO userlogs, or relational database system (RDBMS) logs. A distinct Log Monitor process is used on each managed node to monitor a particular log and map the information from incoming log messages to Log Central internal log format. The log messages are then forwarded to the Message Sender.
Log Central Components on the Managed Node
Figure 1-2 Log Central Flow of Information
Log Monitor
Message definitions are also provided out-of-the-box for these log resources.
This is an extensible system in that other log resources can be incorporated into Log Central as well. To manage other log resources you need to provide two things:
You can define different mappings for the same log file-up to 20 different Log Central messages could be generated from a single message logged by the managed resource. How to devise mappings for log files is discussed in Chapter 4, "Integrating Logs into Log Central."
For a list of all the procedures for setting up Log Central, refer to Chapter 2, "Getting Started."
The Message Sender reads incoming messages from its queue and forwards them to its primary Central Collector. If the primary Central Collector is not available, the Message Sender sends the message to a secondary Central Collector if a secondary Central Collector has been defined. There should be one message sender for every managed node.
Agent filters can be defined for the Message Sender to do the following:
Message Sender
Configuring filters is described in Chapter 6, "Host and Filter Configuration."
If none of the Central Collectors configured for this agent are accessible-due to a network outage, for example-the Message Sender writes messages retrieved from its queue to a temporary local file. When the Central Collector becomes available, the Message Sender automatically recovers all messages from the temporary file and forwards them to the Central Collector. When automatic recovery is occurring, new incoming messages have the highest priority and recovered messages are forwarded when the Message Sender is not preoccupied with new messages.
Temporary files are automatically deleted once the messages have been forwarded to the Central Collector.
If the Message Sender is unable to save messages in a temporary file, the messages are discarded and an SNMP trap is generated.
For information on starting and stopping Log Central processes on the managed node, refer to Chapter 8, "Starting and Stopping Log Central."
The Process Monitor ( Process Monitor
proc_monitor
) is a daemon that runs on all managed nodes and the central host. The Process Monitor is started whenever the start_messaging
command is invoked on a particular machine. The following processes "register" with the Process Monitor at startup time:
start_messaging
The Process Monitor awakens at fixed intervals and checks all registered processes. If configured to do so, it restarts any dead processes. The Process Monitor restarts the processes with the user and group IDs that were passed to it at startup.
The Central Collector performs the following functions:
Central Collector
For fault tolerance, you can configure two Central Collectors with one serving as the backup or secondary collector in case the primary Central Collector goes down or is unavailable.
The Central Collector is made up of two processes, the Message Receiver and the Message Processor. The flow of information at the Central Collector is shown in Figure 1-3.
Messages from the distributed data collection agents arrive at the Message Receiver. The Message Receiver stores the incoming messages in an intermediate file. A new intermediate file is created every hour. You can control how frequently the intermediate files are deleted using the Log Central Console Storage Maintenance tool.
For information on using the Storage Maintenance tool, refer to Chapter 9, "Using the Log Central Console."
The Message Receiver generates an enterprise-specific SNMP trap if it cannot log messages to an intermediate file. The number of failures that triggers a trap is defined by the The Message Processor performs the following functions:
Figure 1-3 Log Central Components on the Central Host
Message Receiver
BEA_LC_TRAP_EVERY_FAILURES
environment variable. If this environment variable is not set, a trap is generated every 100 failures. The SNMP trap has an enterprise-specific trap number of 90101.
Message Processor