Skip Headers

Oracle Intelligent Agent User's Guide
Release 9.0.2

Part Number A95412-01
Go To Documentation Library
Home
Go To Product List
Solution Area
Go To Table Of Contents
Contents
Go To Index
Index

Go to previous page Go to next page

B
Troubleshooting

This chapter covers generic troubleshooting strategies in the event your Intelligent Agent does not function properly. The following topics are discussed:

Troubleshooting the Intelligent Agent

Under most circumstances, the Intelligent Agent itself requires very little in the way of configuration. In order to function properly, however, the Agent must be able to communicate with the managing host and managed services. If you are familiar with Oracle and your operating system, using the following abbreviated checklists will likely solve problems that can interfere with Agent operation.


Important:

Because the Agent is continuously being improved from one release to the next, it is strongly recommended that you upgrade to the latest Agent available for your particular server release. Oftentimes, this will resolve problems you may encounter with earlier versions of the Agent.


Quick Checks

The following checklists cover the areas most likely to affect Agent operation. Agent troubleshooting checklists have been divided according to the two most common platforms on which the Agent is run: Windows NT and UNIX. The checklists are abbreviated and assume knowledge of both Oracle, the operating system, and related communication protocols. Specific troubleshooting procedures are covered in detail later in this chapter.

Quick Checks for the Windows NT Agent

If you are running an Agent on a Windows NT system, use the following checklist.

  1. Make sure the Agent service is up by checking the OracleAgent service in your control panel. If the Agent did not start up, use any of the following hints listed below.

  2. Check for messages written to the NT Event Viewer (under Administrative Tools) since this is where the NT Agent writes any problems associated with startup.

  3. Check if snmp_ro.ora, snmp_rw.ora, and services.ora are created by the Agent on startup. snmp_ro.ora and snmp_rw.ora are in the ORACLE_HOME\network\admin directory, and services.ora is in the ORACLE_HOME\network\agent directory.

    Compare the services listed with the services which are available on the machine. Please refer to Appendix A, "Agent Configuration Files" for valid sample files.

    If services are missing, check the following files for inconsistency or corruption:

    • listener.ora

    • tnsnames.ora

  4. Check that you do not have a system path set to external drives.

    The Agent is a service and runs by default as SYSTEM. It also needs DLLs from the ORACLE_HOME/BIN directory. If you need mapped drives in your path, you MUST NOT set them in the SYSTEM path.

    To set your own path:

    1. Move mapped drive paths out of SYSTEM path variables and into your own.

    2. Reboot to "unset" the systems path.

  5. Check if you have TCP/IP installed. TCP/IP is a requirement.

  6. If you still do not know why the Agent did not start, trace the Agent.

    1. Set the following variables in snmp_rw.ora:

      dbsnmp.trace_level=admin (or 16 if you want maximum information)

      dbsnmp.trace_directory=<any directory in which the Oracle user has write privileges>

      dbsnmp.trace_file=<name of the trace output file>

    2. Restart the Agent.

    3. Check the log files located in the oracle_home/network/log directory.

      DBSNMP.LOG should show general Agent problems.

      DBSNMP.NOHUP should show any errors related to the Agent's "watchdog" dbsnmpwd process.

      DBSNMPCONFIG.LOG should show problems with auto-discovery.

  7. Ensure that the DNS Host entry is set to the node name in the listener.ora and tnsnames.ora files.

    1. Run the start button-> settings-> control panel-> network-> protocol-> TCP/IP properties.

    2. Check the DNS Host entry.

Quick Checks for UNIX Agents

If you are running an Agent on a UNIX system, use the following checklist.

  1. Check the Agent's status. Enter the command:

    agentctl status
    

    Alternatively, you can check to see if the Intelligent Agent is running by entering the following command:

    ps -eaf | grep dbsnmp
    

    If your Agent is running, you should see something similar to the following:

    DBSNMP for Solaris: Version 9.0.0.0.0 - Production on 04-NOV-01 18:44:15
    
    (c) Copyright 2001 Oracle Corporation.  All rights reserved.
    
    The db subagent is already running.
    

    These checks should show that a "dbsnmp" process is running and/or "dbsnmpwd" watchdog script is running.

  2. Check the ORACLE_HOME/network/log/dbsnmp*.log file for errors on UNIX. (nmiconf.log for discovery).

  3. Check that the Oracle user has write permissions to ORACLE_HOME/network/log directory

  4. Check snmp_ro.ora, snmp_rw.ora, and services.ora for the entries created by the Agent. snmp_ro.ora and snmp_rw.ora are in the ORACLE_HOME/network/admin directory, and services.ora is in the ORACLE_HOME/network/agent directory. Alternatively, you can check the directory pointed to by the TNS_ADMIN environment variable.

    Compare the services listed with the services which are available on the machine. Please refer to Appendix A, "Agent Configuration Files" for valid sample files.

    If services are missing, check the following files for inconsistency or corruption:

    • listener.ora

    • tnsnames.ora

    • oratab

  5. If you still do not know why the Agent did not start, trace the Agent by setting the following variables in snmp_rw.ora and then re-start the Agent.

    • dbsnmp.trace_level=admin (or 16 if you want more information)

    • dbsnmp.trace_directory=<any directory which the Oracle user can write to>

    • dbsnmp.trace_file=agent

  6. If you have problems running the Intelligent Agent control utility (agentctl), set tracing for agentctl as follows:

    • agentctl:trace_level=admin

    • agentctl.trace_directory

    • agentctl.trace_file

  7. If you have upgraded the database software and one of your machines is having problems with the generated snmp_ro.ora, snmp_rw.ora or services.ora file, follow the instructions below:

    1. Run catsnmp.sql under the INTERNAL or SYS account (NOT the dbsnmp account). Normally the catsnmp.sql script is run from catalog.sql upon database creation but since this is an upgrade, you may not have run this script yet. If the necessary scripts have not been run, the dbsnmp account is not created.

    2. If you have more than one SID or older SIDs referenced in the oratab file, run catsnmp.sql against each of the databases.

    3. The snmp_ro.ra file is a read only file which means that all changes to the file will be overwritten each time the Agent is started. You can make changes (if needed) to the snmp_rw.ora file.

    If you are trying to do backups, you must run backupts.sql with the dbsnmp/dbsnmp account.


Warning:

Do not modify the Tcl scripts (job and events scripts written in Tool Command Language) that come with the Agent. If you want to submit a job different from the ones that are predefined with the Agent, use the TCL Job where you are allowed to pass in arbitrary scripts and have the Agent execute them.


Additional Checks

If after going through the quick checks your Intelligent Agent still is not functioning correctly, use the following section to cover other areas of Agent operation that are less probable causes of Agent operating problems. In addition, many of the steps in the checklists are covered in greater detail for those users who may be less familiar with Oracle and/or the operating system on which the Agent is running. The following questions are covered in this section:


Note:

You do not need to remove all ".q" files from the $ORACLE_HOME/network/agent directory in order to debug the Agent. Although this approach was recommended in the past, troubleshooting more recent versions of the Intelligent Agent no longer requires this action. There are exceptions to this rule, which will be pointed out later in the chapter.


Is TCP/IP configured and running correctly?

One of the most common problems that prevents the Agent from starting is TCP/IP configuration. To check whether your TCP/IP setup is configured correctly, issue the following commands at the command line:


Note:

To determine the hostname of a Windows NT system, type "hostname" at a command prompt.


Correcting TCP/IP configuration problems

  1. (Windows NT) Edit the WINNT\system32\drivers\etc\hosts and lmhosts files.

    If these files have never been used, only sample files will exist in the directory. Either rename or copy the .sam files to just the file name with no extension.

    (UNIX) Log in as root and edit the /etc/hosts file.

  2. Verify that the IP address and host information for each system are correct.

    Example: (Windows NT)

    (Replace the information in brackets with the actual host information for that system.)

    HOSTS file: 
            <122.111.111.111>   <hostname>
    
    LMHOSTS file: 
            <122.111.111.111>   <netbios name or hostname>  #PRE
    


Note:

You can also verify this information through the Windows NT Control Panel -> Network property sheet.


  1. Delete the $ORACLE_HOME\network\agent\*.q and services.ora files.


Note:

The *.q files contain information about current jobs and events. Do not delete these files without first removing all jobs and events registered against this Agent.


  1. Delete the $ORACLE_HOME\network\admin\snmp_ro.ora and

        $ORACLE_HOME\network\admin\snmp_rw.ora files. 
    

  1. Restart the Agent.

Do the DNS Name and the Computer Name Match? (Windows NT)

Before Release 8.0.4 of the Agent, the NT Agent required the DNS Hostname and the Computer Name to be identical. These parameters can be checked/changed from the following Windows NT Control Panel property sheets.

To verify the computer name:

To verify the DNS Name:

Are the Oracle Net configuration files correct?

In addition to proper network configuration, which allows nodes in your network to communicate, components of your Oracle environment must also be able to communicate with each other. Oracle Net provides the session and data communication medium between client machines and Oracle servers, or between Oracle servers. For this reason, proper Oracle Net configuration is a prerequisite for Agent communication. This section covers the most common problems that can occur when Agent communication fails.

Oracle Net configuration files are found in $ORACLE_HOME/network/admin, or $TNS_ADMIN (Windows NT) or $ORACLE_HOME/network/admin (UNIX).

Primary configuration files are:

See Appendix A, "Agent Configuration Files" for information and examples of the above files.

TNS_ADMIN variable usage during Agent Discovery

(UNIX)

All versions of the Unix discovery script allow the use of the TNS_ADMIN variable to locate input files (listener.ora and tnsnames.ora). Only Agent versions 7.3.4 and above correctly write the output files (snmp_ro.ora and snmp_rw.ora) into TNS_ADMIN, if set.

(Windows NT)

Beginning with version 8.0.5, the discovery script also reads the TNS_ADMIN value from the NT Registry.

The Agent also uses the TNS alias information found in the listener.ora file. The Agent does so even within an Oracle names environment. This behavior is intentional since an Oracle Names server may be temporarily unavailable and the Agent needs to be able to resolve names at all times. Check the following to make sure the local translation of the TNS alias takes place:

  1. Verify that the listener.ora file contains the following for each instance:

    • Two IPC entries

    • One TCP entry

    Do not activate the listener on port 1748, since Agent is listening on this port. (This is the reason you can use TNSPING against the Agent; TNSPING cannot differentiate between a listener and an Agent)

    The Agent requires IPC entries and TNS alias definitions on the server, in addition to alias definitions from the Console, to perform alias translations. This correct IPC entries and TNS alias definitions are essential for correct Agent/Console (V1) or Agent/Management Server (V2) communications.

  2. Ensure that the DNS Host entry is set to the node name in the listener.ora and tnsnames.ora files.

    1. From the Windows NT menu bar, click Start -> Settings -> Control Panel

    2. Double-click on the Network icon

    3. Click on the Protocols tab

    4. Select TCP/IP Protocol and click Properties.

    5. Check the DNS Host entry.

Is Oracle Net functioning properly?

If your Oracle Net configuration is correct and you are still unable to contact the Agent, the next step is to determine whether services in your Oracle Net network can be reached. You can use the TNSPING utility on each database you want to access by entering the following at the command prompt:

tnsping <network service name>

If you can connect successfully from a client to a server (or from a server to a server) using TNSPING, the command will return an estimate of the round trip time (in milliseconds) it takes to reach the Oracle Net service. This indicates Oracle Net is functioning properly.

Next, add the following alias (Agent debug entry) to the Console's tnsnames.ora file:

        agent_<sid>.world= 
           (DESCRIPTION = 
               (ADDRESS_LIST = 
                   (ADDRESS = 
                       (COMMUNITY =TCP.world) 
                       (PROTOCOL = TCP) 
                       (Host = <your-agent-hostname>) 
                       (Port = 1748) 
                   ) 
               ) 
           )

Then ping the Agent from the OEM console using:

tnsping agent_<sid>

or

tnsping80 agent_<sid> 

If the TNSPING command does not work, add the above alias to the Agent machine's tnsnames.ora file and try using TNSPING from the machine on which the Agent resides. Every Agent must be TNSPING-able using this alias.

Did the Agent startup successfully?

To check whether the Agent process is running issue the following command:

agentctl status

If the Agent did not start up, use any of the hints listed in the following table:

Table B-1 Troubleshooting an Agent that Will Not Start
UNIX Windows NT

Check the

$ORACLE_HOME/network/log/dbsnmp*.log

file for errors

Check for messages written to the NT Event Viewer (under Administrative Tools) since this is where the NT Agent writes any problems associated with startup.

Check the

$ORACLE_HOME/network/log/nmiconf.log

file for errors.

Check the

$ORACLE_HOME/network/log/nmiconf.log

file for errors.

Check that the Oracle user has write permissions to the following directory:

$ORACLE_HOME/network/log

Check the properties of the Agent Service to verify the OS account used by the Agent (default is 'System') Check that the Agent user has write permissions to the following directory:

$ORACLE_HOME/network/log

Check snmp_ro.ora, snmp_rw.ora, and services.ora for the entries created by the Agent. The snmp_ro and snmp_rw.ora files are located in the $ORACLE_HOME/network/admin directory, and services.ora is in the $ORACLE_HOME/network/agent directory.

Check if snmp_ro.ora, snmp_rw.ora, and services.ora are created by the Agent on startup.The snmp_ro and snmp_rw.ora files are located in the $ORACLE_HOME\network\admin directory, and services.ora is located in the $ORACLE_HOME\network\agent directory.

Compare the services listed with the services which are available on the machine. See Appendix A for valid sample files. If services are missing, check the following files for inconsistency or corruption:

  • listener.ora

  • tnsnames.ora

  • oratab

Compare the services listed with the services which are available on the machine. See Appendix A for valid sample files. If services are missing, check the following files for inconsistency or corruption:

  • listener.ora

  • tnsnames.ora

Check if you have TCP/IP installed. TCP/IP is a requirement. See Is TCP/IP configured and running correctly?

Check if you have TCP/IP installed. TCP/IP is a requirement. See Is TCP/IP configured and running correctly?

If you still do not know why the Agent did not start, turn on tracing. (see Tracing the Intelligent Agent)

Check that you DO NOT have a systems path variable containing external drives. The Agent is a service and runs by default as SYSTEM. It also needs DLLs from the $ORACLE_HOME/bin directory. If you need external mapped drives in your path, you MUST NOT set them in the SYSTEM path. To set your own path:

  1. Move external mapped drive paths out of systems path variable and into your own.

  2. Reboot to "unset" the systems path.

If you still do not know why the Agent did not start, turn on tracing. For more information on setting up Agent tracing, see "Tracing the 9i Agent")

For both UNIX and Windows NT systems check:

$ORACLE_HOME/network/log/dbsnmp.nohup

Did the Agent connect to ALL instances on its node?

To test whether an Agent can connect to the database(s) it monitors on a given node, try connecting to each database with the following connect string:

dbsnmp/dbsnmp@address_list 

You must perform this test on the node where the Agent resides.


Note:

Agents prior to 7.3.3 maintain two permanent connections to its local databases. Post 7.3.3 Agents maintain only one permanent connection.


Is the Agent running with the correct permissions? (UNIX)

To verify whether the Agent has the correct user permissions, see Installing the Intelligent Agent on page 2-2 .

Does the OS user exist and does it have the correct permissions? (Windows NT)

An OS user needs to be specified for the node and must have the following permissions:

Are there errors?

(Windows NT) Check the NT EVENT VIEWER -> APPLICATIONS -> LOG for any errors starting the DBSNMP process.

(Windows NT and UNIX) Check the $ORACLE_HOME/network/log/nmiconf.log file for discovery errors.

For both UNIX and Windows NT systems check the following file for additional errors:

$ORACLE_HOME/network/log/dbsnmp.nohup 

Why doesn't the Agent send status notifications back to the Enterprise Manager Console even though the jobs have run?

Most likely the job does actually run, but the Agent is unable to contact the Console to send back notifications. Verify that hostname resolution can occur. Verify that the IP and hostname of the Windows NT machine running the console is in the /etc/hosts file on the Unix box or the hostname can be resolved via DNS/NIS. Retry the job.

To test the TCP/IP resolution, perform the following tests from a command prompt:

ping <hostname> 
ping <IPaddress>

If the server is running telnet or ftp services(UNIX):

telnet <hostname> 
ftp <hostname>

Since PING uses IP and not TCP, it is a good way of determining if the problem is in the packet routing.

To determine if the problem is actually with TCP, use the telnet or ftp utilities.

Be sure the name and IP address of the Enterprise Manager Console machine is in the /etc/hosts file on the Sun server, otherwise the Agent is not able to return messages to the console because it can not resolve the name of the machine to an IPADDRESS.

The default listening address (TNS format) is:

LISTENING ADDRESS = (ADDRESS=(PROTOCOL= TCP)(Host=machine_name)(Port=7770)) 

If a job stays in the scheduled status, repeatedly delete it using the DEL key. Restart the job. Sometimes it takes several submits until it starts up. A delay of up to a minute until a job starts is common, especially the first time an Agent tries to sync with the OEM console with old Agents (7.3.2)

Intelligent Agent Error Messages and Resolutions

The following error messages and resolution are categorized by operating system. Situations that apply to all systems are listed under "Generic Agent."

Generic Agent

ORA-12163: 'TNS:connect descriptor is too long'

Copy the snmp.address.<host_name> parameter from your $ORACLE_HOME\network\admin\snmp_ro.ora file. Paste this address and parameter into your $ORACLE_HOME\network\admin\snmp_rw.ora file. In snmp_rw.ora, reduce the size of this connect string by removing the address entries for IPC. (NMP and SPX may also be removed.)

Shutdown/restart the Agent. See examples below.


Note:

The parameter snmp.address in no longer found in snmp_ro.ora starting with the 7.3.4/8.0.3 Agents. Therefore, you will have to use this example to add a new variable to your snmp_rw.ora.


EXAMPLES:

Entry to be copied out of snmp_ro.ora:

snmp.address.ORCL_MACHINE-PC = (DESCRIPTION=(ADDRESS_LIST 
=(ADDRESS=(PROTOCOL=IPC)(KEY=oracle.world))(ADDRESS=(PROTOCOL=IPC)(KEY=ORCL))(AD
DRESS=(COMMUNITY= TCP.world)(Host=machine-pc)
(PROTOCOL=TCP)(Port=1521))(ADDRESS=(COMMUNITY=TCP.world)(Host=machine-pc)
(PROTOCOL=TCP)(Port= 1526)))(CONNECT_DATA=(SID=ORCL)(SERVER=DEDICATED))) 

Modified entry in snmp_rw.ora:

snmp.address.ORCL_machine-PC = (DESCRIPTION=(ADDRESS_LIST 
=(ADDRESS=(COMMUNITY=TCP.world)(Host = machine-pc)(PROTOCOL= TCP)(Port= 
1521))(ADDRESS=(COMMUNITY= TCP.world)(Host = machine-pc)(PROTOCOL= 
TCP)(Port=1526)))(CONNECT_DATA=(SID=ORCL)(SERVER=DEDICATED)))

TNS-12542: 'TNS:address already in use'

This is actually a Oracle Net Listener error.

The following is documented in the 8.0.3.0.0 Intel NT release notes for the Oracle Net Listener. When a client connects to an Oracle8 server in dedicated server mode, WINSOCK2 Shared Sockets feature is used so that the client connection is routed from the listener to the database server. This feature improves the connection time, because the client does not need to close the socket connection with the listener and establish a new connection with the database server.

With the use of Shared Sockets, threads also use the same port as the listener. If you shut down the listener and try to start it up again for the same port, the listener does not start up if the port is in use due to any open connections with the database. Ensure that no client is connected to the database before starting up the listener. Note that if you are using a listener with a different port number you are able to start it up.


Warning:

Do not bring down the listener when any clients are connected to the database. If you need to listen for a new database, modify the listener.ora configuration file, and issue the reload command from the Listener Control Utility LSNRCTL80.


See Oracle Networking Products Getting Started for Windows Platforms for more information about the listener.ora file and the LSNRCTL80 utility. Oracle Corporation attempted to overcome the restriction by using the WINSOCK2 option to allow the re-use of a port, but the option does not work reliably. Oracle Corporation is currently working with Microsoft Corporation to resolve this issue.

For additional information about the reload command, see the Oracle Net Administrator's Guide.

VOC-04816 'Invalid Destination'

While submitting a job, validation fails with "failed to find address for Agent_node". And then the VOC-04816 Invalid Destination. This might also be caused by an invalid address in the tnsnames.ora located on the console.

Upgrade your Agent to at least 7.3.3. or later.

Verify that your SQL*Net configuration files are correct?

'Failed to authenticate user' error when running a job

In order for the Agent to execute jobs on a managed node, the following conditions must be met:

'Login denied', 'Invalid username/password' messages in trace files

This usually happens if you have a databases prior to 7.3.3 on the machine. From V7.3.3 onwards, a script called CATSNMP.SQL is included in the CATALOG.SQL dictionary script. This script is responsible for creating the DBSNMP user the Agent needs to connect. Older databases did not have this script yet.

Verify if the user 'DBSNMP' exists. If not, run the catsnmp.sql script.

'ORACLE_HOME does not exist' when starting the Agent

This message comes from the discovery script, nmiconf.tcl. Make sure you have $ORACLE_HOME environment variable set to the ORACLE_HOME of the Agent and re-start the Agent.

The Agent is only finding one database on a certain node

If you have more than one database on a single node, then you need to make sure that each instance has a unique GLOBAL_DBNAME in the listener.ora. You may have to define this manually in the listener.ora.

No snmp_ro.ora and snmp_rw.ora are generated.

This error can occur if the Agent cannot write to $ORACLE_HOME\network\admin. Refer to the $ORACLE_HOME\networklog\nmiconf.log for errors. For more information on Agent startup problems, see "Did the Agent startup successfully?".

Not all services are discovered.

Check the services.ora file to determine which services have been discovered.

All the services the Agent finds on a machine, must be defined in the relevant SQL*Net/Oracle Net configuration files. If the service(s) are not defined, service discovery will fail and, in the worst case, the Agent will hang or return errors.

For the remaining databases, check the oratab file, and the SQL*Net/Oracle Net files to see if these files exist and that all definitions are present. Make sure that all of the databases are listed in the listener.ora file. For more information, see "Are the Oracle Net configuration files correct?" and "Is Oracle Net functioning properly?" .

'Invalid service name' or 'File operation error' while registering a job or event.

This error is usually seen when the services on the console and the services discovered by the Agent are out of sync. For example, if you have an event registered against TESTDB and someone changes the name of the database to PRODDB, that Agent and Console are out of sync.

To fix this start by removing all job and event registrations from this service and dropping the node where the services exist from the console. Rediscover the node from the console using the auto-discovery wizard.

NOTE: With 7.3.2 the alias are case sensitive.

If you have a NT Agent please refer to 'Invalid service name' while registering a job or event.

'Transport read error' or 'Transport write error' messages

This indicates a problem with the TCP/IP layer. Most obvious cause for this is that the IP address and the hostname do not reference the same physical machine.

Verify that TCP/IP is configured and running correctly. (See Is TCP/IP Installed and Running Correctly)

'Oralogin failed in orlon'

You may receive this error while executing a TCL script using the oratcl verb oralogon through the Software Developer's Kit. "Oralogin failed in orlon" means that the connect string is either wrong or for some reason, the account used cannot logon to the database.

NT Agent

For any NT Operating System Error when starting the Agent

If you see an OS error when starting the Agent, check to see whether it is an actual Agent error as described in snmimsg.mc. Due to one of the Windows APIs not working as documented, the Agent fails to print out the real cause of the error.

Use the Event Viewer in the Administrative tools group of Windows NT. You should find the true cause of the problem documented. The source for the Agent errors are under the service name "dbsnmp". Highlight the most recent dbsnmp entry in the list. Double click on the event to get the actual results.

In order to debug the Agent after you have received an OS error, follow the following steps:

'Failed to connect to Agent' error.
(Jobs that remain in submitted status)

There are in fact two hostname definitions on NT: One NETBios one, used for the NT's internal Named Pipes protocol, which is always installed. The other is the TCP/IP hostname, which is only configurable when you install TCP/IP on NT.

To find the NT NetBios hostname:

To find the TCP/IP hostname:

On an NT server, you can 'ping' the two names, even if they are configured differently. Other clients, however, only 'ping' real TCP/IP hostnames. If the Agent is using local IPC connections, it uses Named Pipes. Therefore the NetBios name, while all external connections will use the TCP/IP name.

A mismatch in these names leads to 'unable to contact Agent', or forever pending jobs in the console. Therefore, make sure that the NetBios and the TCP/IP hostname are identical.

Receive the error failed -> 'output from job lost' while running job.

The Windows NT user that you created for the Agent (see Agent Configuration, Configuration Guide) needs read/write permissions to the $ORACLE_HOME\network\agent directory (and TEMP directory, for some applications) and read permissions to the SYSTEM32 directory

Verify that the NT user has these permissions.

Agent finds no services after discovery

This problem has been fixed for Agent versions 7.3.4 and higher. For Agent versions 7.3.3 and lower, the following workaround can be used.

Check the listener.ora file, and make sure that no $ORACLE_HOME parameter is specified in the SID_LIST section. Specifying an $ORACLE_HOME in the SID_LIST section prevents the Agent from finding the requisite files for service discovery.

'Invalid service name' while registering a job or event.

If you have a 8.0.4 Agent, you may experience this problem. If you have a default domain other than ".world". The Agent tries to append a ".world" to the database name during discovery. For example, if your default domain is nl.oracle.com and you define your GLOBAL_DBNAME = database.nl.oracle.com, the Agent defines the database name to be database.nl.oracle.com.world. This problem only occurs when the Agent and Console reside on the same machine (they share the some configuration files).

The workaround is to append ".world" to all services that do not currently have a specified domain.

UNIX Agent

Discovery fails with no services at all

First check that all of the SQL*Net files are present and correctly defined. You can then debug discovery by editing your oratab file contains only a valid SID with a listener running. After you get this working, you can add the remaining entries in the oratab file to see which entry is causing the problem.

Check the $ORACLE_HOME/network/log/nmiconf.log files for errors.

NMS-0308 : 'Failed to listen on address : another Agent may be running'.

There are two possible causes for this error:

  1. If two Agents are installed on a machine, in two different ORACLE_HOME, then you see this message if you try to start the second Agent. This is because both Agents try to listen the same default port #1748.

    Only have one Agent on a machine.

  2. The port 1748 where the Agent listens is being used by someone else, or is not being released by dead process that were formerly using it (unfortunately common problem on SUN) .

To confirm port is being used by someone else

  1. Use this command in UNIX

    netstat -a | grep 1748 
    
                     ^---- this is port # 
    

    If any result shown on screen that ends in "LISTENING" then the port is in use.

  2. If the following is true :

    • netstat -a | grep 1748 ---> results in "LISTENING"

    • agentctl status agent ( results in "The db subagent is not started.")

    Then do this.

    1. ps -ef | grep dbsnmp

    2. kill -9 ______ (fill in process numbers)

    3. restart Agent with agentctl start agent

  3. If it still fails to start the Agent, go through steps again, but before re-starting the AGENT, do this.

    1. cd $ORACLE_HOME/network/agent

    2. rm *.q, services.ora, snmp_ro.ora, and snmp_rw.ora

    3. restart Agent with agentctl start agent

    This will re-start the Agent and remove all of the job and event queues it was using in the past.

    If all else fails, re-booting the machine will free up the port.

NMS-001 while starting the Agent

This message indicates that the SNMP Master Agent (the process on UNIX that controls the SNMP protocol) could not be contacted. By default the Agent listens and works over SQL*Net, but the Agent can also work over SNMP on UNIX systems.

This message can safely be ignored unless you are trying to communicate with a Master Agent.

NMS-00207 Agent xxxx user account is locked for database yyyy

Events registered with the Agent for monitoring a "seed" database of version 9.0.0.0 will not work since, by default the Agent's database account "dbsnmp" is locked when the seed database is created. A "seed" database is a sample database that gets created when the user does a "typical" Oracle Server installation.

Under these conditions, an Enterprise Manager database up-down event will always indicate that the seed database is down. The Agent's log file dbsnmp.log will contain a NMS-00207 error message indicating the dbsnmp user account for the seed database is locked.

To resolve this problem, the you must log into the seed database and perform the following:

  1. Unlock the "dbsnmp" account by running the sql statement:

    ALTER USER dbsnmp ACCOUNT UNLOCK;
    
  2. Reset the password for dbsnmp account by running the sql statement:

    ALTER USER dbsnmp IDENTIFIED BY <password>;
    
  3. Add the reset password to the Agent configuration file snmp_rw.ora as follows:

    SNMP.CONNECT.<service_name>.PASSWORD=<password>
    

    where service_name is the name of the seed database as discovered by the Agent in snmp_ro.ora/snmp_rw.ora.

  4. Stop and start the Agent using agentctl.

Run the catsnmp.sql script for that database with either the SYS or INTERNAL accounts.

NMS-205 while starting the Agent

The 'dbsnmp' user could not be located.

Run the catsnmp.sql script for that database with either the SYS or INTERNAL accounts.

NMS-351 while starting the Agent

This happens if there mismatches between the ID's in the '*.q' files in the $ORACLE_HOME/network/agent directory. Delete all the '*.q' in the $ORACLE_HOME/network/agent directory. Rebuild your repository. Restart the Agent.

Tracing the 9i Agent

Beginning with 7.3.3, the Agent reads information from the snmp_ro.ora and snmp_rw.ora files in the $ORACLE_HOME\network\admin directory.


Note: These files only exist after you have started the Agent the first time. If you want to trace the Agent the first time it is started, you can manually create a new file called snmp_rw.ora and add the trace parameters to this file. Otherwise, start the Agent and then modify the snmp_rw.ora file to add the trace information and restart the Agent.


Example of modifications of the snmp_rw.ora file:

DBSNMP.TRACE_LEVEL = (OFF | USER | ADMIN | 16 )

The DBSNMP.TRACLE_LEVEL settings mirror those used for SQL*Net.

Optional:

DBSNMP.TRACE_FILE = agent        Default=dbsnmp.trc 
DBSNMP.TRACE_DIRECTORY = /private/temp  Default=$ORACLE_HOME/network/trace

(Any existing directory where the Agent has write permissions)


Note:

Because the Data Gatherer functionality has been integrated with the 9i Agent, data collection-based tracing cna be turned on as follows:

1. setenv VP_DEBUG 1

2. Then start the agent using agentctl start agent

Any collection activity will be logged in

$ORACLE_HOME/network/log/dbsnmp.nohup.


The log file, $ORACLE_HOME/network/log/dbsnmp.log, is written by the Agent on every startup, even if tracing is not turned on. It contains the name and version of the Agent and the name and location of the Agent's configuration files. If tracing is turned on, it also contains problems encountered with the database and listener connections.

The log file, $ORACLE_HOME/network/log/nmiconf.log, is created on the first start up of the Agent and appended to every time after that. The auto discovery is done by the Tcl script, nmiconf.tcl (hence, the log file name). This file is written to only during startup. $ORACLE_HOME/agentbin/ORATCLSH is a special-purpose TCL shell that supports all standard TCL verbs (supported in TCL75.dll) plus a large subset (not all) of the ORATCL verbs supported by the OEM Agent. ORATCLSH is not a general purpose utility and may only be used in combination with the OEM Agent as it depends on files and data structures maintained by the OEM Agent.

There is no documentation of ORATCLSH and it has never been part of the supported feature set of the OEM Agent. It is provided strictly as a debugging tool to help Oracle customers and developers in developing OEM job and event scripts. The executable ORATCLSH is provided for debugging your TCL scripts. Before executing ORATCLSH, set the environment variable TCL_LIBRARY to point to $ORACLE_HOME/network/agent/tcl, the location of the init.tcl file.

Tracing TCL

You may also turn Tcl tracing on by setting the environment variable ORATCL_DEBUG and turning tracing on in the snmp_rw.ora file. The ORATCL_DEBUG must be set to the $ORACLE_HOME/network/trace directory. You must shut down and re-start the Agent for these parameters to take effect. TCL tracing creates a file, oratcl.trc in the above location. Every time an event is run an entry is added to the oratcl.trc file.


Go to previous page Go to next page
Oracle
Copyright © 2002 Oracle Corporation.

All Rights Reserved.
Go To Documentation Library
Home
Go To Product List
Solution Area
Go To Table Of Contents
Contents
Go To Index
Index