Monitoring for Administration (Admin) Nodes

The Administrative (Admin) Node is a process running in the Storage Node, that is used to configure, deploy, monitor, and change store components. The Administrative Node handles the execution of commands from the Administrative Command Line Interface (CLI). For more information, see Administration in the Concepts Guide.

See the following section:

Metrics for Admin Nodes

The following metrics are accessible through JMX for monitoring Administrative Nodes in the Oracle NoSQL Database cluster.

  • adminId The unique ID for the Admin Node.

  • adminServiceStatus The status of the administrative service. It can be one of the follows:

    • unreachable(0) The Admin Node is unreachable. This can be due to a network error or the Admin Node maybe down.

    • starting (1) The Admin Node agent is booting up.

    • waitingForDeploy (2) Indicates a bootstrap admin that has not been configured, that is, it has not been given a store name. Configuring the admin triggers the creation of the Admin database, and changes its status from "WAITING_FOR_DEPLOY" to "RUNNING".

    • running(3) The Admin Node is running.

    • stopping(4) The Admin Node in the process of shutting down.

    • stopped(5) An intentional clean shutdown of the Admin Node.

    • errorRestarting(6) The Storage Node tried to start the admin several times without success and gave up.

    • errorNoRestart(7) Service is in an error state and will not be automatically restarted. Administrative intervention is required. The user can start looking for SEVERE entries in both the service's log file and the log file of the SNA controlling the failed service. The service's log in Monitoring for Admin section is Admin log:

      <kvroot>/<storename>/log/admin*_*.log

      where, <kvroot> and <storename> are user inputs and * represents the number of the log.

      Note that the kvroot and storename will be different for every installation. Similarly, to find the log file for SNA, use:

      <kvroot>/<storename>/log/sn*_*.log
      Examples of SN logs can be: sn1_0.log, sn1_1.log.
      You can search SEVERE keyword in these log files, and then read the searched messages to fix the errors, or you may require help from Oracle NoSQL Database support. The action to take depends on the nature of the failure and can vary from stopping and restarting the service explicitly (easy) to the need to replace the service instance entirely (not easy and slow). The issues can be any of the following:
      • Resource issue – Some type of necessary resource for example, disk space, memory, or network is not available.

      • Configuration problem – Some configuration-related issues which needs a fix.

      • Software bug – Bugs in the code which needs Oracle NoSQL Database support.

      • On disk corruption – Something in persistent storage has been corrupted.

      In the rare case that you discover disk corruption, you must get help from Oracle NoSQL Database support.

    • expectedRestarting(9) The Admin Node is executing an expected restart as some plan CLI commands causes a component to restart. This is an expected restart, that is different from errorRestarting(6) (which is a restart after encountering an error).

  • adminLogFileCount A logging config parameter that represents the maximum number of log files that are retained by the Admin Node. Users can change the value of this parameter, and also the adminLogFileLimit parameter, if they want to reduce the amount of disk space used by debug log files. Note that reducing the amount of debug log data saved may make it harder to debug problems if debug information is deleted before the problem is noticed. For more information on adminLogFileCount, see Admin Parameters and Admin Restart.

  • adminLogFileLimit A logging config parameter that represents the maximum size of a single log file in bytes. For more information on adminLogFileLimit, see Admin Parameters and Admin Restart.

  • adminPollPeriod The frequency by which the Admin polls agents (Replication Node and Storage Node Agent) for statistics. This polling receives service status changes, performance metrics, and log messages. This period is reported in units of milliseconds.

  • adminEventExpiryAge Tells how long critical events are saved in the admin database. This value is reported in units of hours.

  • adminIsMaster A Boolean value which indicates whether or not this Admin Node is the master node for the admin group.