Modifying Storage Node HA Port Ranges

When you initially configured your installation, you defined a range of ports for the nodes to use when communicating between themselves. (You did this in Installation Configuration Parameters.) This range of ports is called the HA port range, where HA is an acronym for High Availability, and indicates your store’s replication factor.

If you inadvertently used invalid values for the HA Port Range, you cannot deploy a Replication Node (RN) or a secondary Administration process (Admin) on any Storage Node. You will discover the problem when you first attempt to deploy a store with a Replication node. Following are indications that the Replication Node did not come up on the Storage Node:

  • The Admin logs include an error that the Replication Node is in the ERROR_RESTARTING state. After a number of retries, the warning error changes to ERROR_NO_RESTART. You can find the Replication Node state in the ping command output.

  • The plan enters an ERROR state. Using the CLI's show plan <planID> command to get more history details includes an error message like this:

    Attempt 1
            state: ERROR
            start time: 10-03-11 22:06:12
            end time: 10-03-11 22:08:12
            DeployOneRepNode of rg1-rn3 on sn3/farley:5200 [RUNNING]
            failed. ....  Failed to attach to RepNodeService for rg1-rn3,
            see log, /KVRT3/<storename>/log/rg1-rn3*.log, on host
            farley for more information. 
  • The critical events mechanism, accessible through the Admin CLI show events command, includes an alert containing the same error information from the plan history.

  • The store’s runtime or boot logs for the Storage Node and/or Admin shows a port specific error message, such as:

    [rg1-rn3] Process exiting
    java.lang.IllegalArgumentException: Port number 1 is invalid because
    the port must be outside the range of "well known" ports
    

You can address incorrect HA port ranges in a configuration by completing the following steps. Steps that require you to execute them on the physical node hosting the Oracle NoSQL Database Storage Node, begin with the directive On the Storage Node. You can execute other steps from any node that can access the Admin CLI.

  1. Using the Admin CLI, cancel the plan deploy-sn or plan deploy-admin command that includes invalid HA Port Range values.

  2. On the Storage Node, kill the existing, incorrectly configured StorageNodeAgentImpl process and all of its Managed Processes. You can distinguish managed processes from other processes because they have the parameter -root <KVROOT>.

  3. On the Storage Node, remove all files from the KVROOT directory.

  4. On the Storage Node, recreate the storage node bootstrap configuration file in the KVROOT directory. For directions, see Installation Configuration Parameters.

  5. On the Storage Node, restart the storage node using this Java command:

    java -Xmx64m -Xms64m 
    -jar KVHOME/lib/kvstore.jar restart
  6. Using the Admin CLI, you can now create and execute a deploy-sn or deploy-admin plan, using the same parameters as the initial plan, but with the correct HA range.