Fixing Incorrect Storage Node HA Port Ranges

Fixing Incorrect Storage Node HA Port Ranges
Prev	Chapter 6. Administrative Procedures	Next

When you initially configured your installation, you defined a range of ports to be used by the nodes when communication among themselves. (You did this in Installation Configuration.) This range of ports is called the HA port range, where HA is short hand for "replication."

If you have specified invalid values for the HA Port Range, you are unable to deploy a Replication Node (RN) or a secondary Administration process (Admin) on any misconfigured SN. You discover the problem when you first attempt to deploy a store or a Admin Replica on a faulty SN. You see these indications that the RN did not come up on this Storage Node:

The Admin displays an error dialog warning that the Replication Node is in the ERROR_RESTARTING state. The Topology tab also shows this state in red, and after a number of retries, it indicates that the Replication Node is in ERROR_NO_RESTART.

The plan goes into ERROR state, and its detailed history — available by clicking on the plan in the Admin's Plan History tab, or through the CLI's show plan <planID> command — shows an error message like this:

Attempt 1
        state: ERROR
        start time: 10-03-11 22:06:12
        end time: 10-03-11 22:08:12
        DeployOneRepNode of rg1-rn3 on sn3/farley:5200 [RUNNING] 
        failed. ....  Failed to attach to RepNodeService for rg1-rn3,
        see log, /KVRT3/<storename>/log/rg1-rn3*.log, on host 
        farley for more information.

The critical events mechanism, accessible through the Admin or CLI shows an alert that contains the same error information from the plan history.
An examination of the specified .log file or the store-wide log displayed in the Admin's Log tab shows a specific error message, such as:
```
[rg1-rn3] Process exiting
java.lang.IllegalArgumentException: Port number 1 is invalid because
the port must be outside the range of "well known" ports
```

The misconfiguration can be addressed with the following steps. Some steps must be executed on the physical node which hosts the Oracle NoSQL Database Storage Node, while others can be done from any node which can access the Admin or CLI.

Using the Admin or CLI, cancel the deploy-store or deploy-admin plan which ran afoul of the misconfiguration.
On the Storage Node, kill the existing, misconfigured StorageNodeAgentImpl process and all its ManagedProcesses. You can distinguish them from other processes because they have the parameter -root <KVROOT>.
On the Storage Node, remove all files from the KVROOT directory.
On the Storage Node, re-create the storage node bootstrap configuration file in the KVROOT directory. For directions on how to do this, see Installation Configuration.
On the Storage Node, restart the storage node using the java -jar KVHOME/lib/kvstore.jar restart command.
Using the CLI, re-deploy the storage node using the deploy-sn plan.

You can now create and execute a deploy-store or deploy-admin plan, using the same parameters as the initial attempt which uncovered your misconfigured Storage Node.

Prev	Up	Next
Updating an Existing Oracle NoSQL Database Deployment	Home	Chapter 7. Standardized Monitoring Interfaces