MySQL Cluster Manager 8.4.3 User Manual

4.5.2.2 Verify All Cluster Process PID Files

You must verify that each process in the wild cluster has a valid PID file. For purposes of this discussion, a valid PID file has the following characteristics:

  1. To check the PID file for the management node process, log in to a system shell on host 198.51.100.102, change to the management node's data directory as specified by the Datadir parameter in the cluster's configuration file, then check to see whether the PID file is present. On Linux, you can use the command shown here:

    $> ls ndb_*.pid
    ndb_50.pid
    

    Check the content of the matching .pid file using a pager or text editor. We use more for this purpose here:

    $> more ndb_50.pid
    10221
    

    The number shown should match the ndb_mgmd process ID. We can check this on Linux using the ps command:

    $> ps -ef | grep ndb_mgmd
    ari      10221     1  0 19:38 ?        00:00:09 /home/ari/bin/cluster/bin/ndb_mgmd --config-file=/home/ari/bin/cluster/wild-cluster/config.ini --config-cache=false --ndb-nodeid=50
    

    The management node PID file satisfies the requirements listed at the beginning of this section.

  2. Next, we check the PID files for the data nodes, on hosts 198.51.100.103 and 198.51.100.104. Log in to a system shell on 198.51.100.103, then obtain the process ID of the ndbd process on this host, as shown here:

    $> ps -ef | grep ndbd
    ari      12838     1  0 Nov08 ?        00:10:12 ./bin/ndbd --initial --ndb-nodeid=2 --ndb-connectstring=198.51.100.102
    

    As specified in the cluster's configuration file, the node's DataDir is /home/ari/bin/cluster/wild-cluster/2/data. Go to that directory to look for a file named ndb_2.pid:

    $> ls ndb_*.pid
    ndb_2.pid
    

    Now check the content of this file, and you are going to see the process ID for angel process for the data node:

    $> more ndb_2.pid
    12838
    

    There should be no need to adjust the PID files to contain the data node processes' own PIDs, as that should have been taken care of by the --remove-angel option used with the import cluster command at the last step of the import process. The data nodes are ready for import as long as they have valid PID files containing the PIDs for their angel processes.

    We are ready to proceed to the mysqld node running on host 198.51.100.102.

  3. To check the PID file for the mysqld node: the default location for it is the data directory of the node, specified by the datadir option in either a configuration file or at the command line at the start of the mysqld process. Let's go to the data directory /home/ari/bin/cluster/wild-cluster/51/data on host 198.51.100.104 and look for the PID file.

    $> ls *.pid
    localhost.pid
    

    Notice that the MySQL Server could have been started with the --pid-file option, which puts a PID file at a specified location. In the following case, the same mysqld node has been started with the mysqld_safe script, and the ps command reveals the value for the --pid-file used:

    $>  ps -ef | grep mysqld
    ari      11999  5667  0 13:15 pts/1    00:00:00 /bin/sh ./bin/mysqld_safe --defaults-file=/home/ari/bin/cluster/wild-cluster.cnf --ndb-nodeid=51
    ari      12136 11999  1 13:15 pts/1    00:00:00 /home/ari/bin/cluster/bin/mysqld --defaults-file=/home/ari/bin/cluster/wild-cluster.cnf
    --basedir=/home/ari/bin/cluster/ --datadir=/home/ari/bin/cluster/wild-cluster/51/data/ --plugin-dir=/home/ari/bin/cluster//lib/plugin
    --ndb-nodeid=51 --log-error=/home/ari/bin/cluster/wild-cluster/51/data//localhost.localdomain.err
    --pid-file=/home/ari/bin/cluster/wild-cluster/51/data//localhost.localdomain.pid
    

    As in the example, it is likely that you have a PID file that is not named in the required format for cluster import (ndb_node_id.pid); and if the --pid-file option was used, the PID file might not be at the required location (the data directory). Let us look into the PID file being referred to in the last example:

    $> more /home/ari/bin/cluster/wild-cluster/51/data//localhost.localdomain.pid
    12136
    

    The PID file for the SQL node is at an acceptable location (inside the data directory) and has the correct contents (the right PID), but has the wrong name. Let us just copy the PID file into a correctly named file in the same directory, like this

    $> cd /home/ari/bin/cluster/wild-cluster/51/data/
    $> cp localhost.localdomain.pid ndb_51.pid