Cloning the Oracle Grid Infrastructure

The following procedure describes how to clone the Oracle Grid infrastructure onto the replacement compute server. In the commands, working_server is a working compute server, and replacement_server is the replacement compute server.

To clone the Oracle Grid infrastructure:

  1. Log in as root to a working compute server in the cluster.

  2. Verify the hardware and operating system installation using the cluster verification utility (cluvfy):

    $ cluvfy stage -post hwos -n replacement_server,working_server –verbose
    

    The phrase Post-check for hardware and operating system setup was successful should appear at the end of the report.

  3. Verify peer compatibility:

    $ cluvfy comp peer -refnode working_server -n replacement_server  \
      -orainv oinstall -osdba dba | grep -B 3 -A 2 mismatched
    

    The following is an example of the output:

    Compatibility check: Available memory [reference node: ra01db02]
    Node Name Status Ref. node status Comment
    ------------ ----------------------- ----------------------- ----------
    ra01db01 31.02GB (3.2527572E7KB) 29.26GB (3.0681252E7KB) mismatched
    Available memory check failed
    Compatibility check: Free disk space for "/tmp" [reference node: ra01db02]
    Node Name Status Ref. node status Comment
    ------------ ----------------------- ---------------------- ----------
    ra01db01 55.52GB (5.8217472E7KB) 51.82GB (5.4340608E7KB) mismatched
    Free disk space check failed
    

    If the only failed components are related to the physical memory, swap space, and disk space, then it is safe for you to continue.

  4. Perform the requisite checks for adding the server:

    1. Ensure that the GRID_HOME/network/admin/samples directory has permissions set to 750.

    2. Validate the addition of the compute server:

      $ cluvfy stage -ignorePrereq -pre nodeadd -n replacement_server \
      -fixup -fixupdir  /home/oracle/fixup.d
       

      If the only failed component is related to swap space, then it is safe for you to continue.

      You might get an error about a voting disk similar to the following:

      ERROR: 
      PRVF-5449 : Check of Voting Disk location "o/192.168.73.102/ \
      DATA_CD_00_ra01cel07(o/192.168.73.102/DATA_CD_00_ra01cel07)" \
      failed on the following nodes:
      Check failed on nodes: 
              ra01db01
              ra01db01:No such file or directory
      ...
      PRVF-5431 : Oracle Cluster Voting Disk configuration check failed
      

      If this error occurs, then use the -ignorePrereq option when running the addnode script in the next step.

  5. Add the replacement compute server to the cluster:

    $ cd /u01/app/12.1.0/grid/addnode/
    $ ./addnode.sh -silent "CLUSTER_NEW_NODES={replacement_server}" \
      "CLUSTER_NEW_VIRTUAL_HOSTNAMES={replacement_server-vip}"[-ignorePrereq]
    

    The addnode script causes Oracle Universal Installer to copy the Oracle Clusterware software to the replacement compute server. A message like the following is displayed:

    WARNING: A new inventory has been created on one or more nodes in this session.
    However, it has not yet been registered as the central inventory of this
    system. To register the new inventory please run the script at
    '/u01/app/oraInventory/orainstRoot.sh' with root privileges on nodes
    'ra01db01'. If you do not register the inventory, you may not be able to 
    update or patch the products you installed.
    
    The following configuration scripts need to be executed as the "root" user in
    each cluster node:
     
    /u01/app/oraInventory/orainstRoot.sh #On nodes ra01db01
     
    /u01/app/12.1.0/grid/root.sh #On nodes ra01db01
    
  6. Run the configuration scripts:

    1. Open a terminal window.

    2. Log in as the root user.

    3. Run the scripts on each cluster server.

    After the scripts are run, the following message is displayed:

    The Cluster Node Addition of /u01/app/12.1.0/grid was successful.
    Please check '/tmp/silentInstall.log' for more details.
    
  7. Run the orainstRoot.sh and root.sh scripts:

    # /u01/app/oraInventory/orainstRoot.sh
    Creating the Oracle inventory pointer file (/etc/oraInst.loc)
    Changing permissions of /u01/app/oraInventory.
    Adding read,write permissions for group.
    Removing read,write,execute permissions for world.
    Changing groupname of /u01/app/oraInventory to oinstall.
    The execution of the script is complete.
     
    # /u01/app/12.1.0/grid/root.sh
    

    Check the log files in /u01/app/12.1.0/grid/install/ for the output of the root.sh script. The output file reports that the listener resource on the replaced compute server failed to start. This is an example of the expected output:

    /u01/app/12.1.0/grid/bin/srvctl start listener -n ra01db01 \
    ...Failed
    /u01/app/12.1.0/grid/perl/bin/perl \
    -I/u01/app/12.1.0/grid/perl/lib \
    -I/u01/app/12.1.0/grid/crs/install \
    /u01/app/12.1.0/grid/crs/install/rootcrs.pl execution failed
    
  8. Reenable the listener resource that you stopped in "Removing the Failed Compute Server from the Cluster".

    # GRID_HOME/grid/bin/srvctl enable listener -l LISTENER \
      -n replacement_server
    
    # GRID_HOME/grid/bin/srvctl start listener -l LISTENER  \
      -n replacement_server