3 Migrate an Oracle Data Hub Cloud Service Cluster to Oracle Cloud Infrastructure

These topics lists the steps to migrate an Oracle Data Hub Cloud Service cluster to Oracle Cloud Infrastructure.

Deploy the DataStax Cluster

Deploy nodes in Oracle Cloud Infrastructure as a stand-alone DDAC or DSE cluster by using the DataStax Oracle Cloud Infrastructure terraform templates that you receive from your sales professional. Deploying a stand-alone cluster first allows you to check your deployment and configuration, or perform any performance testing desired without touching your production cluster.

Stop Critical Services

Stop the DDAC/DSE services on all Oracle Cloud Infrastructure nodes by running either of the following commands:
sudo service dse stop
or
sudo service ddac stop

Add OCI Nodes as a New Data Center

Add the Oracle Cloud Infrastructure nodes as a new datacenter by following these instructions but stop before running the nodetool rebuild command. Since you're reusing nodes (i.e. this was a stand-alone cluster) the prerequisites listed are required.

Trigger Migration of Data

At this point you can migrate data.

If there is < 1 TB of data per node, it is quickest to use the nodetool rebuild command in the instructions linked above by SSH-ing onto each Oracle Cloud Infrastructure node. Note, this puts a load on the Data Hub nodes. We recommend that you run this on one node to gauge load and rebuild time. If the load on the existing cluster is reasonable and the time taken is not too long, gradually increase the number of nodes being rebuilt.

If there is > 1 TB of data per node or nodetool rebuild command was prohibitively slow, it is recommended to:

  1. Install the Oracle Cloud Infrastructure CLI on all Data Hub and Oracle Cloud Infrastructure nodes.
  2. Snapshot the data for each keyspace that needs to be migrated.
  3. Create a bucket in Oracle Cloud Infrastructure Object Storage.
  4. Copy the snapshots from the Data Hub nodes to the Oracle Cloud Infrastructure nodes using the Oracle Cloud Infrastructure CLI per the details provided below:
    • From each Data Hub node upload each snapshot to Oracle Cloud Infrastructure object storage by running a command like the one below. Values in <> are either determined by the snapshot command run above or the bucket name you choose.
      oci os object bulk-upload --bucket-name <bucket> \
      --src-dir /var/lib/cassandra/data/<keyspace_name>/<table_name-UUID>/snapshots/<snapshot_name> \
      --object-prefix <node_hostname>/<keyspace_name>/<table_name-UUID>/<snapshot_name>
    • On each Oracle Cloud Infrastructure node you can download all snapshots from the corresponding Data Hub node by running:
      oci os object bulk-download --bucket-name <bucket> \
      --download-dir <snapshot_tmp_dir> \
      --prefix <node_hostname>
  5. Finally, follow these instructions to restore the snapshot on the new nodes.

    Note:

    These operations need to be performed quicker than gc_grace_seconds (default 10 days) at the risk of deleted data reappearing.