Migrate from DynamoDB JSON file to Oracle NoSQL Database

This example shows how to use the Oracle NoSQL Database Migrator to copy DynamoDB JSON file to Oracle NoSQL Database.

Use Case:

After evaluating multiple options, an organization finalizes Oracle NoSQL Database over DynamoDB database. The organization wants to migrate their tables and data from DynamoDB to Oracle NoSQL Database (On-premises).

See Mapping of DynamoDB table to Oracle NoSQL table for more details.

You can migrate a file or directory containing the DynamoDB exported JSON data from a file system by specifying the path in the source configuration template.

A sample DynamoDB-formatted JSON File is as follows:
{"Item":{"Id":{"N":"101"},"Phones":{"L":[{"L":[{"S":"555-222"},{"S":"123-567"}]}]},"PremierCustomer":{"BOOL":false},"Address":{"M":{"Zip":{"N":"570004"},"Street":{"S":"21 main"},"DoorNum":{"N":"201"},"City":{"S":"London"}}},"FirstName":{"S":"Fred"},"FavNumbers":{"NS":["10"]},"LastName":{"S":"Smith"},"FavColors":{"SS":["Red","Green"]},"Age":{"N":"22"}}}
{"Item":{"Id":{"N":"102"},"Phones":{"L":[{"L":[{"S":"222-222"}]}]},"PremierCustomer":{"BOOL":false},"Address":{"M":{"Zip":{"N":"560014"},"Street":{"S":"32 main"},"DoorNum":{"N":"1024"},"City":{"S":"Wales"}}},"FirstName":{"S":"John"},"FavNumbers":{"NS":["10"]},"LastName":{"S":"White"},"FavColors":{"SS":["Blue"]},"Age":{"N":"48"}}}

You copy the exported DynamoDB table data from AWS S3 storage to a local mounted file system.

Example:

For this demonstration, you will learn how to migrate a DynamoDB JSON file to Oracle NoSQL Database(On-premises). You will use a manually created configuration file for this example.

Prerequisites

  • Identify the source and sink for the migration.
    • Source: DynamoDB JSON File
    • Sink: Oracle NoSQL Database (On-premises)
  • In order to import DynamoDB table data to Oracle NoSQL Database, you must first export the DynamoDB table to S3. Refer to steps provided in Exporting DynamoDB table data to Amazon S3 to export your table. While exporting, you select the format as DynamoDB JSON. The exported data contains DynamoDB table data in multiple gzip files as shown below.
    / 01639372501551-bb4dd8c3 
    |-- 01639372501551-bb4dd8c3 ==> exported data prefix
    |----data
    |------sxz3hjr3re2dzn2ymgd2gi4iku.json.gz  ==>table data
    |----manifest-files.json
    |----manifest-files.md5
    |----manifest-summary.json
    |----manifest-summary.md5
    |----_started
  • You must download the files from AWS s3. The structure of the files after the download will be as shown below.
    download-dir/01639372501551-bb4dd8c3     
    |----data    
    |------sxz3hjr3re2dzn2ymgd2gi4iku.json.gz  ==>table data   
    |----manifest-files.json   
    |----manifest-files.md5   
    |----manifest-summary.json   
    |----manifest-summary.md5   
    |----_started

Procedure

To migrate the DynamoDB JSON data to the Oracle NoSQL Database:
  1. Prepare the configuration file (in JSON format) with the identified source and sink details.See Source Configuration Templates and Sink Configuration Templates .
    You can choose one of the following two options.
    • Option 1: Importing DynamoDB table a as JSON document using default schema config.
      Here the defaultSchema is TRUE and so the migrator creates the default schema at the sink. You need to specify the DDBPartitionKey and the corresponding NoSQL column type. Else an error is thrown.
      {
       "source" : {
         "type" : "file",
         "format" : "dynamodb_json",
         "dataPath" : "<complete/path/to/the/DynamoDB/Formatted/JSON/file>"
       },
       "sink" : {
          "type" : "nosqldb",
          "table" : "<table_name>",
          "storeName" : "kvstore",
          "helperHosts" : ["<hostname>:5000"]
          "schemaInfo" : {
             "defaultSchema" : true,    
             "DDBPartitionKey" : "<PrimaryKey:Datatype>",
           },  
        },
        "abortOnError" : true,
        "migratorVersion" : "1.0.0"
      }
      For a DynamoDB JSON source, the default schema for the table will be as shown below:
      CREATE TABLE IF NOT EXISTS <TABLE_NAME>(DDBPartitionKey_name DDBPartitionKey_type, 
      [DDBSortKey_name DDBSortKey_type], DOCUMENT JSON, 
      PRIMARY KEY(SHARD(DDBPartitionKey_name),[DDBSortKey_name]))

      Where

      TABLE_NAME = value provided for the sink 'table' in the configuration

      DDBPartitionKey_name = value provided for the partiiton key in the configuration

      DDBPartitionKey_type = value provided for the data type of the partition key in the configuration

      DDBSortKey_name = value provided for the sort key in the configuration if any

      DDBSortKey_type = value provided for the data type of the sort key in the configuration if any

      DOCUMENT = All attributes except the partition and sort key of a Dynamo DB table item aggregated into a NoSQL JSON column

    • Option 2: Importing DynamoDB table as fixed columns using a user-supplied schema file.
      Here the defaultSchema is FALSE and you specify the schemaPath as a file containing your DDL statement. See Mapping of DynamoDB types to Oracle NoSQL types for more details.

      Note:

      If the Dynamo DB table has a data type that is not supported in NoSQL, the migration fails.
      A sample schema file is shown below.
      CREATE TABLE IF NOT EXISTS sampledynDBImp (AccountId INTEGER,document JSON, 
      PRIMARY KEY(SHARD(AccountId)));
      The schema file is used to create the table at the sink as part of the migration. As long as the primary key data is provided, the input JSON record will be inserted, otherwise it throws an error.

      Note:

      If the input data does not contain a value for a particular column(other than the primary key) then the column default value will be used. The default value should be part of the column definition while creating the table. For example id INTEGER not null default 0. If the column does not have a default definition then SQL NULL is inserted if no values are provided for the column.
      {
        "source" : {
          "type" : "file",
          "format" : "dynamodb_json",
          "dataPath" : "<complete/path/to/the/DynamoDB/Formatted/JSON/file>"
        },
        "sink" : {
          "type" : "nosqldb",
          "table" : "<table_name>",
          "schemaInfo" : {
            "defaultSchema" : false,
            "readUnits" : 100,
            "writeUnits" : 60,
            "schemaPath" : "<full path of the schema file with the DDL statement>",
            "storageSize" : 1
          },
          "storeName" : "kvstore",
          "helperHosts" : ["<hostname>:5000"]
        },
        "abortOnError" : true,
        "migratorVersion" : "1.0.0"
      }
  2. Open the command prompt and navigate to the directory where you extracted the NoSQL Database Migrator utility.
  3. Run the runMigrator command by passing the configuration file using the --config or -c option.
    [~/nosqlMigrator/nosql-migrator-1.0.0]$./runMigrator 
    --config <complete/path/to/the/JSON/config/file>
  4. The utility proceeds with the data migration, as shown below.
    Records provided by source=7..,
    Records written to sink=7,
    Records failed=0,
    Records skipped=0.
    Elapsed time: 0 min 2sec 50ms
    Migration completed.

Validation

Start the SQL prompt in your KVStore.
 java -jar lib/sql.jar -helper-hosts localhost:5000 -store kvstore
Verify that the new table is created with the source data:
desc <table_name>
SELECT * from <table_name>