Generic Binding

Using a Single Generic Schema Binding
Using Multiple Generic Schema Bindings
Using Embedded Records
Managing Generic Schemas Dynamically

Generic bindings provide the widest support for the Avro data types. They are also flexible in that your application does not need to know the entire set of schemas in use in the store at compile time. This provides you good flexibility if your store has a constantly expanding set of schema.

The downside to generic bindings is that they do not provide compile-time type safety. Generic bindings identify fields using a string (as opposed to getter and setter methods provided by specific bindings), so it is not possible for the compiler to know, for example, whether you are using an integer where a real is expected.

Generic binding uses AvroCatalog.getGenericBinding() for a single schema binding, and uses AvroCatalog.getGenericMultiBinding() when using multiple schemas.

Using a Single Generic Schema Binding

{
    "type": "record",
    "name": "PersonInformation",
    "namespace": "avro",
    "fields": {"name": "ID", "type": "int"}
} 

Further, suppose you placed that schema in a file named PersonSchema.avsc.

Then to use that schema, first add it to your store using the ddl add-schema command:

> java -jar <kvhome>/kvstore.jar runadmin -port <port> \
-host <host>
kv-> ddl add-schema -file PersonSchema.avsc 

In your Oracle NoSQL Database client code, you must make the schema available to the code. One way to do this is to read the schema directly from the file where you created it:

package avro;

import java.io.File;
import org.apache.avro.generic.GenericData;
import org.apache.avro.generic.GenericRecord;
import org.apache.avro.Schema;

import oracle.kv.avro.AvroCatalog;
import oracle.kv.avro.GenericAvroBinding;

...

final Schema.Parser parser = new Schema.Parser();
parser.parse(new File("PersonSchema.avsc")); 

Next, you need to make the schema available to your application:

final Schema personSchema = 
    parser.getTypes().get("avro.PersonInformation"); 

Finally, before you can begin serializing and deserializing values that use the Avro data format, you must create a binding and then create an Avro record for that binding. In this example, we use the generic binding. But as we explain later in this chapter, there are other bindings and the generic binding may not necessarily be the best one for your purposes.

/**
 * Here, for the sake of brevity, we skip the necessary steps of 
 * declaring and opening the store handle.
 */
final AvroCatalog catalog = store.getAvroCatalog();
final GenericAvroBinding binding = 
    catalog.getGenericBinding(personSchema);

Once you have the binding, you need a way for your application to represent the fields in the schema, so that they can be read and written. You do this by creating an Avro record, which is a data structure that allows you to read and/or write the fields in the schema. (Do not confuse an Avro record, which is a handle to a binary object containing data, to an Oracle NoSQL Database record, which is a single key-value pair contained in your store. An Oracle NoSQL Database record can contain a value that uses the Avro data format. An instance of the Avro data format, in turn, is managed within your client code using an Avro record.)

Because we are using the generic binding for this example, we will use the GenericRecord to manage the contents of the binding.

For example, assume we performed a store read, and now we want to examine the information stored with the Oracle NoSQL Database record.

/**
  * Assume a store read was performed here, and resulted in a 
  * ValueVersion instance called 'vv'. Then, to deserialize
  * the value in the returned record:
  */
final GenericRecord member;
final int ID;
if (vv != null) {
    /* Deserialize the the value */
    member = binding.toObject(vv.getValue());
    /* Retrieve the contents of the ID field. Because we are 
     * using a generic binding, we must type cast the retrieved
     * value.
     */
     ID = (Integer) member.get("ID");
} 

If we want to write to a field (that is, we want to serialize some data), we use the record's put() method. As an example, suppose we wanted to create a brand new Avro object to be written to the store. Then:

final GenericRecord person = new GenericData.Record(personSchema);
final int ID = 100011;
person.put("ID", ID); 

/**
  * To serialize this information so that it can be written to 
  * the store, use GenericBinding.toValue() as the value for the
  * store put(). That is, assuming you already have a store handle 
  * and a key:
  */
store.put(key, binding.toValue(person)); 

Using Multiple Generic Schema Bindings

It is unlikely that you will use only one schema with your application. In order to use more than one schema:

  1. Specify each schema individually in separate files.

  2. Add all these schemas to your store as described in Managing Avro Schema in the Store.

  3. Use HashMap to organize your schemas, and then pass that to AvroCatalog.getGenericMultiBinding() in order to create your binding.

For example, suppose you had the following two schemas:

{
 "type": "record",
 "namespace": "avro",
 "name": "PersonInfo",
 "fields": [
   { "name": "first", "type": "string" },
   { "name": "last", "type": "string" },
   { "name": "age", "type": "int" }
 ]
}


{
 "type": "record",
 "namespace": "avro",
 "name": "AnimalInfo",
 "fields": [
   { "name": "species", "type": "string"},
   { "name": "name", "type": "string"},
   { "name": "age", "type": "int"}
 ]
} 

Then put Avro.PersonInfo in a file (call it PersonSchema.avsc) and Avro.AnimalInfo in a second file (AnimalSchema.avsc). Add these schemas to your store using the command line interface.

At this point, you could simply create one binding for each schema that you are using, but that can quickly become awkward depending on how many schemas your code is using. Instead, create multiple schemas using HashMap and (in this case) AvroCatalog.getGenericMultiBinding(). To do this, first create a HashMap that you use to organize your schemas:

package avro;

import java.io.File;
import java.io.IOException;
import java.util.Arrays;
import java.util.HashMap;

import org.apache.avro.Schema;
import org.apache.avro.generic.GenericData;
import org.apache.avro.generic.GenericRecord;

...

import oracle.kv.ValueVersion;
import oracle.kv.avro.AvroCatalog;
import oracle.kv.avro.GenericAvroBinding;

...

HashMap<String, Schema> schemas = new HashMap<String, Schema>(); 

Then, parse each schema and add it to the HashMap:

final Schema.Parser parser = new Schema.Parser();

Schema personSchema = parser.parse(new File("PersonSchema.avsc"));
schemas.put(personSchema.getFullName(), personSchema);

Schema animalSchema = parser.parse(new File("AnimalSchema.avsc"));
schemas.put(animalSchema.getFullName(), animalSchema);

Then create your binding. You will only need one, because you are using a multi binding which is capable of using multiple schemas.

/*
 * Store creation is skipped for brevity
 */

catalog = store.getAvroCatalog();
binding = catalog.getGenericMultiBinding(schemas); 

To use the binding, you call toObject() or put() in the same way as you would if you were using an ordinary single-schema binding. The multi-binding is capable of determining which schema you are using, and serializing/deserializing accordingly. For example, suppose you retrieve a record that uses the Avro.AnimalInfo schema. Then you can deserialize as if you are using a single-schema binding:

/*
 * Key creation and store retrieval skipped.
 * Assume we have retrieved a ValueVersion (vv1) that
 * contains an AnimalInfo value.
 */

final GenericRecord animalObject;
if (vv1 != null) {
    animalObject = binding.toObject(vv1.getValue());
    final String species = animalObject.get("species").toString();
    final String name = animalObject.get("name").toString();
    final int age = (Integer) animalObject.get("age");

    /* Do something with the data */
} 

You can also create a new Avro.PersonInfo object for placement in the store using the same binding, like this:

final GenericRecord personObject = 
    new GenericData.Record(personSchema);
personObject.put("name", "Sam Brown");
personObject.put("age", 34);

/*
 * Key creation and store handle creation skipped
 * for brevity's sake.
 */

store.put(aKey, binding.toValue(personObject)); 

Using Embedded Records

Suppose you have a schema that looks like this:

{
    "type" : "record",
    "name" : "hatInventory",
    "namespace" : "avro",
    "fields" : [{"name" : "sku", "type" : "string", "default" : ""},
                  {"name" : "description",
                     "type" : {
                         "type" : "record",
                         "name" : "hatInfo",
                         "fields" : [
                                     {"name" : "style", 
                                      "type" : "string", 
                                      "default" : ""},

                                     {"name" : "size", 
                                      "type" : "string", 
                                      "default" : ""},

                                     {"name" : "color", 
                                      "type" : "string", 
                                      "default" : ""},

                                     {"name" : "material", 
                                      "type" : "string", 
                                      "default" : ""}
                            ]}
                }
    ]
}

In order to address the fields in the embedded record hatInfo, you treat it as if it is a second piece of standalone schema. You only have to parse the schema file once. You then create two schemas and two records, but only one binding. For example, to create a serialized object that uses this schema:

package avro;

import java.io.File;

import org.apache.avro.generic.GenericData;
import org.apache.avro.generic.GenericRecord;
import org.apache.avro.Schema;

import oracle.kv.KVStore;
import oracle.kv.Key;
import oracle.kv.ValueVersion;
import oracle.kv.avro.AvroCatalog;
import oracle.kv.avro.GenericAvroBinding;

...

// Parse our schema file
final Schema.Parser parser = new Schema.Parser();
try {
    parser.parse(new File("HatSchema.avsc"));
} catch (IOException io) {
    io.printStackTrace();
}


// Get two Schema objects. We need two because of the 
// embedded record.
final Schema hatInventorySchema =
    parser.getTypes().get("avro.hatInventory");
final Schema hatInfoSchema =
    parser.getTypes().get("avro.hatInfo");

// Get two GenericRecords so we can manipulate both of
// the records in the schema
final GenericRecord hatRecord = 
    new GenericData.Record(hatInventorySchema);
final GenericRecord hatInfoRecord = 
    new GenericData.Record(hatInfoSchema);

// Now populate our records. Start with the 
// embedded record.
hatInfoRecord.put("style", "western");
hatInfoRecord.put("size", "medium");
hatInfoRecord.put("color", "black");
hatInfoRecord.put("material", "leather");

// Now the top-level record. Notice that we
// set the embedded record as the value for the 
// description field.
hatRecord.put("sku", "289163009");
hatRecord.put("description", hatInfoRecord);

// Now we need a binding. Only one is required,
// and we use the top-level schema to create it.
final AvroCatalog catalog = store.getAvroCatalog();
final GenericAvroBinding hatBinding =
    catalog.getGenericBinding(hatInventorySchema);

// Create a Key and write the value to the store.
final Key key = Key.createKey(Arrays.asList("hats", "0000000033"));
store.put(key, hatBinding.toValue(hatRecord)); 

On retrieval, you edit values of this type in the following way:

// Perform the retrieval
final ValueVersion vv = store.get(key);
if (vv != null) {
    // Deserialize the ValueVersion as normal
    GenericRecord hatR =      
        new GenericData.Record(hatInventorySchema);
    hatR = hatBinding.toObject(vv.getValue());

    // To access the embedded record, create a GenericRecord
    // using the embedded record's schema. Then get the
    // embedded record from the field on the top-level
    // schema that contains it.
    GenericRecord hatInfoR =
        new GenericData.Record(hatInfoSchema);
    hatInfoR = (GenericRecord) hatR.get("description");

    // Finally, you can write to the top-level record and the
    // embedded record like this:

    // Modify a field on the embedded record:
    hatInfoR.put("style", "Fedora");

    // Modify the top-level record:
    hatR.put("sku", "300");
    hatR.put("description", hatInfoR);

    store.put(key, hatBinding.toValue(hatR)); } 

Managing Generic Schemas Dynamically

A special use-case of generic bindings is that you do not necessarily know about all the schemas that will ever be used by your store at the time you write your code. That is, the use of a HashMap in the previous example is somewhat brittle if you are operating in an environment with a constantly growing list of schemas. In that scenario, whenever you add to the schemas in use by your store, you potentially might need to rewrite your client code to add the new schemas to the HashMap used by your client. Otherwise, your code could retrieve a value that uses a schema which is unknown to your code. Depending on what your code is doing, this can cause you problems.

If this is a problem for you, you can avoid it by using AvroCatalog.getCurrentSchemas() with AvroCatalog.getGenericMultiBinding() so that you do not need to build a HashMap of all your schemas.

For example, in the previous example we showed client code that used two known schemas. We can change the previous example to use getCurrentSchemas() in the following way:

package avro;

import java.io.File;
import java.io.IOException;
import java.util.Arrays;

import org.apache.avro.Schema;
import org.apache.avro.generic.GenericData;
import org.apache.avro.generic.GenericRecord;

...

import oracle.kv.ValueVersion;
import oracle.kv.avro.AvroCatalog;
import oracle.kv.avro.GenericAvroBinding;

...

final Schema.Parser parser = new Schema.Parser();
Schema animalSchema = parser.parse(new File("AnimalSchema.avsc"));


/*
 * We skip creating a HashMap of Schemas because
 * it is not needed.
 */


/*
 * Store creation is skipped for brevity
 */

catalog = store.getAvroCatalog();
binding = catalog.getGenericMultiBinding(catalog.getCurrentSchemas());

If we then perform a read on the store, there is the possibility that we will retrieve an object that uses a schema which was not in use when our binding was created. (This is particularly true for long-running client code). To handle this problem, we catch SchemaNotAllowedException.

/*
 * Key creation and store retrieval skipped.
 */

final GenericRecord animalObject;
if (vv1 != null) {
    try {
        animalObject = binding.toObject(vv1.getValue());
    } catch (SchemaNotAllowedException e) {
        // Take some action here. Potentially you could
        // recreate your binding, and then retry the
        // deserialization process
    }

    /* 
     * Do something with the data. If your client code is 
     * using more than one schema, you can identify which
     * schema the retrieved value is using by testing 
     * the schema name. That is:
     *
     * String sName = animalObject.getSchema().getFullName()
     * if (sName.equals("avro.animalInfo")) { ... }
     */
}