Generic bindings provide the widest support for the Avro data types. They are also flexible in that your application does not need to know the entire set of schemas in use in the store at compile time. This provides you good flexibility if your store has a constantly expanding set of schema.
The downside to generic bindings is that they do not provide compile-time type safety. Generic bindings identify fields using a string (as opposed to getter and setter methods provided by specific bindings), so it is not possible for the compiler to know, for example, whether you are using an integer where a real is expected.
Generic binding uses AvroCatalog.getGenericBinding()
for a single schema binding, and uses
AvroCatalog.getGenericMultiBinding()
when using multiple schemas.
{ "type": "record", "name": "PersonInformation", "namespace": "avro", "fields": {"name": "ID", "type": "int"} }
Further, suppose you placed that schema in a file
named PersonSchema.avsc
.
Then to use that schema, first add it to your store
using the ddl add-schema
command:
> java -jar <kvhome>/kvstore.jar runadmin -port <port> \ -host <host> kv-> ddl add-schema -file PersonSchema.avsc
In your Oracle NoSQL Database client code, you must make the schema available to the code. One way to do this is to read the schema directly from the file where you created it:
package avro; import java.io.File; import org.apache.avro.generic.GenericData; import org.apache.avro.generic.GenericRecord; import org.apache.avro.Schema; import oracle.kv.avro.AvroCatalog; import oracle.kv.avro.GenericAvroBinding; ... final Schema.Parser parser = new Schema.Parser(); parser.parse(new File("PersonSchema.avsc"));
Next, you need to make the schema available to your application:
final Schema personSchema = parser.getTypes().get("avro.PersonInformation");
Finally, before you can begin serializing and deserializing values that use the Avro data format, you must create a binding and then create an Avro record for that binding. In this example, we use the generic binding. But as we explain later in this chapter, there are other bindings and the generic binding may not necessarily be the best one for your purposes.
/** * Here, for the sake of brevity, we skip the necessary steps of * declaring and opening the store handle. */ final AvroCatalog catalog = store.getAvroCatalog(); final GenericAvroBinding binding = catalog.getGenericBinding(personSchema);
Once you have the binding, you need a way for your application to represent the fields in the schema, so that they can be read and written. You do this by creating an Avro record, which is a data structure that allows you to read and/or write the fields in the schema. (Do not confuse an Avro record, which is a handle to a binary object containing data, to an Oracle NoSQL Database record, which is a single key-value pair contained in your store. An Oracle NoSQL Database record can contain a value that uses the Avro data format. An instance of the Avro data format, in turn, is managed within your client code using an Avro record.)
Because we are using the generic binding for this example,
we will use the GenericRecord
to
manage the contents of the binding.
For example, assume we performed a store read, and now we want to examine the information stored with the Oracle NoSQL Database record.
/** * Assume a store read was performed here, and resulted in a * ValueVersion instance called 'vv'. Then, to deserialize * the value in the returned record: */ final GenericRecord member; final int ID; if (vv != null) { /* Deserialize the the value */ member = binding.toObject(vv.getValue()); /* Retrieve the contents of the ID field. Because we are * using a generic binding, we must type cast the retrieved * value. */ ID = (Integer) member.get("ID"); }
If we want to write to a field (that is, we want to
serialize some data), we use the record's
put()
method. As an example,
suppose we wanted to create a brand new Avro object to be
written to the store. Then:
final GenericRecord person = new GenericData.Record(personSchema); final int ID = 100011; person.put("ID", ID); /** * To serialize this information so that it can be written to * the store, use GenericBinding.toValue() as the value for the * store put(). That is, assuming you already have a store handle * and a key: */ store.put(key, binding.toValue(person));
It is unlikely that you will use only one schema with your application. In order to use more than one schema:
Specify each schema individually in separate files.
Add all these schemas to your store as described in Managing Avro Schema in the Store.
Use HashMap
to organize
your schemas, and then pass that to
AvroCatalog.getGenericMultiBinding()
in order to create your binding.
For example, suppose you had the following two schemas:
{ "type": "record", "namespace": "avro", "name": "PersonInfo", "fields": [ { "name": "first", "type": "string" }, { "name": "last", "type": "string" }, { "name": "age", "type": "int" } ] } { "type": "record", "namespace": "avro", "name": "AnimalInfo", "fields": [ { "name": "species", "type": "string"}, { "name": "name", "type": "string"}, { "name": "age", "type": "int"} ] }
Then put Avro.PersonInfo
in a file
(call it PersonSchema.avsc
) and
Avro.AnimalInfo
in a second file
(AnimalSchema.avsc
). Add these
schemas to your store using the command line
interface.
At this point, you could simply create one binding for
each schema that you are using, but that can quickly
become awkward depending on how many schemas your code
is using. Instead, create multiple schemas using
HashMap
and (in this case)
AvroCatalog.getGenericMultiBinding()
.
To do this, first create a
HashMap
that you use to organize
your schemas:
package avro; import java.io.File; import java.io.IOException; import java.util.Arrays; import java.util.HashMap; import org.apache.avro.Schema; import org.apache.avro.generic.GenericData; import org.apache.avro.generic.GenericRecord; ... import oracle.kv.ValueVersion; import oracle.kv.avro.AvroCatalog; import oracle.kv.avro.GenericAvroBinding; ... HashMap<String, Schema> schemas = new HashMap<String, Schema>();
Then, parse each schema and add it to the
HashMap
:
final Schema.Parser parser = new Schema.Parser(); Schema personSchema = parser.parse(new File("PersonSchema.avsc")); schemas.put(personSchema.getFullName(), personSchema); Schema animalSchema = parser.parse(new File("AnimalSchema.avsc")); schemas.put(animalSchema.getFullName(), animalSchema);
Then create your binding. You will only need one, because you are using a multi binding which is capable of using multiple schemas.
/* * Store creation is skipped for brevity */ catalog = store.getAvroCatalog(); binding = catalog.getGenericMultiBinding(schemas);
To use the binding, you call
toObject()
or
put()
in the same way as you
would if you were using an ordinary single-schema
binding. The multi-binding is capable of determining
which schema you are using, and
serializing/deserializing accordingly. For example,
suppose you retrieve a record that uses the
Avro.AnimalInfo
schema. Then you can
deserialize as if you are using a single-schema
binding:
/* * Key creation and store retrieval skipped. * Assume we have retrieved a ValueVersion (vv1) that * contains an AnimalInfo value. */ final GenericRecord animalObject; if (vv1 != null) { animalObject = binding.toObject(vv1.getValue()); final String species = animalObject.get("species").toString(); final String name = animalObject.get("name").toString(); final int age = (Integer) animalObject.get("age"); /* Do something with the data */ }
You can also create a new
Avro.PersonInfo
object for placement
in the store using the same binding, like this:
final GenericRecord personObject = new GenericData.Record(personSchema); personObject.put("name", "Sam Brown"); personObject.put("age", 34); /* * Key creation and store handle creation skipped * for brevity's sake. */ store.put(aKey, binding.toValue(personObject));
Suppose you have a schema that looks like this:
{ "type" : "record", "name" : "hatInventory", "namespace" : "avro", "fields" : [{"name" : "sku", "type" : "string", "default" : ""}, {"name" : "description", "type" : { "type" : "record", "name" : "hatInfo", "fields" : [ {"name" : "style", "type" : "string", "default" : ""}, {"name" : "size", "type" : "string", "default" : ""}, {"name" : "color", "type" : "string", "default" : ""}, {"name" : "material", "type" : "string", "default" : ""} ]} } ] }
In order to address the fields in the embedded record
hatInfo
, you treat it as if it is a
second piece of standalone schema. You only have to
parse the schema file once. You then create
two schemas and two records, but only one binding. For
example, to create a serialized object that uses this
schema:
package avro; import java.io.File; import org.apache.avro.generic.GenericData; import org.apache.avro.generic.GenericRecord; import org.apache.avro.Schema; import oracle.kv.KVStore; import oracle.kv.Key; import oracle.kv.ValueVersion; import oracle.kv.avro.AvroCatalog; import oracle.kv.avro.GenericAvroBinding; ... // Parse our schema file final Schema.Parser parser = new Schema.Parser(); try { parser.parse(new File("HatSchema.avsc")); } catch (IOException io) { io.printStackTrace(); } // Get two Schema objects. We need two because of the // embedded record. final Schema hatInventorySchema = parser.getTypes().get("avro.hatInventory"); final Schema hatInfoSchema = parser.getTypes().get("avro.hatInfo"); // Get two GenericRecords so we can manipulate both of // the records in the schema final GenericRecord hatRecord = new GenericData.Record(hatInventorySchema); final GenericRecord hatInfoRecord = new GenericData.Record(hatInfoSchema); // Now populate our records. Start with the // embedded record. hatInfoRecord.put("style", "western"); hatInfoRecord.put("size", "medium"); hatInfoRecord.put("color", "black"); hatInfoRecord.put("material", "leather"); // Now the top-level record. Notice that we // set the embedded record as the value for the // description field. hatRecord.put("sku", "289163009"); hatRecord.put("description", hatInfoRecord); // Now we need a binding. Only one is required, // and we use the top-level schema to create it. final AvroCatalog catalog = store.getAvroCatalog(); final GenericAvroBinding hatBinding = catalog.getGenericBinding(hatInventorySchema); // Create a Key and write the value to the store. final Key key = Key.createKey(Arrays.asList("hats", "0000000033")); store.put(key, hatBinding.toValue(hatRecord));
On retrieval, you edit values of this type in the following way:
// Perform the retrieval final ValueVersion vv = store.get(key); if (vv != null) { // Deserialize the ValueVersion as normal GenericRecord hatR = new GenericData.Record(hatInventorySchema); hatR = hatBinding.toObject(vv.getValue()); // To access the embedded record, create a GenericRecord // using the embedded record's schema. Then get the // embedded record from the field on the top-level // schema that contains it. GenericRecord hatInfoR = new GenericData.Record(hatInfoSchema); hatInfoR = (GenericRecord) hatR.get("description"); // Finally, you can write to the top-level record and the // embedded record like this: // Modify a field on the embedded record: hatInfoR.put("style", "Fedora"); // Modify the top-level record: hatR.put("sku", "300"); hatR.put("description", hatInfoR); store.put(key, hatBinding.toValue(hatR)); }
A special use-case of generic bindings is that you do
not necessarily know about all the schemas that will
ever be used by your store at the time you write your
code. That is, the use of a HashMap
in the previous example is somewhat brittle if you are
operating in an environment with a constantly growing
list of schemas. In that scenario, whenever you
add to the schemas in use by your store, you
potentially might need to
rewrite your client code to add the new schemas to the
HashMap
used by your client.
Otherwise, your code could retrieve a value that uses a
schema which is unknown to your code. Depending on what
your code is doing, this can cause you problems.
If this is a problem for you, you can avoid it by using
AvroCatalog.getCurrentSchemas()
with AvroCatalog.getGenericMultiBinding()
so that you do not need to build a
HashMap
of all your schemas.
For example, in the previous example we showed client
code that used two known schemas. We can change the
previous example to use
getCurrentSchemas()
in the
following way:
package avro; import java.io.File; import java.io.IOException; import java.util.Arrays; import org.apache.avro.Schema; import org.apache.avro.generic.GenericData; import org.apache.avro.generic.GenericRecord; ... import oracle.kv.ValueVersion; import oracle.kv.avro.AvroCatalog; import oracle.kv.avro.GenericAvroBinding; ... final Schema.Parser parser = new Schema.Parser(); Schema animalSchema = parser.parse(new File("AnimalSchema.avsc"));/* * We skip creating a HashMap of Schemas because * it is not needed. */
/* * Store creation is skipped for brevity */ catalog = store.getAvroCatalog(); binding = catalog.getGenericMultiBinding(catalog.getCurrentSchemas()
);
If we then perform a read on the store, there is the
possibility that we will retrieve an object that uses a
schema which was not in use when our binding was
created. (This is particularly true for long-running
client code). To handle this problem, we catch
SchemaNotAllowedException
.
/* * Key creation and store retrieval skipped. */ final GenericRecord animalObject; if (vv1 != null) { try { animalObject = binding.toObject(vv1.getValue()); } catch (SchemaNotAllowedException e) { // Take some action here. Potentially you could // recreate your binding, and then retry the // deserialization process } /* * Do something with the data. If your client code is * using more than one schema, you can identify which * schema the retrieved value is using by testing * the schema name. That is: * * String sName = animalObject.getSchema().getFullName() * if (sName.equals("avro.animalInfo")) { ... } */ }