Table of Contents
        The RDF Graph feature supports loading RDF triples data into the
        default graph or a named graph in Oracle NoSQL Database. RDF data can be
        loaded into the graph using two approaches: Triples can be inserted
        incrementally using the
        graph.add(Triple.create()) API as
        illustrated in 
        Example1.java: Create a default graph and add/delete triples
        and 
         Example1b.java: Create a named graph and add/delete triples.
    
 
        Triples can be bulk loaded from an RDF file using the
        DatasetGraphNoSql.load() API as illustrated in 
         Example2.java: Load an RDF file
        and
         Concurrent RDF data loading.
    
To load RDF data files containing thousands to millions of records into an Oracle NoSQL Database, you can use concurrent loading in the RDF Graph feature to speed up the task.
Concurrent or parallel loading is an optimized solution to data loading in the RDF Graph feature, where triples are organized into batches and load execution is done if and only if a batch is full or the process has loaded all triples from the RDF file. Once a batch is full, to increase performance on write operations to Oracle NoSQL Database, we use multiple threads and connections to store multiple triples into the Oracle NoSQL Database.
You can use parallel loading by specifying the degree of parallelism (number of threads used in load operations) and the size of the batches managed when calling the load method in the OracleDatasetGraphNoSql class.
The following example loads an RDF data file in Oracle NoSQL Database using parallel loading. The degree of parallelism and batch size used are controlled by the input parameters iDOP and iBatchSize respectively.
On a balanced hardware setup with 4 or more CPU cores, setting a DOP to 8 (or 16) can improve significantly the speed of the load operation when many triples are going to be processed.
public static void main(String[] args) throws Exception
{
String szStoreName  = args[0];
String szHostName   = args[1];
String szHostPort   = args[2];
int iBatchSize      = Integer.parseInt(args[3]);
int iDOP            = Integer.parseInt(args[4]);
// Create Oracle NoSQL connection
OracleNoSqlConnection conn 
= OracleNoSqlConnection.createInstance(szStoreName,
                                       szHostName, 
                                       szHostPort);
     
// Create Oracle NoSQL datasetgraph
OracleGraphNoSql graph = new OracleGraphNoSql(conn);
DatasetGraphNoSql datasetGraph = DatasetGraphNoSql.createFrom(graph);
   
// Close graph, as it is no longer needed
graph.close();
    
// Clear datasetgraph
datasetGraph.clearRepository();
    
// Load N-QUADS data from a file into the Oracle NoSQL Database
DatasetGraphNoSql.load("example.nt", 
                       Lang.NQUADS,         // data format
                       conn, 
                       "http://example.org",
                       iBatchSize,          // batch size
                       iDOP);               // degree of parallelism
    
// Create dataset from Oracle NoSQL datasetgraph to execute
Dataset ds = DatasetImpl.wrap(datasetGraph);
   
String szQuery = "select * where { graph ?g { ?s ?p ?o }  }";
System.out.println("Execute query " + szQuery);
Query query = QueryFactory.create(szQuery);
QueryExecution qexec = QueryExecutionFactory.create(query, ds);
try {
      ResultSet results = qexec.execSelect();
      ResultSetFormatter.out(System.out, results, query);
    }
finally {
      qexec.close();
    }
ds.close();
conn.dispose();
   }
}