4 Configuring Property Graph Support

This chapter explains how to configure the support for property graphs in a Big Data environment.

It assumes that you have already performed the installation on a Big Data Appliance (see Installing Oracle Big Data Spatial and Graph on an Oracle Big Data Appliance), an Apache Hadoop system (see Installing Property Graph Support on a CDH Cluster or Other Hardware), or an Oracle NoSQL Database.

You might be able to improve the performance of property graph support by altering the database and Java configuration settings. The suggestions provided are guidelines, which you should follow only after carefully and thoroughly evaluating your system.

4.1 Tuning Apache HBase for Use with Property Graphs

Modifications to the default Apache HBase and Java Virtual Machine configurations can improve performance.

4.1.1 Modifying the Apache HBase Configuration

To modify the Apache HBase configuration, follow the steps in this section for your CDH release. (Note that specific steps might change from one CDH release to the next.)

For CDH 5.2.x, CDH 5.3.x, and CDH 5.4.x:

  1. Log in to Cloudera Manager as the admin user.

  2. On the Home page, click HBase in the list of services on the left.

  3. On the HBase page, click the Configuration tab.

  4. In the Category panel on the left, expand Service-Wide, and then choose Advanced.

  5. Edit the value of HBase Service Advanced Configuration Snippet (Safety Valve) for hbase-site.xml as follows:

    <property>
      <name>hbase.regionserver.handler.count</name>
      <value>32</value>
    </property>
    <property>
      <name>hbase.hregion.max.filesize</name>
      <value>1610612736</value>
    </property>
    <property>
      <name>hbase.hregion.memstore.block.multiplier</name>
      <value>4</value>
    </property>
    <property>
       <name>hbase.hregion.memstore.flush.size</name>
       <value>134217728</value>
    </property>
    <property>
       <name>hbase.hstore.blockingStoreFiles</name>
       <value>200</value></property>
    <property>
      <name>hbase.hstore.flusher.count</name>
      <value>1</value>
    </property>
    

    If the property already exists, then replace the value as required. Otherwise, add the XML property description.

  6. Click Save Changes.

  7. Expand the Actions menu, and then choose Restart or Rolling Restart, whichever option better suits your situation.

For CDH 5.4.x:

  1. Log in to Cloudera Manager as the admin user.

  2. On the Home page, click HBase in the list of services on the left.

  3. On the HBase page, click the Configuration tab.

  4. Expand SCOPE.

  5. Click HBase (Service-wide), scroll to the bottom of the page, and select Display All Entries (not Display 25 Entries).

  6. On this page, locate HBase Service Advanced Configuration Snippet (Safety Valve) for hbase-site.xml, and enter the following value for the <property> element:

    <property>
      <name>hbase.regionserver.handler.count</name>
      <value>32</value>
    </property>
    <property>
      <name>hbase.hregion.max.filesize</name>
      <value>1610612736</value>
    </property>
    <property>
      <name>hbase.hregion.memstore.block.multiplier</name>
      <value>4</value>
    </property>
    <property>
       <name>hbase.hregion.memstore.flush.size</name>
       <value>134217728</value>
    </property>
    <property>
       <name>hbase.hstore.blockingStoreFiles</name>
       <value>200</value></property>
    <property>
      <name>hbase.hstore.flusher.count</name>
      <value>1</value>
    </property>
    

    If the property already exists, then replace the value as required. Otherwise, add the XML property description.

  7. Click Save Changes.

  8. Expand the Actions menu, and then choose Restart or Rolling Restart, whichever option better suits your situation.

4.1.2 Modifying the Java Memory Settings

To modify the Java memory settings, follow the steps in this section for your CDH release. (Note that specific steps might change from one CDH release to the next.)

For CDH 5.2.x and CDH 5.3.x:

  1. Log in to Cloudera Manager as the admin user.

  2. On the Home page, click HBase in the list of services on the left.

  3. On the HBase page, click the Configuration tab.

  4. For RegionServer Group (default and others), click Advanced, and use the following for Java Configuration Options for HBase RegionServer:

    -Xmn256m -XX:+UseParNewGC -XX:+UseConcMarkSweepGC -XX:CMSInitiatingOccupancyFraction=70 -XX:+UseCMSInitiatingOccupancyOnly
    
  5. Click Resource Management, and enter an appropriate value (for example, 18G) for Java Heap Size of HBase RegionServer.

  6. Click Save Changes.

  7. Expand the Actions menu, and then choose Restart or Rolling Restart, whichever option better suits your situation.

For CDH 5.4.x:

  1. Log in to Cloudera Manager as the admin user.

  2. On the Home page, click HBase in the list of services on the left.

  3. On the HBase page, click the Configuration tab.

  4. Expand SCOPE.

  5. Click RegionServer, scroll to the bottom of the page, and select Display All Entries (not Display 25 Entries).

  6. On this page, for Java Configuration Options for HBase RegionServer, enter the following value:

    -Xmn256m -XX:+UseParNewGC -XX:+UseConcMarkSweepGC -XX:CMSInitiatingOccupancyFraction=70 -XX:+UseCMSInitiatingOccupancyOnly
    
  7. For Java Heap Size of HBase RegionServer in Bytes, enter an appropriate value (for example, 18G).

  8. Click Save Changes.

  9. Expand the Actions menu, and then choose Restart or Rolling Restart, whichever option better suits your situation.

See Also:

For detailed information about Java garbage collection, see:

http://docs.oracle.com/javase/8/docs/technotes/guides/vm/gctuning/

For descriptions of all settings, see the Java Tools Reference:

https://docs.oracle.com/javase/8/docs/technotes/tools/unix/java.html

4.2 Tuning Oracle NoSQL Database for Use with Property Graphs

To obtain the best performance from Oracle NoSQL Database, do the following.

  • Ensure that the replication groups (shards) are balanced.

  • Adjust the user process resource limit setting (ulimit). For example:

    ulimit -u 131072
    
  • Set the heap size of the Java Virtual Machines (JVMs) on the replication nodes to enable the B-tree indexes to fit in memory.

    To set the heap size, use either the -memory_mb option of the makebookconfig command or the memory_mb parameter for the storage node.

    Oracle NoSQL Database uses 85% of memory_mb as the heap size for processes running on the storage node. If the storage node hosts multiple replication nodes, then the heap is divided equally among them. Each replication node uses a cache that is 70% of the heap.

    For example, if you set memory_mb to 3000 MB on a storage node that hosts two replication nodes, then each replication node has the following:

    • 1275 MB heap, calculated as (3000 MB * .85)/2

    • 892 MB cache, calculated as 1275 MB * .70