Oracle Big Data Spatial and Graph delivers advanced spatial and graph analytic capabilities to supported Apache Hadoop and NoSQL Database Big Data platforms.
The spatial features include support for data enrichment of location information, spatial filtering and categorization based on distance and location-based analysis, and spatial data processing for vector and raster processing of digital map, sensor, satellite and aerial imagery values, and APIs for map visualization.
The property graph features support Apache Hadoop HBase and Oracle NoSQL Database for graph operations, indexing, queries, search, and in-memory analytics.
The multimedia analytics features provide a framework for processing video and image data in Apache Hadoop, including built-in face recognition using OpenCV.
Spatial location information is a common element of Big Data.
Businesses can use spatial data as the basis for associating and linking disparate data sets. Location information can also be used to track and categorize entities based on proximity to another person, place, or object, or on their presence a particular area. Location information can facilitate location-specific offers to customers entering a particular geography, something known as geo-fencing. Georeferenced imagery and sensory data can be analyzed for a variety of business benefits.
The spatial features of Oracle Big Data Spatial and Graph support those use cases with the following kinds of services.
Vector Services:
Ability to associate documents and data with names, such as cities or states, or longitude/latitude information in spatial object definitions for a default administrative hierarchy
Support for text-based 2D and 3D geospatial formats, including GeoJSON files, Shapefiles, GML, and WKT, or you can use the Geospatial Data Abstraction Library (GDAL) to convert popular geospatial encodings such as Oracle SDO_Geometry, ST_Geometry, and other supported formats
An HTML5-based map client API and a sample console to explore, categorize, and view data in a variety of formats and coordinate systems
Topological and distance operations: Anyinteract, Inside, Contains, Within Distance, Nearest Neighbor, and others
Spatial indexing for fast retrieval of data
Raster Services:
Support for many image file formats supported by GDAL and image files stored in HDFS
A sample console to view the set of images that are available
Raster operations, including, subsetting, georeferencing, mosaics, and format conversion
Graphs manage networks of linked data as vertices, edges, and properties of the vertices and edges. Graphs are commonly used to model, store, and analyze relationships found in social networks, cyber security, utilities and telecommunications, life sciences and clinical data, and knowledge networks.
Typical graph analyses encompass graph traversal, recommendations, finding communities and influencers, and pattern matching. Industries including, telecommunications, life sciences and healthcare, security, media and publishing can benefit from graphs.
The property graph features of Oracle Big Data Spatial and Graph support those use cases with the following capabilities:
A scalable graph database on Apache HBase and Oracle NoSQL Database
Developer-based APIs based upon Tinkerpop Blueprints, and Java graph APIs
Text search and query through integration with Apache Lucene and SolrCloud
Scripting languages support for Groovy and Python
A parallel, in-memory graph analytics engine
A fast, scalable suite of social network analysis functions that include ranking, centrality, recommender, community detection, path finding
Parallel bulk load and export of property graph data in Oracle-defined flat files format
Manageability through a Groovy-based console to execute Java and Tinkerpop Gremlin APIs
The following are recommendations for property graph installation.
Table 1-1 Property Graph Sizing Recommendations
Graph Size | Recommended Physical Memory to be Dedicated | Recommended Number of CPU Processors |
---|---|---|
10 to 100M edges |
Up to 14 GB RAM |
2 to 4 processors, and up to 16 processors for more compute-intensive workloads |
100M to 1B edges |
14 GB to 100 GB RAM |
4 to 12 processors, and up to 16 to 32 processors for more compute-intensive workloads |
Over 1B edges |
Over 100 GB RAM |
12 to 32 processors, or more for especially compute-intensive workloads |
The multimedia analytics feature of Oracle Big Data Spatial and Graph provides a framework for processing video and image data in Apache Hadoop. The framework enables distributed processing of video and image data.
A main use case is performing facial recognition in videos and images.
The Mammoth command-line utility for installing and configuring the Oracle Big Data Appliance software also installs the Oracle Big Data Spatial and Graph option, including the spatial, property graph, and multimedia capabilities.
You can enable this option during an initial software installation, or afterward using the bdacli
utility.
To use Oracle NoSQL Database as a graph repository, you must have an Oracle NoSQL Database cluster.
To use Apache HBase as a graph repository, you must have an Apache Hadoop cluster.
See Also:
Oracle Big Data Appliance Owner's Guide for software configuration instructions.
Installing and configuring the Image Processing Framework depends upon the distribution being used.
The Oracle Big Data Appliance cluster distribution comes with a pre-installed setup, but you must follow few steps in Installing the Image Processing Framework for Oracle Big Data Appliance Distribution to get it working.
For a commodity distribution, follow the instructions in Installing the Image Processing Framework for Other Distributions (Not Oracle Big Data Appliance).
For both distributions:
You must download and compile PROJ libraries, as explained in Getting and Compiling the Cartographic Projections Library.
After performing the installation, verify it (see Post-installation Verification of the Image Processing Framework).
If the cluster has security enabled, make sure that the user executing the jobs is in the princs
list and has an active Kerberos ticket.
Before installing the Image Processing Framework, you must download the Cartographic Projections Library and perform several related operations.
Download the PROJ.4 source code and datum shifting files:
$ wget http://download.osgeo.org/proj/proj-4.9.1.tar.gz $ wget http://download.osgeo.org/proj/proj-datumgrid-1.5.tar.gz
Untar the source code, and extract the datum shifting files in the nad
subdirectory:
$ tar xzf proj-4.9.1.tar.gz $ cd proj-4.9.1/nad $ tar xzf ../../proj-datumgrid-1.5.tar.gz $ cd ..
Configure, make, and install PROJ.4:
$ ./configure $ make $ sudo make install $ cd ..
libproj.so
is now available at /usr/local/lib/libproj.so
.
Create a link to the libproj.so
file in the spatial installation directory:
sudo ln -s /usr/local/lib/libproj.so /opt/oracle/oracle-spatial-graph/spatial/raster/gdal/lib/libproj.so
Provide read and execute permissions for the libproj.so
library for all users
sudo chmod 755 /opt/oracle/oracle-spatial-graph/spatial/raster/gdal/lib/libproj.so
The Oracle Big Data Appliance distribution comes with a pre-installed configuration, though you must ensure that the image processing framework has been installed.
Be sure that the actions described in Getting and Compiling the Cartographic Projections Library have been performed, so that libproj.so
(PROJ.4
) is accessible to all users and is set up correctly.
For OBDA, ensure that the following directories exist:
SHARED_DIR (shared directory for all nodes in the cluster): /opt/shareddir
ALL_ACCESS_DIR (shared directory for all nodes in the cluster with Write access to the hadoop group): /opt/shareddir/spatial
For Big Data Spatial and Graph in environments other than the Big Data Appliance, follow the instructions in this section.
Ensure that HADOOP_LIB_PATH
is under /usr/lib/hadoop
. If it is not there, find the path and use it as it your HADOOP_LIB_PATH
.
Install NFS.
Have at least one folder, referred in this document as SHARED_FOLDER, in the Resource Manager node accessible to every Node Manager node through NFS.
Provide write access to all the users involved in job execution and the yarn users to this SHARED_FOLDER
Download oracle-spatial-graph-<version>.x86_64.rpm
from the Oracle e-delivery web site.
Execute oracle-spatial-graph-<version>.x86_64.rpm
using the rpm command.
After rpm executes, verify that a directory structure created at /opt/oracle/oracle-spatial-graph/spatial/raster
contains these folders: console
, examples
, jlib
, gdal
, and tests
. Additionally, index.html
describes the content, and javadoc.zip
contains the Javadoc for the API..
Several test scripts are provided to perform the following verification operations.
Test the image loading functionality
Test test the image processing functionality
Test a processing class for slope calculation in a DEM and a map algebra operation
Verify the image processing of a single raster with no mosaic process (it includes a user-provided function that calculates hill shade in the mapping phase).
Test processing of two rasters using a mask operation
Execute these scripts to verify a successful installation of image processing framework.
If the cluster has security enabled, make sure the current user is in the princs
list and has an active Kerberos ticket.
Make sure the user has write access to ALL_ACCESS_FOLDER and that it belongs to the owner group for this directory. It is recommended that jobs be executed in Resource Manager node for Big Data Appliance. If jobs are executed in a different node, then the default is the hadoop group.
This script loads a set of six test rasters into the ohiftest
folder in HDFS, 3 rasters of byte data type and 3 bands, 1 raster (DEM) of float32 data type and 1 band, and 2 rasters of int32 data type and 1 band. No parameters are required for OBDA environments and a single parameter with the ALL_ACCESS_FOLDER value is required for non-OBDA environments.
Internally, the job creates a split for every raster to load. Split size depends on the block size configuration; for example, if a block size >= 64MB is configured, 4 mappers will run; and as a result the rasters will be loaded in HDFS and a corresponding thumbnail will be created for visualization. An external image editor is required to visualize the thumbnails, and an output path of these thumbnails is provided to the users upon successful completion of the job.
The test script can be found here:
/opt/oracle/oracle-spatial-graph/spatial/raster/tests/runimageloader.sh
For ODBA environments, enter:
./runimageloader.sh
For non-ODBA environments, enter:
./runimageloader.sh ALL_ACCESS_FOLDER
Upon successful execution, the message GENERATED OHIF FILES ARE LOCATED IN HDFS UNDER
is displayed, with the path in HDFS where the files are located (this path depends on the definition of ALL_ACCESS_FOLDER) and a list of the created images and thumbnails on HDFS. The output may include:
“THUMBNAILS CREATED ARE: ---------------------------------------------------------------------- total 13532 drwxr-xr-x 2 yarn yarn 4096 Sep 9 13:54 . drwxr-xr-x 3 yarn yarn 4096 Aug 27 11:29 .. -rw-r--r-- 1 yarn yarn 3214053 Sep 9 13:54 hawaii.tif.ohif.tif -rw-r--r-- 1 yarn yarn 3214053 Sep 9 13:54 inputimageint32.tif.ohif.tif -rw-r--r-- 1 yarn yarn 3214053 Sep 9 13:54 inputimageint32_1.tif.ohif.tif -rw-r--r-- 1 yarn yarn 3214053 Sep 9 13:54 kahoolawe.tif.ohif.tif -rw-r--r-- 1 yarn yarn 3214053 Sep 9 13:54 maui.tif.ohif.tif -rw-r--r-- 1 yarn yarn 4182040 Sep 9 13:54 NapaDEM.tif.ohif.tif YOU MAY VISUALIZE THUMBNAILS OF THE UPLOADED IMAGES FOR REVIEW FROM THE FOLLOWING PATH:
If the installation and configuration were not successful, then the output is not generated and a message like the following is displayed:
NOT ALL THE IMAGES WERE UPLOADED CORRECTLY, CHECK FOR HADOOP LOGS
The amount of memory required to execute mappers and reducers depends on the configured HDFS block size By default, 1 GB of memory is assigned for Java, but you can modify that and other properties in the imagejob.prop
file that is included in this test directory.
This script executes the processor job by setting three source rasters of Hawaii islands and some coordinates that includes all three. The job will create a mosaic based on these coordinates and resulting raster should include the three rasters combined in a single one.
runimageloader.sh
should be executed as a prerequisite, so that the source rasters exist in HDFS. These are 3 band rasters of byte data type.
No parameters are required for OBDA environments, and a single parameter "-s" with the ALL_ACCESS_FOLDER value is required for non-OBDA environments.
Additionally, if the output should be stored in HDFS, the "-o" parameters must be used to set the HDFS folder where the mosaic output will be stored.
Internally the job filters the tiles using the coordinates specified in the configuration input, xml, only the required tiles are processed in a mapper and finally in the reduce phase, all of them are put together into the resulting mosaic raster.
The test script can be found here:
/opt/oracle/oracle-spatial-graph/spatial/raster/tests/runimageprocessor.sh
For ODBA environments, enter:
./runimageprocessor.sh
For non-ODBA environments, enter:
./runimageprocessor.sh -s ALL_ACCESS_FOLDER
Upon successful execution, the message EXPECTED OUTPUT FILE IS: ALL_ACCESS_FOLDER/processtest/hawaiimosaic.tif
is displayed, with the path to the output mosaic file. The output may include:
EXPECTED OUTPUT FILE IS: ALL_ACCESS_FOLDER/processtest/hawaiimosaic.tif total 9452 drwxrwxrwx 2 hdfs hdfs 4096 Sep 10 09:12 . drwxrwxrwx 9 zherena dba 4096 Sep 9 13:50 .. -rwxrwxrwx 1 yarn yarn 4741101 Sep 10 09:12 hawaiimosaic.tif MOSAIC IMAGE GENERATED ---------------------------------------------------------------------- YOU MAY VISUALIZE THE MOSAIC OUTPUT IMAGE FOR REVIEW IN THE FOLLOWING PATH: ALL_ACCESS_FOLDER/processtest/hawaiimosaic.tif”
If the installation and configuration were not successful, then the output is not generated and a message like the following is displayed:
MOSAIC WAS NOT SUCCESSFULLY CREATED, CHECK HADOOP LOGS TO REVIEW THE PROBLEM
To test the output storage in HDFS, use the following command
For ODBA environments, enter:
./runimageprocessor.sh -o hdfstest
For non-ODBA environments, enter:
./runimageprocessor.sh -s ALL_ACCESS_FOLDER -o hdfstest
This script executes the processor job for a single raster, in this case is a DEM source raster of North Napa Valley. The purpose of this job is process the complete input by using the user processing classes configured for the mapping phase. This class calculates the hillshade of the DEM, and this is set to the output file. No mosaic operation is performed here.
runimageloader.sh
should be executed as a prerequisite, so that the source raster exists in HDFS. This is 1 band of float 32 data type DEM rasters.
No parameters are required for OBDA environments, and a single parameter "-s" with the ALL_ACCESS_FOLDER value is required for non-OBDA environments.
The test script can be found here:
/opt/oracle/oracle-spatial-graph/spatial/raster/tests/runsingleimageprocessor.sh
For ODBA environments, enter:
./runsingleimageprocessor.sh
For non-ODBA environments, enter:
./runsingleimageprocessor.sh -s ALL_ACCESS_FOLDER
Upon successful execution, the message EXPECTED OUTPUT FILE: ALL_ACCESS_FOLDER/processtest/NapaSlope.tif
is displayed, with the path to the output DEM file. The output may include:
EXPECTED OUTPUT FILE: ALL_ACCESS_FOLDER/processtest/NapaDEM.tif total 4808 drwxrwxrwx 2 hdfs hdfs 4096 Sep 10 09:42 . drwxrwxrwx 9 zherena dba 4096 Sep 9 13:50 .. -rwxrwxrwx 1 yarn yarn 4901232 Sep 10 09:42 NapaDEM.tif IMAGE GENERATED ---------------------------------------------------------------------- YOU MAY VISUALIZE THE OUTPUT IMAGE FOR REVIEW IN THE FOLLOWING PATH: ALL_ACCESS_FOLDER/processtest/NapaDEM.tif”
If the installation and configuration were not successful, then the output is not generated and a message like the following is displayed:
IMAGE WAS NOT SUCCESSFULLY CREATED, CHECK HADOOP LOGS TO REVIEW THE PROBLEM
This script executes the processor job by setting a DEM source raster of North Napa Valley and some coordinates that surround it. The job will create a mosaic based on these coordinates and will also calculate the slope on it by setting a processing class in the mosaic configuration XML.
runimageloader.sh
should be executed as a prerequisite, so that the source rasters exist in HDFS. This is 1 band of float 32 data type DEM raster.
No parameters are required for OBDA environments, and a single parameter "-s" with the ALL_ACCESS_FOLDER value is required for non-OBDA environments.
The test script can be found here:
/opt/oracle/oracle-spatial-graph/spatial/raster/tests/runimageprocessordem.sh
For ODBA environments, enter:
./runimageprocessordem.sh
For non-ODBA environments, enter:
./runimageprocessordem.sh -s ALL_ACCESS_FOLDER
Upon successful execution, the message EXPECTED OUTPUT FILE: ALL_ACCESS_FOLDER/processtest/NapaSlope.tif
is displayed, with the path to the slope output file. The output may include:
EXPECTED OUTPUT FILE: ALL_ACCESS_FOLDER/processtest/NapaSlope.tif total 4808 drwxrwxrwx 2 hdfs hdfs 4096 Sep 10 09:42 . drwxrwxrwx 9 zherena dba 4096 Sep 9 13:50 .. -rwxrwxrwx 1 yarn yarn 4901232 Sep 10 09:42 NapaSlope.tif MOSAIC IMAGE GENERATED ---------------------------------------------------------------------- YOU MAY VISUALIZE THE MOSAIC OUTPUT IMAGE FOR REVIEW IN THE FOLLOWING PATH: ALL_ACCESS_FOLDER/processtest/NapaSlope.tif”
If the installation and configuration were not successful, then the output is not generated and a message like the following is displayed:
MOSAIC WAS NOT SUCCESSFULLY CREATED, CHECK HADOOP LOGS TO REVIEW THE PROBLEM
You may also test the “if” algebra function, where every pixel in this raster with value greater than 2500 will be replaced by the value you set in the command line using the “–c” flag. For example:
For ODBA environments, enter:
./runimageprocessordem.sh –c 8000
For non-ODBA environments, enter:
./runimageprocessordem.sh -s ALL_ACCESS_FOLDER –c 8000
You can visualize the output file and notice the difference between simple slope calculation and this altered output, where the areas with pixel values greater than 2500 look more clear.
This script executes the processor job for two rasters that cover a very small area of North Napa Valley in the US state of California.
These rasters have the same MBR, pixel size, SRID, and data type, all of which are required for complex multiple raster operation processing. The purpose of this job is process both rasters by using the mask operation, which checks every pixel in the second raster to validate if its value is contained in the mask list. If it is, the output raster will have the pixel value of the first raster for this output cell; otherwise, the zero (0) value is set. No mosaic operation is performed here.
runimageloader.sh
should be executed as a prerequisite, so that the source rasters exist in HDFS. These are 1 band of int32 data type rasters.
No parameters are required for OBDA environments. For non-ODBA environments, a single parameter -s
with the ALL_ACCESS_FOLDER value is required.
The test script can be found here:
/opt/oracle/oracle-spatial-graph/spatial/raster/tests/runimageprocessormultiple.sh
For ODBA environments, enter:
./runimageprocessormultiple.sh
For non-ODBA environments, enter:
./runimageprocessormultiple.sh -s ALL_ACCESS_FOLDER
Upon successful execution, the message EXPECTED OUTPUT FILE: ALL_ACCESS_FOLDER/processtest/MaskInt32Rasters.tif
is displayed, with the path to the mask output file. The output may include:
EXPECTED OUTPUT FILE: ALL_ACCESS_FOLDER/processtest/MaskInt32Rasters.tif total 4808 drwxrwxrwx 2 hdfs hdfs 4096 Sep 10 09:42 . drwxrwxrwx 9 zherena dba 4096 Sep 9 13:50 .. -rwxrwxrwx 1 yarn yarn 4901232 Sep 10 09:42 MaskInt32Rasters.tif IMAGE GENERATED ---------------------------------------------------------------------- YOU MAY VISUALIZE THE OUTPUT IMAGE FOR REVIEW IN THE FOLLOWING PATH: ALL_ACCESS_FOLDER/processtest/MaskInt32Rasters.tif”
If the installation and configuration were not successful, then the output is not generated and a message like the following is displayed:
IMAGE WAS NOT SUCCESSFULLY CREATED, CHECK HADOOP LOGS TO REVIEW THE PROBLEM
You can access the image processing framework through the Oracle Big Data Spatial Image Server, which provides a web interface for loading and processing images.
Installing and configuring the Spatial Image Server depends upon the distribution being used.
After you perform the installation, verify it.
To perform an automatic installation using the provided script, you can perform these steps:
Run the following script:
sudo /opt/oracle/oracle-spatial-graph/spatial/configure-server/install-bdsg-consoles.sh
If the active nodes have changed since the installation, update the configuration in the web console.
Start the server:
cd /opt/oracle/oracle-spatial-graph/spatial/web-server sudo ./start-server.sh
If any errors occur, see the the README file located in /opt/oracle/oracle-spatial-graph/spatial/configure-server
.
The preceding instructions configure the entire server. If no further configuration is required, you can go directly to Post-Installation Verification Example for the Image Server Console.
If you need more information or need to perform other actions, see the following topics.
Ensure that you have the prerequisite software installed.
Copy the asm-3.1.jar
file under /opt/oracle/oracle-spatial-graph/spatial/raster/jlib/asm-3.1.jar
to WEB_SDERVER_HOME/webapps/imageserver/WEB-INF/lib
.
Note:
The jersey-core*
jars will be duplicated at WEB_SERVER_HOME/webapps/imageserver/WEB-INF/lib
. Make sure you remove the old ones and leave just jersey-core-1.17.1.jar
in the folder, as in the next step.
Enter the following command:
ls -lat jersey-core*
Delete the listed libraries, except do not delete jersey-core-1.17.1.jar
.
In the same directory (WEB_SERVER_HOME/webapps/imageserver/WEB-INF/lib
), delete the xercesImpl
and servlet
jar files:
rm xercesImpl* rm servlet*
Start the web server
If you need to change the port, specify it. For example, in the case of the Jetty server, set jetty.http.port=8081
.
Ignore any warnings, such as the following:
java.lang.UnsupportedOperationException: setXIncludeAware is not supported on this JAXP implementation or earlier: class oracle.xml.jaxp.JXDocumentBuilderFactory
Type the http://thehost:8045/imageserver
address in your browser address bar to open the web console.
From the Administrator tab, then Configuration tab, in the Hadoop Configuration Parameters section, depending on the cluster configuration change three properties:
fs.defaultFS
: Type the active namenode
of your cluster in the format hdfs://<namenode>:8020
(Check with the administrator for this information).
yarn.resourcemanager.scheduler.address
: Active Resource manager of your cluster. <shcedulername>:8030.
This is the Scheduler address.
yarn.resourcemanager.address
: Active Resource Manager address in the format <resourcename>:8032
Note:
Keep the default values for the rest of the configuration. They are pre-loaded for your Oracle Big Data Appliance cluster environment.
Click Apply Changes to save the changes.
Tip:
You can review the missing configuration information under the Hadoop Loader tab of the console.
To install and configure the image server web for other systems (not Big Data Appliance), see these topics.
Before installing the image server on other systems, you must install the image processing framework as specified in Installing the Image Processing Framework for Other Distributions (Not Oracle Big Data Appliance).
The steps to install the image server web on other systems are the same as for installing it on BDA.
Follow the instructions specified in "Prerequisites for Performing a Manual Installation."
Follow the instructions specified in "Installing Dependencies on the Image Server Web on an Oracle Big Data Appliance."
Follow the instructions specified in "Configuring the Environment for Other Systems."
Configure the environment as described in Configuring the Environment for Big Data Appliance
, and then continue with the following steps.From the Configuration tab in the Global Init Parameters section, depending on the cluster configuration change these properties
shared.gdal.data
: Specify the gdal shared data folder. Follow the instructions in Installing the Image Processing Framework for Other Distributions (Not Oracle Big Data Appliance) .
gdal.lib
: Location of the gdal .so
libraries.
start
: Specify a shared folder to start browsing the images. This folder must be shared between the cluster and NFS mountpoint (SHARED_FOLDER).
saveimages
: Create a child folder named saveimages
under start
(SHARED-FOLDER) with full write access. For example, if start=/home
, then saveimages=/home/saveimages
.
nfs.mountpoint
: If the cluster requires a mount point to access the SHARED_FOLDER, specify a mount point. For example, /net/home
. Otherwise, leave it blank.
From the Configuration tab in the Hadoop Configuration Parameters section, update the following property:
yarn.application.classpath
: The classpath for the Hadoop to find the required jars and dependencies. Usually this is under /usr/lib/hadoop
. For example:
/etc/hadoop/conf/,/usr/lib/hadoop/*,/usr/lib/hadoop/lib/*,/usr/lib/hadoop-hdfs/*,/usr/lib/hadoop-hdfs/lib/*,/usr/lib/hadoop-yarn/*,/usr/lib/hadoop-yarn/lib/*,/usr/lib/hadoop-mapreduce/*,/usr/lib/hadoop-mapreduce/lib/*
Note:
Keep the default values for the rest of the configuration.
Click Apply Changes to save the changes.
Tip:
You can review any missing configuration information under the Hadoop Loader tab of the console.
In this example, you will:
Load the images from local server to HDFS Hadoop cluster.
Run a job to create a mosaic image file and a catalog with several images.
View the mosaic image.
Note:
If no errors were shown, then you have successfully installed the Image Loader web interface.
The image server has two ready-to-use web services, one for the HDFS loader and the other for the HDFS mosaic processor.
These services can be called from a Java application. They are currently supported only for GET operations. The formats for calling them are:
Loader: http://host:port/imageserver/rest/hdfsloader?path=string&overlap=string
where:
path
: The images to be processed; can be a the path of a single file, or of one or more whole folders. For more than one folder, use commas to separate folder names.
overlap
(optional): The overlap between images (default = 10).
Mosaic: http://host:port/imageserver/rest/mosaic?mosaic=string&config=string
where:
mosaic
: The XML mosaic file that contains the images to be processed. If you are using the image server web application, the XML file is generated automatically. Example of a mosaic XML file:
<?xml version='1.0'?> <catalog type='HDFS'> <image> <source>Hadoop File System</source> <type>HDFS</type> <raster> /hawaii.tif.ohif</raster> <bands datatype='1' config='1,2,3'>3</bands> </image> <image> <source>Hadoop File System</source> <type>HDFS</type> <raster>/ /kahoolawe.tif.ohif</raster> <bands datatype='1' config='1,2,3'>3</bands> </image> </catalog>
config
: Configuration file; created the first time a mosaic is processed using the image server web application. Example of a configuration file
<?xml version="1.0" encoding="UTF-8" standalone="no"?> <mosaic> <output> <SRID>26904</SRID> <directory type = "FS">/net/system123/scratch/user3/installers</directory> <tempFsFolder>/net/system123/scratch/user3/installers</tempFsFolder> <filename>test</filename> <format>GTIFF</format> <width>1800</width> <height>1406</height> <algorithm order = "0">1</algorithm> <bands layers = "3"/> <nodata>#000000</nodata> <pixelType>1</pixelType> </output> <crop> <transform>294444.1905688362,114.06068372059636,0,2517696.9179752027,0,-114.06068372059636</transform> </crop> <process/> <operations> <localnot/> </operations> </mosaic>
Java Example: Using the Loader
public class RestTest public static void main(String args[]) { try { // Loader http://localhost:7101/imageserver/rest/hdfsloader?path=string&overlap=string // Mosaic http://localhost:7101/imageserver/rest/mosaic?mosaic=string&config=string String path = "/net/system123/scratch/user3/installers/hawaii/hawaii.tif"; URL url = new URL( "http://system123.example.com:7101/imageserver/rest/hdfsloader?path=" + path + "&overlap=2"); // overlap its optional HttpURLConnection conn = (HttpURLConnection) url.openConnection(); conn.setRequestMethod("GET"); //conn.setRequestProperty("Accept", "application/json"); if (conn.getResponseCode() != 200) { throw new RuntimeException("Failed : HTTP error code : " + conn.getResponseCode()); } BufferedReader br = new BufferedReader(new InputStreamReader( (conn.getInputStream()))); String output; System.out.println("Output from Server .... \n"); while ((output = br.readLine()) != null) { System.out.println(output); } conn.disconnect(); } catch (MalformedURLException e) { e.printStackTrace(); } catch (IOException e) { e.printStackTrace(); } } }
Java Example: Using the Mosaic Processor
public class NetClientPost { public static void main(String[] args) { try { String mosaic = "<?xml version='1.0'?>\n" + "<catalog type='HDFS'>\n" + " <image>\n" + " <source>Hadoop File System</source>\n" + " <type>HDFS</type>\n" + " <raster>/user/hdfs/newdata/net/system123/scratch/user3/installers/hawaii/hawaii.tif.ohif</raster>\n" + " <url>http://system123.example.com:7101/imageserver/temp/862b5871973372aab7b62094c575884ae13c3a27_thumb.jpg</url>\n" + " <bands datatype='1' config='1,2,3'>3</bands>\n" + " </image>\n" + "</catalog>"; String config = "<?xml version=\"1.0\" encoding=\"UTF-8\" standalone=\"no\"?>\n" + "<mosaic>\n" + "<output>\n" + "<SRID>26904</SRID>\n" + "<directory type=\"FS\">/net/system123/scratch/user3/installers</directory>\n" + "<tempFsFolder>/net/system123/scratch/user3/installers</tempFsFolder>\n" + "<filename>test</filename>\n" + "<format>GTIFF</format>\n" + "<width>1800</width>\n" + "<height>1269</height>\n" + "<algorithm order=\"0\">1</algorithm>\n" + "<bands layers=\"3\"/>\n" + "<nodata>#000000</nodata>\n" + "<pixelType>1</pixelType>\n" + "</output>\n" + "<crop>\n" + "<transform>739481.1311601736,130.5820811245199,0,2254053.5858749463,0,-130.5820811245199</transform>\n" + "</crop>\n" + "<process/>\n" + "</mosaic>"; System.out.println ("asdf"); URL url2 = new URL("http://192.168.1.67:8080" ); HttpURLConnection conn2 = (HttpURLConnection) url2.openConnection(); conn2.setRequestMethod("GET"); if (conn2.getResponseCode() != 200 ) { throw new RuntimeException("Failed : HTTP error code : " + conn2.getResponseCode()); } /*URL url = new URL("http://system123.example.com:7101/imageserver/rest/mosaic?" +("mosaic=" + URLEncoder.encode(mosaic, "UTF-8") + "&config=" + URLEncoder.encode(config, "UTF-8"))); HttpURLConnection conn = (HttpURLConnection) url.openConnection(); conn.setRequestMethod("GET"); if (conn.getResponseCode() != 200 ) { throw new RuntimeException("Failed : HTTP error code : " + conn.getResponseCode()); } BufferedReader br = new BufferedReader(new InputStreamReader( (conn.getInputStream()))); String output;System.out.println("Output from Server .... \n"); while ((output = br.readLine()) != null) System.out.println(output); conn.disconnect();*/ } catch (MalformedURLException e) { e.printStackTrace(); } catch (IOException e) { e.printStackTrace(); } } }
To install the Oracle Big Data SpatialViewer web application (SpatialViewer), follow the instructions in this topic.
The following assumptions and prerequisites apply for installing and configuring SpatialViewer.
The API and jobs described here run on a Cloudera CDH5.7, Hortonworks HDP 2.4, or similar Hadoop environment.
Java 8 or newer versions are present in your environment.
In addition to the Hadoop and Spark environment jars, the libraries listed here are required by the Vector Analysis API.
sdohadoop-vector.jar sdospark-vector.jar hadoop-spatial-commons.jar sdoutil.jar sdoapi.jar ojdbc8.jar commons-fileupload-1.3.1.jar commons-io-2.4.jar jackson-annotations-2.1.4.jar jackson-core-2.1.4.jar jackson-core-asl-1.8.1.jar jackson-databind-2.1.4.jar javacsv.jar lucene-analyzers-common-4.6.0.jar lucene-core-4.6.0.jar lucene-queries-4.6.0.jar lucene-queryparser-4.6.0.jar mvsuggest_core.jar
You can install SpatialViewer on Big Data Appliance as follows
Run the following script:
sudo /opt/oracle/oracle-spatial-graph/spatial/configure-server/install-bdsg-consoles.sh
Start the web application by using one of the following commands (the second command enables you to view logs):
sudo service bdsg start sudo /opt/oracle/oracle-spatial-graph/spatial/web-server/start-server.sh
If any errors occur, see the the README file located in /opt/oracle/oracle-spatial-graph/spatial/configure-server
.
Open: http://<oracle_big_data_spatial_vector_console>:8045/spatialviewer/
If the active nodes have changed after the installation or if Kerberos is enabled, then update the configuration file as described in Configuring SpatialViewer on Oracle Big Data Appliance.
Optionally, upload sample data (used with examples in other topics) to HDFS:
sudo -u hdfs hadoop fs -mkdir /user/oracle/bdsg sudo -u hdfs hadoop fs -put /opt/oracle/oracle-spatial-graph/spatial/vector/examples/data/tweets.json /user/oracle/bdsg/
Follow the steps for manual configuration described in in Installing SpatialViewer on Oracle Big Data Appliance.
Then, change the configuration, as described in Configuring SpatialViewer for Other Systems (Not Big Data Appliance)
To configure SpatialViewer on Oracle Big Data Appliance, follow these steps.
Open the console: http://<oracle_big_data_spatial_vector_console>:8045/spatialviewer/?root=swadmin
Change the general configuration, as needed:
Local working directory: SpatialViewer local working directory. Absolute path. The default directory /usr/oracle/spatialviewer
is created when installing SpatialViewer.
HDFS working directory: SpatialViewer HDFS working directory. The default directory /user/oracle/spatialviewer
is created when installing SpatialViewer.
Hadoop configuration file: The Hadoop configuration directory. By default: /etc/hadoop/conf
eLocation URL: URL used to get the eLocation background maps. By default: http://elocation.oracle.com
Kerberos keytab: If Kerberos is enabled, provide the full path to the file that contains the keytab file.
Display logs: If necessary, disable the display of the jobs in the Spatial Jobs screen. Disable this display if the logs are not in the default format. The default format is: Date LogLevel LoggerName: LogMessage
The Date must have the default format: yyyy-MM-dd HH:mm:ss,SSS
. For example: 2012-11-02 14:34:02,781
.
If the logs are not displayed and the Display logs field is set to Yes
, then ensure that yarn.log-aggregation-enable
in yarn-site.xml
is set to true
. Also ensure that the Hadoop jobs configuration parameters yarn.nodemanager.remote-app-log-dir
and yarn.nodemanager.remote-app-log-dir-suffix
are set to the same value as in yarn-site.xml
.
Change the Hadoop configuration, as needed.
Chanbge the Spark configuration, as needed.
If Kerberos is enabled, then you may need to add the parameters:
spark.authenticate
: must be set to true
spark.authenticate.enableSaslEncryption
: set the value to true
or false
depending on your cluster configuration.
spark.yarn.keytab
: the full path to the file that contains the keytab for the principal
spark.yarn.principal
: the principal to be used to log in to Kerberos. The format of a typical Kerberos V5 principal is primary/instance@REALM
.
Ensure that the user can read the keytab file.
Copy the keytab file to the same location on all the nodes of the cluster.
Follow the steps mentioned in Configuring SpatialViewer on Oracle Big Data Appliance.
Additionally, change the Hadoop Configuration, replacing the Hadoop property yarn.application.classpath
value /opt/cloudera/parcels/CDH/lib/
with the actual library path, which by default is /usr/lib/
.
You can use property graphs on either Oracle Big Data Appliance or commodity hardware.
See Also:
The following prerequisites apply to installing property graph support in HBase.
Linux operating system
Cloudera's Distribution including Apache Hadoop (CDH)
For the software download, see: http://www.cloudera.com/content/cloudera/en/products-and-services/cdh.html
Apache HBase
Java Development Kit (JDK) (Java 8 or higher)
Details about supported versions of these products, including any interdependencies, will be provided in a My Oracle Support note.
The installation directory for Oracle Big Data Spatial and Graph property graph features has the following structure:
$ tree -dFL 2 /opt/oracle/oracle-spatial-graph/property_graph/
/opt/oracle/oracle-spatial-graph/property_graph/
|-- dal
| |-- groovy
| |-- opg-solr-config
| `-- webapp
|-- data
|-- doc
| |-- dal
| `-- pgx
|-- examples
| |-- dal
| |-- pgx
| `-- pyopg
|-- lib
|-- librdf
`-- pgx
|-- bin
|-- conf
|-- groovy
|-- scripts
|-- webapp
`-- yarn
Follow this installation task if property graph support is installed on a client without Hadoop, and you want to read graph data stored in the Hadoop Distributed File System (HDFS) into the in-memory analyst and write the results back to the HDFS, and/or use Hadoop NextGen MapReduce (YARN) scheduling to start, monitor and stop the in-memory analyst.
When running a Java application using in-memory analytics and HDFS, make sure that $HADOOP_HOME/etc/hadoop
is on the classpath, so that the configurations get picked up by the Hadoop client libraries. However, you do not need to do this when using the in-memory analyst shell, because it adds $HADOOP_HOME/etc/hadoop
automatically to the classpath if HADOOP_HOME
is set.
You do not need to put any extra Cloudera Hadoop libraries (JAR files) on the classpath. The only time you need the YARN libraries is when starting the in-memory analyst as a YARN service. This is done with the yarn
command, which automatically adds all necessary JAR files from your local installation to the classpath.
You are now ready to load data from HDFS or start the in-memory analyst as a YARN service. For further information about Hadoop, see the CDH 5.x.x documentation.
To use the Multimedia analytics feature, the video analysis framework must be installed and configured.
If you have licensed Oracle Big Data Spatial and Graph with Oracle Big Data Appliance, the video analysis framework for Multimedia analytics is already installed and configured. However, you must set $MMA_HOME
to point to /opt/oracle/oracle-spatial-graph/multimedia
.
Otherwise, you can install the framework on Cloudera CDH 5 or similar Hadoop environment, as follows:
Install the framework by using the following command on each node on the cluster:
rpm2cpio oracle-spatial-graph-<version>.x86_64.rpm | cpio -idmv
Set $MMA_HOME
to point to /opt/oracle/oracle-spatial-graph/multimedia
.
Identify the locations of the following libraries:
Hadoop jar files (available in $HADOOP_HOME/jars
)
Video processing libraries (see Transcoding Software (Options)
OpenCV libraries (available with the product)
If necessary, install the desired video processing software to transcode video data (see Transcoding Software (Options)).
The following options are available for transcoding video data:
JCodec
FFmpeg
Third-party transcoding software
To use Multimedia analytics with JCodec (which is included with the product), when running the Hadoop job to recognize faces, set the oracle.ord.hadoop.ordframegrabber
property to the following value: oracle.ord.hadoop.decoder.OrdJCodecFrameGrabber
To use Multimedia analytics with FFmpeg:
Download FFmpeg from: https://www.ffmpeg.org/.
Install FFmpeg on the Hadoop cluster.
Set the oracle.ord.hadoop.ordframegrabber
property to the following value: oracle.ord.hadoop.decoder.OrdFFMPEGFrameGrabber
To use Multimedia analytics with custom video decoding software, implement the abstract class oracle.ord.hadoop.decoder.OrdFrameGrabber
. See the Javadoc for more details