T
-
public class SpatialJavaRDD<T>
extends <any>
This class represents a spatially enabled RDD. A SpatialJavaRDD encapsulates an existing RDD and adds spatial transformations and functions.
Spatial information is extracted from the source RDD records using an implementation of SparkRecordInfoProvider
provided by the user. The SparkRecordInfoProvider is expected to return the geometry of spatial records.
The following example shows how to create a SpatialJavaRDD from a text RDD, using a SparkRecordInfo which extracts the spatial information from text records
JavaRDD<String> rdd = sc.textFile("someFile.txt");
SparkRecordInfoProvider recordInfoProvider = new MySparkRecordInfoProvider();
SpatialJavaRDD<String> spatialRDD = SpatialJavaRDD.fromJavaRDD(rdd, recordInfoProvider, String.class);
The previous example creates a SpatialJavaRDD whose records are of type String as the source RDD. Every time a spatial operation is performed, the provided SparkRecordInfoProvider instance will be used to extract the spatial information. For some data set formats, like JSON, extracting spatial information may require parsing each record every time the spatial information is needed for a spatial operation and it may be executed several times if multiple spatial transformations are applied.
A way to ensure records will be parsed only once by the SparkRecordInfoProvider is to create a SpatialJavaRDD which records are of type SparkRecordInfo
. A SparkRecordInfo holds spatial information and any desired additional data from the source RDD records. The SpatialJavaRDD static method SpatialJavaRDD#fromJavaRDD(JavaRDD, SparkRecordInfoProvider)
can be used to create JavaSpatialRDD of type SparkRecordInfo.
Spatial versions of existing transformations are provided. Spatial transformations take the following parameters:
SpatialOperationConfig
which contains the information of the spatial operation to be performed to filter records.Modifier and Type | Method and Description |
---|---|
long |
count() |
DataFrame |
createSpatialDataFrame(SQLContext sqlContext, java.util.List<java.lang.String> fieldsList)
Creates data frame based on the SpatialJavaRDD.
|
SpatialTransformationContext<T> |
createSpatialTransformationContext()
Creates an instance of
SpatialTransformationContext |
SpatialTransformationContext<T> |
createSpatialTransformationContext(SpatialOperationConfig spatialOperationConf)
Creates an instance of
SpatialTransformationContext associated to the given SpatialOperationConfig . |
<R> <any> |
enrich(<any> f, GeoEnricher enricher)
Associates records from the current spatial RDD to spatial features using an instance of
GeoEnricher |
SpatialJavaRDD<T> |
filter(<any> f, SpatialOperationConfig spatialOpConf)
Returns a new spatial RDD containing only the elements that satisfy both, the filtering function and the spatial operation
|
<R> <any> |
flatMap(<any> fmFun, SpatialOperationConfig spatialOpConf)
Returns a new RDD by first spatially filtering the RDD elements using the spatial operation given by spatialOpConf, then a function is applied to all the remaining elements.
|
static <T> SpatialJavaRDD<SparkRecordInfo> |
fromJavaRDD(<any> rdd, SparkRecordInfoProvider<T> recordInfoProvider)
Creates a spatial RDD of type SparkRecordInfo from the given rdd.
|
static <T> SpatialJavaRDD<T> |
fromJavaRDD(<any> rdd, SparkRecordInfoProvider<T> recordInfoProvider, java.lang.Class<T> recordType)
Creates a spatial RDD from the given java rdd.
|
static <T> SpatialJavaRDD<T> |
fromRDD(<any> rdd, SparkRecordInfoProvider<T> recordInfoProvider)
Creates a spatial RDD from the given rdd.
|
double[] |
getMBR()
Gets the minimum bounding rectangle of the RDD
|
SparkRecordInfoProvider<T> |
getRecordInfoProvider()
Gets the RDD's
SparkRecordInfoProvider instance |
java.lang.Class<T> |
getType()
Gets the type of the records in the RDD
|
java.util.List<<any>> |
nearestNeighbors(<any> f, int k, SpatialOperationConfig spatialOpConf)
Returns the k elements which are closest to a given query window defined in the give spatial operation configuration
|
java.util.List<<any>> |
nearestNeighbors(JGeometry qryWindow, int k, double tol)
Returns the k elements which are closest to the given query window
|
public long count()
public DataFrame createSpatialDataFrame(SQLContext sqlContext, java.util.List<java.lang.String> fieldsList)
sqlContext
- the sqlContextfieldsList
- the extra fields to include. The geometry will always be the first field.public SpatialTransformationContext<T> createSpatialTransformationContext()
SpatialTransformationContext
SpatialTransformationContext
public SpatialTransformationContext<T> createSpatialTransformationContext(SpatialOperationConfig spatialOperationConf)
SpatialTransformationContext
associated to the given SpatialOperationConfig
.spatialOperationConf
- a spatial operation used to filter recordsSpatialTransformationContext
public <R> <any> enrich(<any> f, GeoEnricher enricher)
GeoEnricher
f
- a lambda function which is called for each record from the spatial RDD and its associated spatial features if any.enricher
- a component used to associate a geometry to spatial features from different spatial data layerspublic SpatialJavaRDD<T> filter(<any> f, SpatialOperationConfig spatialOpConf)
f
- a filtering functionspatialOpConf
- a spatial operation used to filter recordsSpatialJavaRDD
public <R> <any> flatMap(<any> fmFun, SpatialOperationConfig spatialOpConf)
fmFun
- a function to apply to each elementspatialOpConf
- a spatial operation used to filter recordspublic static <T> SpatialJavaRDD<SparkRecordInfo> fromJavaRDD(<any> rdd, SparkRecordInfoProvider<T> recordInfoProvider)
rdd
- an existing RDDrecordInfoProvider
- an implementation of SparkRecordInfoProvider
public static <T> SpatialJavaRDD<T> fromJavaRDD(<any> rdd, SparkRecordInfoProvider<T> recordInfoProvider, java.lang.Class<T> recordType)
rdd
- an existing JavaRDDrecordInfoProvider
- an implementation of SparkRecordInfoProvider
recordType
- the type of the source rdd recordspublic static <T> SpatialJavaRDD<T> fromRDD(<any> rdd, SparkRecordInfoProvider<T> recordInfoProvider)
rdd
- an existing RDDrecordInfoProvider
- an implementation of SparkRecordInfoProvider
public double[] getMBR()
public SparkRecordInfoProvider<T> getRecordInfoProvider()
SparkRecordInfoProvider
instanceSparkRecordInfoProvider
instancepublic java.lang.Class<T> getType()
public java.util.List<<any>> nearestNeighbors(<any> f, int k, SpatialOperationConfig spatialOpConf)
f
- an optional filtering function which should return true for the elements which can be part of the solutionk
- the number of nearest neighbors to returnspatialOpConf
- a spatial configuration containing the query window to be used. SpatialOperation is ignored for this action.public java.util.List<<any>> nearestNeighbors(JGeometry qryWindow, int k, double tol)
qryWindow
- a geometry from where the nearest neighbors will be calculatedk
- the number of neighborstol
- the tolerance usedCopyright © 2017, 2019 Oracle and/or its affiliates. All Rights Reserved.