12 Querying a Cache (C++)
This chapter includes the following sections:
- Overview of Query Functionality
Coherence can perform queries and indexes against currently cached data that meets a given set of criteria. - Performing Simple Queries
You can use a value extractor and filter to query a cache. - Understanding Query Concepts
The concept of querying is based on theValueExtractor
interface. - Performing Queries Involving Multi-Value Attributes
Coherence supports indexing and querying of multi-value attributes including collections and arrays. - Using a Chained Extractor in a Query
TheChainedExtractor
implementation allows chained invocation of zero-argument (accessor) methods. - Using a Query Recorder
TheQueryRecorder
class produces an explain or trace record for a given filter.
Parent topic: Creating C++ Extend Clients
Overview of Query Functionality
It should be noted that queries apply only to currently cached data (and do not use the CacheLoader
interface to retrieve additional data that may satisfy the query). Thus, the data set should be loaded entirely into cache before queries are performed. In cases where the data set is too large to fit into available memory, it may be possible to restrict the cache contents along a specific dimension (for example, "date") and manually switch between cache queries and database queries based on the structure of the query. For maintainability, this is usually best implemented inside a cache-aware data access object (DAO).
Indexing requires the ability to extract attributes on each Partitioned cache node; For dedicated CacheServer
instances, this implies (usually) that application classes must be installed in the CacheServer
classpath.
For Local and Replicated caches, queries are evaluated locally against unindexed data. For Partitioned caches, queries are performed in parallel across the cluster, using indexes if available. Coherence includes a Cost-Based Optimizer (CBO). Access to unindexed attributes requires object deserialization (though indexing on other attributes can reduce the number of objects that must be evaluated).
Parent topic: Querying a Cache (C++)
Performing Simple Queries
The following example uses an a value extractor and filter to query a cache.
ValueExtractor::Handle hExtractor = ReflectionExtractor::create("getAge"); Filter::View vFilter = GreaterEqualsFilter::create(hExtractor, Integer32::valueOf(18)); for (Iterator::Handle hIter = hCache->entrySet(vFilter)->iterator(); hIter->hasNext(); ) { Map::Entry::Handle hEntry = cast<Map::Entry::Handle>(hIter->next()); Integer32::View vKey = cast<Integer32::View>(hEntry->getKey()); Person::Handle hPerson = cast<Person::Handle>(hEntry->getValue()); std::cout << "key=" << vKey << " person=" << hPerson; }
Coherence provides a wide range of filters in the coherence::util::Filter
package. A LimitFilter
may be used to limit the amount of data sent to the client, and also to provide "paging" for users:
int32_t nPageSize = 25; ValueExtractor::Handle hExtractor = ReflectionExtractor::create("getAge"); Filter::View vFilter = GreaterEqualsFilter::create(hExtractor, Integer32::valueOf(18)); // get entries 1-25 LimitFilter::Handle hLimitFilter = LimitFilter::create(vFilter, nPageSize); Set::View vEntries = hCache->entrySet(hLimitFilter); // get entries 26-50 hLimitFilter->nextPage(); vEntries = hCache->entrySet(hLimitFilter);
Any queryable attribute may be indexed with the addIndex
method of the QueryMap
class:
// addIndex(ValueExtractor::View vExtractor, boolean_t fOrdered, Comparator::View vComparator) hCache->addIndex(hExtractor, true, NULL);
The fOrdered
argument specifies whether the index structure is sorted. Sorted indexes are useful for range queries, including "select all entries that fall between two dates" and "select all employees whose family name begins with 'S'". For "equality" queries, an unordered index may be used, which may have better efficiency in terms of space and time.
The comparator argument provides a custom java.util.Comparator
for ordering the index.
Note:
This method is only intended as a hint to the cache implementation, and as such it may be ignored by the cache if indexes are not supported or if the desired index (or a similar index) exists. It is expected that an application calls this method to suggest an index even if the index exists, just so that the application is certain that index has been suggested. For example, in a distributed environment each server likely suggests the same set of indexes when it starts, and there is no downside to the application blindly requesting those indexes regardless of whether another server has requested the same indexes.
Note that queries can be combined by Coherence if necessary, and also that Coherence includes a cost-based optimizer (CBO) to prioritize the usage of indexes. To take advantage of an index, queries must use extractors that are equal ((Object->equals()
) to the one used in the query.
Querying Partitioned Caches
The Partitioned Cache implements the QueryMap interface using the Parallel Query feature and results in high performance queries even for large data sets.
Querying Near Caches
Although queries can be executed through a near cache, the query does not use the front portion of a near cache. If using a near cache with queries, the best approach is to use the following sequence:
Set::View vSetKeys = hCache->keySet(vFilter); Map::View vMapResult = hCache->getAll(vSetKeys);
Parent topic: Querying a Cache (C++)
Understanding Query Concepts
ValueExtractor
interface. A value extractor is used to extract an attribute from a given object for querying (and similarly, indexing). Most developers only need the ReflectionExtractor
implementation of this interface. The ReflectionExtractor uses reflection to extract an attribute from a value object by referring to a method name, typically a "getter" method like getName()
.
ReflectionExtractor::Handle hExtractor = ReflectionExtractor::create("getName");
Any void argument method can be used, including Object
methods like toString()
(useful for prototyping/debugging). Indexes may be either traditional field indexes (indexing fields of objects) or function-based indexes (indexing virtual object attributes). For example, if a class has field accessors getFirstName
and getLastName
, the class may define a function getFullName
which concatenates those names, and this function may be indexed.
To query a cache that contains objects with getName
attributes, a Filter
must be used. A filter has a single method which determines whether a given object meets a criterion.
Filter::Handle hEqualsFilter = EqualsFilter::create(hExtractor, String::create("Bob Smith"));
To select the entries of a cache that satisfy a particular filter:
for (Iterator::Handle hIter = hCache->entrySet(hEqualsFilter)->iterator(); hIter->hasNext(); ) { Map::Entry::Handle hEntry = cast<Map::Entry::Handle>(hIter->next()); Integer32::View vKey = cast<Integer32::View>(hEntry->getKey()); Person::Handle hPerson = cast<Person::Handle>(hEntry->getValue()); std::cout << "key=" << vKey << " person=" << hPerson; }
To select and also sort the entries:
// entrySet(Filter::View vFilter, Comparator::View vComparator) Iterator::Handle hIter = hCache->entrySet(hEqualsFilter, NULL)->iterator();
The additional NULL argument specifies that the result set should be sorted using the "natural ordering" of Comparable objects within the cache. The client may explicitly specify the ordering of the result set by providing an implementation of Comparator. Note that sorting places significant restrictions on the optimizations that Coherence can apply, as sorting requires that the entire result set be available before sorting.
Using the keySet
form of the queries—combined with getAll()
—may provide more control over memory usage:
// keySet(Filter::View vFilter) Set::View vSetKeys = hCache->keySet(vFilter); Set::Handle hSetPageKeys = HashSet::create(); int32_t PAGE_SIZE = 100; for (Iterator::Handle hIter = vSetKeys->iterator(); hIter->hasNext();) { hSetPageKeys->add(hIter->next()); if (hSetPageKeys->size() == PAGE_SIZE || !hIter->hasNext()) { // get a block of values Map::View vMapResult = hCache->getAll(hSetPageKeys); // process the block // ... hSetPageKeys->clear(); } }
Parent topic: Querying a Cache (C++)
Performing Queries Involving Multi-Value Attributes
The ContainsAllFilter
, ContainsAnyFilter
, and ContainsFilter
are used to query against collections with multi-value attributes.
Set::Handle hSearchTerms = HashSet::create(); hSearchTerms->add(String::create("java")); hSearchTerms->add(String::create("clustering")); hSearchTerms->add(String::create("books")); // The cache contains instances of a class "Document" which has a method // "getWords" which returns a Collection<String> containing the set of // words that appear in the document. ValueExtractor::Handle hExtractor = ReflectionExtractor::create("getWords"); Filter::View vFilter = ContainsAllFilter::create(hExtractor, hSearchTerms); Set::View vEntrySet = hCache->entrySet(vFilter); // iterate through the search results // ...
Parent topic: Querying a Cache (C++)
Using a Chained Extractor in a Query
ChainedExtractor
implementation allows chained invocation of zero-argument (accessor) methods.The following example extractor first uses reflection to call getName()
on each cached Person
object, and then use reflection to call length()
on the returned String
. This extractor could be passed into a query, allowing queries (for example) to select all people with names not exceeding 10 letters.
ChainedExtractor::Handle hExtractor = ChainedExtractor::create(ChainedExtractor::createExtractors("getName.length"));
Method invocations may be chained indefinitely, for example: getName.trim.length
.
POF extractors and POF updaters offer the same functionality as ChainedExtractors
through the use of the SimplePofPath
class. See Using POF Extractors and POF Updaters in Developing Applications with Oracle Coherence.
Parent topic: Querying a Cache (C++)
Using a Query Recorder
QueryRecorder
class produces an explain or trace record for a given filter. The class is an implementation of a parallel aggregator that is capable querying all nodes in a cluster and aggregating the results. The class supports two record types: an QueryRecorder::explain
record that provides the estimated cost of evaluating a filter as part of a query operation and a QueryRecorder::trace
record that provides the actual cost of evaluating a filter as part of a query operation. Both query records take into account whether or not an index can be used by a filter. See Interpreting Query Records in Developing Applications with Oracle Coherence.
To create a query record, create a new QueryRecorder
instance that specifies a RecordType
parameter. Include the instance and the filter to be tested as parameters of the Aggregate
method. The following example creates an explain record:
NamedCache::Handle hCache = CacheFactory::getCache("MyCache"); IdentityExtractor::View hExtract = IdentityExtractor::getInstance(); OrFilter::Handle hFilter = OrFilter::create( GreaterEqualsFilter::create(hExtract, Integer32::create(50)), LessEqualsFilter::create(hExtract, Integer32::create(20))); QueryRecord::View vRecord = cast<QueryRecord::View>(hCache->aggregate( (Filter::View) hFilter, QueryRecorder::create(QueryRecorder::explain))); cout << vRecord;
To create a trace record, change the RecordType
parameter to trace
:
QueryRecord::View vRecord = cast<QueryRecord::View>(hCache->aggregate(
(Filter::View) hFilter, QueryRecorder::create(QueryRecorder::trace)));
Parent topic: Querying a Cache (C++)