A dimension is a collection of related dimension values, organized into a tree. Dimension values are tags, or labels, you use to classify the records in your data set.
Access detailed information about a dimension and its dimension values, modify a dimension, or determine the order in which they are returned from the MDEX Engine.
The Dimensions view displays information about all of the dimensions in your project in tabular format. You can add, remove, or modify dimensions here, as well as access more detailed information about an individual dimension and its dimension values. The Dimension view is also where you rank your dimensions to determine the order in which they are returned from the MDEX Engine.
All dimensions are visible in the Dimensions view, opened from the Project Explorer.
To open the Dimensions view:
In the Project Explorer Project tab, double-click Dimensions.
The Dimensions view appears in the work area.
Choose from the following options:
To make them easier to find and work with, the dimensions in your Dimensions view can be sorted by name (in alphabetical ascending or descending order) or by rank.
Note
The procedure described below changes the way dimensions are sorted in Dimensions view only. It does not determine how dimensions are ranked when they are returned from the MDEX Engine. See Ranking dimensions manually for more information.
To sort dimensions in the Dimensions view:
Create new dimensions from the Dimensions view, on the Project tab.
To add dimension values to the new dimension:
You can modify one dimension at a time using the Dimension editor.
To make changes to a particular dimension:
Delete dimensions from the Dimensions view.
You use the Dimension editor to create a new dimension or modify the attributes that affect how an existing dimension is evaluated and displayed.
The top of the Dimension editor contains the following information that identifies the dimension:
Option |
Description |
---|---|
Name |
The name of this dimension. Dimension names are case sensitive. |
ID |
A unique system-generated identifier. |
Member of this dimension group |
Allows you to select from existing dimension groups or add a new one. |
Refinements sort order |
Specifies the sort type for any refinement dimension values that are returned for this dimension: Alpha, Integer, or Floating point. |
The lower half of the Dimension editor contains five tabs. See tables below for details.
The General tab contains the following settings:
Option |
Description |
---|---|
Prepare sort offline |
When checked, record sorting on this dimension is optimized. |
Hidden |
Specifies whether or not this dimension is shown in the navigation controls. |
Show with record list |
When checked, enables this dimension to appear in the record list display. Any records that are tagged with a value from this dimension will have the value shown as part of their entry in the record list. |
Show with record |
When checked, allows this dimension to appear on the record page. Any records that are tagged with a value from this dimension will have that value shown as part of their entry on the record page. |
Language |
Specifies the language for this dimension so that the MDEX Engine can perform language-specific operations correctly. If your application tends to have mixed-language records, and the languages are segregated into different dimensions, setting a per-dimension language ID might be appropriate. For more information about language settings, see the Endeca Advanced Development Guide. |
The Search tab contains the following settings:
Option |
Description |
---|---|
Search hierarchy for dimension search |
When checked, allows dimension search to consider ancestor dimension values when matching a dimension search query. |
Enable record search |
Specifies whether or not record search should be enabled for this dimension. Record search finds all records in an Endeca application that have a dimension whose value matches a term the user provides. Checking Enable record search makes the following additional options available. |
Search hierarchy for record search |
When checked, allows record search to consider ancestor dimension values when matching a record search query. This setting is only enabled when Enable record search is checked. |
Enable wildcard search |
When checked, indicates that a user query can contain a wildcard character (*) to match against fragments of words in a dimension value. You must enable each dimension that you want available for wildcard searching. |
The Advanced tab contains the following settings:
Option |
Description |
---|---|
Primary |
Specifies whether this dimension is the project's sole primary dimension. All other dimensions are secondary. (Note that a primary dimension is no longer required and is ignored by the MDEX Engine. The MDEX Engine treats all dimensions as secondary, no matter what you specify in this field). |
Multiselect |
Allows the end user to select more than one dimension value from a dimension. |
Enable for rollup |
Enables aggregated Endeca record creation by allowing rollups based on this dimension. |
Compute refinement statistics |
Enables the computation of refinement statistics. |
Collapsible dimension threshold |
Allows you to set your application to collapse a deep hierarchy to make it shallower when available data is small. |
The Dynamic Ranking tab contains the following settings:
The Cluster Discovery tab contains the following settings:
Option |
Description |
---|---|
Enable clustering |
Specifies whether Cluster Discovery is enabled for this dimension. The checkbox also makes the following controls available. For more information, see the Endeca Relationship Discovery Guide. |
Sample size |
This parameter governs how many documents are sampled. Clustering processing time and memory consumption are both roughly linear with this number; thus, lowering the value results in smaller memory consumption and faster turnaround. However, statistical errors are likely to occur when the sample size is small. Setting this value higher will overcome statistical errors for data sets where fewer terms are tagged onto each document. Range: Integer, 50-2000 (default: 500) Recommended value: 500 |
Maximum clusters |
This parameter limits the number of clusters that will be generated by the MDEX Engine. Range: Integer, 2-10 (default: 10) Recommended value: 6 |
Coherence |
This parameter governs the decision of whether a set of terms is coherent enough to form a cluster (that is, each cluster should have only closely related documents). Low values are permissive (i.e., not demanding much coherence) and will result in fewer larger clusters; high values are strict and will result in more smaller clusters. The average value is recommended. Range: Integer, 2-10 (default: 10) Recommended value: 6 |
Maximum precision |
Terms that are extracted from sampled documents are filtered by their precision p (where p = number of sampled documents that this term is tagged onto divided by the number of all sampled documents). Terms that have too high a value of p are likely to be the search term (or be synonymous with it) or be too general to make for a good clustering term. If you use the recommended tuning values of the term extractor, each term is tagged only roughly 1/3 of the documents, which means that the search term, if present, will have p of roughly 0.33 (more or less stringent tuning of the term extractor will change this value). There usually is a gap in the values of p between the search term and the more useful terms, which start at approximately p = 0.25 and less. Range: Float, 0.0 - 1.0 (default: 1.0) Recommended value: 0.25 |
Maximum cluster size |
Sets the maximum number of terms that can be in a cluster. Each cluster will have at least 2 terms. Because of the match-partial cluster selection mechanism, the more terms there are in the cluster, the (potentially) higher its coverage will be. On the other hand, the clusters that are too large take up too much space to display and take too long for users to read. Range: Integer, 2 - 10 (default: 10) Recommended value: 8 |
Maximum cluster overlap |
If two clusters overlap (that is, if the document sets that each cluster maps to overlap), then the smaller one (as measured by the estimated size of the document set it maps to) can be removed, depending on how big this overlap is. This parameter dictates the overlap above which the smaller cluster is removed. Clusters which overlap by more than this value will be removed. Thus, the default setting of 10 (out of ten) means that clusters that overlap by more than 10 out of 10 documents will be removed. Since this is impossible, this means that setting of 10 will disable cluster overlap filtering, which is most extreme level of coarseness for this filter. Tuning this parameter down will make the cluster overlap more and more fine-grained. Thus, a value of 9 will remove only the clusters that greatly overlap; setting it to the recommended value of 5 will remove only clusters overlapping half-way or so (remember that the overlap is merely estimated). Setting this parameter to lower values (less than 5) will make overlap filtering quite sensitive and will remove clusters which overlap even by a small amount. Note that clusters that do not overlap at all will never be filtered. Range: Integer, 0-10 (default: 10) Recommended value: 5 |
A dimension ID is system-generated and cannot be changed, but you can view it.
Each dimension has a unique numeric ID.
To check the ID of a dimension:
Dimension groups allow you to organize dimensions into groupings for presentation purposes.
Note
A dimension group must exist before you can add a dimension to it.
To add a dimension to an existing dimension group:
Note
A dimension can belong to only one dimension group.
Map a source property to a dimension to enable navigation on that property.
You must map a source property to a dimension in order to enable navigation on that property. You use the property mapper component to map source properties to dimensions. See the Endeca Forge Guide for details.
Note
By default, Forge removes source properties that have not been mapped to an Endeca property or dimension; therefore, be sure to create a mapping for a source property if you intend to use it in your Endeca-enabled Web application.
The Dimension Values view displays hierarchical dimension value information about the selected dimension.
In the example below, the Red dimension value has seven child dimension values: Beaujolais, Bordeaux, Cabernet Franc, and so on. The Sparkling and White dimension values also have children, as indicated by the plus sign (+) next to their names.
You can provide name/value pairs with descriptive information about a given dimension value.
For example, putting a property that contains a database lookup key on a dimension value will allow your Web application to access an existing database system that stores data related to that dimension value.
Note
Do not confuse these name/value pairs with source properties or Endeca properties. They are purely for descriptive information about a given dimension value.
To associate one or more properties with a dimension value:
Open the Dimension Values view from any dimension in the Dimensions view.
To view dimension values for a particular dimension:
In the Dimensions view, select a dimension and click Values. The Dimension Values view opens.
To expand a node in the dimension hierarchy, click the plus sign (+) next to the node's name. To collapse a node, click the minus sign (-). In addition to buttons that allow you to add, edit, delete, or adjust the rank of dimension values, the Load and Promote buttons support two features: editing of auto-generated dimension values and working with externally-generated taxonomies. This view also contains the following columns:
Column Description Dimension Value Lists dimension values in hierarchical order. External Type If the dimension, and its dimension values, were not manually created in Developer Studio, indicates what type of external dimension they are:
Auto Gen: The dimension's values are being automatically generated because the match mode in the Property Mapper is set to Auto Generate.
External: Indicates that this dimension has been created and is being maintainedin a third-party tool, outside of Developer Studio. Externally managed dimensions and dimension values are read-only.
Prop Mapped: Indicates that this dimension, and its dimension values, are being created using the "If no mapping is found, map source properties to Endeca: Dimensions" on the Property Mapper's Advanced tab.
Synonyms Lists any dimension value synonyms that you have applied.
Bounds Lists any bounds you have set for a range or sift dimension value.
Inert Indicates whether a dimension value is non-navigable.
Collapsible Indicates whether a dimension value is a candidate for collapsing.
Properties Lists any properties associated with a dimension value.
Note
The Load and Promote buttons support two features: editing of auto-generated dimension values and working with externally-generated taxonomies.
Set dimension value name, type (exact, range, or sift), properties, bounds, synonyms, and specify whether dimension value is inert and/or collapsible.
Adding or editing a dimension value affects how records are classified, and which records are available under a given dimension value in the client browser.
Set a dimension value's type, navigability, add synonyms, or associate properties to the dimension value.
To configure a dimension value:
In the Dimension Value editor, type the dimension value's name in the Name text box.
In the Type list, choose a dimension value type: Exact, Range, or Sift. A dimension value's type determines how it matches to property values during mapping.
Option
Description
Exact
Dimension values of type exact will only match to property values that match one of the synonyms exactly.
Range
Dimension values of type range will match to ranges of property values (specified by the upper and lower bounds of the range dimension value).
Sift
Auto sifting is an extension to autogeneration which positions newly auto-generated dimension values within an existing 'sift' hierarchy. A sift hierarchy is a normal hierarchy that has dimension values of type Sift. Sift dimension values are specified using ranges. Auto-generated dimension values "sift" through the hierarchy according to the ranges they match.
(Optional) Check Inert if the dimension value is non-navigable.
Check Collapsible if the dimension value is a candidate for collapsing
If you chose Range or Sift in step 2 above, do the following:
From the Bound Type list, choose one of the following: String, Floating point, or Integer.
In the Lower Bound text box, enter the lower number in the range. If you want the range to include the number you enter, check Include Value in Range.
In the Upper Bound text box, enter the higher number in the range. If you want the range to include the number you enter, check Include Value in Range.
(Optional) Click Properties to associate any properties to the dimension value.
Make all changes to a dimension value in the Dimension Value editor.
To edit a dimension value:
Remove dimension values and their child dimension values from within the Dimension Values view.
To delete a dimension value:
Rename only a dimension's child dimension values.
You cannot rename a dimension's root dimension value, but you can rename any of its child dimension values.
See Editing dimension values and Deleting dimension values for instructions.
Synonyms provide a textual way to refer to a dimension value, rather than by ID alone. A dimension value can have multiple synonyms. All synonyms that you assign to dimension values must be unique.
You specify the way each synonym is used by the MDEX Engine with the Search, Classify, and (Display) options:
Enabling the Search option indicates that this synonym should be considered during record and dimension searches. You can enable search for multiple synonyms, allowing you to create a more robust dimension value for searching.
Enabling the Classify option indicates that this synonym should be considered when attempting to map a source property value to this dimension value. In order for a source property value to match a dimension value, the dimension value’s definition must contain a synonym that:
By enabling classification for multiple synonyms, you increase the mapping potential for a dimension value because a source property can map to any of the synonyms that have been marked with Classify. If a synonym does not have its Classify option enabled, it is ignored during mapping, regardless of whether or not it is a text match to a source property value.
While you can have multiple synonyms for a dimension value, only one synonym can be marked for display. This is the synonym whose text is displayed in your implementation whenever this dimension value is shown. By default, the first synonym you create is set to be displayed, as is indicated by the parentheses around the synonym’s name, but you can set any synonym for display in the Synonyms dialog box.
To better understand these three options, consider the following example. This dimension value has an ID of 100 (automatically assigned by Developer Studio) and three synonyms:
Dimension Value ID = 100 Synonyms = 2002 SEARCH=enabled CLASSIFY=enabled DISPLAY=yes '02 SEARCH=enabled CLASSIFY=enabled DISPLAY=no 02 SEARCH=enabled CLASSIFY=enabled DISPLAY=no
In this example, dimension searches on any of the following terms would return the dimension value ID 100:
2002
'02
02
Records that have source property values that match any of the following would be tagged with the dimension value ID 100:
2002
'02
02
Finally, anytime the dimension value with an ID of 100 is displayed in the implementation, the text used to represent the dimension value is "2002".
Configure a synonym to display in the navigation controls, and associate other synonyms with a dimension value.
To add a synonym to a dimension value:
In the Dimension Values view, double-click the dimension value's name to open it in the Dimension Value editor.
The Synonym editor appears.
If this synonym is to be included in record and dimension searches, check Search.
If this synonym is to be used during dimension value tagging, check Classify.
This allows you to expand the mapping potential of a dimension value.
If you want this synonym to be the one that is displayed in the navigation controls, click (Display).
Only this synonym is displayed, regardless of the other synonyms for the dimension value or the original property value for the record.
(Optional) Repeat steps 2 through 7 to add additional dimension value synonyms.
You can only have one synonym marked for display.
Configuring a dimension to use range dimension values is useful for data that should be navigated as discrete values.
For example, a price value can be organized into discrete ranges ($0 - $50, $51 - $100, $101 - $200, and so on) with a Price Range dimension.
You can create a hierarchy of range dimension values. Normally the higher levels of a dimension's hierarchy are more general and the lower levels are more specific. For a range dimension, this translates to broader ranges at higher levels and more narrow ranges at lower levels.
This section describes the work necessary to construct range dimensions values and how to configure the pipeline to assign the appropriate range values to records.
You must set upper and lower bounds for a range dimension value.
Dimensions that use ranges are very similar in structure to ordinary dimensions. The dimension root and dimension hierarchy are created using the same basic process. The difference is in the configuration of the dimension values, which have a type of Range (as opposed to Exact) and require lower and upper bounds.
To set the bounds for a range dimension value:
In the Dimension Value editor, double-click the dimension value you want to change to open it in the Dimension editor.
In the Lower Bound text box, enter the lower number in the range. If you want the range to include the value you enter, check Include Value in Range.
In the Upper Bound text box, enter the higher number in the range. If you want the range to include the value you enter, check Include Value in Range.
If necessary, configure the range dimension value's synonyms.
Synonym configuration for range dimension values is the same as for ordinary dimension values. You can create more than one synonym for any given dimension value (although, you are still restricted to only one displayable synonym per dimension value).
The value of the synonym does not have to correspond to the bounds. For example, a dimension value in a Price range dimension might have a range of [0,10], and the synonyms Under $10 (set for display) and Bargains.
You should configure synonyms for classification and search using the same logic you would for ordinary dimensions. For example, if an end-user searches for Bargains, it would make sense to return the Under $10 dimension value from the Price dimension. In this case, you would configure Under $10 for display while configuring Bargains as searchable.
In most cases, synonyms for range dimension values should have their Classify option disabled so that only the bounds of the range itself are used for record classification and tagging. There are some cases where it is useful to combine range matching on the bounds with exact matching on the synonyms. See "Combining exact and range matching" for details.
Note
For more general information about synonyms, see "Adding synonyms to a dimension value".
This type of matching is called range matching (as opposed to exact matching where the source property value and the dimension value must be identical).
Matching between a source property value and a range dimension value is a two-step process:
You configure mapping between a source property and a range dimension as you would with non-range dimensions (see Establishing a dimension mapping). When Forge maps a source property to a dimension that contains range dimension values, it uses range matching for both Normal and Must Match modes. For the AutoGen mode, Forge still uses range matching, but when a property value does not match any ranges, a new exact match dimension value is generated, not a range dimension value.
See Choosing a match mode for more information on Normal, Must Match, and AutoGen match modes.
A single dimension can contain both exact and range dimension values. During source property mapping, this type of dimension participates in both exact matching, for the exact dimension values, and range matching, for the range dimension values.
If a range dimension value has synonyms that are enabled for classification, it will participate in both exact matching (for the synonyms) and range matching (for the bounds of the range). The ability to interweave ranges and classifiable synonyms allows you to match heterogeneous property values to a single range dimension.
For example, consider a wine rating property that is mapped to a Wine Rating dimension. Some of the source records have numeric ratings (56, 75, 92, and so on) while others have verbal ratings (poor, good, excellent). Using the technique described above, you can create a range dimension value that has bounds of 90 to 100 and a classifiable synonym for 'excellent' so that all records with numeric ratings above 90 are classified the same as records with 'excellent' ratings.
Specify either of these values as the lower or upper bound for a dimension value, to indicate a value less than or greater than all other values in the range.
You can use two special values, NEG_INF and POS_INF, when creating the bounds for your dimension values. NEG_INF indicates less than all other values while POS_INF indicates greater than all other values. For example, to specify a range of greater than 100, you would use a lower bound of 100 and an upper bound of POS_INF.
Likewise, less than 100 would use a lower bound of NEG_INF and an upper bound of 100.
You can use POS_INF and NEG_INF with string values as well. For example, setting a lower bound of S, inclusive, and an upper bound of POS_INF, inclusive, would match all strings starting with S and going to the end of the alphabet, including values such as S, Style, Trigger, and Zzzzz.
The order of symbols depends on your locale setting, which is external to the Endeca software. On UNIX, it is determined by a set of environment variables, typically LOCALE, LANG, or LC_ALL. On Windows, there are separate system and user locales which can be set from the Regional and Language Options control panel. For example, in ASCII, using [NEG_INF, A) as the bounds includes all numerics and many symbols (the '[' symbol indicates the value is inclusive while ')' indicates it is not). Using (Z, POS_INF] includes the rest of the symbols, as well as lower-case letters. This is not the case for other encodings, such as Unicode, which intersperses symbols and numbers with letters much more than ASCII. To use NEG_INF and POS_INF effectively, you must have a good understanding of the order of symbols in your locale's encoding.
When working with range and sift dimension values, it can be useful to have an Other dimension value to capture any source property values that don't fall within the defined ranges.
To catch source property values outside of your defined ranges, you must use a Perl manipulator, placed after the property mapper, to look at each record that has been mapped, and determine if it has any dimension values assigned to it from the range/sift dimension. If not, the Perl manipulator should assign the Other dimension value to the record.
Some reasons for issues may include the way a dimension value's bounds are configured, or assignment of a dimension value to an incorrect range.
The following information will help you troubleshoot range dimension value issues:
The displayable synonym for a range dimension value provides the text displayed in your Endeca-enabled Web application’s user interface. The bounds of the range are independent of this text. It is the developer's responsibility to correctly align a range dimension value's synonym text with its bounds.
Particular attention should be paid to correctly configuring the inclusion or exclusion of bounds values. (This is controlled by the Include Value in Range check boxes in the Bounds frame.) Incorrectly setting this field can lead to unwanted bounds overlap or holes in ranges. For example, say you have one range from 0 to 10, and another from 10 to 20. If you included the value of 10 in both ranges, your ranges would overlap (that is, the value 10 would match both ranges). Conversely, if you had the same ranges (0 to 10 and 10 to 20) but did not include the bounds values in either range, you would have a hole (that is, the value 10 would match neither range).
Ranges, by nature, have an order. However, this order is not preserved by the Presentation API. You can maintain the correct order by manually ranking each dimension value.
It is valid to create a range dimension containing dimension values with overlapping bounds. In performing the match, all dimension values that bound the property value will match successfully and will be assigned to the record.
If a dimension value cannot be precisely represented as an IEEE double, the dimension value might get assigned to an incorrect range.
You can provide name/value pairs with descriptive information about a given dimension value. For example, associating a key/value pair that contains a database lookup key with a dimension value will allow your Web application to access an existing database system that stores data related to that dimension value.
Note
Do not confuse these name/value pairs with source properties or Endeca properties. They are purely for descriptive information about a given dimension value.
To associate one or more properties with a dimension value:
If a dimension's hierarchy has been created manually, you should also rank its dimension values manually. Default dimension value ranking is used with dimensions that are auto-generated.
There are two ways to control the order in which dimension values are returned. You can:
In an auto-generated dimension, you generally don't have direct access to and, hence, can't manually rank, the dimension values (see the notes below for an exception to this statement).
Note
You can load auto-generated dimension values and then rank them manually. See "Editing auto-generated dimension values" for details.
If dimension values are assigned ranks with values greater than 16,000,000, unpredictable ranking behavior may result.
A related feature allows you to prune the list of refinement dimension values returned for a query to those values that occur most frequently in the requested navigation state. See "Pruning dimension value refinements by frequency of occurrence" for more information.
For information on ranking dimensions, see "Ranking dimensions manually".
Manual dimension value ranking defines the order in which dimension values appear in your Web application and overrides any default dimension value ranking you have specified.
A dimension value is only ranked relative to its siblings (that is other dimension values at the same level of hierarchy within the same dimension).
To manually rank dimension values:
If necessary, in the Dimension Values editor, expand the dimension hierarchy to display the dimension value you want to rank.
Click Up to move the value up in rank, or Down to move it down, within its own level of hierarchy.
Continue moving dimension values as necessary.
Note
The order in which the dimension values appear in Dimension Values view will be the order in which they appear in your application.
Note
You can load auto-generated dimension values and then rank them manually. See "Editing auto-generated dimension values" for details.
Use of ranked dimensions and dimension values does not affect MDEX Engine performance. However, indexing time is slightly increased by heavy use of this feature.
Configuring a dimension so that its dimension values are pruned according to their popularity overrides any manual or default dimension value ranking you may have specified.
In an auto-generated dimension, you don't have direct access to and, hence, can't manually rank, the dimension values. Instead, you must set a default rank order.
Default dimension value ranking is used with dimensions that are auto-generated.
Note
The paragraph above describes Developer Studio's default behavior with respect to auto-generated dimensions. Developer Studio also offers features that allow you load and/or promote an auto-generated dimension in Developer Studio so that you can edit it, including setting manual ranking for its values.
To set a default dimension value rank order:
To better understand the difference between the dimension value ordering types, consider an example of a dimension called Score that has the dimension values 1, 5, 5.5, 9, and 10. The following table shows what the ordering would be for each type.
Original dimension |
Alpha ordering |
Integer ordering |
Floating point ordering |
---|---|---|---|
-- Score |
1 |
1 |
1 |
-- 1 |
10 |
5 |
5 |
-- 5 |
5 |
5 |
5.5 |
-- 5.5 |
5.5 |
9 |
9 |
-- 9 |
9 |
10 |
10 |
-- 10 |
With integer ordering, the value 5.5 has been truncated to 5, and it is unclear which 5 is the original version and which is the truncated version. For some applications this may be acceptable, for others it is not. Additionally, integer ordering is significantly faster than floating point ordering. When choosing a numeric ordering type, you must balance the needs of your application against the extra time it takes to use floating point ordering.
Note
The Refinements Sort Order setting has no bearing on how records are sorted. It only controls how refinement dimension values are sorted.
Configuring a dimension so that its dimension values are pruned according to their popularity overrides any manual or default dimension value ranking you may have specified.
To configure multiple synonyms, collapsibility, or inertness, or to create a hierarchy for dimension values, you must manually populate a dimension. Otherwise, set a dimension to auto-generate its values.
Every dimension must be populated with dimension values. There are two ways to do this:
Manually create the dimension values, and any dimension value hierarchy, using the Dimension Values editor.
Configure the dimension to have its dimension values automatically generated from the source data. Unlike manually created dimension values, which are defined during the dimension editing process, auto-generated dimensions must be configured when you create your source property-to-dimension mappings in your property mapper component.
You must populate a dimension manually if any of the following is true:
If a dimension does not have either of these requirements, you can set it to auto-generate and avoid the labor involved in manually entering dimension values.
The following section, "Working with manually created dimension values," describes how to create, edit, and configure manually created dimension values. See Choosing a match mode for details on automatically generating dimension values.
Marking a dimension value as inert (or non-navigable) indicates that the dimension value should not be included in any navigation state, although it can be displayed in a user interface to help guide the end-user toward a selection.
When a user selects an inert dimension value, the navigation state is not changed, but the children of the dimension value are displayed for selection.
For example, if a Wineries dimension contains 1000 wineries and there is no geographic information from which to create a meaningful hierarchy, you can create a non-navigable alphabetical hierarchy. The first set of refinements returned for the Winery dimension would be the non-navigable refinements (such as A, B, C, and so on). When a user selects 'A,' the resulting query returns the same record set but the winery refinements are limited to those wineries whose name begins with A.
Note
For a common inert dimension use case scenario, see A typical scenario.
To make a dimension value non-navigable:
Fully implementing this feature requires additional work outside of Developer Studio. Please refer to the Endeca Basic Development Guide for details.
This describes a typical scenario for setting dimensions and dimension values.
A typical scenario involves combining three features in one dimension:
Create a sift dimension with a set of discrete ranges such as A - D, E - H, and so on. Auto sifting is an extension to autogeneration which positions newly generated dimension values within an existing hierarchy, such that Applebrandy Wine appears under A - D, Four Grapes Merlot appears in E - H, and so on.
Configure the dimension values in the sift dimension as non-navigable so that they can appear in the user interface to assist the end user but do not have any effect on the navigation state.
Configure the dimension values in the sift dimension so that they are collapsible. A collapsible hierarchy is an ordinary hierarchy, in which some or all of the internal (non-root and non-leaf) dimension values are flagged as potentially collapsible. The MDEX Engine automatically removes, or collapses, these dimension values when there are only a few leaves available for refinement, creating a more streamlined, user-friendly navigation experience for your users.
You can prune the list of refinement dimension values returned for a query to those values that occur most frequently in the requested navigation state.
You can limit the number of frequently-occurring (popular) refinements returned, as well as control the order in which they are returned. Note that configuring a dimension so that its dimension values are pruned according to their popularity overrides any manual or default dimension value ranking you may have specified.
To prune dimension value refinements according to their popularity:
In Dimensions view, double-click the dimension you want to edit to open it in the Dimension editor.
Click Enable Dynamic Ranking to specify that this dimension should calculate which refinements are most popular.
Type the number of popular refinements to return in the Maximum Dimension Values to Return box. The default value is 10.
Choose a method for sorting the popular refinements:
Alphabetically uses whatever order you've selected for the Refinements Sort Order setting on the main part of the Dimension editor.
Dynamically orders the most popular refinement values according to their frequency of appearance within a data set. Dimension values that occur more frequently are returned before those that occur less frequently.
(Optional) click Generate "More..." Dimension Value.
When this option is checked, if the actual number of refinement options exceeds the number set in Maximum Dimension Values to Return, then an additional option called More is returned for that dimension. If the user selects the More option, then the MDEX Engine will return all of the refinement options for that dimension. If Generate "More..." Dimension Value is not checked, only the number of dimension values defined in Maximum Dimension Values to Return is displayed.
Manual dimension value ranking defines the order in which dimension values appear in your Web application and overrides any default dimension value ranking you have specified.
A dimension value is only ranked relative to its siblings (that is other dimension values at the same level of hierarchy within the same dimension).
To manually rank dimension values:
If necessary, in the Dimension Values editor, expand the dimension hierarchy to display the dimension value you want to rank.
Click Up to move the value up in rank, or Down to move it down, within its own level of hierarchy.
Continue moving dimension values as necessary.
Note
The order in which the dimension values appear in Dimension Values view will be the order in which they appear in your application.
Note
You can load auto-generated dimension values and then rank them manually. See "Editing auto-generated dimension values" for details.
Use of ranked dimensions and dimension values does not affect MDEX Engine performance. However, indexing time is slightly increased by heavy use of this feature.
Configuring a dimension so that its dimension values are pruned according to their popularity overrides any manual or default dimension value ranking you may have specified.
Load an auto-generated dimension and then promote its dimension values to make them editable.
Normally, auto-generated dimension values cannot be edited. They are generated by Forge behind the scenes and maintained in state files. With an auto-generated dimension, you can configure the dimension's behavior, but you cannot configure the behavior of individual dimension values within the dimension. Endeca's load and promote functionality, however, allows you to load an auto-generated dimension and then promote its dimension values so that they become editable.
The process of converting an auto-generated dimension has been broken down into two distinct steps, loading and promoting. Loading displays the auto-generated dimension values so that you can inspect them before promoting them. In addition, loaded dimension values can be used in the following ways. You can:
After loading a dimension, you have the option of promoting its dimension values. Promoting a dimension's values converts them to manual dimension values, with all of the editing capability of a regular manually created dimension value. Promotion is done on a per dimension basis. In other words, when you promote a dimension, all of its dimension values are promoted; you cannot pick individual dimension values to promote and leave others to be auto-generated. It is important to note that, after promotion, you can no longer treat a promoted dimension as auto-generated. All configuration and editing must be performed manually at this point.
Loading and promoting auto-generated dimensions has two requirements:
You must use Endeca Workbench and EAC. Endeca Workbench stores temporary copies of auto-generated dimensions. This is the location that Developer Studio retrieves them from during loading.
A dimension must be auto-generated before it can be loaded and/or promoted. This means that you must have already run Forge at least once before attempting to load and promote auto-generated dimensions.
After you promote auto-generated dimension values, you must run a baseline update with Forge's
--pruneAutoGen
flag. The flag cleans out any promoted dimensions from the auto-generated state files. This step is necessary in order to avoid any potential duplicate dimensions in your output records.
Load and promote an auto-generated dimension from within the Dimensions view.
You must use Endeca Workbench and EAC. Endeca Workbench stores temporary copies of auto-generated dimensions. This is the location that Developer Studio retrieves them from during loading.
A dimension must be auto-generated before it can be loaded and/or promoted. This means that you must have already run Forge at least once before attempting to load and promote auto-generated dimensions.
To load and promote an auto-generated dimension:
In Dimensions view, select an auto-generated dimension. Auto-generated dimensions are indicated by this icon:
Click Values to display the Dimension Values view.
You see the root of the auto-generated dimension.
The dimension is populated with the auto-generated dimension values.
The icons next to the promoted dimension values change to indicate that they are now treated as manual dimensions:
Run a baseline update to remove promoted dimensions from auto-generated state files.
After you promote auto-generated dimension values, you must run a baseline update with Forge's
--pruneAutoGen
flag. The flag cleans out any promoted dimensions from the auto-generated state files. This step is necessary in order to avoid any potential duplicate dimensions in your output records.
To clean up after promoting auto-generated dimensions:
You must enable a dimension for record list display before you can access its dimension values using the methods in the Endeca Presentation API.
When record list display is enabled for a dimension, any records that are tagged with dimension values from that dimension display those values as part of their entry in the record list.
To enable record set display:
Fully implementing this feature requires additional work outside of Developer Studio. Please refer to the Endeca Basic Development Guide for details.
When record page display is enabled for a dimension, any records that are tagged with dimension values from that dimension display those values as part of their entry on the record page.
Note
You must enable a dimension for record display before you can access it using the methods in the Endeca Presentation API.
To allow the dimension value for this dimension to appear on record pages:
Fully implementing this feature requires additional work outside of Developer Studio. Please refer to the Endeca Basic Development Guide for details.
Configuring dimensions that are composed of many dimension values as hidden improves Presentation API and MDEX Engine performance to the extent that navigation query results do not have to include these large dimensions, reducing the processing cycles and amount of data the MDEX Engine must return.
You prevent a dimension from appearing in the navigation controls by designating it a hidden dimension. Hidden dimensions, like regular dimensions, are composed of dimension values that allow the user to refine a set of records. The difference between regular dimensions and hidden dimensions is that regular dimensions are returned for both navigation and record queries, while hidden dimensions are only returned for record queries. This means that hidden dimensions cannot be displayed as part of your navigation controls, but can be displayed as part of a record page (assuming the hidden dimension is configured to render on the record page).
Also, although hidden dimensions are not rendered in the navigation UI, records are still indexed with relevant values from these dimensions. Therefore, an end-user can search for records based on values within hidden dimensions.
To configure a hidden dimension:
Example 7. Hidden dimension example
Marking a dimension as hidden is useful in cases where the dimension is composed of numerous dimension values, and returning these values as navigation options does not add useful navigation information. Consider, for example, an Authors dimension in a bookstore. Scanning thousands of authors for a specific name is less useful than simply using keyword search to find the desired author.
In this case, you can configure the Authors dimension as hidden. The end-user will be able to perform a keyword search on a particular author but will not be able to browse on author names in order to find books by the author. Once the end-user has browsed to the record page for a particular book-either by keyword search or by navigating within other dimensions-he or she may be interested in other books by the same author. Because the hidden dimension is included in the record query results, the user can formulate a new navigation query, including the hidden dimension, that returns a list of books by that author. This process, in effect, creates a store of books by the same author.
Fully implementing this feature requires additional work outside of Developer Studio. Please refer to the Endeca Basic Development Guide for details.
If an Endeca-enabled Web application does not specify sort order as part of the query, the MDEX Engine returns query results using the default sort order, if one has been specified.
You specify the
default sort order and sort direction (ascending or descending) by using the
--sort
flag when running Dgidx. The
--sort
flag has the following syntax:
--sort "key|dir"
where key is the name of an Endeca property or dimension on which to sort and dir is either asc for an ascending order or desc for descending (if not specified, the order will be ascending).
You can also specify multiple sort keys in the format:
--sort
"key_1|dir_1||key_2|dir_2||...||key_n|dir_n"
If you specify multiple sort keys, the records are sorted by the first sort key, with ties being resolved by the second sort key, whose ties are resolved by the third sort key, and so on.
Note
If you are using the Endeca Application Controller (EAC) to control your environment, you must omit the quotation marks from the --sort flag. Instead, use the following syntax:
If you are using the Endeca Application Controller (EAC) to control your environment, you must omit the quotation marks from the --sort flag. Instead, use the following syntax:
--sort key_1|dir_1||key_2|dir_2||...||key_n|dir_n
Precomputing sort can save query time.
To precompute sort indices on a dimension:
Note
An explicit record sort key, specified as part of a MDEX Engine query, takes priority over any other type of record sorting (default sorting and relevance ranking). See "Controlling the order of record results" for details.
If the Web application does not specify sort order as part of the query, the MDEX Engine returns records in the default sort order, if one has been specified. See "Specifying a default record sort order" for details.
Record sorting only affects the order of records. It does not affect the ordering of dimensions or dimension values that are returned for query refinement. You use dimension and dimension value ranking to affect the order of dimensions and dimension values.
Fully implementing this feature requires additional work outside of Developer Studio. Please refer to the Endeca Basic Development Guide for details.
Relevance ranking is used to control the order of results that are returned in response to a keyword search. Record sorting is used to control the order of records that are returned in response to any type of MDEX Engine query that returns records.
Relevance ranking and record sorting are closely related features but there are some distinct differences.
Relevance ranking determines which results are more relevant to the user, based on a set of rules you define. For example, you can configure a rule that says "for multi-term searches, rank records that match more of the terms higher than those that match fewer terms." Relevance ranking is configured either as part of a search interface, where each search interface has its own relevance ranking strategy, or is specified in the record search query itself.
Unlike relevance ranking which is limited to keyword search queries, record sorting can be used with any type of query that returns records. Record sorting is based on a sort key. The sort key can either be defined as a default, or identified by the Web application as part of the query.
Generally, if you have relevance ranking enabled, you would not specify a record sort key within a record search query because record sort keys take priority over all other types of ordering, making the relevance ranking settings useless.
Note
A search interface is a named collection of properties and dimensions, each of which has its Enable Record Search option checked. Search interfaces allow your end-users to search on multiple properties and/or dimensions simultaneously. The search interface's name is used just like a normal property or dimension when performing record searches. A record search query on a search interface returns results that match any of the properties or dimensions in the interface.
Specify an explicit sort key in the MDEX Engine query, set a default sort order, or use relevance ranking (for records returned in response to record search queries).
There are three ways of controlling the order in which records are returned:
Specifying an explicit sort key in the MDEX Engine query, either via a URL parameter (Ns) or an ENEQuery setter method (setNavActiveSortKeys())
Using relevance ranking, for records returned in response to record search queries only. Relevance ranking settings can either be implicitly defined as part of the search interface, or explicitly defined in the search query itself.
The priority of record sorting/relevance ranking is as follows:
If none of these three sorting methods is specified, records are returned in an arbitrary, but consistent, order as determined by an internal ID generated by Dgidx during indexing.
If the MDEX Engine query includes an explicit sort key parameter, that sort key overrides all other sorting and relevance ranking settings.
If a default sort key is specified, and no other sort parameters are set, records are returned in default sort order. Ties are broken using the arbitrary internal order described above.
When searching against a search interface that incorporates a relevance ranking strategy, the relevance ranking strategy takes priority but ties are broken using the default sort key, if one has been specified. If there is no default sort key, ties are broken using the internal ID order described above.
If the MDEX Engine query includes a relevance ranking parameter, that setting overrides any relevance ranking strategies configured in the search interface that is being searched against.
A search interface is a named collection of properties and dimensions, each of which has its Enable Record Search option checked. Search interfaces allow your end-users to search on multiple properties and/or dimensions simultaneously. The search interface's name is used just like a normal property or dimension when performing record searches. A record search query on a search interface returns results that match any of the properties or dimensions in the interface.
Note
Fully implementing this feature requires additional work outside of Developer Studio. Please refer to the Endeca Basic Development Guide for details.
To fix issues with record sorts, check for property type, number of values assigned to each record, and uniqueness of property and dimension names.
If the records returned with a navigation request do not seem to respect the sort key parameter, here are some things to check:
Was the Endeca property specified as a numeric when it is actually alphanumeric, or vice versa? In this case, the MDEX Engine returns a valid response, but the sorting may not be what you expected. Also, if you are sorting by a dimension, then the sort is always alphabetic.
In general, properties and dimensions that are enabled for sorting should only have one value assigned per record. If a record has multiple property values or dimension values for a single Endeca property or dimension, the MDEX Engine sorts the records based on the first value associated with the key. If the application is displaying any value other than the first one, then the records may not appear to be sorted correctly.
If an application has properties and dimensions with the same name and a sort is requested by that name, the MDEX Engine will arbitrarily pick either the property or dimension for sorting. In general, all properties and dimensions should have a unique name.
Record search finds all records in an Endeca application that are tagged with a dimension value or Endeca property that matches a term the user provides. In order for a dimension to be considered during record searches, you must enable it for record search.
To enable record search for a dimension:
Fully implementing search features requires additional work outside of Developer Studio. Please refer to the Endeca Basic Development Guide for details.
You use the Dimension Search Configuration editor to configure search options for all dimensions in your project.
The Dimension Search Configuration editor does not specify the same options as the Search tab of the Dimension editor. The Search tab of the Dimension editor affects record searches.
The Dimension Search Configuration editor specifies the following information:
Option |
Description |
---|---|
Enable positional indexing |
When checked, indicates that Dgidx builds an index that describes word position with respect to other words in a dimension. A word position index is highly recommended when using the relevance ranking Phrase module and also recommended to improve the speed of phrase search. |
Enable wildcard search |
When checked, indicates that a user query can contain a wildcard character (*) to match against fragments of words in a dimension value. |
Return highest ancestor dimension |
When checked, the results of a dimension search return only the highest ancestor dimension value. This means that if both red zinfandel and red wine match a search query for 'red' and this checkbox is enabled, only the red wine dimension value is returned. When unchecked, then both dimension values are returned. |
Include inert dimension values |
When checked, indicates that certain inert (non-navigable) dimension values, such as dimension roots, are also returned as the result of a dimension search query. |
For both record and dimension search, you can expand the search so that a dimension value's search context includes not only its own searchable synonyms but also any searchable synonyms of its ancestors. This type of expanded search is called "hierarchical search."
Record search and dimension search each have their own option to enable this feature.
Consider an example where Record A is tagged with Merlot from the Wine Type dimension below, and the end-user has searched for "red." Without hierarchical search enabled, Record A would not be returned as part of the search results. With hierarchical search enabled, it would.
Wine Type
Normally, dimension search considers only the text in individual dimension value synonyms when searching for dimension values that match a user's search term(s). With hierarchical search enabled, the MDEX Engine considers ancestor dimension values as well.
To enable hierarchical search during dimension searches:
Fully implementing search features requires additional work outside of Developer Studio. Please refer to the Endeca Basic Development Guide for details.
Normally, record search only returns a record if the specific dimension value tagged to the record matches the search term(s). When hierarchical search is enabled, the MDEX Engine considers ancestor dimension values as well.
To enable hierarchical search during record searches:
Fully implementing search features requires additional work outside of Developer Studio. Please refer to the Endeca Basic Development Guide for details.
A range filter allows an Endeca-enabled Web application to select a subset of the total dataset for display, based on an arbitrary, dynamic range that uses an Endeca property or dimension as the filter key.
Navigation queries that use a range filter return only those records that are included in the selected data subset, along with the refinement dimension values that are appropriate for the filtered records. Range filters are supported for:
For values of properties and dimensions of type Floating point, you can specify values using both decimal (0.00...68), and scientific notation (6.8e-10).
It is important to remember that range filters are simply modifiers for a navigation query. The range filter acts in the same manner as a dimension value, even though it is not a specific system-defined dimension value. Consider the following records and examples:
Record ID |
Sample Dimension Value (Wine_Type) |
Sample Property (Price) |
Sample Property (Description) |
---|---|---|---|
1 |
Red (Dim Value 101) |
10 |
Dark ruby in color, with extremely ripe… |
2 |
Red (Dim Value 101) |
12 |
Dense, rich and complex describes this '96 California… |
3 |
White (Dim Value 102) |
19 |
Dense and vegetal, with celery, pear and spice flavors… |
4 |
Other (Dim Value 103) |
20 |
Big, ripe and generous, layered with honey… |
Make sure your dimension is of Numeric type to enable it for range filtering.
In order to use a dimension as the key for a range filter, you must make sure your dimension is of a compatible type. Range filters are supported for dimensions of Numeric type with Integer or Floating point values.
You may also use an Endeca property as the key for a range filter.
To enable a dimension for range filtering:
In the Dimensions view, double-click the dimension you want to change to open it in the Dimension editor.
In the Refinements Sort Order list, select a type that is compatible with range filters: Integer or Floating point.
For values of dimensions of type Floating point, you can specify values using both decimal (0.00...68), and scientific notation (6.8e-10). Be careful of dollar signs or other characters in dimension values that would prevent a dimension from being defined as numeric.
Fully implementing this feature requires additional work outside of Developer Studio. Please refer to the Endeca Basic Development Guide for details.
Wildcard searching allows user queries that contain a wildcard character (*) to match against fragments of words in dimension and its child dimension values.
You can either enable each dimension that you want available for wildcard searching or you can enable all dimensions using the Dimension Search Configuration editor.
To enable wildcard searching for a dimension:
Fully implementing this feature requires additional work outside of Developer Studio. Please refer to the Endeca Basic Development Guide for details.
Always rank dimensions manually, by placing them in ranked order in the Dimensions view. The MDEX Engine will return dimensions in this order.
Note
The use of manually ranked dimensions does not affect MDEX Engine performance. However, indexing time is slightly increased by heavy use of this feature.
To rank your dimensions:
Dimension statistics count the number of records, for a given result set, that are tagged with each of a dimension's dimension values.
Giving the end-user an indication of the number of records that will be returned for each refinement enhances an Endeca application's navigation experience by providing more context to the end-user at each point during the navigation. In the example illustration, the query for differential crawl has seven results. Within the Documents dimension, two of the results are tagged with Developer Studio Help, one is tagged with Features Guide, one is tagged with Release Notes, and three are tagged with XML Reference.
Dimension statistics are returned as a property on each dimension value. See the Endeca Basic Development Guide for details on accessing and displaying dimension statistics properties.
You enable refinement statistics on a per dimension basis. The Endeca MDEX Engine dynamically computes dimension statistics at run-time.
To enable statistics for a dimension:
Fully implementing this feature requires additional work outside of Developer Studio. For details, refer to "Displaying refinement statistics" in the Endeca Basic Development Guide.
Aggregated records allow you to treat a collection of separate records as one if the rollup key is the same for any number of records.
An aggregated record is a collection of individual Endeca records that have been rolled up based on a rollup key (an Endeca property or dimension name). All records in the current record set that have the same value for the rollup key are collected together into an aggregated record. For example, rolling up on a Name key causes all wines in the current record set that have the value 'My Red Wine' for the Name key to be rolled up into one aggregated record.
Commonly, aggregated records are used to eliminate duplicate display entries. For example, in a music store catalog, an album by the same title may exist in several formats, with multiple prices. Each title is represented in the MDEX Engine as a distinct record. However, from a business perspective, it might be useful to treat these separate records as a single record by creating an aggregate record.
Record aggregation affects the current record set only. In other words, if you have 10,000 Endeca records total but only 3,000 are displayed in the current record set, then the aggregation affects those 3,000 records only.
The aggregated records feature requires that each record should have at most one value from the dimension or Endeca property that has been specified as the rollup key. Also, if an Endeca record has a unique value for the rollup key, it is 'rolled up' into an aggregated record that contains only one sub-record.
Note
Fully implementing this feature requires additional work outside of Developer Studio. Please refer to the Endeca Basic Development Guide for details.
By default, the Endeca MDEX Engine only permits a single dimension value from any given dimension to be added to the navigation state. For some applications, however, it may be useful to allow the user to select more than one dimension value from a dimension.
After a user selects a leaf refinement from any dimension, that dimension is removed from the list of dimensions available for refinement in the query results.
For example, it might be useful to give a user the ability to show wines that have a flavor of Apple AND Apricot, or movies that star Cary Grant OR Clark Gable. This is accomplished by tagging the dimension as multi-select AND or OR , respectively.
Enable a dimension for multi-select AND or OR from the Advanced tab of the editor for that dimension.
To enable multi-select AND or OR:
In the Dimensions view, double-click the dimension you want to change to open it in the Dimension editor.
If you are enabling multi-select OR for a flat dimension, no more configuration is required. If you are enabling multi-select OR for a hierarchical dimension, you must configure all non-leaf dimension values as Inert. See "Multi-select OR and hierarchical dimensions" for details.
In a hierarchical dimension, you must configure all non-leaf dimension values to be inert, or non-navigable, to prevent them from appearing in the navigation query.
Multi-select OR queries are restricted to leaf dimension values. In a flat dimension, all possible refinements are leaf dimension values, so no extra configuration is necessary.
Non-root and non-leaf dimension values are collapsible dimensions in a hierarchy.
A collapsible hierarchy is an ordinary hierarchy, in which some or all of the internal (non-root and non-leaf) dimension values are flagged as potentially collapsible. The MDEX Engine automatically removes, or collapses, these dimension values when there are only a few leaves available for refinement, creating a more streamlined, user-friendly navigation experience for your users.
For example, a dimension containing many state names could have a collapsible hierarchy introduced to group the names alphabetically. At query time, the available refinements are determined by the dimension values tagged to the records in the current navigation state. If there are many refinement values, it is easier for a user to select first a letter range, then a letter, and then the state name they want. But if there are only a few values, it is easier for the user to look at a brief list and select the state name directly. In this case, the letter-based dimension values can be collapsed, or removed, so that only the list of state names is displayed.
Dimension values that are configured as collapsible have the potential to be collapsible. Whether or not a dimension value is actually collapsed is controlled by the collapsible dimension threshold.
Note
This feature requires that you make all dimension values collapsible, and set a collapsible dimension threshold.
Making a dimension value collapsible means that it is a candidate for collapsing.
See "About collapsible dimensions" for details.
To make a dimension value collapsible:
The collapsible dimension threshold determines how many refinement values must exist for a set of query results in order for dimension value collapsing to occur.
If the number of refinement values is less than or equal to the threshold, dimension values that have been configured as potentially collapsible are collapsed (that is, removed from the hierarchy). This creates a shallower hierarchy with fewer conceptual levels of refinement.
If the number of refinement values is greater than the threshold, dimension values that have been configured as potentially collapsible are not collapsed. They remain in the hierarchy, creating a deeper hierarchy with more conceptual levels of refinement.
To set a collapsible dimension threshold:
This describes a typical scenario for setting dimensions and dimension values.
A typical scenario involves combining three features in one dimension:
Create a sift dimension with a set of discrete ranges such as A - D, E - H, and so on. Auto sifting is an extension to autogeneration which positions newly generated dimension values within an existing hierarchy, such that Applebrandy Wine appears under A - D, Four Grapes Merlot appears in E - H, and so on.
Configure the dimension values in the sift dimension as non-navigable so that they can appear in the user interface to assist the end user but do not have any effect on the navigation state.
Configure the dimension values in the sift dimension so that they are collapsible. A collapsible hierarchy is an ordinary hierarchy, in which some or all of the internal (non-root and non-leaf) dimension values are flagged as potentially collapsible. The MDEX Engine automatically removes, or collapses, these dimension values when there are only a few leaves available for refinement, creating a more streamlined, user-friendly navigation experience for your users.
Auto sifting is an extension to autogeneration which positions newly generated dimension values within an existing hierarchy.
The hierarchy must be a sift hierarchy, which is a normal hierarchy that contains dimension values of type Sift. As with autogeneration, the dimension mapping in the property mapper must be configured for auto generation.
A sift dimension value must be configured as type Sift and specify a range. In the illustration below, all of the dimension values except the root and Quintana-Roo are sift dimension values. The first layer of dimension values after the root specify ranges that span a set of letters, A to D, E to H, and so on. The set of child dimension values below Q-T are also ranges, although they aren't as obvious. The Q dimension value includes all strings that are greater than Q but less than R. Q, therefore, includes strings such as Q, QWERTY, QUEEN, QQQ, QZZZ but not PEZ or RA.
Note
Although the range for a sift dimension value is specified the same way as for a range dimension value, sift and range are separate, and mutually exclusive, features.
When a new dimension value is auto-generated, it sifts down the hierarchy according to the ranges it matches. For example, in the illustration above, the auto-generated Quintana-Roo dimension value would first sift into Q-T, and then into Q. The sifting of auto-generated dimension values is handled according to these rules:
Dimension values that do not match any sift range in the hierarchy are added as children of the dimension root.
Dimension values that match the ranges of multiple children of a sift dimension value are placed under the first child they match.
Dimension values that do not match any child of a sift dimension value are placed under the deepest value they match.
It is a good idea to configure the last child under each sift dimension value as a catch-all, with a range of NEG_INF to POS_INF , inclusive. That way, auto-generated dimension values that do not match any other range can be gathered under an Other dimension value.
Sift dimension values do not participate in matching. Even though the range Q-T appears within the dimension above, the property value Quintana-Roo will not match it. Following the match failure, a new Quintana-Roo dimension value is automatically generated and sifted through the hierarchy. Any further properties with the value Quintana-Roo will match this auto-generated dimension value.
Sift hierarchies are frequently made collapsible. A collapsible hierarchy is a form of conditional hierarchy in which some or all of the internal (non-root and non-leaf) dimension values are flagged as potentially collapsible. The MDEX Engine automatically removes, or collapses, these dimension values when there are only a few leaves available for refinement, creating a more streamlined, user-friendly navigation experience for your users. See About collapsible dimensions for details.
In addition, sift dimension values are typically ranked, to ensure that the ranges appear in order during navigation.
Note: All dimension values in a sift dimension must be specified as type Sift or there may be unexpected behavior.
At the end of each data processing run, the dimension server saves the state of an auto sift dimension, just like any other auto-generated dimension. At the start of the next run, the sift hierarchy may be updated; therefore, all of the previously auto-generated dimension values are re-sifted into the new sift hierarchy. This ensures that they are positioned correctly within the hierarchy, even when the hierarchy has been changed since it was initially generated.
Note
For a common sift dimension use case scenario, see A typical scenario.
Defining a sift dimension is a two-step process. First, you create the sift dimension hierarchy, then you specify that the dimension should use auto-generation during its mapping process.
The procedure below shows you how to create a sift dimension for a hierarchy.
To create a sift dimension:
Add dimension values to the dimension and define them as Sift by doing the following:
From the Bound Type list, choose one of the following: String, Floating point, or Integer.
In the Lower Bound text box, enter the lower number in the range.
Note
If you want the range to include the number you enter, check Include value in range.
In the Upper Bound text box, enter the higher number in the range. If you want the range to include the number you enter, check Include Value in Range. This illustration shows how to set the range for the Q - T dimension value.
Note
By specifying R as the upper bound for the Q dimension value, but not including it in the range, we ensure that all strings that begin with Q are included in the range, such as Queen, Quick, QWERTY, and so on.
This illustration shows how to set the range for the Q dimension value.
Continue to populate the dimension with sift dimension values as needed. This illustration shows what the Dimension Values editor looks like for the example established in this procedure.
Note
Although the range for a Sift dimension value is specified the same way as for a Range dimension value, Sift and Range are separate, and mutually exclusive, features.
As mentioned in step 2 a in the procedure above, all dimension values in a sift dimension must be specified as type Sift or there may be unexpected behavior.
If a dimension value cannot be precisely represented as an IEEE double, the dimension value might get assigned to an incorrect range.
Upon creating a sift dimension hierarchy, you may enable Auto Generate as the match mode for a sift dimension.
If you have not already done so, create a sift dimension hierarchy.
Specify either of these values as the lower or upper bound for a dimension value, to indicate a value less than or greater than all other values in the range.
You can use two special values, NEG_INF and POS_INF, when creating the bounds for your dimension values. NEG_INF indicates less than all other values while POS_INF indicates greater than all other values. For example, to specify a range of greater than 100, you would use a lower bound of 100 and an upper bound of POS_INF.
Likewise, less than 100 would use a lower bound of NEG_INF and an upper bound of 100.
You can use POS_INF and NEG_INF with string values as well. For example, setting a lower bound of S, inclusive, and an upper bound of POS_INF, inclusive, would match all strings starting with S and going to the end of the alphabet, including values such as S, Style, Trigger, and Zzzzz.
The order of symbols depends on your locale setting, which is external to the Endeca software. On UNIX, it is determined by a set of environment variables, typically LOCALE, LANG, or LC_ALL. On Windows, there are separate system and user locales which can be set from the Regional and Language Options control panel. For example, in ASCII, using [NEG_INF, A) as the bounds includes all numerics and many symbols (the '[' symbol indicates the value is inclusive while ')' indicates it is not). Using (Z, POS_INF] includes the rest of the symbols, as well as lower-case letters. This is not the case for other encodings, such as Unicode, which intersperses symbols and numbers with letters much more than ASCII. To use NEG_INF and POS_INF effectively, you must have a good understanding of the order of symbols in your locale's encoding.
When working with range and sift dimension values, it can be useful to have an Other dimension value to capture any source property values that don't fall within the defined ranges.
To catch source property values outside of your defined ranges, you must use a Perl manipulator, placed after the property mapper, to look at each record that has been mapped, and determine if it has any dimension values assigned to it from the range/sift dimension. If not, the Perl manipulator should assign the Other dimension value to the record.
This describes a typical scenario for setting dimensions and dimension values.
A typical scenario involves combining three features in one dimension:
Create a sift dimension with a set of discrete ranges such as A - D, E - H, and so on. Auto sifting is an extension to autogeneration which positions newly generated dimension values within an existing hierarchy, such that Applebrandy Wine appears under A - D, Four Grapes Merlot appears in E - H, and so on.
Configure the dimension values in the sift dimension as non-navigable so that they can appear in the user interface to assist the end user but do not have any effect on the navigation state.
Configure the dimension values in the sift dimension so that they are collapsible. A collapsible hierarchy is an ordinary hierarchy, in which some or all of the internal (non-root and non-leaf) dimension values are flagged as potentially collapsible. The MDEX Engine automatically removes, or collapses, these dimension values when there are only a few leaves available for refinement, creating a more streamlined, user-friendly navigation experience for your users.
Prior to the MDEX Engine version 6.1.0, each project had to have a single primary dimension which functioned as the root for all other secondary dimensions in the project. You needed to set up precedence rules between the primary dimension and each secondary dimension to ensure that the secondary dimensions would appear in the navigation controls.
If a primary dimension was not explicitly defined in Developer Studio, Dgidx created one for you. Dgidx also created the required precedence rules between the primary dimension and the other dimensions in your project, which were, by default, considered secondary. For most projects, this was sufficient and developers did not have to worry about creating a primary dimension.
Partial updates, however, required an explicit primary dimension that had explicit precedence rules to all secondary dimensions. Starting with the MDEX Engine version 6.1.0, you no longer need to specify a primary dimension. If it is specified, it is ignored by the MDEX Engine. Note that partial updates no longer require that a primary dimension is specified explicitly.
The procedure below describes how to set a primary dimension. See "About precedence rules" and "Creating, modifying, and deleting precedence rules" for information on creating precedence rules.
Note
Explicitly defined precedence rules override any internally generated rules.
To set the primary dimension:
Be sure that source data reflects dimension values, match modes are correctly set, and that source properties are mapped to a dimension or an Endeca property.
If a dimension value does not appear as expected in the client browser, double-check the following potential issues:
Check that the value exists in the source data. If the dimension value 2010 is added to the Year dimension, but there are no records with that year, then the MDEX Engine will never present 2010 as an option in the client browser.
Matches to dimension values of type Exact must be, as the name implies, exact and are case-sensitive. The property values $20 and Twenty will not match to the dimension values 20 or twenty.
A source property must be mapped to either a dimension or an Endeca property in order to appear in your Endeca-enabled Web application. If a source property that you want to map to a dimension is not appearing in your Web application, make sure the following is true:
Your source property is mapped correctly to a dimension in your pipeline's property mapper. See "Establishing a dimension mapping" for details.
The dimension is either configured for auto-generation, or the source property values on the records match the dimension values that you explicitly defined for the dimension in the Dimensions editor.
The Show with Record List and Show with Record options are set correctly in the Dimension editor.
Note
By default, Forge removes source properties that have not been mapped to an Endeca property or dimension. See "Removing source properties after mapping" for details.