Top FAQs to Index Content and Data

The top FAQs for indexing data models and catalog content are identified in this topic.

What can I index?

You can choose to index:

  • Data models - Subject area, dimensions names and values, and measure names and values. You must be an administrator to modify the data model indexing preferences.
  • Catalog content - Projects, analyses, dashboards, and reports. You must be an administrator to modify the catalog indexing preferences.
  • File-based data sets - You can index a filed-based data set so that the specified users can build visualizations with a data set's data. Or you can certify a file-based data set so that the specified users can search for its data from the home page. Any user can set a file-based data set to index or certify the data set.

What is a certified data set?

Any user can upload a spreadsheet to create a data set, and uploaded spreadsheets can be of varying quality. When a user certifies a shared data set, it means that the user is confirming that the data set contains good, reliable data that other users can search for from the home page. When you and users who've been granted access to data sets search from the home page, the data in a certified data set is ranked high in the search results.

How often should I schedule a crawl?

The index updates automatically as users add or modify catalog content. By default, the catalog and data model crawl run once per day. In some cases you might want to change this default after importing a BAR file, if automatic indexing didn't run, or if your data updates occur less frequently (for example, monthly).

Are there considerations when indexing subject areas with large tables?

You can index any size table, but big tables will take longer to index. For large subject areas that have many tables or large tables, consider indexing only the columns your users will need to search for.

Because index files are compact, it is rare to exceed the storage space that Analytics Cloud reserves for indexing.

How are search results ordered?

Search results are listed in this order:

  1. Data model (semantic layer)
  2. Certified data sets
  3. Personal data sets
  4. Catalog items (projects, analyses, dashboards, and reports)

Should I use Don't Index to secure my catalog items?

No. Oracle doesn't recommend setting the Crawl Status field to Don't Index as a way of hiding a catalog item from users. Users won't see the item in search results or on the home page, but are still able to access the item. Instead, use permissions to apply the proper security to the item.

How do I build an index most effectively?

For best results only index the subject areas, dimensions, catalog items, and certify data sets that users need to find. Indexing all items yields too many search results. Oracle recommends that you deselect all data model and catalog items and then select only the items that the user needs. You can then add items to the index as needed.

Why are there many select distinct queries on the database during indexing?

This is most likely because the data model's indexing option is set to Index. When you set this option to Index, the metadata and values are indexed, which means that during indexing the select distinct queries are run to fetch the data values for all of the columns in all of the subject areas that are configured for indexing.

If this system overhead isn't acceptable or if users don't need the additional functionality to visualize data values from the search bar on the Home page, then go to the Console, click Search Index, and set the indexing option to Index Metadata Only. Setting this option to Index Metadata Only option indexes dimension and measure names, only, and doesn't run select distinct queries.