9 Managing Usage Tracking

The Oracle BI Server supports the collection of usage tracking data. When usage tracking is enabled, the Oracle BI Server collects usage tracking data for each query and writes statistics to a usage tracking log file or inserts them directly to a database table. It is strongly recommended that you use direct insertion instead of writing to a log file.

Note:

A sample usage tracking implementation is provided with the Oracle Business Intelligence installation at:

ORACLE_INSTANCE\bifoundation\OracleBIServerComponent\coreapplication_obisn\sample\usagetracking

If you are upgrading from previous versions of Usage Tracking, then see the usage tracking topics in Oracle Fusion Middleware Installation Guide for Oracle Business Intelligence.

This chapter contains the following topics:

Section 9.1, "Setting Up Usage Tracking"
Section 9.2, "Description of the Usage Tracking Data"

9.1 Setting Up Usage Tracking

The Oracle BI Server supports the accumulation of usage tracking statistics that can be used in a variety of ways such as database optimization, aggregation strategies, or billing users or departments based on the resources that they consume. The BI Server tracks usage at the detailed query level.

When you enable usage tracking, statistics for every query are inserted into a database table or are written to a usage tracking log file. If you use direct insertion, then the BI Server directly inserts the usage tracking data into a relational database table. It is recommended that you use direct insertion to write statistics to a database table.

When the BI Server starts, it validates the column names in the metadata against the list of valid columns in the usage tracking table. The following events occur:

Column names. If there is a mismatch between the columns in the database table and the columns in the metadata, then it results in a database error on insert.
Varchar length. If the length in the metadata and the set length in the table do not match, then an error is written to the nqserver.log file and usage tracking is disabled.

This section contains the following topics:

Section 9.1.1, "Setting Up Direct Insertion to Collect Information for Usage Tracking"
Section 9.1.2, "Setting Up a Log File to Collect Information for Usage Tracking"

9.1.1 Setting Up Direct Insertion to Collect Information for Usage Tracking

Direct insertion is the recommended method for setting up usage tracking. To set up direct insertion, update the following parameters in the Usage Tracking section of the NQSConfig.INI file:

ENABLE. Set this parameter to YES to enable usage tracking.
DIRECT_INSERT. This parameter determines whether the query statistics are inserted directly into a database table or are written to a file for subsequent loading. Set this parameter to YES to enable direct insertion.
PHYSICAL_TABLE_NAME. To insert query statistic information into a table, you must provide the name of the table and the connection pool that is used to access the table (see the CONNECTION_POOL parameter).

The fully qualified physical table name consists of as many as four components (database name, catalog name, schema name, and table name). Each component is surrounded by double quotes (") and separated by a period (.). The physical table name must be fully qualified. This fully qualified physical table name must match a table name in the physical layer of the loaded repository. The following is an example of a physical table name for the Usage Tracking table in the Oracle Business Intelligence repository:
```
PHYSICAL_TABLE_NAME = "Oracle BI Usage"."Catalog"."dbo"."S_NQ_ACCT" ;
```
In this example, Oracle BI Usage represents the database component, Catalog represents the catalog component, dbo represents the schema component, and S_NQ_ACCT represents the table name.
CONNECTION_POOL. The fully specified connection pool name has two parts: database name and connection pool name. Each part is surrounded by double quotes (") and separated by a period (.). The fully qualified connection pool name should match a connection pool name in the physical layer of the loaded repository. For an example, see the following connection pool name in the Oracle Business Intelligence repository:
```
CONNECTION_POOL = "Oracle BI Usage"."Connection Pool" ;
```
In this example, Oracle BI Usage represents the database component and Connection Pool represents the actual connection pool name.

For Usage Tracking insertions to succeed, the connection pool must be configured with a user ID that has write access to the back-end database.

Note:
It is recommended that the connectivity type supports international data.

See Oracle Fusion Middleware Metadata Repository Builder's Guide for Oracle Business Intelligence Enterprise Edition for information about configuring connection pools.
BUFFER_SIZE. This parameter indicates how much memory the BI Server should allocate for buffering the insert statements. Such a buffer lets the BI Server submit multiple insert statements as part of a single transaction, improving Usage Tracking insert throughput. It also means that ordinary analyses do not have to wait on Usage Tracking insertions, which improves average query response time. You might want to adjust this value based on available memory and memory utilization on the server computer.
BUFFER_TIME_LIMIT_SECONDS. This parameter indicates the maximum amount of time that an insert statement remains in the buffer before the Usage Tracking subsystem attempts to issue it. This time limit ensures that the BI Server issues the insert statements quickly, even during periods of extended quiescence.
NUM_INSERT_THREADS. This parameter indicates the number of threads that remove insert statements from the buffer and issue them to the Usage Tracking database. Assuming separate connection pools for readers and inserters, the number of insert threads should typically equal the Maximum Connections setting in the connection pool.
MAX_INSERTS_PER_TRANSACTION. This parameter indicates the maximum number of insert statements that the Usage Tracking subsystem attempts to issue as part of a single transaction. The larger this number, the greater potential throughput for UsageMarathon Tracking inserts. However, a larger number also increases the likelihood of transactions failing due to deadlocks. A small value for BUFFER_TIME_LIMIT_SECONDS can limit the number of inserts per transaction.

See Appendix A, "NQSConfig.INI File Configuration Settings" for additional information about the usage tracking configuration parameters.

9.1.2 Setting Up a Log File to Collect Information for Usage Tracking

Log file collection is an alternate method for configuring usage tracking. Although this feature is supported, it is recommended that you use direct insertion to collect information for usage tracking. See Section 9.1.1, "Setting Up Direct Insertion to Collect Information for Usage Tracking" for details.

The following sections explain how to configure log file collection for usage tracking. Before you configure the log file parameters, you must set the ENABLE parameter in the Usage Tracking section of the NQSConfig.INI file to YES to enable usage tracking. See Appendix A, "NQSConfig.INI File Configuration Settings" for more information about the usage tracking configuration parameters.

This section contains the following topics:

Section 9.1.2.1, "Selecting an Output Location"
Section 9.1.2.2, "File Naming Conventions"
Section 9.1.2.3, "Output File Format"
Section 9.1.2.4, "Performance Considerations"

9.1.2.1 Selecting an Output Location

The STORAGE_DIRECTORY parameter in the Usage Tracking section of the NQSConfig.INI file determines the location of usage tracking log files. If usage tracking is enabled, but no storage folder is specified, then the files are written in the logs folder in the software installation folder (for example, in \OBI11g\logs).

Current files are periodically written to disk, and new files are created. The CHECKPOINT_INTERVAL_MINUTES parameter controls the frequency with which usage tracking data is flushed to disk, and the FILE_ROLLOVER_INTERVAL_MINUTES parameter controls the frequency with which the current usage tracking log file is closed and a new file is created.

When usage tracking is enabled, every query is logged to a usage tracking log file. This might require a large amount of available storage. For example, assume an average of 300 bytes of data output for each query and 10 queries per second over an 8 hour day. This results in approximately 83 MB of usage tracking data that is written to storage per day. If this example is extended to a 24 x 7 operation, then the result is approximately .25 GB of storage per day.

The BI Server has no limit on the size or quantity of usage tracking log files that can exist in the specified location. You must ensure that sufficient space is available and that you remove or archive old usage tracking files.

Caution:

Insufficient storage space might cause you to lose usage tracking data. If the BI Server encounters an error while accessing a usage tracking output file, then it immediately discontinues the collection of usage tracking statistics and issues an error message to the nqserver.log and, on Windows, to the Windows Event log. Even if additional storage space is made available, the collection of usage tracking statistics does not resume until the server is restarted.

9.1.2.2 File Naming Conventions

The file naming scheme for the usage tracking log files is NQAcct.yyyymmdd.hhmmss.log, where yyyy is the year, mm is the month, dd is the day, hh is the hour, mm is the minute, and ss is the second of the timestamp when the file was created. For example, if the server creates the usage tracking log file at 07:15:00 AM on February 12, 2010, the filename would be NQAcct.20100212.071500.log. After the specified rollover interval, this file is flushed to disk and closed and a new log file, with the current date and timestamp, is created.

9.1.2.3 Output File Format

The usage tracking log files are text files, in semicolon-delimited ( ; ) format. A semicolon is used as the column delimiter because the logical SQL text contains commas. A line feed delimits the end of each row of data.

While there is no guaranteed unique key for the usage tracking data, a combination of User name, Node ID, Start timestamp, and Query text is usually sufficient. Query_Key can be used as an unique key but Query_Key is available only in direct insert.

For information about sample scripts to help you extract data from usage tracking log files and load it to appropriately formatted relational database tables, see Chapter 9, "Managing Usage Tracking." See also Section 9.2, "Description of the Usage Tracking Data" for more information about the contents of each column.

Table 9-1 shows the format of the usage tracking output file.

Table 9-1 Usage Tracking Output File Format

Column Number	Column Name	Data Type	Max Data Size	Nullable
1	User name	Varchar	128	No
2	Repository name	Varchar	128	No
3	Subject area name	Varchar	128	No
4	Node ID	Varchar	15	No
5	Start timestamp	Char (Timestamp)	19	No
6	Start date	Char (yyyy-mm-dd)	10	No
7	Start hourMin	Char (hh:mm)	5	No
8	End timestamp	Char (Timestamp)	19	No
9	End date	Char (yyyy-mm-dd)	10	No
10	End hourMin	Char (hh:mm)	5	No
11	Query Text	Varchar	1024	No
12	Success indicator	Integer	4	No
13	Row count	Integer	4	Yes
14	Total time (secs)	Integer	4	Yes
15	Compilation time (secs)	Integer	4	Yes
16	Number of database queries	Integer	4	Yes
17	Cumulative db time (secs)	Integer	4	Yes
18	Cumulative db rows	Integer	4	Yes
19	Cache indicator	Char	1	No
20	Query source	Varchar	30	No
21	Presentation Catalog path	Varchar	250	No
22	Dashboard name	Varchar	150	Yes

Table 9-1 describes the schema. The following list contains more detail about Integer data types, Timestamp columns, and the Query Text column:

Integers data types. All data in the output file is in character format. The data in columns 12 through 18 are output as text representations of integer numbers. Therefore, they behave more like Varchar(10) columns than integers. For example, if the row count is one million rows, then 1000000 appears in the output file in column 13 (Row count). This constitutes seven bytes of data, even though the data represents a 4-byte internal integer value.

In column 12, a Success indicator value of 0 signifies a successful query. All nonzero values indicate failure. The following failure indicators are currently defined:
- 1 indicates timeout
- 2 indicates row limit violation
- 3 indicates unknown error
The subsequent integer columns are valid only if the Success indicator (column 12) signifies a successful query (value is 0):
Timestamp columns. The Start timestamp and End timestamp columns indicate the wall clock time (local time) when the logical query started and finished. Each value is 19 bytes of character data that represents a SQL-92 timestamp. The format is yyyy-mm-dd-hh:mm:ss. The related columns, Start date and End date, contain just the date component from the respective timestamps (in the yyyy-mm-dd format). Finally, the related columns, Start hourMin and End hourMin, contain just the hour and minute components from the respective timestamps (in a char hh:mm format).

9.1.2.4 Performance Considerations

When usage tracking is enabled, the BI Server collects usage tracking data for every query. This data, however, is written to disk only at user-specified intervals, known as checkpoints. The default setting is to have checkpoints every 5 minutes.

While you can modify this value in the NQSConfig.INI file, reducing the interval adds overhead and, if set low enough, could potentially impact server performance. Setting the value higher increases the amount of usage tracking data that could be lost in the unlikely event of an abnormal shutdown of the BI Server.

The BI Server periodically initiates usage tracking log file rollovers. A rollover consists of closing the current usage tracking log file and opening a newly created one for writing subsequent data. The frequency at which rollovers occur is called a rollover interval. The default rollover interval is 240 minutes (every 4 hours).

Usage tracking log files that are closed are available for analysis. Setting a lower rollover interval makes usage tracking log files available for analysis sooner, but at the cost of additional overhead.

If the checkpoint interval equals or exceeds the rollover interval, then only the rollover occurs explicitly. The checkpoint occurs implicitly only when the old usage tracking log file is closed.

9.2 Description of the Usage Tracking Data

Table 9-2 describes each column in the usage tracking table. Where appropriate, the data type and length is also included.

Table 9-2 Usage Tracking Data

Column	Description
CACHE_IND_FLG	Default is N. Y indicates a cache hit for the query; N indicates a cache miss.
COMPILE_TIME_SEC	The time in seconds that is required to compile the query.
CUM_DB_TIME_SEC	The total amount of time in seconds that the BI Server waited for back-end physical databases on behalf of a logical query.
CUM_NUM_DB_ROW	The total number of rows that are returned by the back-end databases.
END_DT	The date the logical query was completed.
END_HOUR_MIN	The hour and minute the logical query was completed.
END_TS	The date and time the logical query finished. The start and end timestamps also reflect any time that the query spent waiting for resources to become available. Note: If the user submitting the query navigates away from the page before the query finishes, then the final fetch never happens, and a timeout value of 3600 is recorded. However, if the user navigates back to the page before the timeout, then the fetch completes at that time, and this is recorded as the end_ts time.
ERROR_TEXT	Default is Null. Varchar(250) Error message from the back-end database. This column is only applicable if the `SUCCESS_FLG` (for more information, see entry later in this table) is set to a value other than 0 (zero). Multiple messages are concatenated and are not parsed by the BI Server.
NODE_ID	The host name of the computer where the BI Server is running.
NUM_CACHE_HITS	Default is Null. Number(10,0). For DB2, the data type and length is Decimal(10,0). Indicates the number of times that the cache result returned for query.
NUM_CACHE_INSERTED	Default is Null. Number(10,0). For DB2, the data type and length is Decimal(10,0). Indicates the number of times that the query generated a cache entry.
NUM_DB_QUERY	The number of queries that were submitted to back-end databases to satisfy the logical query request. For successful queries (SuccessFlag = 0) this number is 1 or greater.
PRESENTATION_NAME	Default is Null. Varchar(128) The name of the Oracle BI Presentation Catalog.
QUERY_BLOB	The data type is `ntext` when using SQLServer, and is `CLOB` when using ORACLE, DB2, or TERRADATA databases. Contains the entire logical SQL statement without any truncation.
QUERY_KEY	Default is Null. Varchar(128). An MD5 hash key that is generated by Oracle Business Intelligence from the logical SQL statement.
QUERY_SRC_CD	The source of the request (for example, Drill or Report).
QUERY_TEXT	Varchar (1024). The SQL statement that was submitted for the query. You can change the length of this column (using the ALTER TABLE command), but note that the text that is written into this column is always truncated to the size that is defined in the physical layer. It is the responsibility of the repository administrator not to set the length of this column to a value greater than the maximum query length that is supported by the back-end physical database. For example, Oracle databases enable a maximum Varchar of 4000, but Oracle truncates to 4000 bytes, not 4000 characters. Hence, if you use a multibyte character set, the actual maximum string size has a varying number of characters, depending on the character set and characters used.
REPOSITORY_NAME	The name of the repository that the query accesses.
ROW_COUNT	The number of rows that are returned to the query client. Note: When a large amount of data is returned from a query, this column is not populated until the user displays all of the data.
IMPERSONATOR_USER_NAME	Default is Null. Varchar(128) The user name of the impersonated user. If the request is not run as an impersonated user, then the value is NULL.
SAW_DASHBOARD	The path name of the dashboard. If the query was not submitted through a dashboard, then the value is NULL.
SAW_DASHBOARD_PG	Default is Null. Varchar(150) The page name in the dashboard. If the request is not a dashboard request, then the value is NULL.
SAW_SRC_PATH	The path name in the Oracle BI Presentation Catalog for the analysis.
START_DT	The date that the logical query was submitted.
START_HOUR_MIN	The hour and minute that the logical query was submitted.
START_TS	The date and time that the logical query was submitted.
SUBJECT_AREA_NAME	The name of the business model that is being accessed.
SUCCESS_FLG	The completion status of the query, as defined in the following list: 0 - The query completed successfully with no errors. 1 - The query timed out. 2 = The query failed because row limits were exceeded. 3 = The query failed due to some other reason.
TOTAL_TIME_SEC	The time in seconds that the BI Server spent working on the query while the client waited for responses to its analyses. This setting is the same as the Response time in the nqquery.log file, as described in Section 8.3.1.1, "Setting the Query Logging Level."
USER_NAME	The name of the user who submitted the query.