62 Data Management

You can use the Data Management pages to manage the storage space for Insights data that has been generated by the skills on your Oracle Digital Assistant instance.

For instances provisioned with the Development shape, you're allotted 40GB of storage. For instances provisioned with the Production shape, you're allotted 100GB of storage. Insights reporting stops when this storage has been depleted, so you can ensure that Insights reporting continues by using the Data Management pages to monitor storage availability and free up space by exporting data to an archive file before purging it, or by simply purging it. You can perform archive and purge tasks manually, or schedule these tasks based on the document retention period that’s designated by your documentation retention policy.

You can access Data Management by first clicking Settings in the left navbar then by clicking Data Management. Here’s how you use the Data Management pages:
  • Monitor – Check for data usage threshold alerts and monitor capacity.
  • Manage – Manually purge and archive data, check the status of archive and purge tasks that have either been triggered manually or automatically, and view the history or archive and purge tasks
  • Auto Purge Preference – Implement your organization’s document retention policy by automating the archive and purge tasks related to the documentation retention period.

Monitor Insights Data Storage Capacity

Use the Data Monitoring dashboard to view storage consumption on both a daily and a monthly basis for a given time period.

View Storage Indicators

Use the tiles on the Data Monitoring page to:
  • Track the alerts generated during the selected period – The Monitor dashboard displays warning- and critical-level alerts that correspond to the 60% and 80% data usage thresholds that are set in the page’s Cumulative Percentage graph. In addition to enumerating these alerts, this tile also tallies the info-level alerts, which confirm that data has been removed from the database after the successful completion of purge tasks, both manual and automated.

    The report starts generating alerts when the storage in your allocated space reaches the 60% warning level. Clicking the Alerts tile displays a history of alerts for the period.
    Description of alert_history.png follows
    Description of the illustration alert_history.png

  • Monitor the total amount of allocated storage space – For the selected period, you can find out how much of it’s been used, how much storage remains, and the average amount of storage that the skills in your instance consume.

View Storage Capacity

The Data Management report provides two views of storage consumption:
  • Cumulative Percentage – This graph gives you the percentage of storage that’s used up for each day during a selected time period. It plots data up to, but not including, the last day of the selected date range. For example, if January 7th is the last day of the selected date range, then the graph won’t include the capacity used up for that day. The line concludes with the usage for January 6th instead.
    The line continues to rise until an Export and Purge task succeeds. (The task itself might have been signaled by the trend line crossing the graph’s 60% or 80% thresholds.) After the data has been purged, the drop in the graph indicates how much capacity has been freed up.
  • Data Volume – This bar chart measures the actual amount of available storage (as opposed to a percent) by plotting the data consumption by day, week, or month (depending on the selected time span).

Manage Data Capacity with Archive and Purge Tasks

The Archive and Archive & Purge options on the Management page allow skill developers and administrators to maintain Insights data. While both these tasks export the conversations logged by Insights into a CSV file, they have different uses. Archive and purge tasks free up data capacity while archive tasks do not.

Skill developers typically create archive tasks to review customer input for potential additions to the training data. An archive task is part of the ongoing effort to improve skill quality, not to manage storage.

An archive and purge task, on the other hand, does free up storage. System administrators, not skill developers, usually create these tasks. They are either performed manually in response to a capacity alert, or they are triggered automatically based on the schedule set forth in a documentation retention policy. When archive and purge tasks complete, they generate a ZIP file, one that itself contains ZIP files for every skill that generated Insights data within the date range specified for the task. If only a single skill generated Insights data for the selected period, then an archive task generates a ZIP file that contains a single CSV file. Although archive and purge tasks allow you to maintain the Insights data in a CSV file, the actual data no longer exists in storage so it can’t be recovered.

Free Capacity Manually with Archive and Purge Tasks

When there’s not enough free space to support ongoing Insights reporting, administrators can free up space manually by using the Archive & Purge option in the Manage page. For example, when the Monitor dashboard displays a warning-level alert, you (an administrator) would use this option to submit a task that archives and purges the Insights data that’s been logged for a specific period. If that period ends with the current date, then the data for conversations that are in-progress may also get removed. After you’ve created the task, you can track it in the Manage page.

Note:

You can't preemptively purge data to maintain future storage capacity. You can only manually archive and purge that that’s been collected up to, and including, the current date.

Schedule Automated Archive and Purge Tasks

While administrators can dispose of excess data manually when critical levels of consumption threaten Insights reporting, their organization’s document retention policy may require automated archive and purge cycles that can be tracked for auditing purposes. As an administrator, you can implement your documentation retention policy’s requirements for document retention periods and scheduled data purges by setting the properties on the Auto Purge Preference page. The record of the automated archive and purge tasks that have been generated as a result of your auto purge configuration is maintained by the Manage report.

By default, auto purge is not enabled. To set the retention period, purge schedule, data usage threshold and other properties, you must first switch on Enable Auto Purge. After you’ve activated this option, data can be purged from storage when it has either been stored for longer than the number of days specified for the retention period, when a data usage threshold has been reached, or because of a combination of both these factors. For example, if your organization’s document retention policy sets the document retention period for 90 days and the data storage threshold at 60% capacity, then data that is older than 90 days gets purged whenever consumption rises above the 60% threshold.

The Auto Purge Preferences

To implement an auto purge policy:
  • Enable Auto Purge – Switch on to purge data from storage when it has either been stored for longer than the Retention Period, when the Data Usage Threshold has been reached, or both.

    Note:

    Switching this option off deletes, rather than disables the auto purge policy created by these settings.
  • Enable Archiving – By default, this option is enabled so that data gets archived before it’s purged from storage. If you switch this option off, then the data will be just be purged after the retention period has ended or the usage threshold has been reached. Data that’s purged from storage cannot be recovered.
  • Retention Period – The number of days, according to your data retention policy, that data should be retained in storage before it can be purged. Any purge or archive tasks can only be run outside of this period. For example, if the retention period is 90 days, then only the data that has been added to storage in the last 90 days will be kept. Any data that's been stored for longer than 90 days will be purged. If you do not want to set a retention period, then enter 0. In this case, all data will be either be archived, purged, or both, depending on the auto purge preferences.
  • Data Usage Threshold – A number between 0 and 99 that represents the storage limit as a percentage. 60 means 60 percent, for example. Data gets purged when storage consumption exceeds this cap. If you've set a Retention Period, then older data will be purged when the volume of data exceeds the Data Usage Threshold. If you don't want to set a threshold, then enter 0. In this case, data will be purged per the Retention Period only.
  • Schedule – Specifies the day (or days) on which the auto-purge and archiving process can be run. You can set this in combination with the Retention Period according to your data retention policy.
  • Timeout – The amount of time (in seconds) that automatic archive and purge tasks can run before they time out and fail. The time it takes to complete these tasks varies depending on the amount of data within the selected date range, so large jobs may take a longer time to complete.

Manage, Track, and Monitor Archive Tasks

In addition to creating archive and purge tasks, the Manage page lets you monitor, and search for, manually-created and automated archive and purge tasks. For the automated tasks, you can use the page’s filter and sort functions to create an audit log. In addition to searching through tasks, you can use this page to:
  • Create manual archive and purge tasks.
  • Download ZIP files of completed tasks.
  • Remove archive tasks (and delete their archived data).
After a task has been submitted, it's listed in a table with the following columns.
  • Task: The type of task: Archive, Archive & Purge, Purge, and Auto Purge
  • Name: The task name
  • Run: The timestamp marking when the task was completed
  • Created By: The name of the task creator. For auto purge tasks, the task creator name is Automation.
  • Date Range: The starting and ending date for the Insights data that has been purged and/or archived.
  • Status: Submitted, Archive Failed, Purge Failed, and No Data (when there's no data to export within the date range defined for the task), and Archive Succeeded, Archive & Purge Succeeded, hyperlinks that let you download a ZIP file that contains separate CSVs for each skill that generated Insights data during the selected period.

    Tip:

    You can filter the table's display using various criteria, such as the name of the task and task creator, the task status or the task type. To track the auto purge tasks for auditing purposes, enter Automation in the Filter by Name or Created By field.