System Backup and Restore

The integrated backup service protects the Private Cloud Appliance system configuration against data loss and corruption. It stores the data required for system and service operation so that any crucial service or component can be restored to its last known healthy state. Backups of user environments in the Compute Enclave are not included.

The following file systems are backed up:

PCA project: obj_share and MGMT_ROOT
public_ostore_project: File systems related to each tenancy created on the rack (for COS operations)
private_ostore project: File system in private_ostore_project

In line with the microservice-based deployment model, the backup service orchestrates the various backup operations across the entire system and ensures data consistency and integrity, but it does not define the individual component backup requirements. That logic is part of the component backup plugins.

The backup plugin is the key element that determines which files are backed up for a given system component or service, and how the data is collected. For example, a simple file copy may work for certain files while a snapshot is required for other data, or in some cases a service may need to be stopped to allow the backup to be created. The plugin also determines the backup frequency and retention time. Each plugin registers with the backup service so that the backup service is aware of the active plugins and can schedule the required backup operations in a consistent manner as Kubernetes CronJobs. Plugins are aggregated into a backup profile; the backup profile is the list of tasks that the backup service performs when a backup job is run.

The backup data collected through the plugins is then stored by the backup service in a dedicated NFS share on the internal ZFS Storage Appliance, using ZFS encryption to ensure that the data at rest is secure. If required, the backup files can optionally be replicated to an external storage location.

When restoring a service or component from a backup, the service again relies on the logic provided by the plugin. A component restore process has two major phases: verification and data management. In the verification phase, the backup is evaluated for completeness and appropriateness in relation to the current condition of the component. In the data management phase, the required actions are taken to stop or suspend a component, replace the data, and restart or resume normal operation. As with backup, the operations to restore the data are specific to the component in question.

The default backup and restore implementation is to perform all tasks in a global backup profile that covers the MySQL cluster database, a snapshot of the ZFS projects on the storage appliance, and all registered component backup plugins. The default profile is processed daily at midnight UTC and has a 14-day retention policy; older backups are deleted. Backups are stored in /nfs/shared_storage/backups/backup_*. All restore operations must be performed manually on a per-component basis.

Important

Automated restore operations based on the backup plugins are not possible. If a manual restore from a backup is required, contact Oracle for assistance.
Monitoring data from Prometheus is not included in automated backups. To preserve your Prometheus data, create a backup and restore it manually. For more information, refer to the support note with Doc ID 3021643.1.
Automatic purging of backups – regardless of whether it is a standard daily backup or a manually triggered operation – is critical for backups of the MySQL database. If a MySQL backup must be stored longer than the retention period, for example because it represents an important restore point, ensure that the data is copied to another location before the retention period expires. Contact your Oracle representative for assistance.
Certain snapshots are not governed by any retention policy. If these are no longer needed, you must delete them manually. See Converted Snapshots.

Oracle Cloud Infrastructure Documentation

System Backup and Restore