C Creating Backup of Prometheus Time Series Database (TSDB) Using Snapshot Utility
This section details how users can create backup of Prometheus Time Series Database (TSDB) using the snapshot utility.
Capturing TSDB Snapshots
Perform the following steps to capture the TSDB snapshots:
- Enable the
web.enable-admin-apiflag provided inocoso_csar_25_2_100_0_0_0_prom_custom_values.yaml:extraFlags: - web.enable-lifecycle ## web.enable-admin.api flag controls access to the administrative HTTP API which includes functionality such as ## deleting time series. This is disabled by default. - web.enable-admin-api - Install OSO using Installing OSO Using CSAR.
- Use the ephemeral (Debug) container to capture the snapshot and
export it out of Prometheus. After the export is done successfully, ephemeral
container will be exited and stopped.
Note:
- Take the backup of current Prometheus data. You can wait for a couple days for Prometheus database to fill up with the required data or until OSO's retention period is over (the default period is 7 days). Then, perform the following steps to take the backup.
- Use the
occne.io/occne/oso_snapshot:25.2.100 image for ephemeral container. This image is created for the purpose of handling the creation and removal of the TSDB snapshots. You must load and push this image into the customer's central or system registry.
- Capture snapshots by connecting the Debug container to the
Prometheus server.
When OSO is configured with
CLUSTER_NAME_PREFIX, run the following command:$ kubectl -n <oso-namespace> debug <oso-prom-pod-name> -it --image=<oso_snapshot_image_url> --target=<name of Prometheus server container> --env OSO_PROMETHEUS_SERVICE_NAME=<oso-prometheus-service-name> --env CLUSTER_NAME_PREFIX=<cluster-name-prefix> &When OSO is not configured with
CLUSTER_NAME_PREFIX, provide only theOSO_PROMETHEUS_SERVICE_NAMEand run the following command:
For example:$ kubectl -n <oso-namespace> debug <oso-prom-pod-name> -it --image=<oso_snapshot_image_url> --target=<name of Prometheus server container> --env OSO_PROMETHEUS_SERVICE_NAME=<oso-prometheus-service-name> &$ kubectl -n oso debug oso-p-prom-svr-55f8d47c74-4vwfx -it --image=occne-repo-host:5000/occne.io/occne/oso_snapshot:25_2_100 --target=prom-svr --env OSO_PROMETHEUS_SERVICE_NAME=oso-p-prom-svr --env CLUSTER_NAME_PREFIX=occne3-n2 & Targeting container "prom-svr". If you don't see processes from this container it may be because the container runtime doesn't support this feature. --profile=legacy is deprecated and will be removed in the future. It is recommended to explicitly specify a profile, for example "--profile=general". Defaulting debug container name to debugger-x25bq. If you don't see a command prompt, try pressing enter. [1]+ Stopped kubectl -n oso debug oso-p-prom-svr-55f8d47c74-4vwfx -it --image=occne-repo-host:5000/occne.io/occne/oso_snapshot:25_2_100 --target=prom-svr --env OSO_PROMETHEUS_SERVICE_NAME=oso-p-prom-svr --env CLUSTER_NAME_PREFIX=occne3-n2Note:
There is an ampersand (&) character at the end of the command. This character indicates that the process is running in the background and allows the terminal available for other processes. In the above example, "Defaulting debug container name to debugger-x25bq.", the name "debugger-x25bq" indicates the ephemeral container name of the active container. This is used to copy the snapshot out of the pod and to remove it later. - Run
the below command to view the logs, when the snapshot process is running in the
background. The output should look like below with "
200 OK" code.kubectl -n <oso-namespace> logs <oso-prom-pod-name> -c <ephemeral-container-name>For example:
$ kubectl -n oso logs oso-p-prom-svr-55f8d47c74-4vwfx -c debugger-x25bqSample output:
total 36K drwxr-sr-x. 2 nobody nobody 4.0K Aug 19 17:12 chunks_head -rw-r--r--. 1 nobody nobody 0 Aug 19 17:12 lock drwxrws---. 2 root nobody 16K Aug 19 17:12 lost+found -rw-r--r--. 1 nobody nobody 20K Aug 19 17:12 queries.active -rw-r--r--. 1 nobody nobody 7.2K Aug 20 16:53 snapshots.log drwxr-sr-x. 2 nobody nobody 4.0K Aug 19 17:12 wal % Total % Received % Xferd Average Speed Time Time Time Current Dload Upload Total Spent Left Speed 0 0 0 0 0 0 0 0 --:--:-- --:--:-- --:--:-- 0* Trying 10.233.11.145:80... * Connected to oso-p-prom-svr (10.233.11.145) port 80 (#0) > POST /occne3-n2/prometheus/api/v1/admin/tsdb/snapshot HTTP/1.1 > Host: oso-p-prom-svr > User-Agent: curl/7.76.1 > Accept: */* > * Mark bundle as not supporting multiuse < HTTP/1.1 200 OK < Content-Type: application/json < Date: Wed, 20 Aug 2025 16:53:24 GMT < Content-Length: 72 < { [72 bytes data] 100 72 100 72 0 0 1440 0 --:--:-- --:--:-- --:--:-- 1440 * Connection #0 to host oso-p-prom-svr left intact - Verify if the log of the curl command that issues the snapshot
creation logic inside debug container is successful, by looking for a
"
< HTTP/1.1 200 OK" in the above sample output. Export the tarball (.tgz artifact) out of the debug container using the following command.$ kubectl cp <oso-namespace>/<oso-prom-svr-pod-name>:/proc/1/root/data/snapshots.tgz -c <debug-container-name> /tmp/<snapshot-folder-name>/snapshots.tgzFigure C-1 Exporting tgz artifact from Debug container

- Get
the snapshot process back to foreground, once the snapshots tar is available in
the local system, by running the following
command:
$ fg $(jobs | awk -F '[][]' '/oso_snapshot/{print $2}')For example:$ fg $(jobs | awk -F '[][]' '/oso_snapshot/{print $2}')Sample output:
$ fg $(jobs | awk -F '[][]' '/oso_snapshot/{print $2}') kubectl -n oso debug oso-p-prom-svr-55f8d47c74-4vwfx -it --image=occne-repo-host:5000/occne.io/occne/oso_snapshot:25_2_100 --target=prom-svr --env OSO_PROMETHEUS_SERVICE_NAME=oso-p-prom-svr --env CLUSTER_NAME_PREFIX=occne3-n2The above step leaves the terminal in a waiting state.
Press Enter in the terminal to finalize the process and to see the full log of the snapshot creation process.
------------------------------ Snapshot procedure started ---------------------------------- STEP 1: CHECKING FOR EXISTING SNAPSHOT ARCHIVES AND CLEANING UP IF NECESSARY Listing current contents of the directory for snapshot archives: total 32K drwxrws---. 2 root nobody 16K Aug 19 17:12 lost+found -rw-r--r--. 1 nobody nobody 20K Aug 19 17:12 queries.active drwxr-sr-x. 2 nobody nobody 4.0K Aug 19 17:12 wal -rw-r--r--. 1 nobody nobody 0 Aug 19 17:12 lock drwxr-sr-x. 2 nobody nobody 4.0K Aug 19 17:12 chunks_head -rw-r--r--. 1 nobody nobody 239 Aug 20 16:51 snapshots.log No previous snapshots.tgz found. All clear! No previous snapshots directory found. All clear! STEP 2: CAPTURING SNAPSHOT OF CURRENT PROMETHEUS DB USING ADMINISTRATIVE API Wed Aug 20 16:51:01 UTC 2025 Sending POST request to Prometheus API to create a snapshot with Cluster Name Prefix... Executing: curl -vvv -XPOST "http://oso-p-prom-svr/occne3-n2/prometheus/api/v1/admin/tsdb/snapshot" % Total % Received % Xferd Average Speed Time Time Time Current Dload Upload Total Spent Left Speed 0 0 0 0 0 0 0 0 --:--:-- --:--:-- --:--:-- 0* Trying 10.233.11.145:80... * Connected to oso-p-prom-svr (10.233.11.145) port 80 (#0) > POST /occne3-n2/prometheus/api/v1/admin/tsdb/snapshot HTTP/1.1 > Host: oso-p-prom-svr > User-Agent: curl/7.76.1 > Accept: */* > * Mark bundle as not supporting multiuse < HTTP/1.1 200 OK < Content-Type: application/json < Date: Wed, 20 Aug 2025 16:51:01 GMT < Content-Length: 72 < { [72 bytes data] 100 72 100 72 0 0 911 0 --:--:-- --:--:-- --:--:-- 911 * Connection #0 to host oso-p-prom-svr left intact {"status":"success","data":\{"name":"20250820T165101Z-0eb533ff8d43ed64"}} ✅ Snapshot is successfully created! Directory contents of Prometheus DB AFTER successful Snapshot: total 36K drwxrws---. 2 root nobody 16K Aug 19 17:12 lost+found -rw-r--r--. 1 nobody nobody 20K Aug 19 17:12 queries.active drwxr-sr-x. 2 nobody nobody 4.0K Aug 19 17:12 wal -rw-r--r--. 1 nobody nobody 0 Aug 19 17:12 lock drwxr-sr-x. 2 nobody nobody 4.0K Aug 19 17:12 chunks_head drwxr-sr-x. 3 nobody nobody 4.0K Aug 20 16:51 snapshots -rw-r--r--. 1 nobody nobody 2.0K Aug 20 16:51 snapshots.log STEP 3: CREATING EXPORTABLE ARCHIVE OF THE SNAPSHOT DATA ✅ Snapshots is successfully archived! total 40K drwxrws---. 2 root nobody 16K Aug 19 17:12 lost+found -rw-r--r--. 1 nobody nobody 20K Aug 19 17:12 queries.active drwxr-sr-x. 2 nobody nobody 4.0K Aug 19 17:12 wal -rw-r--r--. 1 nobody nobody 0 Aug 19 17:12 lock drwxr-sr-x. 2 nobody nobody 4.0K Aug 19 17:12 chunks_head drwxr-sr-x. 3 nobody nobody 4.0K Aug 20 16:51 snapshots -rw-r--r--. 1 nobody nobody 162 Aug 20 16:51 snapshots.tgz -rw-r--r--. 1 nobody nobody 2.5K Aug 20 16:51 snapshots.log Non-interactive shell detected. Skipping interactive prompt... ----------------------------------- Snapshot creation procedure completed --------------------------------------- ------------------------------- Snapshot procedure started ---------------------------------- STEP 1: CHECKING FOR EXISTING SNAPSHOT ARCHIVES AND CLEANING UP IF NECESSARY Listing current contents of the directory for snapshot archives: total 40K drwxrws---. 2 root nobody 16K Aug 19 17:12 lost+found -rw-r--r--. 1 nobody nobody 20K Aug 19 17:12 queries.active drwxr-sr-x. 2 nobody nobody 4.0K Aug 19 17:12 wal -rw-r--r--. 1 nobody nobody 0 Aug 19 17:12 lock drwxr-sr-x. 2 nobody nobody 4.0K Aug 19 17:12 chunks_head drwxr-sr-x. 3 nobody nobody 4.0K Aug 20 16:51 snapshots -rw-r--r--. 1 nobody nobody 162 Aug 20 16:51 snapshots.tgz -rw-r--r--. 1 nobody nobody 3.3K Aug 20 16:53 snapshots.log Found snapshots.tgz archive. Deleting... ✅ Successfully deleted! Found snapshots directory. Deleting... ✅ Successfully deleted! STEP 2: CAPTURING SNAPSHOT OF CURRENT PROMETHEUS DB USING ADMINISTRATIVE API Wed Aug 20 16:53:01 UTC 2025 Sending POST request to Prometheus API to create a snapshot with Cluster Name Prefix... Executing: curl -vvv -XPOST "http://oso-p-prom-svr/occne3-n2/prometheus/api/v1/admin/tsdb/snapshot" % Total % Received % Xferd Average Speed Time Time Time Current Dload Upload Total Spent Left Speed 0 0 0 0 0 0 0 0 --:--:-- --:--:-- --:--:-- 0* Trying 10.233.11.145:80... * Connected to oso-p-prom-svr (10.233.11.145) port 80 (#0) > POST /occne3-n2/prometheus/api/v1/admin/tsdb/snapshot HTTP/1.1 > Host: oso-p-prom-svr > User-Agent: curl/7.76.1 > Accept: */* > * Mark bundle as not supporting multiuse < HTTP/1.1 200 OK < Content-Type: application/json < Date: Wed, 20 Aug 2025 16:53:01 GMT < Content-Length: 72 < { [72 bytes data] 100 72 100 72 0 0 2000 0 --:--:-- --:--:-- --:--:-- 1945 * Connection #0 to host oso-p-prom-svr left intact {"status":"success","data":\{"name":"20250820T165301Z-0572f8dd893c2028"}} ✅ Snapshot is successfully created! Directory contents of Prometheus DB AFTER successful Snapshot: total 40K drwxrws---. 2 root nobody 16K Aug 19 17:12 lost+found -rw-r--r--. 1 nobody nobody 20K Aug 19 17:12 queries.active drwxr-sr-x. 2 nobody nobody 4.0K Aug 19 17:12 wal -rw-r--r--. 1 nobody nobody 0 Aug 19 17:12 lock drwxr-sr-x. 2 nobody nobody 4.0K Aug 19 17:12 chunks_head drwxr-sr-x. 3 nobody nobody 4.0K Aug 20 16:53 snapshots -rw-r--r--. 1 nobody nobody 5.3K Aug 20 16:53 snapshots.log STEP 3: CREATING EXPORTABLE ARCHIVE OF THE SNAPSHOT DATA ✅ Snapshots is successfully archived! total 44K drwxrws---. 2 root nobody 16K Aug 19 17:12 lost+found -rw-r--r--. 1 nobody nobody 20K Aug 19 17:12 queries.active drwxr-sr-x. 2 nobody nobody 4.0K Aug 19 17:12 wal -rw-r--r--. 1 nobody nobody 0 Aug 19 17:12 lock drwxr-sr-x. 2 nobody nobody 4.0K Aug 19 17:12 chunks_head drwxr-sr-x. 3 nobody nobody 4.0K Aug 20 16:53 snapshots -rw-r--r--. 1 nobody nobody 161 Aug 20 16:53 snapshots.tgz -rw-r--r--. 1 nobody nobody 5.8K Aug 20 16:53 snapshots.log Non-interactive shell detected. Skipping interactive prompt... ----------------------------------- Snapshot creation procedure completed --------------------------------------- ------------------------------- Snapshot procedure started ---------------------------------- STEP 1: CHECKING FOR EXISTING SNAPSHOT ARCHIVES AND CLEANING UP IF NECESSARY Listing current contents of the directory for snapshot archives: total 44K drwxrws---. 2 root nobody 16K Aug 19 17:12 lost+found -rw-r--r--. 1 nobody nobody 20K Aug 19 17:12 queries.active drwxr-sr-x. 2 nobody nobody 4.0K Aug 19 17:12 wal -rw-r--r--. 1 nobody nobody 0 Aug 19 17:12 lock drwxr-sr-x. 2 nobody nobody 4.0K Aug 19 17:12 chunks_head drwxr-sr-x. 3 nobody nobody 4.0K Aug 20 16:53 snapshots -rw-r--r--. 1 nobody nobody 161 Aug 20 16:53 snapshots.tgz -rw-r--r--. 1 nobody nobody 6.6K Aug 20 16:53 snapshots.log Found snapshots.tgz archive. Deleting... ✅ Successfully deleted! Found snapshots directory. Deleting... ✅ Successfully deleted! STEP 2: CAPTURING SNAPSHOT OF CURRENT PROMETHEUS DB USING ADMINISTRATIVE API Wed Aug 20 16:53:24 UTC 2025 Sending POST request to Prometheus API to create a snapshot with Cluster Name Prefix... Executing: curl -vvv -XPOST "http://oso-p-prom-svr/occne3-n2/prometheus/api/v1/admin/tsdb/snapshot" % Total % Received % Xferd Average Speed Time Time Time Current Dload Upload Total Spent Left Speed 0 0 0 0 0 0 0 0 --:--:-- --:--:-- --:--:-- 0* Trying 10.233.11.145:80... * Connected to oso-p-prom-svr (10.233.11.145) port 80 (#0) > POST /occne3-n2/prometheus/api/v1/admin/tsdb/snapshot HTTP/1.1 > Host: oso-p-prom-svr > User-Agent: curl/7.76.1 > Accept: */* > * Mark bundle as not supporting multiuse < HTTP/1.1 200 OK < Content-Type: application/json < Date: Wed, 20 Aug 2025 16:53:24 GMT < Content-Length: 72 < { [72 bytes data] 100 72 100 72 0 0 1440 0 --:--:-- --:--:-- --:--:-- 1440 * Connection #0 to host oso-p-prom-svr left intact {"status":"success","data":\{"name":"20250820T165324Z-3e8dc6572924e398"}} ✅ Snapshot is successfully created! Directory contents of Prometheus DB AFTER successful Snapshot: total 44K drwxrws---. 2 root nobody 16K Aug 19 17:12 lost+found -rw-r--r--. 1 nobody nobody 20K Aug 19 17:12 queries.active drwxr-sr-x. 2 nobody nobody 4.0K Aug 19 17:12 wal -rw-r--r--. 1 nobody nobody 0 Aug 19 17:12 lock drwxr-sr-x. 2 nobody nobody 4.0K Aug 19 17:12 chunks_head drwxr-sr-x. 3 nobody nobody 4.0K Aug 20 16:53 snapshots -rw-r--r--. 1 nobody nobody 8.5K Aug 20 16:53 snapshots.log STEP 3: CREATING EXPORTABLE ARCHIVE OF THE SNAPSHOT DATA ✅ Snapshots is successfully archived! total 48K drwxrws---. 2 root nobody 16K Aug 19 17:12 lost+found -rw-r--r--. 1 nobody nobody 20K Aug 19 17:12 queries.active drwxr-sr-x. 2 nobody nobody 4.0K Aug 19 17:12 wal -rw-r--r--. 1 nobody nobody 0 Aug 19 17:12 lock drwxr-sr-x. 2 nobody nobody 4.0K Aug 19 17:12 chunks_head drwxr-sr-x. 3 nobody nobody 4.0K Aug 20 16:53 snapshots -rw-r--r--. 1 nobody nobody 163 Aug 20 16:53 snapshots.tgz -rw-r--r--. 1 nobody nobody 9.0K Aug 20 16:53 snapshots.log ----------------------------------- Snapshot creation procedure completed --------------------------------------- Session ended, the ephemeral container will not be restarted but may be reattached using 'kubectl attach oso-p-prom-svr-55f8d47c74-4vwfx -c debugger-x25bq -i -t' if it is still running - (Mandatory) Clean up the snapshot archive. Run the following
command to remove the snapshots created.
WARNING:
Failing to perform the following step will leave a snapshot hanging in your system and may fill up your system's storage.kubectl -n <oso-namespace> debug <oso-prom-pod-name> -it --image=<oso_snapshot_image_url> --target=prom-svr --env REMOVE=yes - Verify the output of the above command to confirm if the snapshots
were removed.
For example:
$ kubectl -n oso debug oso-p-prom-svr-55f8d47c74-4vwfx -it --image=occne-repo-host:5000/occne.io/occne/oso_snapshot:25_2_100 --target=prom-svr --env REMOVE=yesSample output:
Targeting container "prom-svr". If you don't see processes from this container it may be because the container runtime doesn't support this feature. --profile=legacy is deprecated and will be removed in the future. It is recommended to explicitly specify a profile, for example "--profile=general". Defaulting debug container name to debugger-fc5q6. ---------------------------------- REMOVE flag is set to 'yes'. Cleaning up snapshot archives ------------------------------------ Prometheus DB Directory contents BEFORE cleanup: total 48K drwxr-sr-x. 2 nobody nobody 4.0K Aug 19 17:12 chunks_head -rw-r--r--. 1 nobody nobody 0 Aug 19 17:12 lock drwxrws---. 2 root nobody 16K Aug 19 17:12 lost+found -rw-r--r--. 1 nobody nobody 20K Aug 19 17:12 queries.active drwxr-sr-x. 3 nobody nobody 4.0K Aug 20 16:53 snapshots -rw-r--r--. 1 nobody nobody 9.5K Aug 20 16:54 snapshots.log -rw-r--r--. 1 nobody nobody 163 Aug 20 16:53 snapshots.tgz drwxr-sr-x. 2 nobody nobody 4.0K Aug 19 17:12 wal ✅ Snapshots archives removed successfully! Prometheus DB Directory contents AFTER cleanup: total 28K drwxr-sr-x. 2 nobody nobody 4.0K Aug 19 17:12 chunks_head -rw-r--r--. 1 nobody nobody 0 Aug 19 17:12 lock drwxrws---. 2 root nobody 16K Aug 19 17:12 lost+found -rw-r--r--. 1 nobody nobody 20K Aug 19 17:12 queries.active drwxr-sr-x. 2 nobody nobody 4.0K Aug 19 17:12 wal ---------------------------------- Snapshot cleanup process completed ------------------------------------- - (Mandatory)Validate the contents of the snapshot by
running the following command on the snapshot
file:
$ tar tvf snapshots.tgzFor example:
$ tar tvf snapshots.tgz drwxr-sr-x nobody/nobody 0 2025-08-25 21:11 snapshots/ drwxr-sr-x nobody/nobody 0 2025-08-25 21:11 snapshots/20250825T211118Z-0aa128d1c1f37e71/ drwxr-sr-x nobody/nobody 0 2025-08-25 21:11 snapshots/20250107T084350Z-2a422bf3ea6cee2c/01JGZYYEXF1DR08XFAHX99Z1EW/ drwxr-sr-x nobody/nobody 0 2025-08-25 21:11 snapshots/20250107T084350Z-2a422bf3ea6cee2c/01JGZYYEXF1DR08XFAHX99Z1EW/chunks/ -rw-r--r-- nobody/nobody 14004 2025-08-25 21:11 snapshots/20250107T084350Z-2a422bf3ea6cee2c/01JGZYYEXF1DR08XFAHX99Z1EW/chunks/000001 -rw-r--r-- nobody/nobody 9 2025-08-25 21:11 snapshots/20250107T084350Z-2a422bf3ea6cee2c/01JGZYYEXF1DR08XFAHX99Z1EW/tombstones -rw-r--r-- nobody/nobody 69072 2025-08-25 21:11 snapshots/20250107T084350Z-2a422bf3ea6cee2c/01JGZYYEXF1DR08XFAHX99Z1EW/index -rw-r--r-- nobody/nobody 273 2025-08-25 21:11 snapshots/20250107T084350Z-2a422bf3ea6cee2c/01JGZYYEXF1DR08XFAHX99Z1EW/meta.jsonNote:
If the snapshot in your system does not have the structure shown above, delete the snapshot and try the process again.