2 Installing or Upgrading Ceph Storage for Oracle Linux

This chapter discusses how to enable the repositories to install the Ceph Storage for Oracle Linux packages, how to perform an installation of those packages, and how to perform an upgrade.

Hardware and Network Requirements

Ceph Storage for Oracle Linux does not require specific hardware; however, certain Ceph operations are CPU and memory intensive. The X6 and X7 line of Oracle x86 Servers are suitable to host Ceph nodes. For more information on Oracle x86 Servers, see:

https://www.oracle.com/servers/x86/index.html

A minimum node configuration is:

  • 4 CPU cores

  • 4GB RAM

  • 2 x 1GB Ethernet NICs

  • 1 TB storage for object data

Your deployment needs may require nodes with a larger footprint. Additional considerations are detailed in the Ceph upstream documentation.

Operating System Requirements

Ceph Storage for Oracle Linux Release 3.0 is available for Oracle Linux 7 (x86_64) running the Unbreakable Enterprise Kernel Release 5 (UEK R5). A minimum of Oracle Linux 7.5 is required.

Enabling Access to the Ceph Storage for Oracle Linux Packages

The ceph-deploy package is available on the Oracle Linux yum server in the ol7_ceph30 repository, or on the Unbreakable Linux Network (ULN) in the ol7_x86_64_ceph30 channel, however there are also dependencies across other repositories and channels, and these must also be enabled on each system included in the Ceph Storage Cluster.

If you are using the Oracle Linux yum server, you must enable the following repositories:

  • ol7_ceph30

  • ol7_addons

  • ol7_latest

  • ol7_optional_latest

  • ol7_UEKR5

If you are using ULN, you must enable the following channels:

  • ol7_x86_64_ceph30

  • ol7_x86_64_addons

  • ol7_x86_64_latest

  • ol7_x86_64_optional_latest

  • ol7_x86_64_UEKR5

Enabling Repositories with ULN

If you are registered to use ULN, use the ULN web interface to subscribe the system to the appropriate channels:

  1. Log in to https://linux.oracle.com with your ULN user name and password.

  2. On the Systems tab, click the link named for the system in the list of registered machines.

  3. On the System Details page, click Manage Subscriptions.

  4. On the System Summary page, select each required channel from the list of available channels and click the right arrow to move the channel to the list of subscribed channels.

    Subscribe the system to the ol7_x86_64_ceph30, ol7_x86_64_addons, ol7_x86_64_latest, ol7_x86_64_optional_latest and ol7_x86_64_UEKR5 channels.

  5. Click Save Subscriptions.

Enabling Repositories with the Oracle Linux Yum Server

To enable the required repositories on the Oracle Linux yum server, ensure that your system is up to date and that you have transitioned to use the modular yum repository configuration by installing the oraclelinux-release-el7 package and running the /usr/bin/ol_yum_configure.sh script.

# yum install oraclelinux-release-el7
# /usr/bin/ol_yum_configure.sh

Install the oracle-ceph-release-el7 release package to install appropriate yum repository configuration.

# yum install oracle-ceph-release-el7

Enable the following repositories:

  • ol7_ceph30

  • ol7_addons

  • ol7_latest

  • ol7_optional_latest

  • ol7_UEKR5

Use the yum-config-manager tool to update your yum configuration:

# yum-config-manager --enable ol7_ceph30 ol7_latest ol7_optional_latest ol7_addons ol7_UEKR5

You can now prepare the Ceph Storage Cluster nodes for installation of the Ceph Storage for Oracle Linux packages. See Installing and Configuring a Ceph Storage Cluster.

Installing and Configuring a Ceph Storage Cluster

A Ceph Storage Cluster consists of several systems, known as nodes. The nodes run various software daemons:

  • Every node runs the Ceph Object Storage Device (OSD) daemon.

  • One or more nodes run the Ceph Monitor and Ceph Manager daemons. Ceph Monitor and Ceph Manager should run on the same nodes.

  • Optionally, one or more nodes run the Ceph Object Gateway daemon.

  • Optionally, one or more nodes run the Metadata Server daemon to use Ceph File System.

A node is selected as an administration node from which Ceph commands can be run to control and monitor the cluster. Typically the administration node is also used as the deployment node, from which other systems can automatically be set up and configured as nodes in the cluster.

For data integrity, a Ceph Storage Cluster should contain two or more nodes for storing copies of an object. For high availability, a Ceph Storage Cluster should contain three or more nodes that store copies of an object.

In the example used in the following steps, the administration and deployment node is ceph-node1.example.com. The nodes used in the Ceph Storage Cluster are ceph-node2.example.com, ceph-node3.example.com, and ceph-node4.example.com.

Preparing Ceph Storage Cluster Nodes

There are some basic requirements for each Oracle Linux system that you intend to use as a Ceph Storage Cluster node. These include the following items, for which some preparatory work may be required before you can begin your deployment.

  1. Time must be accurate and synchronized across the nodes within the Ceph Storage Cluster. This is achieved by installing and configuring NTP on each system that you wish to run as a node in the cluster. If the NTP service if not already configured, install and start it. See Oracle® Linux 7: Administrator's Guide for more information on configuring NTP.

    Note:

    Use the hwclock --show command on a node in the cluster to ensure that all nodes agree on the time. If the clocks on the nodes differ by more than 50 milliseconds, the ceph health command displays the warning:

    health HEALTH_WARN clock skew detected on mon

  2. Ceph Storage Cluster network communications must be able to take place between nodes within the cluster. If firewall software is running on any of the nodes, it must either be disabled or, preferably, configured to facilitate network traffic on the required ports.

    Preferably, leave the firewall running and configure the following rules:

    1. Allow TCP traffic on port 6789 to enable the Ceph Monitor:

      # firewall-cmd --zone=public --add-port=6789/tcp --permanent
    2. Allow TCP traffic for ports 6800 to 7300 to enable the traffic for the Ceph OSD daemon:

      # firewall-cmd --zone=public --add-port=6800-7300/tcp --permanent
    3. Allow TCP traffic on port 7480 to enable the Ceph Object Gateway:

      # firewall-cmd --zone=public --add-port=7480/tcp --permanent
    4. Allow TCP traffic on ports 3260 and 5000 on Ceph iSCSI Gateway nodes:

      # firewall-cmd --zone=public --add-port=3260/tcp --add-port=5000/tcp --permanent
    5. Allow TCP traffic on port 2049 on any NFS server nodes:

      # firewall-cmd --zone=public --add-port=2049/tcp --permanent
    6. After modifying firewall rules, reload and restart the firewall daemon service:

      # firewall-cmd --reload
      # systemctl restart firewalld.service

    Alternatively, stop and disable the firewall daemon on Oracle Linux 7:

    # systemctl stop firewalld
    # systemctl disable firewalld
  3. Ceph Storage Cluster nodes must be able to resolve the fully qualified domain name for each node within the cluster. You may either use DNS for this purpose, or provide entries within /etc/hosts for each system. If you select to rely on DNS, it must have sufficient redundancy to ensure that the cluster is able to perform name resolution at any time. If you select to edit /etc/hosts, add entries for the IP address and host name of all of the nodes in the Ceph Storage Cluster, for example:

    192.168.1.51    ceph-node1.example.com ceph-node1
    192.168.1.52    ceph-node2.example.com ceph-node2
    192.168.1.53    ceph-node3.example.com ceph-node3
    192.168.1.54    ceph-node4.example.com ceph-node4

    Note:

    Although you can use DNS to configure host name to IP address mapping, Oracle recommends that you also configure /etc/hosts in case the DNS service becomes unavailable.

  4. The Ceph Storage Cluster deployment node must be able to connect to each prospective node in the cluster over SSH, to facilitate deployment. To do this, you must generate an SSH key on the deployment node and copy the public key to each of the other nodes in the Ceph Storage Cluster.

    1. On the deployment node, generate the SSH key, specifying an empty passphrase:

      # ssh-keygen
    2. From the deployment node, copy the key to the other nodes in the Ceph Storage Cluster, for example:

      # ssh-copy-id root@ceph-node2
      # ssh-copy-id root@ceph-node3
      # ssh-copy-id root@ceph-node4
  5. To prevent errors when running ceph-deploy as a user with passwordless sudo privileges, use visudo to comment out the Defaults requiretty setting in /etc/sudoers or change it to Defaults:ceph !requiretty.

You can now install and configure the Ceph Storage Cluster deployment node, which is usually the same system as the administration node. See Installing and Configuring a Deployment Node.

Installing and Configuring a Deployment Node

In the example used in the following steps, the deployment node is ceph-node1.example.com (192.168.1.51), which is the same as the administration node.

Perform the following steps on the deployment node:

  1. Install the ceph-deploy package.

    # yum install ceph-deploy
  2. Create a Ceph configuration directory for the Ceph Storage Cluster and change to this directory, for example:

    # mkdir /var/mydom_ceph
    # cd /var/mydom_ceph

    This is the working configuration directory used by the deployment node to roll out configuration changes to the cluster and to client and gateway nodes. If you need to make changes to the Ceph configuration files in future, the changes should be made in this directory and then use the ceph-deploy config push command to update the configuration for other nodes in the cluster.

    All ceph-deploy commands should be run from this working directory. Future commands in this guide use the notation $CEPH_CONFIG_DIR to represent this directory.

  3. Use the ceph-deploy command to define the members of the Ceph Storage Cluster, for example:

    # $CEPH_CONFIG_DIR/ceph-deploy new ceph-node2

    To define multiple nodes, provide them in a space separated list, for example:

    # $CEPH_CONFIG_DIR/ceph-deploy new ceph-node2 ceph-node3 ceph-node4
  4. (Optional) The first time you run the ceph-deploy command a set of files are created to configure the Ceph Storage Cluster. You can edit $CEPH_CONFIG_DIR/ceph.conf to set options for Ceph features. For example, if you are setting up a test environment with only a few OSDs, you can reduce the default number of replicas (defaults to 3), for example:

    osd pool default size = 2
  5. (Optional) BlueStore is the default backend for new OSDs. As well as BlueStore, Oracle supports FileStore, which uses either Btrfs or XFS file systems on OSDs. Oracle recommends Btrfs or XFS file systems on OSDs for Oracle workloads. You can use a mixture of BlueStore and FileStore OSDs in a Ceph Storage Cluster.

    To use Btrfs on OSDs, add the following entry to the Ceph configuration file:

    enable experimental unrecoverable data corrupting features = btrfs

You can now install Ceph on the remaining Ceph Storage Cluster nodes. See Installing and Configuring Ceph Storage Cluster Nodes.

Installing and Configuring Ceph Storage Cluster Nodes

Having installed and configured the Ceph Storage for Oracle Linux deployment node, you can use this node to install Ceph Storage for Oracle Linux on the other nodes participating in the Ceph Storage Cluster.

To install Ceph Storage for Oracle Linux on all the Ceph Storage Cluster nodes, run the following command on the deployment node:

# $CEPH_CONFIG_DIR/ceph-deploy install ceph-node2

To install the software on multiple nodes, provide them in a space separated list, for example:

# $CEPH_CONFIG_DIR/ceph-deploy install ceph-node2 ceph-node3 ceph-node4

After installing the Ceph packages on each node, you may need to add the Ceph services to the firewall rules on each node, for example.

# sudo firewall-cmd --zone=public --add-service=ceph-mon --permanent
# sudo firewall-cmd --zone=public --add-service=ceph --permanent
# sudo firewall-cmd --reload
# sudo systemctl restart firewalld.service

To configure the Ceph Storage Cluster, perform the following steps on the administration node:

  1. Initialize Ceph Monitor:

    # $CEPH_CONFIG_DIR/ceph-deploy mon create-initial
  2. Deploy a Ceph Monitor on one or more nodes in the Ceph Storage Cluster, for example:

    # $CEPH_CONFIG_DIR/ceph-deploy mon create ceph-node2

    To deploy a Ceph Monitor on multiple nodes, provide them in a space separated list, for example:

    # $CEPH_CONFIG_DIR/ceph-deploy mon create ceph-node2 ceph-node3 ceph-node4

    For high availability, Oracle recommends that you configure at least three nodes as Ceph Monitors.

  3. Deploy the Ceph Manager daemon to one or more nodes in the Ceph Storage Cluster. Oracle recommends you deploy the Ceph Manager daemon to the same nodes you use as Ceph Monitors.

    # $CEPH_CONFIG_DIR/ceph-deploy mgr create ceph-node2

    To deploy Ceph Manager on multiple nodes, provide them in a space separated list, for example:

    # $CEPH_CONFIG_DIR/ceph-deploy mgr create ceph-node2 ceph-node3 ceph-node4
  4. Gather the monitor keys and the OSD and MDS bootstrap keyrings from one of the Ceph Monitors, for example:

    # $CEPH_CONFIG_DIR/ceph-deploy gatherkeys ceph-node3
  5. Create the OSD nodes to use in the Ceph Storage Cluster. For information on creating OSDs, see Creating an Object Storage Device (OSD) Node.

  6. Create an administration node to use the Ceph monitoring Command Line Interface (CLI), ceph, to manage the Ceph Storage Cluster. For information on creating an administration host, see Creating a Ceph Storage Cluster Administration Host.

  7. You can check the status and health of the cluster using ceph command. On the administration node, enter:

    # ceph health
    # ceph status

    It usually takes several minutes for the cluster to stabilize before its health is shown as HEALTH_OK. You can also check the cluster quorum status to get an indication on the quorum status of the cluster monitors:

    # ceph quorum_status --format json-pretty

    Refer to the upstream Ceph documentation for help troubleshooting any issues with the health or status of your cluster.

  8. (Optional) If you want to use Ceph File System (Ceph FS), deploy a Ceph Metadata Server (MDS). You can install the MDS service on an existing Monitor node within your Ceph Storage Cluster, as the service does not have significant resource requirements. For more information on setting up Ceph FS, see Setting Up and Using Ceph FS.

Creating an Object Storage Device (OSD) Node

This section discusses creating Object Storage Device (OSD) nodes in a Ceph Storage Cluster.

BlueStore is the default backend for OSDs. As well as BlueStore, Oracle supports the FileStore option, which provides XFS and Btrfs file systems on OSDs. Oracle recommends and XFS or Btrfs file systems on OSDs for Oracle workloads.

Creating a BlueStore OSD

Before you add a BlueStore OSD node to a Ceph Storage Cluster, you should first delete all data on the specified device.

# $CEPH_CONFIG_DIR/ceph-deploy disk zap node device

Replace node with the node name or host name where the disk is located. Replace device with the path to the device on the host where the disk is located. For example, to delete the data on a device named /dev/sdb on a node named ceph-node2 in the Ceph Storage Cluster:

# $CEPH_CONFIG_DIR/ceph-deploy disk zap ceph-node2 /dev/sdb

To create a BlueStore OSD, enter:

# $CEPH_CONFIG_DIR/ceph-deploy osd create --data device node

This command creates a volume group and logical volume using the disk you specify. Data and journal reside on the same logical volume. Replace node with the node name or host name where the disk is located. Replace device with the path to the device on the host where the disk is located. For example, run:

# $CEPH_CONFIG_DIR/ceph-deploy osd create --data /dev/sdb ceph-node2

Creating a FileStore OSD

To create a FileStore OSD, you should first create the required partitions: one for data, and one for journal. This should be done on the OSD node. This example creates a data partition on /dev/sdb1 with a size of 40GB, and a journal partition on /dev/sdb2 with a size of 12GB:

# parted /dev/sdb --script -- mklabel gpt 
# parted --script /dev/sdb mkpart primary 0MB 40000MB 
# parted --script /dev/sdb mkpart primary 42000MB 55000MB 
# dd if=/dev/zero of=/dev/sdb1  bs=1M count=1000
# sgdisk --zap-all --clear --mbrtogpt -g -- /dev/sdb2 
# ceph-volume lvm zap /dev/sdb2

On the deployment node, create the FileStore OSD. To specify the OSD file type, use the --filestore and --fs-type parameters when creating OSDs. This example shows how to create a FileStore OSD with XFS:

# CEPH_CONFIG_DIR/ceph-deploy osd create --filestore --fs-type xfs \
   --data /dev/sdb1 --journal /dev/sdb2 ceph-node2

To use Btrfs on OSDs, use the --fs-type btrfs option when creating OSDs, for example:

# CEPH_CONFIG_DIR/ceph-deploy osd create --filestore --fs-type btrfs \
   --data /dev/sdb1 --journal /dev/sdb2 ceph-node2

Creating a Ceph Storage Cluster Administration Host

When you have set up a Ceph Storage Cluster, you should provide the client admin key and the Ceph configuration file to another host so that a user on the host can use the ceph command line as an administrative user. In this example, ceph-node1 is the host name of the client system configured to manage the cluster using the Ceph Command Line Interface (CLI). This node is also used as the deployment node in examples.

On the deployment node, use ceph-deploy to install the Ceph CLI, for example:

# $CEPH_CONFIG_DIR/ceph-deploy admin ceph-node1

If you are using a separate host (not the deployment node as used in these examples) there are a few more steps.

  1. On the deployment node, copy the SSH key to the host you want to use for Ceph administration, for example:

    # ssh-copy-id root@ceph-client

    This example assumes that you have configured entries for the Ceph client system in DNS and/or in /etc/hosts.

  2. On the deployment node, use ceph-deploy to install the Ceph Storage for Oracle Linux packages, for example:

    # $CEPH_CONFIG_DIR/ceph-deploy install ceph-client
  3. On the deployment node, copy the Ceph configuration file and Ceph keyring, and install the Ceph CLI (ceph), for example:

    # $CEPH_CONFIG_DIR/ceph-deploy admin ceph-client

You can now use the Ceph CLI (ceph) to manage the Ceph Storage Cluster.

Deploying the Ceph Manager Dashboard

The Ceph Manager service (ceph-mgr) includes a web-based user interface to view monitoring information about a Ceph Storage Cluster. The Ceph Manager dashboard is read-only.

If you want to increase security of the Ceph Manager dashboard, you should configure the module to listen on a local port, and use a proxy server to provide TLS or access control for remote access.

To deploy and connect to the Ceph Manager dashboard:

  1. Enable the Ceph Manager dashboard:

    # ceph mgr module enable dashboard

    If you want to enable the dashboard when performing a Ceph deployment, add the following to the $CEPH_CONFIG_DIR/ceph.conf file:

    [mon]
            mgr initial modules = dashboard
  2. Run the ceph mgr services command on the administration node to display the URL to access the dashboard, for example:

    # ceph mgr services
    {
        "dashboard": "http://ceph-node1.example.com:7000/"
    }

    The Ceph Manager dashboard runs on port 7000 by default. Connect to the URL displayed using a web browser to access the dashboard.

Removing a Ceph Storage Cluster Node

This section shows you how to remove OSD and a Ceph Manager nodes from a Ceph Storage Cluster. For information on removing other node types, see the upstream Ceph documentation.

Attention:

Performing the steps in this section removes all data on the Ceph Storage Cluster node.

Removing an OSD Node

This section shows you how to remove an OSD node from a Ceph Storage Cluster.

Removing an OSD node

  1. On the OSD node, find the ID:

    # systemctl status ceph-osd@*

    The results list the ID of the node in a format similar to:

    ceph-osd@1.service - Ceph object storage daemon osd.1

    In this case the ID is 1.

  2. On the administration node, remove the OSD from the cluster. For example, run:

    # ceph osd out osd.1
  3. On the OSD node, stop and disable the service using the ID. For example, run:

    # systemctl stop ceph-osd@1.service
    # systemctl disable ceph-osd@1.service
  4. On the administration node, remove the OSD from the CRUSH map, remove the authorization keys, and delete the OSD from the cluster. For example, run:

    # ceph osd crush remove osd.1
    # ceph auth del osd.1
    # ceph osd rm 1
  5. To remove the data in the /var/lib/ceph directory, and uninstall the Ceph packages, from the deployment node, run:

    # $CEPH_CONFIG_DIR/ceph-deploy purge hostname

    Alternatively you can remove all data from the /var/lib/ceph directory, but leave the Ceph packages installed. To do this, from the deployment node, run:

    # $CEPH_CONFIG_DIR/ceph-deploy purgedata hostname
  6. On the OSD node, delete the volume groups and volume labels used for OSDs on the disk. For example, to remove a volume label, use the lvdisplay command to list the volume labels, and delete any labels:

    # lvdisplay
    ...
    # lvremove --force /dev/ceph-dc39f7cc-e423-48d3-a466-9701e7bf972a/osd-block-f7db38d2-...

    Use the vgdisplay command to list the volume groups, and delete any groups:

    # vgdisplay
    ...
    # vgremove --force ceph-dc39f7cc-e423-48d3-a466-9701e7bf972a
  7. If there are any configuration entries specific to the OSD you are removing in the Ceph configuration file, you should remove them. On the deployment node, edit the $CEPH_CONFIG_DIR/ceph.conf file to remove any entries for the node.

    On the deployment node, push the configuration change to all remaining OSD nodes. For example:

    # $CEPH_CONFIG_DIR/ceph-deploy --overwrite-conf config push ceph-node2 ceph-node3 ceph-node4

    On each remaining OSD node, restart the ceph-osd daemon:

    # systemctl restart ceph-osd.target

Removing a Ceph Monitor Node

This section shows you how to remove a Ceph Monitor node from a Ceph Storage Cluster.

To remove a Ceph Monitor node:

  1. On the deployment node, remove the Ceph Monitor node from the cluster. For example, run:

    # $CEPH_CONFIG_DIR/ceph-deploy mon destroy hostname
  2. On the Ceph Monitor node, stop and disable the service. For example, run:

    # systemctl stop ceph-mon@hostname.service
    # systemctl disable ceph-mon@hostname.service
  3. To remove the data in the /var/lib/ceph directory, and uninstall the Ceph packages, from the deployment node, run:

    # $CEPH_CONFIG_DIR/ceph-deploy purge hostname

    Alternatively you can remove all data from the /var/lib/ceph directory, but leave the Ceph packages installed. To do this, from the deployment node, run:

    # $CEPH_CONFIG_DIR/ceph-deploy purgedata hostname
  4. If there are any configuration entries specific to the Ceph Monitor node you are removing in the Ceph configuration file, you should remove them. On the deployment node, edit the $CEPH_CONFIG_DIR/ceph.conf file to remove any entries for the node.

    On the deployment node, push the configuration change to all remaining Ceph Monitor nodes. For example, run:

    # $CEPH_CONFIG_DIR/ceph-deploy --overwrite-conf config push ceph-node2 ceph-node3 ceph-node4

    On each remaining Ceph Monitor node, restart the ceph-mon daemon:

    # systemctl restart ceph-mon.target

Upgrading Ceph Storage Cluster Nodes

This section describes how to upgrade the Oracle Linux operating system, and update to the supported Unbreakable Enterprise Kernel. When the operating system on each node in the Ceph Storage Cluster has been upgraded, you can upgrade Ceph Storage for Oracle Linux on each node.

Upgrading Oracle Linux

The operating system must be at least Oracle Linux 7.5. Follow the steps to upgrade your operating system on each node in the Ceph Storage Cluster.

  1. If you are using ULN, subscribe to the Oracle Linux 7 ol7_x86_64_latest channel. Alternatively, if you are using the Oracle Linux yum server, enable access to the Oracle Linux 7 ol7_latest repository.

  2. Run the yum update command to update the packages.

    # yum update
  3. When the upgrade has completed, reboot the system.

    # systemctl reboot
  4. Select the new Oracle Linux 7 kernel version during the system startup if it is not the default boot kernel.

More detailed information on upgrading Oracle Linux 7 is available in Oracle® Linux 7: Installation Guide

Upgrading Unbreakable Enterprise Kernel

The Unbreakable Enterprise Kernel must be at least UEK R5. Follow the steps to upgrade to UEK R5 on each node in the Ceph Storage Cluster.

  1. If you are using ULN, subscribe to the UEK R5 channel ol7_x86_64_UEKR5. Alternatively, if you are using the Oracle Linux yum server, enable access to the UEK R5 repository ol7_UEKR5. Make sure you also disable the UEK R4 channel and yum repository if they are enabled.

  2. Run the yum update command to update the packages.

    # yum update
  3. When the upgrade has completed, reboot the system.

    # systemctl reboot
  4. Select the UEK R5 kernel version during the system startup if it is not the default boot kernel.

More detailed information on upgrading to UEK R5 is available in the Unbreakable Enterprise Kernel: Release Notes for Unbreakable Enterprise Kernel Release 5. Also take care to look at the release notes for the latest update release in Unbreakable Enterprise Kernel Documentation.

Upgrading Ceph Storage for Oracle Linux

This section discusses upgrading the Ceph Storage for Oracle Linux Release 2.0 components to the current release. During the upgrade, you do not have to stop the Ceph daemons, but Oracle recommends the Ceph Storage Cluster not be in use during the upgrade. To protect data integrity during the upgrade, Oracle recommends:

  1. Remove the Ceph Storage Cluster from use.

  2. Unmount all file systems in the cluster, including those for Ceph Block Device, Ceph Object Gateway, and Ceph File System (Ceph FS).

  3. Upgrade the cluster using the instructions in this section.

  4. Confirm the cluster is healthy.

  5. Remount all file-systems.

  6. Return the cluster to use.

Where component upgrade is required, it is recommended that the upgrades are performed in the following order:

  1. Ceph Deploy Package

  2. Ceph Monitors

  3. Ceph Managers (deploy new daemon)

  4. Ceph OSDs

  5. Ceph Metadata Servers

  6. Ceph Object Gateways

Oracle recommends that all daemons of a specific type are upgraded together to ensure that they are all on the same release, and that all of the components within a Ceph Storage Cluster are upgraded before you attempt to configure or use any new functionality in the current release.

The following instructions provide an outline of some of the common steps required to perform an upgrade for a Ceph Storage Cluster.

  1. To begin the upgrade process, the yum configuration on all systems that are part of the Ceph Storage Cluster must be updated to provide access to the appropriate yum repositories and channels as described in Enabling Access to the Ceph Storage for Oracle Linux Packages.

  2. From the administration node, check the Ceph Storage Cluster status with the following command:

    # ceph health

    If the health of the cluster is HEALTH_OK, continue with the upgrade. If not, make sure all cluster nodes are stable and healthy.

    Do not create any new erasure-code pools while upgrading the monitors.

    You can monitor the progress of your upgrade at any time using the ceph versions command, which lists the Ceph version for each daemon.

  3. From the administration node, set the sortbitwise option to change the internal sorting algorithm. This option is used by the new object enumeration API and for the new BlueStore backend:

    # ceph osd set sortbitwise
  4. From the administration node, set the noout option to prevent the CRUSH algorithm from attempting to rebalance the cluster during the upgrade:

    # ceph osd set noout
  5. Edit the Ceph configuration file and remove the workaround required in Ceph Storage for Oracle Linux Release 2.0:

    rbd default features = 3
  6. From the deployment node, push the updated Ceph configuration file to all the nodes in the cluster, for example:

    # $CEPH_CONFIG_DIR/ceph-deploy --overwrite-conf config push ceph-node2 ceph-node3 ceph-node4
  7. From the deployment node (usually the same as the administration node), upgrade the ceph-deploy package:

    # yum update ceph-deploy
  8. On the deployment node, use ceph-deploy to upgrade the Ceph Command Line Interface (CLI) on the administration node, for example:

    # $CEPH_CONFIG_DIR/ceph-deploy admin ceph-node1

    In this example, ceph-node1 is the host name of the client system configured to manage the cluster using the Ceph CLI. This node is also used as the deployment node in examples.

  9. From the deloyment node, use ceph-deploy to upgrade the packages on each node in the Ceph Storage Cluster for example:

    # $CEPH_CONFIG_DIR/ceph-deploy install ceph-node2 ceph-node3 ceph-node4

    Note:

    The upstream documentation mentions the --release switch, which is meant to allow you to control which release you are updating to; however, this switch does not have any effect when used in an Oracle Linux environment, and the packages are simply installed to the latest version using yum.

  10. On each Ceph Monitor node, restart the Ceph Monitor daemon:

    # systemctl restart ceph-mon.target

    When all Ceph Monitor daemons are upgraded, on the administration node, verify the monitor upgrade is complete:

    # ceph mon feature ls
    ...
    on current monmap (epoch 1)
    	persistent: [kraken,luminous]
    	required: [kraken,luminous]

    The output should include luminous under persistent features in the monmap list.

  11. From the deployment node, gather the monitor keys and the OSD and MDS bootstrap keyrings from one of the Ceph Monitor nodes, for example:

    # $CEPH_CONFIG_DIR/ceph-deploy gatherkeys ceph-node2
  12. From the deployment node, deploy new Ceph Manager daemons on each node that runs Ceph Monitor daemons, for example:

    # $CEPH_CONFIG_DIR/ceph-deploy --overwrite-conf mgr create ceph-node2 ceph-node3 ceph-node4

    On the administration node, you can verify the Ceph Manager daemons are running with the command:

    # ceph -s
    ... 
      services:
        mon: 3 daemons, quorum ceph-node2,ceph-node3,ceph-node4
        mgr: ceph-node2(active), standbys: ceph-node3, ceph-node4
        osd: 3 osds: 3 up, 3 in
    ...

    This command lists the mon, mgr and osd daemons.

    You can also verify the ceph-mgr daemon is running on each Ceph Manager node. For example, on a node named ceph-node2, which includes the Ceph Manager daemon::

    [ceph-node2 ~]# systemctl status ceph-mgr@ceph-node2
    ● ceph-mgr@ceph-node2.service - Ceph cluster manager daemon
       Loaded: loaded (/usr/lib/systemd/system/ceph-mgr@.service; enabled; vendor preset: disabled)
       Active: active (running) since <date>
     Main PID: 4902 (ceph-mgr)
       CGroup: /system.slice/system-ceph\x2dmgr.slice/ceph-mgr@ceph-node2.service
               └─4902 /usr/bin/ceph-mgr -f --cluster ceph --id ceph-node2 --setuser ceph --setgroup ceph...
    ...
  13. On each OSD node, restart the ceph-osd daemon:

    # systemctl restart ceph-osd.target

    From the administration node, you can monitor the progress of the OSD upgrades with the ceph osd versions command, for example:

    # ceph osd versions
    {
       "ceph version 12.2.4 (...) luminous (stable)": 2,
       "ceph version 10.2.6 (...)": 1,
    }

    The output shows two OSDs upgraded to Luminous, and one OSD not yet upgraded.

  14. On each node that runs the Ceph Metadata Server daemon, restart the ceph-mds daemon:

    # systemctl restart ceph-mds.target
  15. On each node that runs the Ceph Object Gateway daemon, restart the radosgw daemon:

    # systemctl restart radosgw.target
  16. On the administration node, prevent old (pre-Luminous) OSD nodes from joining the Ceph Storage Cluster:

    # ceph osd require-osd-release luminous
  17. Check the Ceph Storage Cluster status with the following command:

    # ceph health

    If the health of the cluster is HEALTH_OK, re-enable the CRUSH algorithm's ability to balance the cluster by unsetting the noout option:

    # ceph osd unset noout
  18. (Optional) BlueStore is the default backend for new OSDs in Ceph Storage for Oracle Linux Release 3.0. The FileStore backend can be used with upgraded OSDs and a mixture of BlueStore and FileStore OSDs is supported. If you want to migrate OSDs from FileStore to BlueStore, see the upstream documentation at:

    https://docs.ceph.com/en/latest/rados/operations/bluestore-migration/