Deploy a Scalable, Distributed File System Using GlusterFS

A scalable, distributed network file system is suitable for data-intensive tasks such as image processing and media streaming. When used in high-performance computing (HPC) environments, GlusterFS delivers high-performance access to large data sets, especially immutable files.

Architecture

This reference architecture contains the infrastructure components required for a distributed network file system. It contains three bare metal instances, which is the minimum required to set up high availability for GlusterFS.

In a three-server configuration, at least two servers must be online to allow write operations to the cluster. Data is replicated across all the nodes, as shown in the diagram.

Description of glusterfs-oci.png follows
Description of the illustration glusterfs-oci.png
  • Regions

    A region is a localized geographic area composed of one or more availability domains. Regions are independent of other regions, and vast distances can separate them (across countries or continents).

  • Availability domains

    Availability domains are standalone, independent data centers within a region. The physical resources in each availability domain are isolated from the resources in the other availability domains, which provides fault tolerance. Availability domains don’t share infrastructure such as power or cooling, or the internal availability domain network. So, a failure at one availability domain is unlikely to affect the other availability domains in the region.

  • Fault domains

    A fault domain is a grouping of hardware and infrastructure within an availability domain. Each availability domain has three fault domains with independent power and hardware. When you place Compute instances across multiple fault domains, applications can tolerate physical server failure, system maintenance, and many common networking and power failures inside the availability domain.

  • Virtual cloud network and subnets

    A VCN is a software-defined network that you set up in an Oracle Cloud Infrastructure region. VCNs can be segmented into subnets, which can be specific to a region or to an availability domain. Both region-specific and availability domain-specific subnets can coexist in the same VCN. A subnet can be public or private.

    This architecture uses two subnets: a public subnet to create the DMZ and host the bastion server, and a private subnet to host the GlusterFS nodes.

  • Network security groups

    Network security groups (NSGs) act as virtual firewalls for your compute instances. With our zero-trust security model, all traffic is denied, and you can control the network traffic inside the VCN. In this architecture, the GlusterFS servers accept traffic only from the nodes and clients on their respective ports; all other traffic is denied.

  • Security lists

    For each subnet, you can create security rules that specify the source, destination, and type of traffic that must be allowed in and out of the subnet.

  • GFS-nodes

    These are the GlusterFS headends, with 1 TB of block storage attached to each instance.

  • /gfs-data

    In the reference architecture, the client mounts the GlusterFS volume at the mount point /gfs-data, through which your application accesses the file system. Multiple servers can access the headend nodes in parallel.

  • Bastion host

    The bastion host is a compute instance that serves as a secure, controlled entry point to the topology from outside the cloud. The bastion host is provisioned typically in a demilitarized zone (DMZ). It enables you to protect sensitive resources by placing them in private networks that can't be accessed directly from outside the cloud. The topology has a single, known entry point that you can monitor and audit regularly. So, you can avoid exposing the more sensitive components of the topology without compromising access to them.

Recommendations

Your requirements might differ from the architecture described here. Use the following recommendations as a starting point.

  • GlusterFS architecture

    This architecture uses replicated GlusterFS volumes; the data is replicated across all the nodes. This configuration provides the highest availability of data but also uses the largest amount of space. As shown in the architecture diagram, when File1 is created, it is replicated across the nodes.

    GlusterFS supports the following architectures. Select an architecture that suits your requirements:
    • Distributed volumes

      This architecture is the default GlusterFS configuration and is used to obtain maximum volume size and scalability. There is no data redundancy. Therefore, if a brick in the volume fails, it will lead to complete loss of data.

    • Replicated volumes

      This architecture is most used where high availability is critical. Data loss problems arising from brick failures are avoided by replicating data across two or more bricks. This reference architecture uses the Replicated volumes configuration.

    • Distributed replicated volumes

      This architecture is a combination of distributed and replicated volumes and is used for obtaining larger volume sizes than a replicated volume and higher availability than a distributed volume. In this configuration, data is replicated onto a subset of the total number of bricks. The number of bricks must be a multiple of the replica count. For example, four bricks of 1 TB each will give you a distributed space of 2 TB with a two-fold replication.

    • Striped volumes

      This architecture is used for large files that will be divided into smaller chunks and each chunk is stored in a brick. The load is distributed across the bricks and the file can be fetched faster but no data redundancy is available.

    • Distributed striped volumes

      This architecture is used for large files with distribution across more number of bricks. The trade-off with this configuration is that if you want to increase the volume size, then you must add bricks in multiples of the stripe count.

  • Compute shapes

    This architecture uses a bare metal shape (BM.Standard2.52) for all the GlusterFS nodes. These bare metal compute instances have two physical NICs that can push traffic at 25 Gbps each. The second physical NIC is dedicated for GlusterFS traffic.

  • Block storage

    This architecture uses 1 TB of block storage. We recommend that you configure a logical volume manager (LVM) to allow the volume to grow if you need more space. Each block volume is configured to use balanced performance and provides 35K IOPS and 480 MB/s of throughput.

  • Virtual cloud network (VCN)

    When you create the VCN, determine how many IP addresses your cloud resources in each subnet require. Using Classless Inter-Domain Routing (CIDR) notation, specify a subnet mask and a network address range large enough for the required IP addresses.

    Select an address range that doesn’t overlap with your on-premises network, so that you can set up a connection between the VCN and your on-premises network later, if necessary.

    After you create a VCN, you can't change its address range.

    When you design the subnets, consider your functionality and security requirements. Attach all the compute instances within the same tier or role to the same subnet, which can serve as a security boundary.

  • Network security groups (NSGs)

    You can use NSGs to define a set of ingress and egress rules that apply to specific VNICs. We recommend using NSGs rather than security lists, because NSGs enable you to separate the VCN's subnet architecture from the security requirements of your application. In the reference architecture, all the network communication is controlled through NSGs.

  • Security lists

    Use security lists to define ingress and egress rules that apply to the entire subnet.

Considerations

  • Performance

    To get the best performance, use dedicated NICs for communication from the application to your users and to the GlusterFS headends. Use the primary NIC for communication between your application and the users. Use the secondary NIC for communication with the GlusterFS headends. You can also change the volume performance for block storage to increase or decrease the IOPS and throughput of your disk.

  • Availability

    Fault domains provide the best resiliency within an availability domain. If you need higher availability, consider using multiple availability domains or multiple regions. For mission-critical workloads, consider using distributed striped GlusterFS volumes.

  • Cost
    The cost of your GlusterFS deployment depends on your requirements for disk performance and availability:
    • You can choose from the following performance options: high performance, balanced performance, and low cost.
    • For higher availability, you need a larger number of GlusterFS nodes and volumes.

Deploy

The Terraform code for this reference architecture is available as a stack in Oracle Cloud Marketplace. You can also download the code from GitHub, and customize it to suit your specific business requirements.

  • Deploy using the stack in Oracle Cloud Marketplace:
    1. Go to Oracle Cloud Marketplace.
    2. Click Get App.
    3. Follow the on-screen prompts.
  • Deploy using the Terraform code in GitHub:
    1. Go to GitHub.
    2. Clone or download the repository to your local computer.
    3. Follow the instructions in the README document.

More Information

GlusterFS documentation