Deploy a Predictive, Federated Healthcare Analytics Platform on Oracle Cloud

Federated learning may soon become the de facto standard for analyzing patient healthcare data on a massive scale. Because the federated learning process occurs locally at any participating hospital, clinic, or pharmacy, the training algorithms can be built without exporting patient data outside of any one system's firewall.

New Jersey-based SymetryML uses federated learning algorithms to run its predictive healthcare analytics platform on Oracle Cloud Infrastructure (OCI), providing data contributors and consumers with its FL models to leverage billions of petabytes of anonymized patient healthcare data in near real time.

Although federated learning models are often difficult to create without adhering to stringent governance models and a global federation configuration, SymetryML's app bypasses these constraints because it doesn't contain any governance model at all. Instead, the app allows peers to share some knowledge about their own data with the other peers, in a HIPAA compliant way, and then each peer can build any model they want or access various analytics from the shared knowledge in the SymetryML federation. This dynamic governance modeling capability protects against common federated learning data leakage and eliminates the need for each federation peer node to have similar, independent and identically distributed (IID) datasets. SymetryML allows federation between multiple peers, using different probability distribution functions and other statistical characteristics.

Highlights of the SymetryML deployment on OCI include:

  • Machine learning instances can be provisioned for virtual machines and bare metal, both of which run on NVIDIA GPUs
  • Data for the federated learning models is ingested by using Oracle Cloud Infrastructure Streaming and the resulting shared knowledge, the data element matrix (DEM), is stored in Redis databases and in Oracle Cloud Infrastructure Object Storage
  • Real-time data analytics models can run on Kafka, which is deployed in containers on OCI

Architecture

The SymetryML app contains two systems: one for the software-as-a-service (SaaS) application interface, and the other for the machine-learning interface.

The SaaS application is deployed on a single virtual machine instance with a Redis database virtual machine instance deployed in a virtual cloud network (VCN). Each customer that uses the SymetryML app is allocated a SaaS instance.

The machine-learning interface is deployed in a second VCN that contains a subnet for data streaming and analytics, a machine-learning subnet for autoML, predictive modeling, and real-time metrics, a database subnet for persistence data storage using Redis, and an optional subnet for federation that contains a NATS cluster.

Data consumers are allocated SaaS application instances to which they can log on and configure the environment for their needs. Consumers have a choice of machine-learning instances to choose from based on their analytics requirements. Customer can choose from virtual machines or bare metal instances, both with NVIDIA GPUs. After choosing the instance type, the instances are provisioned. From the SaaS instance, users are able to issue commands to stop, start, and restart instances as needed and to review the data that has been processed.

The following diagram illustrates the architecture:



symetryml-oci-architecture-oracle.zip

To provide data for machine learning, a data contributor can use the SymetryML Desktop Application to point to a variety of data sources such as databases, streaming data, and object stores that contain patient raw data.

If a customer requires real-time streaming and analytics, they can take advantage of Kafka. If the data consumer wants to take advantage of federation, they can choose to deploy a NATS cluster or use a NATS cluster outside of the OCI VCN. The NATS cluster is optional; it is not automatically deployed as part of the SymetryML architecture but is required for enabling SymetryML federated learning.

Considerations for SymetryML's future state roadmap include:

  • Deploy an Apache Spark Service, using Oracle Cloud Infrastructure Data Flow

    Because SymetryML uses fully home-grown Java applications running as Spark applications, Oracle suggests using Data Flow, which is a cloud-based serverless platform that processes tasks on extremely large datasets. It allows Spark developers and data scientists to create, edit, and run Spark jobs at any scale without the need for clusters, an operations team, or highly specialized Spark knowledge. Because it is serverless, there is no infrastructure to deploy or manage. It is entirely driven by REST APIs, making it easy to integrate with applications or workflows. You can also control Data Flow using the REST API. By using Data Flow Pools, you can support streaming session workloads from various users at the same time in the same tenant. Because Data Flow commands are available as part of the Oracle Cloud Infrastructure command line interface, users can easily:

    • Connect to Apache Spark data sources
    • Create reusable Apache Spark applications
    • Launch Apache Spark jobs in seconds
    • Create Apache Spark applications by using SQL, Python, Java, Scala, or the spark-submit script
    • Manage all Apache Spark applications from a single platform
    • Process data in the Cloud or on-premises in your data center
    • Create Big Data building blocks that you can easily assemble into advanced Big Data applications
  • Replace Apache Kafka with Oracle Streaming Service

    Oracle Cloud Infrastructure Streaming lets users of Apache Kafka offload the setup, maintenance, and infrastructure management that hosting your own Zookeeper and Kafka cluster requires. Because Oracle Cloud Infrastructure Streaming is compatible with most Kafka APIs, it allows you to use applications written for Kafka to send and receive messages from the streaming service without having to rewrite your code. Oracle Cloud Infrastructure Streaming can also use the Kafka Connect ecosystem to interface directly with external sources such as databases, object stores, or any microservice in the Oracle Cloud. Kafka connectors can easily and automatically create, publish to, and deliver topics while taking advantage of the streaming service's high throughput and durability. In addition, Oracle Cloud Infrastructure Streaming provides a fully managed, scalable, and durable messaging solution for ingesting continuous, high-volume streams of data that you can consume and process in real-time. Oracle Cloud Infrastructure Streaming is serverless and offloads the infrastructure management, ranging from networking, to storage, and the configuration needed to stream your data. You do not have to worry about the provisioning of infrastructure, ongoing maintenance, or security patching. The streaming service synchronously replicates data across three availability domains, providing high availability and data durability. In regions with a single availability domain, the data is replicated across three fault domains. Some possible uses for Oracle Cloud Infrastructure Streaming include:

    • Messaging: Producers and consumers can use Oracle Cloud Infrastructure Streaming as an asynchronous message bus and act independently and at their own pace.
    • Metric and log ingestion: Use Oracle Cloud Infrastructure Streaming as an alternative to traditional file-scraping approaches to help make critical operational data quickly available for indexing, analysis, and visualization.
    • Web or mobile activity data ingestion: Use Oracle Cloud Infrastructure Streaming for capturing activity from websites or mobile apps, such as page views, searches, or other user actions. You can use this information for real-time monitoring and analytics, and in data warehousing systems for offline processing and reporting.
    • Infrastructure and apps event processing: Use Oracle Cloud Infrastructure Streaming as a unified entry point for cloud components to report their lifecycle events for audit, accounting, and related activities.
  • Deploy Autonomous Database

    Oracle Autonomous Database (available only on Exadata infrastructure) is a fully managed PaaS service offering auto indexing, tuning, backup, provisioning, auto patching, and upgrades. Autonomous Database also encrypts the database, backups, and all network connections. In addition to providing a vault, data masking, and immediate scaling without downtime, Autonomous Database comes with two disaster recovery options: Active Data Guard can be applied synchronously in the same region and Data Guard can be configured asynchronously across regions, which provides protection in case of a disaster in the entire primary region. Some of the important features include:

    • Cloud elasticity to lower costs by providing autoscale up to 3X (and back down)
    • Prioritize database connections for efficient resource utilization
    • Host local and remote standby by using Oracle Data Guard
    • Automatic failover with zero-data loss to standby, is completely transparent to end-user applications, and provides 99.995% SLA

The following diagram illustrates the future architecture.



symetryml-oci-future-oracle.zip

The architecture has the following components:

  • Tenancy

    A tenancy is a secure and isolated partition that Oracle sets up within Oracle Cloud when you sign up for Oracle Cloud Infrastructure. You can create, organize, and administer your resources in Oracle Cloud within your tenancy. A tenancy is synonymous with a company or organization. Usually, a company will have a single tenancy and reflect its organizational structure within that tenancy. A single tenancy is usually associated with a single subscription, and a single subscription usually only has one tenancy.

  • Region

    An Oracle Cloud Infrastructure region is a localized geographic area that contains one or more data centers, called availability domains. Regions are independent of other regions, and vast distances can separate them (across countries or even continents).

  • Availability domain

    Availability domains are standalone, independent data centers within a region. The physical resources in each availability domain are isolated from the resources in the other availability domains, which provides fault tolerance. Availability domains don’t share infrastructure such as power or cooling, or the internal availability domain network. So, a failure at one availability domain shouldn't affect the other availability domains in the region.

  • Virtual cloud network (VCN) and subnets

    A VCN is a customizable, software-defined network that you set up in an Oracle Cloud Infrastructure region. Like traditional data center networks, VCNs give you control over your network environment. A VCN can have multiple non-overlapping CIDR blocks that you can change after you create the VCN. You can segment a VCN into subnets, which can be scoped to a region or to an availability domain. Each subnet consists of a contiguous range of addresses that don't overlap with the other subnets in the VCN. You can change the size of a subnet after creation. A subnet can be public or private.

  • Route table

    Virtual route tables contain rules to route traffic from subnets to destinations outside a VCN, typically through gateways.

  • Security list

    For each subnet, you can create security rules that specify the source, destination, and type of traffic that must be allowed in and out of the subnet.

  • Internet gateway

    The internet gateway allows traffic between the public subnets in a VCN and the public internet.

  • Service gateway

    The service gateway provides access from a VCN to other services, such as Oracle Cloud Infrastructure Object Storage. The traffic from the VCN to the Oracle service travels over the Oracle network fabric and does not traverse the internet.

  • Local peering gateway (LPG)

    An LPG enables you to peer one VCN with another VCN in the same region. Peering means the VCNs communicate using private IP addresses, without the traffic traversing the internet or routing through your on-premises network.

  • Compute

    The Oracle Cloud Infrastructure Compute service enables you to provision and manage compute hosts in the cloud. You can launch compute instances with shapes that meet your resource requirements for CPU, memory, network bandwidth, and storage. After creating a compute instance, you can access it securely, restart it, attach and detach volumes, and terminate it when you no longer need it.

  • Bare metal

    Oracle’s bare metal servers provide isolation, visibility, and control by using dedicated compute instances. The servers support applications that require high core counts, large amounts of memory, and high bandwidth. They can scale up to 160 cores (the largest in the industry), 2 TB of RAM, and up to 1 PB of block storage. Customers can build cloud environments on Oracle’s bare metal servers with significant performance improvements over other public clouds and on-premises data centers.

  • Identity and Access Management (IAM)

    Oracle Cloud Infrastructure Identity and Access Management (IAM) is the access control plane for Oracle Cloud Infrastructure (OCI) and Oracle Cloud Applications. The IAM API and the user interface enable you to manage identity domains and the resources within the identity domain. Each OCI IAM identity domain represents a standalone identity and access management solution or a different user population.

Get Featured in Built and Deployed

Want to show off what you built on Oracle Cloud Infrastructure? Care to share your lessons learned, best practices, and reference architectures with our global community of cloud architects? Let us help you get started.

  1. Download the template (PPTX)

    Illustrate your own reference architecture by dragging and dropping the icons into the sample wireframe.

  2. Watch the architecture tutorial

    Get step by step instructions on how to create a reference architecture.

  3. Submit your diagram

    Send us an email with your diagram. Our cloud architects will review your diagram and contact you to discuss your architecture.

Acknowledgments

  • Authors: Robert Huie, Sasha Banks-Louie
  • Contributors: Brad Goodwin, Kyle Finnerty, Kyle Adams, Puneet Khana, Ganesh Pitchaiah, Robert Lies

    SymetryML Team: Dustin O'Dell, Neil Couture