Deploy NVIDIA GPUs for Molecular Modeling and Drug Discovery on Oracle Cloud

GridMarkets' Envoy platform runs on Oracle Cloud in data centers worldwide to dramatically reduce the time required to simulate a drug molecule’s reaction to different proteins.

With the advent of computer aided drug design (CADD) as a method of modeling medicinal compounds more than four decades ago, drug researchers have been able to screen larger numbers of molecules and identify the most promising drug candidates faster and cheaper than they could in a lab. Combining advances in machine learning techniques, compute power, parallelization, and cloud-native simulation platforms such as GridMarkets, drug researchers have been able to reduce the time it takes them to simulate a drug molecule’s reaction to different proteins from several weeks or months to just hours.

By using GridMarkets proprietary Envoy application, which integrates into popular molecular modeling platforms such as AMBER, GROMACS, NAMD and MOE, drug researchers can submit thousands of ligands, all of which could be run in parallel within a day, depending upon the number of machines and their processing power.

Cofounded in 2011 by serial entrepreneurs Mark Ross and Hakim Karim, GridMarkets runs its molecular simulation platform on high-performance servers located on Oracle Cloud in data centers around the world. With Oracle Cloud Infrastructure, there’s no need to queue requests or to schedule simulations. Instead, GridMarkets' customers can access an (almost) unlimited number of machines whenever they need them, without having to pay for unused capacity when they don’t.

Within seconds after customers select the number of machines on which to run their simulations, GridMarkets configures the software and compute resources, encrypts the data, and submits the request. When the job’s finished, the results are automatically returned and the machines shut down, so there are no lingering costs. Because GridMarkets' workflow doesn't tie up local resources, drug researchers don’t have to sit and wait behind a company’s firewall for their results. Instead, they can run their simulations from a laptop anywhere in the world. Using zero trust methods for defense-in-depth security, GridMarkets has secured its platform on Oracle Cloud to protect its own environment, as well as the intellectual property of its customers.

Architecture

GridMarkets is a multicloud platform, accessed through a proprietary application called Envoy from an end user's desktop.

The Envoy client uses an API to request access to the head-end region hosted on Oracle Cloud Infrastructure (OCI). Oracle Cloud Infrastructure Load Balancing provides high availability (HA) for the front end, user interface, and microservices that track users, jobs, tasks, and billing. These microservices are hosted on Docker containers. Oracle MySQL Database Service provides the storage for data collected from the front end along with redis and RabbitMQ for ephemeral transactional data to run the service.

After users are authenticated and have established a connection with the head-end region, they can request the number of machines and CPUs or GPUs necessary to run their simulations. The head-end region determines where to send the request. The request can be sent to any cloud service provider or any region within OCI depending on the availability of the types of machines requested by GridMarkets’ clients.

After the request is completed, Envoy uploads the data to be modeled, simulated, or rendered to the Oracle Cloud Infrastructure Object Storage building block. The data is then pulled from object storage and stored in Oracle Cloud Infrastructure Block Volumes attached to a NAS filer (scratch and staging building block) for faster storage access during application execution. Based on the requested compute shape, the management server initiates the application to start processing the data using the HPC cluster in the OCI region requested (CPU or GPU building block). After the modeling or simulations are complete, the result is pushed back to object storage and is automatically downloaded to the user through the Envoy client.

In the background, the management server schedules the jobs, allocates resources, performs queue and file management, and reports availability, usage, and billing information back to the head-end region.

The following diagram illustrates this reference architecture.



gridmarkets-oci-arch-oracle.zip

On the roadmap for GridMarkets is the use of NVIDIA A10 Tensor Core GPU on virtual machines when that option becomes available. Gridmarkets is also exploring options for integrating artificial intelligence (AI) and machine learning (ML) for resource management.

The architecture has the following components:

  • Tenancy

    A tenancy is a secure and isolated partition that Oracle sets up within Oracle Cloud when you sign up for Oracle Cloud Infrastructure. You can create, organize, and administer your resources in Oracle Cloud within your tenancy. A tenancy is synonymous with a company or organization. Usually, a company will have a single tenancy and reflect its organizational structure within that tenancy. A single tenancy is usually associated with a single subscription, and a single subscription usually only has one tenancy.

  • Region

    An Oracle Cloud Infrastructure region is a localized geographic area that contains one or more data centers, called availability domains. Regions are independent of other regions, and vast distances can separate them (across countries or even continents).

  • Availability domain

    Availability domains are standalone, independent data centers within a region. The physical resources in each availability domain are isolated from the resources in the other availability domains, which provides fault tolerance. Availability domains don’t share infrastructure such as power or cooling, or the internal availability domain network. So, a failure at one availability domain is unlikely to affect the other availability domains in the region.

  • Virtual cloud network (VCN) and subnets

    A VCN is a customizable, software-defined network that you set up in an Oracle Cloud Infrastructure region. Like traditional data center networks, VCNs give you complete control over your network environment. A VCN can have multiple non-overlapping CIDR blocks that you can change after you create the VCN. You can segment a VCN into subnets, which can be scoped to a region or to an availability domain. Each subnet consists of a contiguous range of addresses that don't overlap with the other subnets in the VCN. You can change the size of a subnet after creation. A subnet can be public or private.

  • Security list

    For each subnet, you can create security rules that specify the source, destination, and type of traffic that must be allowed in and out of the subnet.

  • Route table

    Virtual route tables contain rules to route traffic from subnets to destinations outside a VCN, typically through gateways.

  • Internet gateway

    The internet gateway allows traffic between the public subnets in a VCN and the public internet.

  • Service gateway

    The service gateway provides access from a VCN to other services, such as Oracle Cloud Infrastructure Object Storage. The traffic from the VCN to the Oracle service travels over the Oracle network fabric and never traverses the internet.

  • Load balancer

    The Oracle Cloud Infrastructure Load Balancing service provides automated traffic distribution from a single entry point to multiple servers in the back end.

  • Compute

    The Oracle Cloud Infrastructure Compute service enables you to provision and manage compute hosts in the cloud. You can launch compute instances with shapes that meet your resource requirements for CPU, memory, network bandwidth, and storage. After creating a compute instance, you can access it securely, restart it, attach and detach volumes, and terminate it when you no longer need it.

  • Bare metal

    Oracle’s bare metal servers provide isolation, visibility, and control by using dedicated compute instances. The servers support applications that require high core counts, large amounts of memory, and high bandwidth. They can scale up to 160 cores (the largest in the industry), 2 TB of RAM, and up to 1 PB of block storage. Customers can build cloud environments on Oracle’s bare metal servers with significant performance improvements over other public clouds and on-premises data centers.

  • Object storage

    Object storage provides quick access to large amounts of structured and unstructured data of any content type, including database backups, analytic data, and rich content such as images and videos. You can safely and securely store and then retrieve data directly from the internet or from within the cloud platform. You can seamlessly scale storage without experiencing any degradation in performance or service reliability. Use standard storage for "hot" storage that you need to access quickly, immediately, and frequently. Use archive storage for "cold" storage that you retain for long periods of time and seldom or rarely access.

  • Block volume

    With block storage volumes, you can create, attach, connect, and move storage volumes, and change volume performance to meet your storage, performance, and application requirements. After you attach and connect a volume to an instance, you can use the volume like a regular hard drive. You can also disconnect a volume and attach it to another instance without losing data.

  • Oracle MySQL Database Service

    Oracle MySQL Database Service is a fully managed Oracle Cloud Infrastructure (OCI) database service that lets developers quickly develop and deploy secure, cloud native applications. Optimized for and exclusively available in OCI, Oracle MySQL Database Service is 100% built, managed, and supported by the OCI and MySQL engineering teams.

    Oracle MySQL Database Service has an integrated, high-performance analytics engine (HeatWave) to run sophisticated real-time analytics directly against an operational MySQL database.

  • Monitoring

    Oracle Cloud Infrastructure Monitoring service actively and passively monitors your cloud resources using metrics to monitor resources and alarms to notify you when these metrics meet alarm-specified triggers.

  • Logging
    Logging is a highly scalable and fully managed service that provides access to the following types of logs from your resources in the cloud:
    • Audit logs: Logs related to events emitted by the Audit service.
    • Service logs: Logs emitted by individual services such as API Gateway, Events, Functions, Load Balancing, Object Storage, and VCN flow logs.
    • Custom logs: Logs that contain diagnostic information from custom applications, other cloud providers, or an on-premises environment.
  • Events

    Oracle Cloud Infrastructure services emit events, which are structured messages that describe the changes in resources. Events are emitted for create, read, update, or delete (CRUD) operations, resource lifecycle state changes, and system events that affect cloud resources.

  • Email Delivery

    Oracle Cloud Infrastructure Email Delivery is a highly scalable, cost effective, and reliable email delivery service for sending high-volume, application-generated emails for mission-critical marketing, notification, and transactional communications such as receipts, fraud detection alerts, multifactor identity verification, and password resets.

Get Featured in Built and Deployed

Want to show off what you built on Oracle Cloud Infrastructure? Care to share your lessons learned, best practices, and reference architectures with our global community of cloud architects? Let us help you get started.

  1. Download the template (PPTX)

    Illustrate your own reference architecture by dragging and dropping the icons into the sample wireframe.

  2. Watch the architecture tutorial

    Get step by step instructions on how to create a reference architecture.

  3. Submit your diagram

    Send us an email with your diagram. Our cloud architects will review your diagram and contact you to discuss your architecture.

Acknowledgments

  • Authors: Robert Huie, Sasha Banks-Louie
  • Contributors: Brad Goodwin, Anup Ojah, Robert Lies

    Oracle Extended Team: James Michels

    GridMarkets Team: Hakim Karim