Use OCI Vision to extract data from images and scanned documents

Oracle Cloud Infrastructure (OCI) Vision is one of several AI services available on Oracle Cloud Infrastructure.

OCI Vision gives you the power to apply machine learning and artificial intelligence without needing data science expertise. It has pre-trained models that allow you to quickly conduct OCR, image classification and object detection, document classification,anomaly detection, and more.

You can tune the pre-trained models with custom data using transfer learning. You can use existing labeled datasets for the tuning, or if your data is not labeled already, you can use Oracle Data Labeling service to ease the task.

Architecture

This architecture demonstrates the relationship among the various components in a typical system that has OCI Vision at its core.

In this system, an end user uploads a photograph or an image of a business document using a web application. The application stores the file in Object Storage. Oracle Events detects the new file and triggers a serverless function which generates a REST API call to the Vision service. Vision retrieves the file from Object Storage and analyzes the image. The results are stored in a database where applications can use them.

The following diagram illustrates this reference architecture.

Description of architecture-ai-vision.png follows
Description of the illustration architecture-ai-vision.png

architecture-ai-vision-oracle.zip

The architecture has the following components:

  • Region

    An Oracle Cloud Infrastructure region is a localized geographic area that contains one or more data centers, called availability domains. Regions are independent of other regions, and vast distances can separate them (across countries or even continents).

  • Availability domains

    Availability domains are standalone, independent data centers within a region. The physical resources in each availability domain are isolated from the resources in the other availability domains, which provides fault tolerance. Availability domains don’t share infrastructure such as power or cooling, or the internal availability domain network. So, a failure at one availability domain is unlikely to affect the other availability domains in the region.

  • Fault domains

    A fault domain is a grouping of hardware and infrastructure within an availability domain. Each availability domain has three fault domains with independent power and hardware. When you distribute resources across multiple fault domains, your applications can tolerate physical server failure, system maintenance, and power failures inside a fault domain.

  • Virtual cloud network (VCN) and subnets

    A VCN is a customizable, software-defined network that you set up in an Oracle Cloud Infrastructure region. Like traditional data center networks, VCNs give you complete control over your network environment. A VCN can have multiple non-overlapping CIDR blocks that you can change after you create the VCN. You can segment a VCN into subnets, which can be scoped to a region or to an availability domain. Each subnet consists of a contiguous range of addresses that don't overlap with the other subnets in the VCN. You can change the size of a subnet after creation. A subnet can be public or private.

  • Compartment

    Compartments are cross-region logical partitions within an Oracle Cloud Infrastructure tenancy. Use compartments to organize your resources in Oracle Cloud, control access to the resources, and set usage quotas. To control access to the resources in a given compartment, you define policies that specify who can access the resources and what actions they can perform.

  • Load Balancer

    The Oracle Cloud Infrastructure Load Balancing service provides automated traffic distribution from a single entry point to multiple servers in the back end.

    The load balancer provides access to different applications.

  • Security List

    For each subnet, you can create security rules that specify the source, destination, and type of traffic that must be allowed in and out of the subnet.

  • Object Storage

    Object storage provides quick access to large amounts of structured and unstructured data of any content type, including database backups, analytic data, and rich content such as images and videos. You can safely and securely store and then retrieve data directly from the internet or from within the cloud platform. You can seamlessly scale storage without experiencing any degradation in performance or service reliability. Use standard storage for "hot" storage that you need to access quickly, immediately, and frequently. Use archive storage for "cold" storage that you retain for long periods of time and seldom or rarely access.

  • FastConnect

    Oracle Cloud Infrastructure FastConnect provides an easy way to create a dedicated, private connection between your data center and Oracle Cloud Infrastructure. FastConnect provides higher-bandwidth options and a more reliable networking experience when compared with internet-based connections.

  • Oracle Infrastructure Cloud Vision

    OCI Vision is used to extract information from PDFs and images. Vision supports OCR, document understanding, table classification, object detection, and image classification

  • Application

    The application in this architecture allows users to upload images and uses the metadata from the images for improved search and context.

Recommendations

Your requirements might differ from the architecture described here. Use the following recommendations as a starting point.

  • VCN

    When you create a VCN, determine the number of CIDR blocks required and the size of each block based on the number of resources that you plan to attach to subnets in the VCN. Use CIDR blocks that are within the standard private IP address space.

    Select CIDR blocks that don't overlap with any other network (in Oracle Cloud Infrastructure, your on-premises data center, or another cloud provider) to which you intend to set up private connections.

    After you create a VCN, you can change, add, and remove its CIDR blocks.

    When you design the subnets, consider your traffic flow and security requirements. Attach all the resources within a specific tier or role to the same subnet, which can serve as a security boundary.

    Use regional subnets.

  • Security

    Use Oracle Cloud Guard to monitor and maintain the security of your resources in Oracle Cloud Infrastructure proactively. Cloud Guard uses detector recipes that you can define to examine your resources for security weaknesses and to monitor operators and users for risky activities. When any misconfiguration or insecure activity is detected, Cloud Guard recommends corrective actions and assists with taking those actions, based on responder recipes that you can define.

    For resources that require maximum security, Oracle recommends that you use security zones. A security zone is a compartment associated with an Oracle-defined recipe of security policies that are based on best practices. For example, the resources in a security zone must not be accessible from the public internet and they must be encrypted using customer-managed keys. When you create and update resources in a security zone, Oracle Cloud Infrastructure validates the operations against the policies in the security-zone recipe, and denies operations that violate any of the policies.

  • Cloud Guard

    Clone and customize the default recipes provided by Oracle to create custom detector and responder recipes. These recipes enable you to specify what type of security violations generate a warning and what actions are allowed to be performed on them. For example, you might want to detect Object Storage buckets that have visibility set to public.

    Apply Cloud Guard at the tenancy level to cover the broadest scope and to reduce the administrative burden of maintaining multiple configurations.

    You can also use the Managed List feature to apply certain configurations to detectors.

  • Security Zones

    Clone and customize the default recipes provided by Oracle to create custom detector and responder recipes. These recipes enable you to specify what type of security violations generate a warning and what actions are allowed to be performed on them. For example, you might want to detect Object Storage buckets that have visibility set to public.

    Apply Cloud Guard at the tenancy level to cover the broadest scope and to reduce the administrative burden of maintaining multiple configurations.

    You can also use the Managed List feature to apply certain configurations to detectors.

  • Load Balancer Bandwidth

    While creating the load balancer, you can either select a predefined shape that provides a fixed bandwidth, or specify a custom flexible shape where you set a bandwidth range and let the service scale the bandwidth automatically based on traffic patterns. With either approach, you can change the shape at any time after creating the load balancer.

  • Oracle Functions

    This architecture uses a function to call the OCI Vision REST API with a specific image and then store the metadata that is returned by Vision. The function can be built using the Java or Python SDK.

  • Events

    In this architecture, the Oracle Cloud Infrastructure Events service is configured to listen to changes in Object Storage creation. The service is invoked after the object is uploaded to Object Storage and calls the function for processing.

Considerations

Consider the following points when deploying this architecture.

  • Performance

    For performance and scalability reasons we are using Functions to call Vision REST API. An alternative solution is to use the Vision REST API directly from the application. If you do this consider having the REST API calls run as background jobs.

  • Access

    OCI Vision supports access through the OCI Console, Java and Python SDK client, and OCI CLI. When testing it's recommended to use the CLI tool or Console.

  • Availability

    In this example the database is not highly available. For critical applications consider running MySQL Database Service in HA mode with 3 replicas.

  • Integration

    When scanning business documents like receipts and application forms in PDF format consider using Oracle Integration Cloud to pull PDFs from systems like email, then calling Vision AI and finally pushing the content to destination system like ERP or CRM systems.