Automate Invoice Images with OCI Vision and OCI Generative AI

Introduction

Companies often receive thousands of invoices in unstructured formats as scanned images or PDFs originating from suppliers and service providers. Manually extracting data from these invoices, such as invoice number, customer name, items purchased, and total amount, is a time-consuming and error-prone process.

These delays in processing not only affect accounts payable cycles and cash flow visibility but also introduce bottlenecks in compliance, auditing, and reporting.

This tutorial demonstrates how to implement an automated pipeline that monitors a bucket in Oracle Cloud Infrastructure (OCI) for incoming invoice images, extracts textual content using OCI Vision, and then applies OCI Generative AI (LLM) to extract structured fiscal data like invoice number, customer, and item list.

OCI services used in this tutorial are:

Service Purpose
OCI Vision Performs OCR on uploaded invoice images.
OCI Generative AI Extracts structured JSON data from raw OCR text using few-shot prompts.
OCI Object Storage Stores input invoice images and output JSON results.

Objectives

Prerequisites

Task 1: Configure Python Packages

  1. Run the requirements.txt file using the following command.

    pip install -r requirements.txt
    
  2. Run the Python script (main.py).

  3. Upload invoice images (for example, .png, .jpg) to your input bucket.

  4. Wait for the image to be processed and the extracted JSON saved in the output bucket.

Task 2: Understand the Code

Task 3: Run the Code

Run the code using the following command.

python main.py

Task 4: Test Suggestions

  1. Use real or dummy invoices with legible product lines and customer name.

  2. Upload multiple images at the input-bucket in sequence to see automated processing.

  3. Log in to the OCI Console, navigate to Object Storage to verify results in both buckets.

Note: In this tutorial, the sample used is a Brazilian invoice to illustrate the complexity of the attributes and disposition and how the prompt was created to resolve this case.

Invoice

Task 5: View Expected Output

For each uploaded invoice image look at the output bucket file processed. A corresponding .json file is generated with structured content as shown in the following image.

img.png

Note:

Acknowledgments

More Learning Resources

Explore other labs on docs.oracle.com/learn or access more free learning content on the Oracle Learning YouTube channel. Additionally, visit education.oracle.com/learning-explorer to become an Oracle Learning Explorer.

For product documentation, visit Oracle Help Center.