Note:

This tutorial requires access to Oracle Cloud. To sign up for a free account, see Get started with Oracle Cloud Infrastructure Free Tier.
It uses example values for Oracle Cloud Infrastructure credentials, tenancy, and compartments. When completing your lab, substitute these values with ones specific to your cloud environment.

Build Llama Optical Character Recognition Web Application using OCI Generative AI

Introduction

If you are a developer, cloud architect, or AI enthusiast who liked Llama Optical Character Recognition (OCR), this tutorial is for you. In this tutorial, you will build a simple, Llama OCR web application that:

Uses Oracle Cloud Infrastructure (OCI) Generative AI’s vision Large Language Models (LLMs) for Meta.
Extracts structured text from images (like receipts, scanned forms).
Runs locally on your machine with Streamlit.
Does not require any frontend coding.

Objectives

We will build a web user interface (UI) that allows you to:

Upload an image (receipt, invoice, screenshot) in the application.
Get the extracted Markdown output from the image using LLM.
View and copy the structured text.

Prerequisites

Configure Oracle Cloud Infrastructure Command Line Interface (OCI CLI) (~/.oci/config).

Access to an OCI Generative AI service in the regions.

Regions with OCI Generative AI

Region Name	Location	Region Identifier	Region Key
Brazil East (Sao Paulo)	Sao Paulo	sa-saopaulo-1	GRU
Germany Central (Frankfurt)	Frankfurt	eu-frankfurt-1 FRA
Japan Central (Osaka)	Osaka	ap-osaka-1	KIX
UAE East (Dubai)	Dubai	me-dubai-1	DXB
UK South (London)	London	uk-london-1	LHR
US Midwest (Chicago)	Chicago	us-chicago-1	ORD

Deploy a vision capable model (like meta.llama-3.2-90b-vision-instruct, llama 4).
Install Python version 3.8 or later and required Python packages.

Task 1: Download Python Code and Set up Config File

Download the code from here: llama-ocr-oci.py
Make sure you have the correct config profile configured in the file ~/.oci/config with a name for it. For example, OCI_PROFILE.

Task 2: Set up a Virtual Environment

Creating a virtual environment helps isolate dependencies and ensures your Streamlit OCR app does not interfere with other Python projects on your system.

Windows: Run the following commands.
1. Open the Command Prompt (cmd) or PowerShell and navigate to your project folder.
```
cd path\\to\\your\\project
```
2. Create a virtual environment.
```
python -m venv venv
```
3. Activate the virtual environment.
```
venv\\Scripts\\activate
```
4. Install dependencies.
```
pip install streamlit oci
```
macOS/Linux: Run the following command.
1. Open Terminal and navigate to your project directory.
```
cd ~/path/to/your/project
```
2. Create a virtual environment.
```
python3 -m venv venv
```
3. Activate the virtual environment.
```
source venv/bin/activate
```
4. Install dependencies.
```
pip install streamlit oci
```

Task 3: Launch the Application

Run the following command to launch the application.

streamlit run ocr_vision_app.py

You should see the application launch in your browser.

app

Task 4: Upload an Image and Extract the Text

In Select OCI Config Profile, select your config profile from the drop-down menu.
In Enter Compartment OCID, enter the compartment Oracle Cloud Identifier (OCID) where you have the access to the OCI Generative AI service.
In Select Vision Model, select a model.
Click Upload and select an image (receipt, invoice, screenshot).

The application will process the image and display the extracted text.

Meta Llama 4 is now available in OCI Generative AI

Acknowledgments

Authors - Mukund Murali (Principal Cloud Architect)

More Learning Resources

Explore other labs on docs.oracle.com/learn or access more free learning content on the Oracle Learning YouTube channel. Additionally, visit education.oracle.com/learning-explorer to become an Oracle Learning Explorer.

For product documentation, visit Oracle Help Center.

Title and Copyright Information

Build Llama Optical Character Recognition Web Application using OCI Generative AI

G35920-01

Build Llama Optical Character Recognition Web Application using OCI Generative AI

Introduction

Objectives

Prerequisites

Task 1: Download Python Code and Set up Config File

Task 2: Set up a Virtual Environment

Task 3: Launch the Application

Task 4: Upload an Image and Extract the Text

Related Links

Acknowledgments

More Learning Resources