Prepare Documents to Analyze with an OCI Document Understanding Model
You use buckets in OCI Object Storage to store the documents that you want to analyze, then create a dataset to access these documents in Oracle Analytics.
You typically store input documents and AI models in the same Oracle Cloud account (tenancy), which makes it easier to setup in Oracle Analytics.
If your input documents and AI models are stored in different tenancies:
- Make sure that the visibility of the storage bucket containing your input documents is public. See Change the visibility of a bucket.
- Populate the input dataset for the data flow with individual document URLs instead of a single URL for the OCI bucket where documents are stored.
Data flows in Oracle Analytics can process up to 10,000 documents in one run. If you have more than 10,000 documents, in OCI's Object Storage & Archive Storage, create multiple buckets containing no more than 10,000 documents in each one. Then, create a separate dataset and data flow for each bucket, and use a sequence to sequentially process the data flows.
You can use a private or public bucket that is accessible by the OCI user and that complies with OCI's generic limits on documents. See OCI documentation.