Document Classification
Document Classification can be used to classify a document.
Document Understanding provides a list of possible
document types for the analyzed document. Each document type has a confidence score. The
confidence score is a decimal number. Scores closer to 1 indicate a higher confidence in the
extracted text, while lower scores indicate lower confidence score. The range of the
confidence score for each label is between 0 to 1. The list of possible document types is:
- Invoice
- Receipt
- Resume or CV
- Tax form
- Driver's license
- Passport
- Bank statement
- Check
- Payslip
- Other
Supported features are:
- Classify document
- Confidence score
- Single request
- Batch request
Document Classification Example
An example of document classification use in Document Understanding.
- Input document
-
Document Classification Input
API Request:
{ "processorConfig": { "processorType": "GENERAL", "features": [ { "featureType": "DOCUMENT_CLASSIFICATION", "maxResults": 5 } ] }, "inputLocation": { "sourceType": "OBJECT_STORAGE_LOCATIONS", "objectLocations": [ { "source": "OBJECT_STORAGE", "namespaceName": "", "bucketName": "", "objectName": "" } ] }, "compartmentId": "", "outputLocation": { "namespaceName": "", "bucketName": "", "prefix": "" } }
- Output:
- API Response:
{ "documentMetadata": { "pageCount": 1, "mimeType": "image/jpeg" }, "pages": [ { "pageNumber": 1, "dimensions": { "width": 361, "height": 600, "unit": "PIXEL" }, "detectedDocumentTypes": [ { "documentType": "RECEIPT", "confidence": 1 }, { "documentType": "TAX_FORM", "confidence": 6.465067e-9 }, { "documentType": "CHECK", "confidence": 6.031838e-9 }, { "documentType": "BANK_STATEMENT", "confidence": 5.413888e-9 }, { "documentType": "PASSPORT", "confidence": 1.5554872e-9 } ], ... detectedDocumentTypes": [ { "documentType": "RECEIPT", "confidence": 1 } ], ...