8.2 Image Processing

This topic provides the information about the Image Processing.

Text Recognition depends on a variety of factors to produce good quality output. The text output highly depends on the quality of the input image. These guidelines help document extraction engine to produce accurate results.

Image Preprocessing comes into play to improve the quality of input image so that the engine gives an accurate output. The main objective of the Preprocessing phase is to make it easy for the system to distinguish a character from the background.

The preprocessing can be controlled using the configuration files and are explained at the bottom. The configuration varies between documents and country.

The following image processing operations are used to improve the quality of input image:
  • Image Scaling – OCR gives accurate output for images with 300 DPI which describes the resolution. Keeping DPI lower than 200 will give unclear and incomprehensible results while keeping the DPI above 600 will unnecessarily increase the size of the output file without improving the quality of the file. Thus, a DPI of 300 works best for this purpose.
  • Image Skew Correction – A Skewed image is defined as a document image that is not straight. Skewed images directly impact the line segmentation of the OCR engine which reduces its accuracy. These kinds of images are to be processed to correct text skew.
  • Background Cropping – Background is cropped from scanned images if it contains any. This is really important as we want to remove unwanted areas from the image that does not contain text at all.
  • Noise Removal – Noise is removed from images as it decreases the readability of text. The main objective of the Noise removal stage is to smoothen the image by removing small dots/patches which have high intensity than the rest of the image. Noise removal can be performed for both Colored and Binary images.
  • Binarization – This involves converting a colored image into black and white pixels which can be achieved by fixing a threshold value.