Open Source Python API to Integrate OCR Capabilities
Open Source Python library that allows software developers to easily integrate optical character recognition (OCR) capabilities into their applications.
PaddleOCR is a powerful open source Python library that enables software developers to easily integrate optical character recognition (OCR) capabilities into their Python applications. It is built on top of PaddlePaddle, an open-source deep learning platform, and uses state-of-the-art deep learning models to achieve high accuracy and performance. PaddleOCR simplifies the OCR process by providing a high-level API that abstracts away many of the low-level details, making it easy for developers to add OCR capabilities to their applications.
PaddleOCR has provided complete support for a wide range of languages and scripts. It currently supports 80+ different languages, including Arabic, Chinese, English, French, German, Japanese, Korean, Russian, Spanish, and many others. This makes it a valuable tool for developers who need to work with multilingual content. In addition to its powerful OCR capabilities, the library also includes a number of useful utilities for working with images and text. For example, it includes tools for image preprocessing, such as deskewing and binarization, as well as post-processing tools for improving the accuracy of the OCR output.
PaddleOCR provides several different OCR models, each optimized for different use cases. For example, the Text Detection model is used to locate and extract text regions from an image, while the Text Recognition model is used to recognize the actual text within those regions. There is also a Model Ensemble feature that allows developers to combine multiple models to achieve even higher accuracy. Overall, PaddleOCR is a powerful and easy-to-use library for adding OCR capabilities to your Python applications. Its support for a wide range of languages and scripts, as well as its customizable models and postprocessing tools, make it a valuable tool for developers working with OCR.
Getting Started with PaddleOCR
The recommend way to install PaddleOCR is using pip. Please use the following command for a smooth installation
Install PaddleOCR via pip
Install PaddleOCR via pip
pip install paddleocr
You can also install it manually; download the latest release files directly from GitHub repository.
Image Text Recognition via PaddleOCR API
Image text recognition is the process of extracting text from images. It is a useful technique for various applications such as document scanning, digitization, and OCR (Optical Character Recognition). The open-source OCR (Optical Character Recognition) API provides a set of state-of-the-art OCR models that can recognize text from various images, including scanned documents, screenshots, and photographs. The library supports several important features related to image text recognition such as loading images, Initialize an OCR model, identify text region in the image, Recognize text from the image, extracting text from the result, and many more. The following example shows how to recognize text from an image inside Python applications.
Perform Image Text Recognition inside Python Projects
import paddleocr
ocr = paddleocr.OCR()
# load an image using the PIL
from PIL import Image
image = Image.open('example.jpg')
result = ocr.ocr(image)
# access the recognized text
for line in result:
print(line[1][0])
print(line[1][1])
OCR Document Recognition using Python API
Document recognition has been one of the prominent research areas for OCR. Documents are used almost every day in our life. When software developers apply OCR to a document, it can retrieve important information, retrieve form fields, analyze layout, store digitally and also for reading old manuscripts. The open-source PaddleOCR library allows software developers to load various types of documents, perform OCR operations and recognize and extract text from it using Python code. The text recognition is very accurate and the library can easily detect special characters and spaces accurately.
Perform OCR Document RecognitionF using Python API
img_path = './input_images/11-document-1.jpg'
result = ocr.ocr(img_path)
//Displaying the output.
Table Recognition Support inside Python Apps
The open source PaddleOCR library enables software developers to recognize table’s data inside their Python applications. The table recognition mainly contains three models, single line text detection-DB, single line text recognition-CRNN and table structure as well as cell coordinates prediction-SLANet. The following example shows how to recognize the image that contains the table. The following example shows how to use the draw_ocr method which takes in the image, the bounding boxes, the texts, the scores, and the path to the font file. It returns an image with the bounding boxes and the detected text. You can display the image using the show method.
Load an Image and Detect Text inside It via Python API
from paddleocr import PaddleOCR, draw_ocr
# Load the image that contains the table.
# Load the image
img_path = 'table_image.png'
with open(img_path, 'rb') as f:
img = f.read()
# Create an instance of the PaddleOCR object
ocr = PaddleOCR()
# Draw the bounding boxes around the detected table cells
boxes = [line[0] for line in result]
scores = [line[1] for line in result]
texts = [line[2][0] for line in result]
im_show = draw_ocr(img, boxes, texts, scores, font_path='arial.ttf')
im_show.show()