1. Products
  2.   OCR
  3.   Python
  4.   Aspose.OCR Cloud SDK for Python

Aspose.OCR Cloud SDK for Python

 
 

Python OCR API to Read & Extract Image's Text

Read and Extract text from Images, Photos, Screenshots, Scanned documents, and PDF Files via Python OCR Library.

Aspose.OCR Cloud SDK for Python is an advanced and flexible optical character recognition (OCR) solution that helps software developers to create OCR applications without any external dependencies. It allows software developers to read and extract text from images, photos, screenshots, scanned documents, and PDFs in a large number of European, Cyrillic, and Eastern scripts, returning results in the most popular document formats. The API makes it easy for developers to add OCR functionality to almost any device or platform, including netbooks, mini PCs, or even entry-level smartphones.

The Aspose.OCR Cloud SDK for Python is straightforward and easy to handle. It provides a wide range of features that make it an ideal OCR solution for developers working with Python, such as reading an entire image, reading a scanned PDF document, extracting text from a specific region of the image, extracting data from a scanned or photographed receipt, fetching PDF recognition results, extracting text from scanned or photographed tables, converting the recognition results into a natural human voice, and many more.

Aspose.OCR Cloud SDK for Python is built on top of the Aspose.OCR Cloud API, is a cloud-based OCR engine that supports 45 recognition languages including English, French, German, Spanish, Chinese, Japanese, Arabic, and many more. Using the OCR SDK, Python programmers can easily integrate OCR functionality into their Python applications without having to worry about the complexities of OCR technology. The SDK provides a simple and intuitive interface that allows users to upload images, perform OCR, and retrieve text in just a few lines of code. If you need to add OCR functionality to your Python applications, the Aspose.OCR Cloud SDK for Python is definitely worth checking out.

Previous Next

Getting Started with Aspose.OCR Cloud SDK for Python

The recommend way to install Aspose.OCR Cloud SDK for Python is using pip. Please use the following command for a smooth installation.

Install Aspose.OCR Cloud SDK for Python via pip

 pip install aspose-ocr-cloud

You can download the SDK directly from Aspose.OCR Python Cloud SDK product page

Image Recognition using Python Apps

Aspose.OCR Cloud SDK for Python allows software developers to perform OCR operation to achieve image recognition inside their own Python applications. The API is very easy to use and image recognition can be performed from any platform with Internet access. You can easily use the OCR REST API to select and send images for recognition, fetch results and store it in any supported file formats with just a couple of lines of code. The following example shows how to perform OCR operation on images using Python code.

Perform OCR on an image inside Python Apps

import asposeocrcloud

# create an instance of the OCR client
client = asposeocrcloud.OcrApi(api_key='your_api_key', app_sid='your_app_sid')

# read the image file
with open('image.jpg', 'rb') as image_file:
    image_data = image_file.read()

# call the OCR API to extract text from the image
result = client.post_ocr(image_data=image_data, language='eng', use_default_dictionaries=True)

# print the extracted text
print(result.text)

Extract Text from PDF Files via Python API

Portable Document Format (PDF) is one of the world's most popular business document file format and is a file format developed by Adobe in 1992 to present documents. Aspose.OCR Cloud SDK for Python has included a very powerful feature for extracting text from PDF files inside Python applications. To achieve the task in easy way you need to upload the PDF file to the Aspose cloud storage and perform the OCR recognition on the uploaded PDF file. The following example shows how software developers can extract text from a PDF file using Python code.

How to Extract Text from a PDF File via Python API?

import asposeocrcloud
from asposeocrcloud.apis.ocr_api import OcrApi
from asposeocrcloud.configuration import Configuration

configuration = Configuration(api_key='your_api_key', app_sid='your_app_sid')
api = OcrApi(asposeocrcloud.ApiClient(configuration))

# Upload the PDF file to the Aspose cloud storage

with open('your_pdf_file.pdf', 'rb') as file:
    api.upload_file(path='your_pdf_file.pdf', file=file)

# Perform the OCR recognition on the uploaded PDF file
result = api.post_recognize_ocr_from_url_or_content(file_path='your_pdf_file.pdf')

# Story the recognized text

recognized_text = result['text']
print(recognized_text)

Convert Text to Speech via Python API

Aspose.OCR Cloud SDK for Python enables software developers to convert text from image without installing any 3rd party software. Using the API, programmers can convert the recognition results into a natural human voice that can be played in the background or downloaded. First user’s need to send the image to Aspose OCR Cloud server and extract text from it and after that convert the text to speech using the Aspose OCR Cloud Text-to-Speech API. After the successful conversion you can save the speech file to disk.

How to Convert Text to Speech using Python API?

 import os
from asposeocrcloud import OcrApi, OcrClient, SpeechApi

client_id = os.environ['CLIENT_ID']
client_secret = os.environ['CLIENT_SECRET']
ocr_api = OcrApi(OcrClient(client_id, client_secret))
speech_api = SpeechApi(OcrClient(client_id, client_secret))

# Upload the image containing the text
filename = 'image.png'
with open(filename, 'rb') as file:
    response = ocr_api.post_recognize_from_content(file.read(), language='English', use_default_dictionaries=True)

# Extract the recognized text

text = ''
for result in response.parts:
    for line in result.lines:
        for word in line.words:
            text += word.text + ' '

# Convert the text to speech
response = speech_api.post_recognize_from_text(text, language='en-US', voice_name='Ben')

# Save the speech file to disk

with open('output.wav', 'wb') as file:
    file.write(response.content)