Free Ruby Library to Convert Image to Text & Searchable PDFs

Open Source Ruby OCR Library That Enables Software Developers to Perform Optical Character Recognition to Extract Text from Scanned Documents, Images, or Even Screenshots

What is RTesseract?

Optical Character Recognition (OCR) serves as a transformative bridge between visual media and digital data, enabling machines to extract text from images and scanned documents with precision. For developers within the Ruby ecosystem, the open source RTesseract library provides a high-efficiency gateway to the power of the Tesseract OCR engine. Originally pioneered by Hewlett-Packard and further refined by Google, Tesseract is widely recognized as a premier solution for image-to-text conversion. By acting as a sophisticated interface, RTesseract allows for the seamless integration of Tesseract OCR into any Ruby project, eliminating the need to manually manage complex command-line interactions while maintaining the accuracy required for professional-grade document digitization.

Beyond its ease of use, RTesseract is a versatile Ruby library engineered to support an extensive array of image formats, including PNG, JPEG, BMP, and TIFF. This flexibility ensures compatibility with virtually any visual source, while its ability to leverage Tesseract’s global data files allows for multi-language text recognition with high reliability. To further enhance data integrity, the library provides confidence scores for each recognized word, allowing developers to programmatically evaluate the quality of the OCR output. As an open-source project, RTesseract offers a cost-effective, customizable, and robust solution for teams looking to automate data extraction workflows and build smarter, text-aware Ruby applications.

At A Glance

An overview of RTesseract features.

Features Overview

Convert Image to Text
Add OCR Capabilities
Recognize Image text
Load Images via URL
Convert PDF tp text
Recognized Font text
Image to Searchable PDF
Other Languages
Create OCR apps
Save to browser
Extract Text
Multi-threading Support

RTesseract

RTesseract supports popular compression file formats listed below.

Reader

PNG, JPEG, BMP, TIFF, TGA, DICOM

Writer

PNG, JPEG, BMP, TIFF

RTesseract

Platform Independence

RTesseract only requires Ruby Runtime.

Ruby 5.1 and above.

RTesseract

Getting Started with RTesseract

The recommend way to install RTesseract is using Rubygems. Please use the following command for a smooth installation.

Install RTesseract via Rubygems

$ gem install rtesseract

Install RTesseract via GitHub

 git clone https://github.com/dannnylo/rtesseract.git

You can download the compiled shared library from GitHub repository.

Image to Text Conversion via Ruby API

The RTesseract library makes it easy for developers to load and convert an image to text inside Ruby applications. The most straightforward use case is converting an image into a string of text. With just a few lines of code, you can extract text from an image file. This following code example loads the image and processes it with Tesseract, returning the recognized text as a Ruby string using Ruby commands.

How to Load an Image and Convert It to Text via Ruby API?

require 'rtesseract'
image = RTesseract.new("path/to/your_image.jpg")
text = image.to_s
puts "Extracted Text: #{text}"

Image Conversion to Searchable PDF via Ruby

The open source RTesseract library has provided complete support for converting an image to a searchable PDF, preserving the image’s layout and colors inside Ruby applications. the following example demonstrates how software developers can load generate a searchable PDF document from an images using Ruby commands.

How to Convert a JPEG Image to Searchable PDF File via Ruby Library?

require 'rtesseract'
image = RTesseract.new("path/to/my_image.jpg")
pdf_file = image.to_pdf  # Returns an open file handle for the PDF
File.write("output.pdf", pdf_file.read)

Custom Configuration for Tesseract

The open source RTesseract library allows software professionals to configure Tesseract’s settings, such as language, page segmentation mode, restrict OCR to digit recognition and OCR engine mode. This enables you to fine-tune the OCR process for better accuracy. You can customize Tesseract’s settings using the config option. In the following example, psm (page segmentation mode) is set to 6, and oem (OCR engine mode) is set to 1.

How to Customize Tesseract’s settings inside Ruby Apps?

image = RTesseract.new('path/to/image.png', config: { psm: 6, oem: 1 })
text = image.to_s

puts text

Multi-Language Support

If your image contains text in a specific language, you can improve accuracy by specifying the language. The library supports languages like English (default), German, French, Italian, Dutch, Portuguese, Spanish, Vietnamese, and so on. Please make sure that the corresponding language pack is installed with Tesseract. In the following example the lang option is set to 'fra' for French. You can use any language code supported by Tesseract

How to Convert Image to Text in Other Languages via Ruby Library?

image = RTesseract.new('path/to/image.png', lang: 'fra') # French language
text = image.to_s

puts text