1. Products
  2.   PDF
  3.   Python
  4.   Python-PDFKit
 
  

Free Python API to Generate & Customize PDF Documents

Open Source Python Library that allows Software Developers to Generate, Modify or Customize PDFs as well as Merge Multiple PDFs inside Python Apps.

Python has long been a go-to language for software developers and data scientists due to its simplicity and versatility. One of the many advantages of Python is its rich ecosystem of libraries that cover various domains. One such library is Python-PDFKit, a powerful tool for PDF manipulation in Python. Whether you need to generate PDFs, extract information from existing ones, or even convert HTML content to PDF, Python-PDFKit has got you covered. The library is very easy to handle and enables users to generate PDFs from HTML, URL, or raw HTML strings seamlessly.

Python-PDFKit is a Python wrapper for the popular PDF conversion tool, wkhtmltopdf, which is written in C++. With this library, developers can easily integrate PDF generation and manipulation into their Python applications. There are several important features part of the library for handling PDF documents via creating PDFs from HTML files, creating PDFs from URLs, customizing the PDF generation process, converting HTML content to PDF directly, merging multiple PDF documents into a single file, managing PDF hears/footers, setting the PDF page size and many more.

The Python-PDFKit library provides an intuitive and straightforward interface to interact with the underlying wkhtmltopdf command-line tool, enabling software developers to create, merge, and convert PDF documents effortlessly. Its numerous configuration options enable fine-tuning the PDF output according to specific requirements. With its easy installation process and straightforward usage, Python-PDFKit is a valuable addition to any developer's toolkit. In conclusion, Python-PDFKit is a must-try library for any Python developer looking to streamline PDF generation tasks and produce professional-looking documents with ease.

Previous Next

Getting Started with Python-PDFKit

The recommended and easiest way to install Python-PDFKit is using pip. Please use the following command a smooth installation.

Install Python-PDFKit  via pip

 pip install pdfkit 

You can also install it manually; download the latest release files directly from GitHub repository.

Extract Text from PDF via Python

The Python-PDFKit library provides capability for programmatically extracting text from PDF files via Python. It is not easy to retrieve data from a PDF file because the way PDF stores information just makes it hard to achieve it. The Python-PDFKit makes developers job easy by providing them easy to use built-in functions for retrieving information. They can use the extractText() method on the page object to get the text content of the page.

Extract Text from PDF via Python

 // extract text from a PDF
  from Python-PDFKit import PdfReader
  reader = PdfReader("example.pdf")
  page = reader.pages[0]
  print(page.extract_text()) 

Generating PDF Documents via Python API

The open source Python-PDFKit library makes it easy for software developers to generate PDF file with ease inside their Python applications. The library has provided support for generating PDFs from various sources. The library allows software developers to create PDFs from HTML files, strings, or even URLs. It is also possible to add images, headers and footers, set page size, set margins and so on inside Python applications. The following example demonstrates, how software developers can generate PDF files from various sources with just a couple of lines Python code.

Generate a PDF from a HTML File, String or URL via Python API

import pdfkit

# Generate a PDF from an HTML file
pdfkit.from_file("source.html", "output.pdf")

# Generate a PDF from an HTML string
html_string = "

Hello, PDFKit!

" pdfkit.from_string(html_string, "output.pdf") # Generate a PDF from a URL pdfkit.from_url("https://example.com", "output.pdf")

Customizing PDF Generation inside Python Apps

The open source Python-PDFKit library enables software developers to customize the PDF generation process inside their own applications. Software developers can specify various options such as page size, margins, headers/footers, merging multiple PDF documentsand more. These options are passed to wkhtmltopdf as command-line arguments. The following example shows how software developers can customize PDF generation process inside Python applications.

How to Customize PDF Generation Process via Python API?

 import pdfkit

options = {
    'page-size': 'A4',
    'margin-top': '0mm',
    'margin-right': '0mm',
    'margin-bottom': '0mm',
    'margin-left': '0mm',
}

pdfkit.from_file("source.html", "output.pdf", options=options)

Convert HTML to PDF via Python Library

The open source Python-PDFKit library is a very easy to use library for loading and converting HTML documents to PDF files inside their Python applications. Besides generating PDFs, the library can convert HTML content to PDF directly without saving an intermediate file. This can be useful when dealing with dynamic content or generating PDFs on-the-fly. Below is a simple example that shows how computer programmers can convert HTML documents to PDF files inside Python applications.

 

How to Convert HTML Documents to PDF Files via Python?

import pdfkit

html_string = "

Hello, PDFKit!

" pdf_bytes = pdfkit.from_string(html_string, False) # Save the PDF bytes to a file with open("output.pdf", "wb") as f: f.write(pdf_bytes)