Html To Pdf With Python A Step By Step Guide Medium

Dr. Aris Thorne
-
html to pdf with python a step by step guide medium

Top 10 Python PDF generator libraries: Complete guide for developers (2025) Table of contents 10 Python PDF libraries for different use cases: - Add forms, annotations, signatures → Nutrient API - Generate basic PDFs → FPDF or pdfdocument - Convert HTML/CSS to PDF → PDFKit (wkhtmltopdf) or WeasyPrint - Design complex layouts and charts → ReportLab - Build rich documents with tables and barcodes → borb - Turn images into PDFs → img2pdf (lossless) or Pillow + img2pdf How to generate PDF in Python PDF is the standard format for documents across industries.

A Python PDF library lets you: - Generate reports and invoices — Create invoices, statements, and reports without manual layout - Build document workflows — Fillable forms, digital signatures, merge/split files - Convert HTML to PDF — Turn web templates into print-ready brochures and ebooks - Archive images — Bundle scans and photos into searchable PDFs - Add PDF endpoints — Serve PDF generation from web apps or serverless functions Python has many PDF libraries — from lightweight, dependency-free tools to HTML-to-PDF engines and commercial APIs.

The sections below compare 10 Python PDF libraries across five criteria to help you choose. 1. Nutrient API Nutrient DWS API generates, customizes, and processes PDFs at scale. It handles form filling, signatures, annotations, OCR, and merging via a cloud REST API — no local rendering engines required.

Key features - HTML to PDF — CSS, custom fonts, headers/footers, and page numbers - Document conversion — Convert Office files (Word, Excel, PowerPoint), images, and Markdown to PDF - PDF operations — Form filling, digital signatures, annotations, watermarking, OCR, merging, compression, and redaction - Accessibility — PDF/UA auto-tagging for accessible documents - Platform-agnostic — Call from Python, JavaScript, or any HTTP client - SOC 2 Type 2 and GDPR compliant — Documents processed in memory, never stored Getting started - Sign up — Visit the Nutrient website(opens in a new tab) and sign up for an account.

Request an API key — After signing up, obtain an API key from the dashboard. - Pricing — Credit-based. Free trial available. Example: Generate a PDF from HTML To generate a PDF, the instructions dictionary specifies that the PDF should be generated from the index.html file. The code then sends a POST request to the Nutrient API with the HTML content. The API processes this data and returns a PDF file. If the request is successful, the PDF is saved as result.pdf.

If there’s an error, the response is printed for troubleshooting. Advanced usage - Dynamic content — Combine with data sources to generate invoices, reports, or certificates - Other Nutrient services — Use with annotation, form filling, or OCR - Batch processing — Handle large-scale PDF generation Why use Nutrient API - 30+ PDF tools — Conversion, watermarking, OCR, form processing, digital signatures, compression, and redaction in one API - Credit-based pricing — Form filling costs 1 credit, OCR costs 3 credits, digital signatures cost 10 credits.

Combine multiple operations in a single request - Python client — Official async Python client with type hints, available via pip install nutrient-dws - Rate limit — 100 requests per minute per API key Sign up for a free account(opens in a new tab) to test the API. Official Python client for Nutrient Processor API Nutrient offers an official Python client(opens in a new tab), available via pip install nutrient-dws. It requires Python 3.10+.

The client provides: - Direct methods — merge_pdfs,ocr_pdf,watermark_pdf,compress_pdf,extract_text - Builder pattern — Chain multiple operations: add parts → apply actions → set output → execute - Input flexibility — File paths, bytes, file-like objects, and remote URLs - Async and type-safe — Full type hints with async/await support - Error handling — Specific exceptions for validation, API, authentication, and network errors The Python client doesn’t expose HTML-to-PDF generation directly. Use requests to call the /build endpoint for HTML conversion. Then use the client for further processing. 2.

FPDF Python: Create PDF with pure Python FPDF(opens in a new tab) is a lightweight, dependency-free library for building PDFs from scratch. Key features - Quick text and image insertion — Add paragraphs, pictures, and simple lines. - Multipage support — Loop through pages with just a few commands. - Basic formatting — Set fonts, colors, and alignments easily. - Fast and small — No dependencies, works in minimal environments. Best for: Simple multipage documents, or image-rich flyers that only need basic text, images, and custom formatting.

Installation Install FPDF using pip: Usage example Create a PDF with FPDF: In this example, you created a PDF document, added a single page, and inserted a line of centered text. The PDF was saved to output.pdf. Advanced example: Generating a multipage PDF with images and custom formatting In this example, you’ll create a Python script to generate a multipage PDF that includes both text and images. You’ll use the fpdf library to: - Add multiple pages to the PDF. - Insert custom text on each page.

Include an image (e.g. example.jpg ) on each page. Ensure you have an image file (e.g. example.jpg ) you want to include in the PDF. Place this image in the same directory as your script. Save the following Python code in a.py file (e.g. generate_pdf.py ): Open your terminal, navigate to the folder where the script and image are stored, and run: This will generate a multi_page.pdf file in the same directory. Each page will have custom text and the example.jpg image placed at specified coordinates.

Try Nutrient API for free and generate PDFs in minutes. 3. Generating PDFs with ReportLab ReportLab(opens in a new tab) is an open source library for creating PDFs with text, images, charts, and graphics.

Key features - PDF generation — Text, images, charts, and custom graphics - Advanced graphics — Lines, shapes, curves, and illustrations - Document templates — Consistent layout across documents - Custom fonts and styles — Full control over fonts, colors, and styles Best for: Complex layouts with charts, graphics, and custom typography — reports, brochures, or invoices. Licensing ReportLab has a free community edition under the ReportLab Open Source License. A commercial edition is available with additional features and support.

Installation To install ReportLab, use pip: Alternatively, if you’re using a system like Anaconda, you can install it via conda: Usage example Generate a PDF with ReportLab: In this example, a PDF file named example.pdf is created with a simple line of text. More advanced features ReportLab offers more advanced features, outlined below. Adding graphics You can draw more than just text — for example, lines, shapes, and even custom images: Creating charts You can also generate various types of charts using ReportLab’s rlbcharts module.

It allows you to create bar charts, line charts, and pie charts with extensive customization. ReportLab works well for detailed reports with charts, graphs, and complex page layouts. For simpler PDFs, consider FPDF or img2pdf. 4. Generating PDFs with PDFKit PDFKit(opens in a new tab) is a thin Python wrapper around wkhtmltopdf(opens in a new tab), letting you generate pixel-perfect PDFs directly from HTML/CSS. Key features - Full HTML/CSS ↔︎ PDF conversion — Supports media queries, page breaks, and inline JavaScript execution.

Complex layout handling — Preserves grids, tables, and layered elements without extra tweaking. - Drop-in integration — Call one function ( pdfkit.from_file/string/url ) and deploy on any system that haswkhtmltopdf installed. Best for: Converting webpages or HTML templates into styled PDFs — perfect for reports, ebooks, and other web-native documents. Installation To use PDFKit, install both the pdfkit library and the wkhtmltopdf tool. Follow the steps below to set it up.

Install the PDFKit library: Download and install wkhtmltopdf from its official website(opens in a new tab), or use a package manager:macOS (Homebrew): Ubuntu/Debian: Windows: Download the installer from the official website(opens in a new tab) and follow the installation instructions. Converting HTML to PDF Here’s a simple guide for converting an HTML file into a PDF: To learn more about converting HTML to PDF using Python, check out our blog post: 5.

Generating PDFs with WeasyPrint WeasyPrint(opens in a new tab) is a Python library that turns modern HTML + CSS into print-ready PDFs while faithfully preserving the source layout and styles. Key features - Accurate HTML/CSS rendering — Handles flexbox, grid, media queries, and other modern web layouts. - Rich styling support — Full CSS3, responsive design, and custom fonts. - Unicode and multilingual — Reliable output for RTL and non-Latin scripts. - SVG embedding — Renders inline or linked SVG graphics without rasterization.

Best for: Web content conversion, styled reports, invoices, receipts, and ebooks from HTML + CSS. Installation Install WeasyPrint via pip: WeasyPrint requires system dependencies. On Ubuntu/Debian: Usage example Generate a styled PDF with WeasyPrint: In this example, HTML(string=html_content).write_pdf("output.pdf") converts the provided HTML content into a PDF file named output.pdf. To learn more about WeasyPrint, visit our blog post on how to generate PDF from HTML using Python. 6. Generating PDFs with borb borb(opens in a new tab) is a modern, pure-Python library for both creating and manipulating PDFs.

It has high-level layout primitives (Paragraph, Table, Chart) and low-level drawing commands. Key features - Rich layout engine — Paragraphs, images, tables, barcodes, SVG, pie and bar charts. - Interactive elements — Forms, annotations, document outlines. - Post-processing — Merge, split, redact, encrypt existing PDFs. Best for: Complex pages (tables, charts, barcodes) and PDF manipulation (merge, split, encrypt) with no external binaries. Installing borb borb can be installed via pip: Usage example 7.

Generating PDFs with img2pdf img2pdf(opens in a new tab) is a tiny utility for stitching images into a single, lossless PDF. Key features - Any image ↔︎ PDF — Handles JPEG, PNG, TIFF, and more. - Batch combine — Merge dozens of images into one document in file order. - Lossless output — Keeps original pixels; no reencoding artefacts. - Fast and minimal — Pure Python, no heavy dependencies.

Best for: Bundling images — such as scans, photos, and graphics — into a single, lossless PDF when you don’t need additional text or formatting. Installation Install via pip: Usage example Convert images to PDF: In this example, img2pdf.convert() takes a list of image file paths and writes them into a PDF file named output.pdf. Try Nutrient API for free and generate PDFs in minutes. 8. Using Pillow with img2pdf Pillow(opens in a new tab) lets you resize, crop, rotate, and convert images before passing them to img2pdf.

Key features - Image editing — Resize, crop, rotate, filter, or watermark images - Format conversion — Convert TIFF, BMP, or RAW to JPEG/PNG - Works with img2pdf — Pass processed images to img2pdf.convert(...) Best for: Preprocessing images before converting to PDF. Installation Install both libraries: Code examples Example 1: Preprocess and convert a single image Example 2: Preprocess and combine multiple images 9. Generating PDFs with xhtml2pdf xhtml2pdf(opens in a new tab) converts HTML + CSS to PDF with a single function call.

Best for: Styled invoices, browser-ready reports, marketing flyers, or offline documents generated straight from web templates. Key features - Layout-faithful HTML → PDF — Preserves CSS, page breaks, and headers and footers. - Embedded assets — Packs fonts and images so the PDF matches the original design. - Multipage and complex structures — Handles tables, floats, and long documents with ease.

Installation To use xhtml2pdf, you can install it via pip: Usage example Here’s a simple example of how to convert an HTML file to a PDF using xhtml2pdf: In this example, xhtml2pdf is used to convert a simple HTML string into a PDF file named output.pdf. The library handles the HTML structure and CSS styling, enabling you to produce a well-formatted PDF. 10. Generating PDFs with pdfdocument pdfdocument(opens in a new tab) is a minimal Python library for creating basic PDFs with minimal setup.

Its intuitive API makes it ideal for simple documents like reports, memos, or letters — without the overhead of more complex libraries. Key features - Straightforward API — Quickly add headings, paragraphs, and images. - Lightweight and dependency-free — Good for small projects or scripting. - Customizable layout — Basic support for formatting, heading levels, and visual structure. Best for: Simple text-based documents and quick PDF output where you don’t need advanced layout engines.

Installation You can install pdfdocument via pip: Usage example Here’s a simple example of how to use pdfdocument to generate a PDF: In this example, pdfdocument is used to create a PDF file named output.pdf. The code adds a title and a paragraph of text, demonstrating how easily you can generate a basic PDF. The generate() method finalizes and saves the document. Comparison: Which Python PDF generator is right for you?

Recommendations - Nutrient API — For production apps needing forms, signatures, annotations, OCR, and support - xhtml2pdf — Simple HTML-to-PDF with good community support - Pillow with img2pdf — Image preprocessing before PDF conversion - img2pdf — Fast image-to-PDF conversion - WeasyPrint and PDFKit — HTML/CSS rendering with good community support - ReportLab — Custom graphics and charts - FPDF and pdfdocument — Basic PDFs with minimal dependencies Quick guide: - Simple PDF generation → FPDF or img2pdf - HTML/CSS conversion → WeasyPrint or PDFKit - Complex layouts → ReportLab - Production apps → Nutrient API Common issues while generating PDFs in Python 1.

Handling large documents - Problem: Generating large PDFs with many pages, images, or complex content can lead to high memory consumption and slow processing times. - Solutions and best practices: - Incremental PDF writing — Instead of generating an entire document in memory, write pages to disk incrementally with streaming options using libraries like reportlab. - Optimize image sizes — Compress and resize images before adding them to the PDF to reduce memory usage.

Chunk processing — Divide content generation into smaller chunks and process them sequentially to avoid memory overload. - Incremental PDF writing — Instead of generating an entire document in memory, write pages to disk incrementally with streaming options using libraries like Example: This example generates 1,000 pages, processing each one incrementally without loading all pages at once, improving memory usage. 2. Managing memory usage - Problem: Memory leaks or excessive memory consumption when processing multiple PDF files or handling large datasets.

Solutions and best practices: - Use generators — Leverage Python generators to process data lazily instead of loading everything into memory at once. - Garbage collection — Explicitly clear unused objects using gc.collect() after processing large chunks of data. - Use streaming APIs — Libraries like pdfkit support streaming outputs instead of storing content in memory. Example: This example creates a PDF and then it explicitly clears the memory by deleting the object and invoking garbage collection to optimize memory usage. 3.

Ensuring cross-platform compatibility - Problem: PDF output may look different on various operating systems due to font availability or encoding issues. - Solutions and best practices: - Embed fonts — Use built-in font embedding features in libraries like FPDF orreportlab to ensure consistency across platforms. - Use standard fonts — Stick to common fonts like Helvetica, Times, and Courier to avoid OS-specific font dependencies. - Encoding handling — Always specify text encoding (e.g. UTF-8) to avoid compatibility issues when generating multilingual PDFs.

Embed fonts — Use built-in font embedding features in libraries like Example: This example uses a built-in font (Arial) and sets page breaks, ensuring compatibility across different systems with consistent font handling. 4. Optimizing performance - Problem: PDF generation can be slow, especially with large datasets, high-resolution images, or complex formatting. - Solutions and best practices: - Minimize draw calls — Reduce the number of drawing operations by batching similar elements together. - Use cached resources — Cache repeated elements (e.g. logos, headers) to avoid redundant processing.

Asynchronous processing — Use asynchronous processing for high-performance document generation in web applications. Example: This example demonstrates parallel PDF generation with a thread pool, allowing chunks of data to be processed simultaneously, improving overall speed. 5. Maintaining layout consistency - Problem: Inconsistent layout issues arise when adding dynamic content such as tables, charts, or paragraphs. - Solutions and best practices: - Define layout templates — Use a consistent document template to standardize layout across all generated PDFs.

Auto-adjust layout — Use libraries that support automatic content fitting and page breaks, such as reportlab. - Test different screen sizes — Check the final PDF layout on different devices to ensure consistency. Example: This example uses multi_cell to handle long paragraphs automatically, ensuring text wraps properly without breaking the layout. By following these solutions and best practices, developers can efficiently generate PDFs in Python while overcoming common challenges related to performance, compatibility, and security.

Conclusion This guide covered 10 Python PDF libraries — from lightweight tools like FPDF and img2pdf to commercial options like Nutrient API. Choose based on your needs: - Simple PDFs — FPDF or img2pdf - HTML conversion — WeasyPrint or PDFKit - Production apps — Nutrient API for forms, signatures, and support Start your free trial(opens in a new tab) to get 200 API credits. FAQ FPDF is a lightweight library for generating simple PDFs with text, images, and basic formatting. It requires no external dependencies. Yes.

You can generate PDFs with embedded images using libraries like ReportLab, WeasyPrint, and img2pdf. Each of these libraries provides support for adding images to your PDFs, with img2pdf being specifically designed for converting images to PDFs. Pillow preprocesses images (resize, crop, convert formats) before passing them to img2pdf for PDF conversion. Nutrient API supports HTML-to-PDF conversion, fillable forms, document merging, watermarks, annotations, OCR, and digital signatures. It has an official Python client library. PDFKit wraps wkhtmltopdf to convert HTML documents into PDFs with CSS styling and JavaScript execution.

People Also Asked

How%20to%20ConvertHTMLtoPDFwithxhtml2pdf%3F

Top%2010%20Python%20PDF%20generator%20libraries%3A%20Complete%20guide%20for%20developers%20%282025%29%20Table%20of%20contents%2010%20Python%20PDF%20libraries%20for%20different%20use%20cases%3A%20-%20Add%20forms%2C%20annotations%2C%20signatures%20%u2192%20Nutrient%20API%20-%20Generate%20basic%20PDFs%20%u2192%20FPDF%20or%20pdfdocument%20-%20Convert%20HTML/CSS%20to%20PDF%20%u2192%20PDFKit%20%28wkhtmltopdf%29%20or%20WeasyPrint%20-%20Design%20complex%20layouts%20and%20charts%20%u2192%20ReportLab%20-%20Build%20rich%20documents%20with%20tables%20and%20barcodes%20%u2192%20...

Top%2010PythonPDFgenerator%20libraries%3A%20Completeguidefor...%3F

The%20sections%20below%20compare%2010%20Python%20PDF%20libraries%20across%20five%20criteria%20to%20help%20you%20choose.%201.%20Nutrient%20API%20Nutrient%20DWS%20API%20generates%2C%20customizes%2C%20and%20processes%20PDFs%20at%20scale.%20It%20handles%20form%20filling%2C%20signatures%2C%20annotations%2C%20OCR%2C%20and%20merging%20via%20a%20cloud%20REST%20API%20%u2014%20no%20local%20rendering%20engines%20required.

ConvertHTMLtoPDFOnline%3F

Try%20Nutrient%20API%20for%20free%20and%20generate%20PDFs%20in%20minutes.%203.%20Generating%20PDFs%20with%20ReportLab%20ReportLab%28opens%20in%20a%20new%20tab%29%20is%20an%20open%20source%20library%20for%20creating%20PDFs%20with%20text%2C%20images%2C%20charts%2C%20and%20graphics.

WkhtmltopdfPythonAlternatives%20ForPythonPDFLibrary%3F

Generating%20PDFs%20with%20img2pdf%20img2pdf%28opens%20in%20a%20new%20tab%29%20is%20a%20tiny%20utility%20for%20stitching%20images%20into%20a%20single%2C%20lossless%20PDF.%20Key%20features%20-%20Any%20image%20%u2194%uFE0E%20PDF%20%u2014%20Handles%20JPEG%2C%20PNG%2C%20TIFF%2C%20and%20more.%20-%20Batch%20combine%20%u2014%20Merge%20dozens%20of%20images%20into%20one%20document%20in%20file%20order.%20-%20Lossless%20output%20%u2014%20Keeps%20original%20pixels%3B%20no%20reencoding%20artefacts.%20-%20Fast%20and%20minimal%20%u2014%20Pure%20Python%2C%20no%20heavy%20dependencies.

Build%20a%20Web%20App%20to%20convert%20Image%20document%20to%20scanned...%20%7CMedium%3F

A%20Python%20PDF%20library%20lets%20you%3A%20-%20Generate%20reports%20and%20invoices%20%u2014%20Create%20invoices%2C%20statements%2C%20and%20reports%20without%20manual%20layout%20-%20Build%20document%20workflows%20%u2014%20Fillable%20forms%2C%20digital%20signatures%2C%20merge/split%20files%20-%20Convert%20HTML%20to%20PDF%20%u2014%20Turn%20web%20templates%20into%20print-ready%20brochures%20and%20ebooks%20-%20Archive%20images%20%u2014%20Bundle%20scans%20and%20photos%20into%20searchable%20PDFs%20-%20Add%20PDF%20endpoints%20%u2014%20Serve%20PDF%20generation%20f...