Our legacy stack has years of workarounds for specific PDFs. Will PDFluent handle them?

PDFluent is tolerant of malformed and non-standard PDFs encountered in production. For edge cases that matter, the migration period of running old and new code in parallel will surface any differences before you cut over.

We rely on Ghostscript for PDF/A conversion. Does PDFluent support that?

PDFluent supports PDF/A validation and conversion. This is one of the highest-value replacements for Ghostscript shell invocations, as it removes a large system dependency and enables serverless deployment.

How much smaller will our Docker image be?

A typical Java + Ghostscript + Python stack produces images of 400-600 MB. A Rust binary with PDFluent can be deployed in a debian:slim image under 30 MB. Lambda deployment packages shrink proportionally.

Can we migrate team members who only know Java or Python?

PDFluent provides Python and Node bindings in addition to the native Rust crate. Your Java or Python team can use the bindings while the Rust team migrates the core service, reducing the learning curve.

PDFluentSDK

← Editor Download

Migration guides/legacy PDF stack

Migrate from a legacy PDF stack

Many teams accumulate a mix of old tools: PDFBox for reading, iText for writing, custom scripts for forms, and shell wrappers around Ghostscript. Consolidate to one Rust crate.

Migrating from legacy PDF stack to PDFluent. Install with cargo add [email protected]

Migration steps

Audit your current PDF toolchain

Before replacing anything, list every library and tool touching PDFs in your system. Common legacy stacks include PDFBox for reading, iText for writing, a separate form library, and Ghostscript invoked via Runtime.exec() or shell scripts. Each one has its own dependency, license, and failure mode.

legacy PDF stack (before)

# Typical legacy PDF stack
# pom.xml: iText 5 (AGPL), PDFBox 2.0 (Apache)
# scripts/fill_form.sh: calls pdftk or ghostscript
# lib/pdf_util.py: Python wrapper around pdfminer
# DockerFile: installs ghostscript, pdftk, java, python
#
# Result: 4 runtimes, 3 licenses, 500+ MB image

PDFluent (after)

# PDFluent replaces all of the above
# Cargo.toml: pdfluent = "0.9"  (MIT/commercial)
#
# One crate for: reading, writing, text extraction,
# form filling, XFA, annotations, and metadata.
# Dockerfile: copies a single static binary.

Replace reading and text extraction first

Start with the lowest-risk part of the stack: reading documents and extracting text. This is typically handled by PDFBox or pdfminer and is safe to swap without changing any downstream logic. Verify output parity before moving to the next operation.

legacy PDF stack (before)

// PDFBox text extraction
PDDocument document = PDDocument.load(new File("report.pdf"));
PDFTextStripper stripper = new PDFTextStripper();
stripper.setStartPage(1);
stripper.setEndPage(document.getNumberOfPages());
String text = stripper.getText(document);
document.close();

PDFluent (after)

use pdfluent::PdfDocument;

let doc = PdfDocument::open("report.pdf")?;
let pages = doc.page_count();
let mut all_text = String::new();
for i in 0..pages {
    all_text.push_str(&doc.page(i)?.text()?);
}

Consolidate form filling and document writing

Legacy stacks often use a different library for writing than for reading. Replace iText or pdftk form-filling with PDFluent's acroform API. Replace Ghostscript shell invocations with PDFluent's document manipulation methods. Each replacement removes a runtime dependency from your Docker image.

legacy PDF stack (before)

// iText 5 form filling (AGPL)
PdfReader reader = new PdfReader("template.pdf");
PdfStamper stamper = new PdfStamper(
    reader,
    new FileOutputStream("filled.pdf")
);
AcroFields form = stamper.getAcroFields();
form.setField("company_name", "Acme Corp");
form.setField("invoice_date", "2024-04-14");
stamper.setFormFlattening(true);
stamper.close();
reader.close();

PDFluent (after)

let mut doc = PdfDocument::open("template.pdf")?;
let mut form = doc.acroform()?;
form.set_field("company_name", "Acme Corp")?;
form.set_field("invoice_date", "2024-04-14")?;
form.flatten()?;
doc.save("filled.pdf")?;

Things to watch out for

!iText 5 is AGPL-licensed. If your team has been quietly ignoring this, migration to PDFluent resolves the license risk.
!Shell wrappers around pdftk or Ghostscript are fragile — they depend on specific installed versions and break silently when the binary is missing. PDFluent eliminates all external process invocations.
!If you are using PDFBox 2.x, note that PDFBox 3.0 changed several APIs. Rather than upgrading PDFBox, migrating to PDFluent at this point may be less work.
!Migrate one operation type at a time and run both the old and new code in parallel on the same documents to verify output parity before decommissioning the old tool.

Frequently asked questions

Download PDFluent

Migrate from a legacy PDF stack

Migration steps

Audit your current PDF toolchain

Replace reading and text extraction first

Consolidate form filling and document writing

Things to watch out for

Frequently asked questions

Related guides