About PDFluent

Built to solve the PDF problems
nobody else wanted to touch.

PDFluent is a pure-Rust PDF SDK. No C++ at the core, no PDFium under the hood, no memory corruption bugs. It runs natively and in the browser — same code, same behaviour.

Why this exists

I got frustrated. I was building a document-processing pipeline that needed to handle XFA forms — the kind of PDF forms that governments and banks have been generating for the past 20 years. Every library I tried either ignored XFA entirely, crashed on anything slightly malformed, or dragged in a full PDFium build that weighed 80 MB and came with its own set of C++ vulnerabilities.

So in 2024 I started writing a PDF parser in Rust. Just for myself, just to see if it was possible. It worked. Then I added an XFA engine. Then a renderer. Then a PDF/A converter. At some point it became a proper SDK — one I would actually want to use in production.

PDFluent is the result. It's been tested against 20,000 real-world PDFs — the kind that come from governments, banks, and enterprise software, not PDF spec examples. XFA forms parse and flatten reliably. Visual accuracy is actively improving — we test against a golden set derived from Adobe output. The PDF/A converter passes a 20 K corpus with zero false negatives.

What makes it different

Pure Rust, zero C deps at the core

The parser, XFA engine, and renderer are written in safe Rust. No FFI boundary to cross, no C++ exceptions to catch, no need to ship a platform binary for every OS.

Runs in the browser

Compile to WebAssembly and the same code runs in the browser without a server. Useful for client-side previews, offline processing, or reducing API round-trips.

XFA support — the hard part

XFA (XML Forms Architecture) is a 1,000-page spec that almost every modern PDF library ignores. PDFluent parses, evaluates FormCalc scripts, and flattens XFA forms to standard PDF.

Tested on real-world PDFs

Not just the spec examples. A corpus of 20,000 PDFs from the wild — government forms, invoices, scanned documents, certified PDFs — with a MuPDF-rendered oracle for SSIM comparison.

Memory safe by default

Rust's ownership model catches entire classes of bugs at compile time. No buffer overflows, no use-after-free, no null pointer dereferences. The kind of bugs that fill PDF-library CVE lists.

One SDK, six runtimes

Rust, Python (PyO3), Node.js (napi-rs), Java (JNI), C (cdylib header), and WebAssembly. Same underlying engine everywhere.

What's open source, what's not

The core engine is open source under the MIT licence. The commercial parts fund continued development and keep the lights on.

ComponentOpen sourceNotes
PDF parser (pdf-syntax)Read-only, zero-copy parsing
PDF/A validatorISO 19005 compliance checks
XFA engineParse, flatten, FormCalc
RendererPage-to-image, tested on 20 K PDFs
Language bindingsPython, Node.js, Java, C
WASM bundlesBrowser-ready, tree-shaken
PDF/A converterProduction-grade compliance pipeline
Test corpus & oracle DB20 K real-world PDFs + ground truth
Commercial supportSLA, priority fixes, private Slack

If you want to read the engine source, the GitHub repo is public. If you want to use PDFluent in production, you need a licence key — see pricing.

The team

It's one person for now. I prefer to be upfront about that.

Jasper de Winter
Founder

Spent years processing enterprise PDFs in production. Got tired of the tools. Built something better.

Want to contribute?

PDFluent is built in the open. If you find a bug, have a feature request, or want to submit a fix — open an issue or PR on GitHub. The test corpus is large enough that even a single PDF that exposes a new edge case is genuinely useful.

Contact to request access

Get in touch

Questions about the SDK, a custom integration, enterprise pricing, or something PDFluent doesn't support yet? I reply to everything.