How do I check if a PDF contains an XFA form?

doc.xfa().is_some() returns true if an /XFA key is present in the AcroForm dictionary.

Can I extract XFA data from a flattened PDF?

No. Flattening converts XFA content to static PDF graphics. The XFA data packet is removed during flattening.

How do I handle repeated subforms (table rows)?

Repeated subforms appear as sibling nodes with the same name. Use datasets.fields_at_path() to get a Vec of all instances.

What is the difference between xfa:data and xfa:datasets?

xfa:datasets is the top-level packet wrapper. xfa:data is the child element inside it that contains the actual field data. PDFluent's datasets() returns the xfa:data subtree.

PDFluentSDK

← Editor Download

How-to guides/XFA Forms

Extract data from an XFA PDF form in Rust

Read the XFA data packet from a dynamic XFA form and access field values as typed Rust data.

rust

use pdfluent::PdfDocument;

fn main() -> pdfluent::Result<()> {
    let doc = PdfDocument::open("form.pdf")?;

    let xfa = doc.xfa().ok_or(pdfluent::Error::NoXfa)?;
    let datasets = xfa.datasets()?;

    // Read a field value by its fully-qualified name
    let first_name = datasets.field_value("form1.subform1.firstName")?;
    println!("First name: {}", first_name.as_str().unwrap_or(""));

    Ok(())
}

Install:cargo add [email protected]Download SDK →

Step by step

Open the PDF and access the XFA root

doc.xfa() returns an Option<XfaDocument>. If the PDF does not contain an XFA structure, it returns None.

rust

let doc = PdfDocument::open("form.pdf")?;
let xfa = doc.xfa().ok_or(pdfluent::Error::NoXfa)?;

Access the datasets packet

XFA forms store submitted data in the xfa:datasets XML packet. xfa.datasets() parses that XML into a queryable tree.

rust

let datasets = xfa.datasets()?;

Read a field value by name

Field names are dot-separated paths from the root node. The path mirrors the XFA form template hierarchy.

rust

let val = datasets.field_value("form1.subform1.firstName")?;
match val {
    pdfluent::xfa::FieldType::Str(s)  => println!("string: {}", s),
    pdfluent::xfa::FieldType::Date(d) => println!("date: {:?}", d),
    pdfluent::xfa::FieldType::Num(n)  => println!("number: {}", n),
    pdfluent::xfa::FieldType::Empty   => println!("(empty)"),
}

Iterate all field nodes

Use datasets.fields() to get every leaf node in the data tree.

rust

for field in datasets.fields() {
    println!("{} = {:?}", field.path(), field.value());
}

Export the raw datasets XML

If you need the raw XML for custom processing, access the bytes directly.

rust

let xml_bytes = datasets.to_xml_bytes()?;
std::fs::write("form_data.xml", &xml_bytes)?;

Notes and tips

XFA forms come in two variants: static XFA (fixed layout) and dynamic XFA (auto-layout). Both use the same data model. PDFluent parses both.
The XFA template and datasets are separate XML streams. Modifying datasets without updating the template rendering may produce inconsistent results.
Adobe Reader is the primary renderer for dynamic XFA. Most other viewers (Foxit, Chrome PDF) do not support dynamic XFA fully.
XFA is deprecated in PDF 2.0. New forms should use AcroForm instead. PDFluent supports both for reading existing documents.

Why PDFluent for this

Pure Rust

No JVM, no runtime, no DLL dependencies. Ships as a single native binary or WASM module.

Memory safe

Rust's ownership model prevents buffer overflows and use-after-free. No segfaults in PDF parsing.

Runs anywhere

Same code runs server-side, in Docker, on AWS Lambda, on Cloudflare Workers, or in the browser via WASM.

Frequently asked questions

Download PDFluent

Extract data from an XFA PDF form in Rust

Step by step

Open the PDF and access the XFA root

Access the datasets packet

Read a field value by name

Iterate all field nodes

Export the raw datasets XML

Notes and tips

Why PDFluent for this

Frequently asked questions

Related guides