How-to guides/Attachments

Extract embedded file attachments from a PDF in Rust

List and extract all embedded file streams from a PDF document. Save attachments to disk or read them directly as byte buffers.

rust
use pdfluent::PdfDocument;
use std::fs;

fn main() -> Result<(), Box<dyn std::error::Error>> {
    let doc = PdfDocument::open("invoice_with_attachment.pdf")?;

    for attachment in doc.attachments() {
        let filename = attachment.filename();
        let data = attachment.read_data()?;
        fs::write(format!("output/{}", filename), &data)?;
        println!("Extracted {} ({} bytes)", filename, data.len());
    }
    Ok(())
}
Install:cargo add pdfluentDownload SDK →

Step by step

1

Add PDFluent to your project

Add the pdfluent crate to Cargo.toml.

rust
[dependencies]
pdfluent = "0.9"
2

Open the PDF

Open the document. A read-only borrow is enough for reading attachments.

rust
use pdfluent::PdfDocument;

let doc = PdfDocument::open("package.pdf")?;
3

List all attachments

Call doc.attachments() to get attachment metadata. No file data is read at this point.

rust
let attachments = doc.attachments();
println!("Found {} attachment(s):", attachments.len());

for att in &attachments {
    println!(
        "  {} - {} - {} bytes",
        att.filename(),
        att.mime_type().unwrap_or("unknown"),
        att.size(),
    );
}
4

Extract a specific attachment by name

Find the attachment you want by filename and extract its bytes.

rust
let xml_att = doc
    .attachments()
    .into_iter()
    .find(|a| a.filename().ends_with(".xml"));

if let Some(att) = xml_att {
    let data = att.read_data()?;
    std::fs::write("extracted_invoice.xml", &data)?;
    println!("Extracted: {} bytes", data.len());
} else {
    println!("No XML attachment found.");
}
5

Extract all attachments to a directory

Loop over all attachments and save each to a target folder.

rust
use std::fs;
use std::path::Path;

let output_dir = Path::new("extracted_files");
fs::create_dir_all(output_dir)?;

for att in doc.attachments() {
    let dest = output_dir.join(att.filename());
    let data = att.read_data()?;
    fs::write(&dest, &data)?;
    println!("Saved: {}", dest.display());
}

Notes and tips

  • attachments() returns document-level embedded files from the EmbeddedFiles name tree. Page-level file annotations are accessed separately via page.annotations().
  • read_data() decompresses the embedded file stream and returns raw bytes. No temporary files are created.
  • Attachments can be any file type: XML, CSV, XLSX, images, or other PDFs.
  • If the PDF is encrypted, decrypt it first. Otherwise, attachment streams are not accessible.

Why PDFluent for this

Pure Rust

No JVM, no runtime, no DLL dependencies. Ships as a single native binary or WASM module.

Memory safe

Rust's ownership model prevents buffer overflows and use-after-free. No segfaults in PDF parsing.

Runs anywhere

Same code runs server-side, in Docker, on AWS Lambda, on Cloudflare Workers, or in the browser via WASM.

Frequently asked questions