Use PDFluent's recovery parser to rebuild the cross-reference table and salvage as many objects as possible from a damaged file.
use pdfluent::{Document, OpenOptions};
fn main() -> pdfluent::Result<()> {
let doc = Document::open_with_options(
"corrupted.pdf",
OpenOptions::default().recovery_mode(true),
)?;
let report = doc.repair_report();
println!("Objects recovered: {}", report.objects_recovered());
println!("Xref rebuilt: {}", report.xref_rebuilt());
doc.save("repaired.pdf")?;
Ok(())
}Standard parsing is faster. Only fall back to recovery mode if the standard open returns an error.
use pdfluent::{Document, Error};
let result = Document::open("damaged.pdf");
match result {
Ok(doc) => println!("Opened normally."),
Err(Error::BrokenXref | Error::UnexpectedEof | Error::InvalidStructure(_)) => {
println!("Standard open failed, trying recovery mode...");
}
Err(e) => return Err(e),
}Recovery mode uses a linear scan of the file to find all PDF objects rather than relying on the cross-reference table.
use pdfluent::{Document, OpenOptions};
let doc = Document::open_with_options(
"damaged.pdf",
OpenOptions::default().recovery_mode(true),
)?;The repair report describes what was found and what could not be recovered.
let report = doc.repair_report();
println!("Objects recovered: {}", report.objects_recovered());
println!("Objects missing: {}", report.objects_missing());
println!("Xref rebuilt: {}", report.xref_rebuilt());
println!("Truncated at byte: {:?}", report.truncated_at());Check that the expected pages are present. Some pages may be unrecoverable if their stream data was overwritten.
println!("Pages recovered: {}", doc.page_count());
for (i, page) in doc.pages().enumerate() {
let text = page.extract_text().unwrap_or_default();
println!("Page {}: {} chars", i + 1, text.len());
}Write the repaired document. The output is a structurally valid PDF even if some content was lost.
doc.save("repaired.pdf")?;
println!("Saved repaired.pdf");No JVM, no runtime, no DLL dependencies. Ships as a single native binary or WASM module.
Rust's ownership model prevents buffer overflows and use-after-free. No segfaults in PDF parsing.
Same code runs server-side, in Docker, on AWS Lambda, on Cloudflare Workers, or in the browser via WASM.