Error: failed to parse PDF structure at byte offset 0This error means PDFluent could not parse the file as a valid PDF. Common causes are passing the wrong file, a truncated download, or a corrupted cross-reference table.
The file does not start with the PDF magic bytes (%PDF-). Common cases: an HTML error page saved with a .pdf extension, a ZIP archive, or a file path that resolves to an empty or placeholder file.
PDFs downloaded over HTTP without Content-Length validation can be incomplete. A valid PDF must end with a %%EOF marker and a complete cross-reference section. If the transfer cut off mid-file, the xref table is missing and PDFluent cannot locate the document catalog.
The xref table maps byte offsets to PDF objects. If a file was modified by a buggy tool and the offsets are wrong, PDFluent cannot locate objects and returns this error. The PDF may still be recoverable with repair mode.
Before calling Document::open, verify the file starts with %PDF-. This catches the wrong-file-type case immediately without a more expensive parse attempt.
use std::io::Read;
fn is_pdf(path: &str) -> std::io::Result<bool> {
let mut buf = [0u8; 5];
let mut f = std::fs::File::open(path)?;
f.read_exact(&mut buf)?;
Ok(&buf == b"%PDF-")
}
if !is_pdf("upload.pdf")? {
eprintln!("File is not a PDF");
return Ok(());
}Document::open_repair() attempts to rebuild the xref table by scanning the file for object markers. It is slower but can recover many files with corrupted or missing xref sections.
use pdfluent::Document;
// Try normal open first; fall back to repair if it fails
let doc = Document::open("damaged.pdf")
.or_else(|_| Document::open_repair("damaged.pdf"))?;
println!("Recovered {} pages", doc.page_count());Check that the path is correct and that the file has non-zero size. A common mistake in server code is writing to a temp path that resolves differently at runtime.
use std::fs;
use pdfluent::Document;
let meta = fs::metadata("document.pdf")?;
if meta.len() == 0 {
return Err("PDF file is empty".into());
}
let doc = Document::open("document.pdf")?;