How-to guides/Merge & Split

Split a PDF at each top-level bookmark

Use the PDF outline to split a document into sections automatically. Each top-level bookmark becomes its own output file.

rust
use pdfluent::PdfDocument;

fn main() -> Result<(), Box<dyn std::error::Error>> {
    let doc = PdfDocument::open("manual.pdf")?;

    let results = doc
        .split_by_top_level_bookmarks()
        .write_files("{title}.pdf")?;

    for r in &results {
        println!("{} -> {} pages", r.filename, r.page_count);
    }
    Ok(())
}
Install:cargo add pdfluentDownload SDK →

Step by step

1

Open the PDF and inspect its outline

Check that the document has top-level bookmarks before splitting. PDFluent exposes the full outline tree via outline().

rust
use pdfluent::PdfDocument;

let doc = PdfDocument::open("manual.pdf")?;
let outline = doc.outline()?;

println!("Top-level sections: {}", outline.len());
for item in &outline {
    println!("  {} -> page {}", item.title, item.destination_page);
}
2

Split at top-level bookmarks

split_by_top_level_bookmarks() computes the page range for each bookmark automatically. The range ends where the next bookmark starts.

rust
let splitter = doc.split_by_top_level_bookmarks();
3

Write output files using the bookmark title as filename

Use {title} in the pattern to name each file after its bookmark. PDFluent sanitises the title to produce a valid filename.

rust
splitter.write_files("{title}.pdf")?;
// "Introduction.pdf", "Chapter 1.pdf", "Chapter 2.pdf", ...
4

Split by a specific outline depth

To split at second-level bookmarks instead of the top level, set the depth parameter.

rust
use pdfluent::SplitDepth;

doc.split_by_bookmarks(SplitDepth::Level(2))
    .write_files("section_{n}.pdf")?;
5

Collect results for further processing

If you need the split data in memory, use to_vec() to get a Vec of SplitSegment values without writing to disk.

rust
let segments = doc
    .split_by_top_level_bookmarks()
    .to_vec()?;

for seg in segments {
    println!("{}: {} bytes", seg.title, seg.data.len());
    // upload seg.data to S3, etc.
}

Notes and tips

  • If the last bookmark has no following bookmark, its range extends to the last page of the document.
  • Bookmarks that point to the same page as the next bookmark produce a zero-page segment. PDFluent skips these by default.
  • Child bookmarks are included in the parent segment, not extracted separately, unless you use SplitDepth::Level(n).

Why PDFluent for this

Pure Rust

No JVM, no runtime, no DLL dependencies. Ships as a single native binary or WASM module.

Memory safe

Rust's ownership model prevents buffer overflows and use-after-free. No segfaults in PDF parsing.

Runs anywhere

Same code runs server-side, in Docker, on AWS Lambda, on Cloudflare Workers, or in the browser via WASM.

Frequently asked questions