Skip to content
Utilities to operate on lots of PDF files
Branch: master
Clone or download
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Type Name Latest commit message Commit time
Failed to load latest commit information.


Turns a directory tree of PDFs into a single bookmarked PDF. Automatically handles the table of contents.

Tested on Linux and Mac.


If you arrange your PDF files in folders like this:

book/01-Table of Contents.pdf
book/02-First Generation/01-Mary Cunningham.pdf
book/02-First Generation/02-Peter Cunningham.pdf
book/02-First Generation/02-:more-notes.pdf
book/03-Second Generation/01-John Mendell Cunningham.pdf

and run:

$ pdfdir-join book

you will find the result in "book.pdf"

The PDF's table of contents will be automatically generated from the filenames:

Table of Contents
First Generation
  Mary Cunningham
  Peter Cunningham
Second Generation
  John Mendell Cunningham

The 01-, 02- prefixes determine the order of the chapters in the final book and don't appear in the bookmarks.

If you don't want a file to be added to the TOC, adding a : to the beginning of its filename will suppress it (02-:more-notes.pdf above).


MacOS: brew install ghostscript Linux: apt-get install ghostscript

And also Ruby. Hopefully this is temporary.

Verify PDFs

This package also includes some tools to help assemble the input files. This will find corrupt PDFs:

$ pdfdir-verify book

It uses Ghostscript to carefully process every page of every PDF file. This is awfully slow. You can specify --quick for a 10X speedup at the risk of missing some obscure corruptions.

Re-encode PDFs

If you're having trouble with encrypted or corrupt PDFs, try using pdfdir-copy to duplicate your entire directory structure. It takes a while but, because it re-encodes each PDF, the result is sure to be valid.

$ pdfdir-copy book /tmp/book-fixed
You can’t perform that action at this time.