Merging pdf's on Mac OS X from a non-duplex scanner

Goal: scan in hundreds of duplex documents in a non-duplex scanner and combine into 1 pdf in automated way. Status: it was harder than it should have been, and not that automated, but this works.

Scan in the papers as pdf’s from your paper-feed equipped scanner. Scan them right side up, then flip over and scan the other sides. The two pdf’s will contain pages: 1, 3, 5… and 2, 4, 6…

Reverse the even pages.

if __FILE__ == $0
  puts "Run this on ubuntu or somewhere that pdftk is easy to be had. (which isn't os x)"
  if ARGV.length != 1
    puts "Syntax: #{__FILE__} pdf_to_reverse.pdf"
  pdf = ARGV[0]
  reversed_pdf = pdf.gsub(/\.pdf/i, "_reversed.pdf")
  page_count = `pdfinfo #{pdf} | grep Pages`.scan(/\d+/)
  `pdftk #{pdf} cat #{page_count}-#{1} output #{reversed_pdf}`

Lastly, combine the two pdf’s, shuffling every other page, starting with the odds. Note it has some dependencies on pdftk and pdfinfo for the reversing (which are excruciatingly difficult to install on os x), and os x (for the merging).

if __FILE__ == $0
  puts "Run this on os x to shuffle two pdf's, where the
        even pages are already reversed (reverse them with other script)"
  if ARGV.length != 3
    puts "Syntax: #{__FILE__} odds.pdf reversed_evens.pdf output.pdf"
  odds_pdf = ARGV[0]
  reversed_evens_pdf = ARGV[1]
  output_pdf = ARGV[2]
  # obviously, only works on os x.  I didn't see an easy way to combine pdf's
  # in pdftk or other tools I searched for
  `python '/System/Library/Automator/Combine PDF Pages.action/Contents/Resources/join.py' --output '#{output_pdf}' --shuffle '#{odds_pdf}' '#{reversed_evens_pdf}'`


  • pdftk – pdf toolkit, I could have installed with ports install pdftk, but that has a very long build dependency on gcj.
  • Another technique which would work if you didn’t need to reverse pages, using automator. And without automator (like I do with a script directly).
  1. Just thought I might shoot you a quick blog post I’d written trying to do a very similar task.. Using automator to help will make it integrate with Finder a bit better and it’s a bit easier


    Scott O'Brien

    24 Apr 12 at 10:44 pm

