JAW Speak

Jonathan Andrew Wolter

Merging pdf’s on Mac OS X from a non-duplex scanner

with one comment

Reading time: 2 – 3 minutes

Goal: scan in hundreds of duplex documents in a non-duplex scanner and combine into 1 pdf in automated way. Status: it was harder than it should have been, and not that automated, but this works.

Scan in the papers as pdf’s from your paper-feed equipped scanner. Scan them right side up, then flip over and scan the other sides. The two pdf’s will contain pages: 1, 3, 5… and 2, 4, 6…

Reverse the even pages.

#!/usr/bin/ruby
 
if __FILE__ == $0
  puts "Run this on ubuntu or somewhere that pdftk is easy to be had. (which isn't os x)"
 
  if ARGV.length != 1
    puts "Syntax: #{__FILE__} pdf_to_reverse.pdf"
    exit
  end
 
  pdf = ARGV[0]
  reversed_pdf = pdf.gsub(/\.pdf/i, "_reversed.pdf")
 
  page_count = `pdfinfo #{pdf} | grep Pages`.scan(/\d+/)
 
  `pdftk #{pdf} cat #{page_count}-#{1} output #{reversed_pdf}`
end

Lastly, combine the two pdf’s, shuffling every other page, starting with the odds. Note it has some dependencies on pdftk and pdfinfo for the reversing (which are excruciatingly difficult to install on os x), and os x (for the merging).

#!/usr/bin/ruby
 
if __FILE__ == $0
  puts "Run this on os x to shuffle two pdf's, where the
        even pages are already reversed (reverse them with other script)"
 
  if ARGV.length != 3
    puts "Syntax: #{__FILE__} odds.pdf reversed_evens.pdf output.pdf"
    exit
  end
 
  odds_pdf = ARGV[0]
  reversed_evens_pdf = ARGV[1]
  output_pdf = ARGV[2]
 
  # obviously, only works on os x.  I didn't see an easy way to combine pdf's
  # in pdftk or other tools I searched for
  `python '/System/Library/Automator/Combine PDF Pages.action/Contents/Resources/join.py' --output '#{output_pdf}' --shuffle '#{odds_pdf}' '#{reversed_evens_pdf}'`
end

References:

  • pdftk – pdf toolkit, I could have installed with ports install pdftk, but that has a very long build dependency on gcj.
  • Another technique which would work if you didn’t need to reverse pages, using automator. And without automator (like I do with a script directly).
Bookmark and Share

Written by Jonathan

August 5th, 2009 at 8:41 am

Posted in automation

One Response to 'Merging pdf’s on Mac OS X from a non-duplex scanner'

Subscribe to comments with RSS or TrackBack to 'Merging pdf’s on Mac OS X from a non-duplex scanner'.

  1. Just thought I might shoot you a quick blog post I’d written trying to do a very similar task.. Using automator to help will make it integrate with Finder a bit better and it’s a bit easier

    http://www.scottyob.com/2011/01/16/pdf-automator-in-osx/

    Scott O'Brien

    24 Apr 12 at 10:44 pm

Leave a Reply