Merging pdf’s on Mac OS X from a non-duplex scanner
Reading time: 2 – 3 minutes
Goal: scan in hundreds of duplex documents in a non-duplex scanner and combine into 1 pdf in automated way. Status: it was harder than it should have been, and not that automated, but this works.
Scan in the papers as pdf’s from your paper-feed equipped scanner. Scan them right side up, then flip over and scan the other sides. The two pdf’s will contain pages: 1, 3, 5… and 2, 4, 6…
Reverse the even pages.
#!/usr/bin/ruby if __FILE__ == $0 puts "Run this on ubuntu or somewhere that pdftk is easy to be had. (which isn't os x)" if ARGV.length != 1 puts "Syntax: #{__FILE__} pdf_to_reverse.pdf" exit end pdf = ARGV[0] reversed_pdf = pdf.gsub(/\.pdf/i, "_reversed.pdf") page_count = `pdfinfo #{pdf} | grep Pages`.scan(/\d+/) `pdftk #{pdf} cat #{page_count}-#{1} output #{reversed_pdf}` end
Lastly, combine the two pdf’s, shuffling every other page, starting with the odds. Note it has some dependencies on pdftk and pdfinfo for the reversing (which are excruciatingly difficult to install on os x), and os x (for the merging).
#!/usr/bin/ruby if __FILE__ == $0 puts "Run this on os x to shuffle two pdf's, where the even pages are already reversed (reverse them with other script)" if ARGV.length != 3 puts "Syntax: #{__FILE__} odds.pdf reversed_evens.pdf output.pdf" exit end odds_pdf = ARGV[0] reversed_evens_pdf = ARGV[1] output_pdf = ARGV[2] # obviously, only works on os x. I didn't see an easy way to combine pdf's # in pdftk or other tools I searched for `python '/System/Library/Automator/Combine PDF Pages.action/Contents/Resources/join.py' --output '#{output_pdf}' --shuffle '#{odds_pdf}' '#{reversed_evens_pdf}'` end
References: