May 5, 2009
Scanning in a book
Over the past year, I’ve digitized a bunch of my family’s old photo albums by photographing the pages with a digital camera. This is far faster than scanning them, and the quality is good enough and infinitely better than not having any digitized versions.
Now I’m contemplating using the same technique to make a digital copy of my 1978 doctoral dissertation. The object consists of 350 pages of typed, double-spaced 8.5″x11″ pages, bound. At 15 secs per page, that’s about 1.5 hours of time (= 4 Daily Shows, or 3 SNLs with the dross fast-forwarded).
I’d appreciate advice about the digital side of it, given that I’d like the “scans” to be readable online and, ultimately, be OCR-able.
1. My camera goes up to 10 megapixels, which I assume is way more than I need for this project. I don’t care about reproducing the pages as physical artifacts. I’m only interested in the text on them. How many mpixies should I be shooting at?
2. What would be the most convenient way to post these from a reader’s point of view? Anything other than PDF? (Google Books lets you submit your books in PDF format, so I’d like to produce a PDF version in any case.)
3. Depending on your answer to #2, do you have any suggestions of tools to use? (I’m doing this on a Mac.)
4. Any other advice?
Thanks!