Slashdot is powered by your submissions, so send in your scoop

 



Forgot your password?
typodupeerror
×
Google Technology

Google Adds OCR To PDF and Images 76

Kilrah_il writes "Now you have the option to OCR every PDF and image you upload to Google Docs. 'When you upload files to Google Docs, you'll notice a new option that tells Google to convert the text from PDF and image files to Google Docs documents. ... I've tried to convert an excerpt from the book Rework and the result wasn't great. About 10% of the text has been incorrectly converted and the formatting hasn't been preserved.'"
This discussion has been archived. No new comments can be posted.

Google Adds OCR To PDF and Images

Comments Filter:
  • lolwut? (Score:3, Insightful)

    by Pojut ( 1027544 ) on Tuesday June 22, 2010 @09:00AM (#32651924) Homepage

    I can understand OCR software not working if you are scanning a document, due to dirt over the text or what have you...but OCR failing on a PDF with typed text? WTF?

  • by clone53421 ( 1310749 ) on Tuesday June 22, 2010 @09:36AM (#32652298) Journal

    They really should hide the text underneath the actual scanned image, though, so that what you're actually looking at is the real page, but searchable. That takes care of the issue with layout, and since you aren't actually trying to read the garbled text, although 10% is still a rather high error rate it won't matter as much because you'll only notice it if you're trying to copy-and-paste or you might search for something and miss a few of the hits because it was incorrectly OCR'd. Not a huge deal.

"Can you program?" "Well, I'm literate, if that's what you mean!"

Working...