Google Engineers Open Source Book Scanner Design 69
c0lo writes "Engineers from Google's Books team have released the design plans for a comparatively reasonably priced (about $1500) book scanner on Google Code. Built using a scanner, a vacuum cleaner and various other components, the Linear Book Scanner was developed by engineers during the '20 percent time' that Google allocates for personal projects. The license is highly permissive, thus it's possible the design and building costs can be improved. Any takers?"
Adds reader leighklotz: "The Google Tech Talk Video starts with Jeff Breidenbach of the Google Books team, and moves on to Dany Qumsiyeh showing how simple his design is to build. Could it be that the Google Books team has had enough of destroying the library in order to save it? Or maybe the just want to up-stage the Internet Archive's Scanning Robot. Disclaimer: I worked with Jeff when we were at Xerox (where he did this awesome hack), but this is more awesome because it saves books."
Re: (Score:2)
/me thinks you weren't really thinking of the book scanner design when you made that comment ;-).
Note that there's absolutely no relation between Book Scanners and Phone Design.
Re: (Score:2)
Re: (Score:2)
it's an open design, dumbass. save yourself some time and don't build the spying part.
False economy (Score:5, Insightful)
If these books are truly unique, you're taking a big risk subjecting them to this contraption.
Re:False economy (Score:5, Funny)
The proper SQL statement would have been "DISTINCT" not a "UNIQUE" index, true.
Re: (Score:2)
You're taking a bigger risk not subjecting them to this contraption.
Re: (Score:3)
He addresses that in the talk. Yes, this machine can fold or tear pages. But they talked to an archivist, and he said that scanning the books in this machine was less risky than not scanning them at all. If they're scanned, the information is preserved, backed up, spread around, and is then widely available. Any library book is subject to risk from the patron tearing or damaging the book, yet they still accept the risk of making them available.
Besides, how much worse is the risk of possibly tearing a pa
Re: (Score:2)
Frankly, I like the idea presented by these guys better:
http://www.diybookscanner.org/ [diybookscanner.org]
The have the book lying down on it's spine and supported in a nice 45-ish angle that prevents too much of a tear. However they use ordinary cameras instead of the scanning tech used in a...well...scanner. Though I believe cameras tend to work faster than a scanner, so I don't see a downside.
Re: (Score:2)
The Google guy mentioned them in the presentation. The primary drawback to the other DIY scanners is manual operation. Setup involves adjusting the lights, the cameras, and the hinge point for the platen; not a big deal. But in operation, the human has to lift the platen, flip the page, set the platen down, trigger the cameras, and then repeat for each page. My understanding is that a person can scan a 500 page book in about 20-30 minutes, so it's of a comparable speed to this new page-turning scanner.
Re: (Score:2)
I agree with your points, and I saw the video, but I was actually referring to the OP's point about handling delicate books.
DIY's system has the book (which is in fragile condition) down, and very properly secured, and the scanning apparatus (which is more able to take the stress from the constant movement) is the one that moves.
I was actually imagining my dad's big-ass collins dictionary from *his* college time, and comparing the state of that to what I might expect the usual state of affair will be of the
Re: (Score:2)
OK, I get what you're saying now. You want to take the mass of the book out of the equation, so that a fragile spine wouldn't be further damaged or even torn in two by the weight of a heavy book straddling a sharp edge, and all the motion of this mechanism. And I agree.
It looks like the high end commercial book scanners are constructed to take that into account too, where the weight of the book is supported by the covers in a cradle, just like the DIY scanners. They use a vacuum mechanism to draw a singl
Re: (Score:2)
Agree with you.
On further thought, I think it would be better if the book stayed still and it was the *scanner* that moved back and forth ( in the scanner-top position I described earlier)
That should eliminate worries about size and weight, since the only weight in question is the scanner itself, rather than the book, and that will remain constant.
Also, I think, errors could be reduced by *slowing* down the process, to further minimise pages caught/stuck/torn, since slower and steadier push will allow for m
Very Good Wiki Direction (Score:5, Interesting)
Re: (Score:2)
If you're scanning to save physical space, you don't need this contraption. Just cut off the back of the book and put the pages in a regular scanner with a sheet feeder. (You can get an excellent one for about $400, including OCR software.)
Re: (Score:2)
What's the best way to cut off the back?
Re: (Score:1)
Re: (Score:2)
Harvesting knowledge in case of society collapse (Score:4, Insightful)
We know it can happen. Rome fell, Greece fell, Angkor Wat fell, Easter Island collapsed. Societies die just like we do.
It would be a shame to lose all of the knowledge, art, and literature that we have accumulated during our tenure so far.
Scanning books is a good way to archive much of that information for the next society that can develop digital computing. I suggest we enshrine it all in orbit or on the moon, guaranteeing it relative immortality and making it accessible only to those technologically advanced enough to benefit from it.
For all we know, the ancient Khmer civilization at Ankgor Wat [about.com] invented advanced technology, and it's just lost merely to time.
We owe it to future generations to make sure our society does not lose as much when it collapses.
Re:Harvesting knowledge in case of society collaps (Score:5, Insightful)
But stone & clay slabs of the Sumerians and papyrus of the Egyptians survived until today, but the original data feed of the Apollo missions are lost forever because they were thrashed when no one had the equipment to read the old data tapes.
Re: (Score:2)
I'm not sure you're making a valid comparison. If I choose any particular piece of Egyptian recorded information then there's a good chance that it is destroyed. The fact that some material survived several millennia is both impressive and interesting, but very much material survives from the 60s even if some has been lost.
I mean how many records of the ancient Egyptian space race survive to this day? I rest my case.
Re: (Score:3)
This wasn't meant as a comparision of better and worse. Just as a set of specific risk for digital archives.
Go and try to read your letters from a 5.25'' floppy disc with your VizaWrite-files from just a few years ago. Wouldn't have happend with paper printouts.
On the other hand, go to a movie archive and see the first cellulose movies lost due to simply rotting away... wouldn't have happened with DVDs
Then again, if there's no DVD player left....
A form of archiving, that needs special knowledge (file format
Re: (Score:2)
Yes, and you CAN print it out. And you CAN print it on good paper...
but what about the inks that you are using? I don't think those will survive very long. And getting better inks that will work with an existing printer is a real problem.
FWIW, I don't really have a much better answer than an improved clay tablet. And preserving anything that way is so expensive that it won't be done...except on a trivial scale. The original CDs were durable things, but that doesn't apply to the ones that you can burn at
Re: (Score:2)
optical discs are actually made in a near identical process to microfiche.
we could simply etch much much smaller using lasers on current replication hardware. you could probably write a small program that translates text files into an ISO file you could burn yourself that results in a human-readable disc.
hell, i want to try that. that sounds amazing.
Re: (Score:2)
IIUC, current consumer CDs and DVDs write using a phase transition process that changes the reflectivity of the metallic layer written upon. Over time this relaxes back into the low energy configuration. It may be good for a decade or two, but I doubt that it's even good over a century.
Re: (Score:2)
It's not that simple. (Nothing ever is.) Preserving information for the future runs into a lot of issues.
Re: (Score:2)
films that old don't necessarily rot. they either get eaten by fungus or burn on their own once exposed to ambient air. Nitrates were not an ideal material for making precious archival materials from...
Re: (Score:1)
Here ya go: (Score:1)
https://www.google.com/search?q=helicopter+of+abydos&tbm=isch&source=univ&sa=X [google.com]
Ancient Egyptian spacecraft & helicopters!
Re: (Score:1)
Greece fell,
Oh, come on, Greece is still working on securing more loans, it hasn't fallen yet!
Having looked at the design... (Score:3)
...I think it's fundamentally flawed in that it would not take much to have a misaligned page sliced right out of the book. Certainly nothing I'd risk a book of any value over. Sorry, this one appears to be a non-starter (although it is rather novel, pun intended).
Re: (Score:2)
In that case, you could use one of the manual ones at diybookscanner.com [diybookscanner.com] and turn the pages yourself, trading speed for safety.
Re: (Score:2)
In point of fact, for individual scanning, the video even mentions that this linear scanner is SLOWER than a manual scanner such as the diybookscanner. The gains come in that since its automatic, a single person could keep 8 or 10 of them running at at time.
Yup. Progress in clock speeds has pretty much slowed down, and Google appears to expect future performance enhancements to come in the form of parallelism
Re: (Score:2)
Clock speed can be quadrupled by switching to a pipeline architecture. See 24:28 [youtube.com] of the video.
Re: (Score:2)
Because the paper itself is more important than the content?
We need more people in this world who understand value.
Re: (Score:2)
If I had a truly unique and special book that must not be damaged, and I wanted to digitize it, I'd bite the bullet and do it very carefully by hand (which you could do, over a long enough time scale, with just about any household USB scanner).
If, however, I wanted to digitize the contents of my personal book collection, which is several hundred books none of which couldn't be replaced via Amazon or eBay, this would be good for the job. So it shreds my 20 year old copy of Asimov's Foundation- I'd be a bit c
Re: (Score:2)
Just as you wouldn't trust your valuable books to this page-turning scanner, you wouldn't scan those same books with the typical household USB scanner, either. Those scanners generally require the books to be opened 180 degrees and pressed flat in order to get the scanning element close enough to the margins, and that can damage the pages and/or the binding.
The prototypical DIY scanner uses a book rest and platen set at a 90 degree angle, which is safe for most books, and as you're manually turning the pag
The missing link (Score:2)
Re: (Score:2)
Yes, it was in my submission but apparently edited for brevity. TL;DW?
Re: (Score:2)
I remember thinking the same thing then.
Re: (Score:2)
And you were right then, just as he is right now.
Google's motivation (Score:5, Insightful)
The summary questions Google's motivations for doing this, but I think it should be clear this isn't a Google project, really. 20% projects can't be totally random, personal things that have no relationship whatsoever with the business or possible business... but the link can be very tenuous, and the cooler the project is, the weaker it can be. All tech managers at Google are engineers themselves and tend to be just as able to geek out about cool stuff as the people they supervise.
Various other bits of obvious Google support for the project are also more incidental than planned. For example, Dany mentions that he built the machine in one of the on-campus workshops. Those workshops are there for "real" work, but they're also available for any employees to use on an as-available basis. Tech talks are also organized by and for the employees for their own interests, with basically zero "corporate" supervision. Most are actually job-related, but far from all. There are plenty of project talks and hobby talks (though this particular hobby/project talk is much cooler than most).
I imagine there was a cursory review required to get permission to publish the talk and the design, but such things tend to be handled on a "is there some really good reason we should say no?" basis. If not... go for it. Publishing cool, geeky things done by Google engineers is pretty positive for Google's brand, and it makes the engineers happy, which is good for employee retention -- especially since the kind of employees who do cool stuff for fun is the kind Google most wants to retain.
Bottom line: It's very unlikely anyone at Google has a corporate strategy built around the release of this information. It's just an engineer doing something he thinks is fun and valuable (to someone) and the company providing generic support for such activities, and otherwise staying out of the way.
Re: (Score:3, Informative)
"Could it be that the Google Books team has had enough of destroying the library in order to save it?"
The Google Books team is not Google. It's a a group of people, some of whom built this non-destructive reader. It's quite likely these people, who probably love books, started by wondering if there was a way they could scan their content without damaging them physically, and decided to use their 20% time to figure it out.
As for scanning books, that is most definitely a Google-the-company supported project
Re: (Score:2)
Re: (Score:2)
See archive.org...
Yes, that's in the original submission, as you see above. For the record, Brewster Kahle (who founded Archive.org), Jeff and Danny (who did this project), and I are all MIT alums, and the "Internet Archive scanning robot" is from a company called Kirtas, which also has ties to Xerox.
Shredder scanner (Score:3)
I'm waiting for a reference to the shredder-scanner to come up from Rainbow's End.
http://en.wikipedia.org/wiki/Rainbows_End [wikipedia.org] (although the wiki article doesn't mention that piece of the plot, sadly)
What about the patents? (Score:2)
Google has a patent on using structured lighting to determine the shape of the page and correct the image ... is that open too?
Re: (Score:2)
Google has a patent on using structured lighting to determine the shape of the page and correct the image ... is that open too?
The license section on the googlecode [google.com] page (scroll to the bottom):
Additional IP Rights Grant (Patents)
Google hereby grants to you a perpetual, worldwide, non-exclusive, no-charge, royalty-free, irrevocable (except as stated in this section) patent license to make, have made, use, offer to sell, sell, import, transfer, and otherwise run, modify and propagate this design where such license applies only to those patent claims, both currently owned by Google and acquired in the future, licensable by Google that are necessarily infringed by This design.
Does this answer your question?
Re: (Score:2)