Follow Slashdot stories on Twitter

 



Forgot your password?
typodupeerror
×
Google Businesses The Almighty Buck

On the Google Book Scanning Project and the Library We Will Never See (theatlantic.com) 165

For a decade, Google's enormous project to create a massive digital library of books was embroiled in litigation with a group of writers who say it was costing them a lot of money in lost revenue. Even as Google notched a victory when a federal appeals court ruled that the company's project was fair use, the company quietly shut down the project. From an article published in April this year: Despite eventually winning Authors Guild v. Google, and having the courts declare that displaying snippets of copyrighted books was fair use, the company all but shut down its scanning operation. It was strange to me, the idea that somewhere at Google there is a database containing 25-million books and nobody is allowed to read them. It's like that scene at the end of the first Indiana Jones movie where they put the Ark of the Covenant back on a shelf somewhere, lost in the chaos of a vast warehouse. It's there. The books are there. People have been trying to build a library like this for ages -- to do so, they've said, would be to erect one of the great humanitarian artifacts of all time -- and here we've done the work to make it real and we were about to give it to the world and now, instead, it's 50 or 60 petabytes on disk, and the only people who can see it are half a dozen engineers on the project who happen to have access because they're the ones responsible for locking it up. But Google seems to be thinking ways to make use of it, it appears. Last month, it added a new feature to its search function that instantly connects you with eBook data from libraries near you. From a report: Now, every time you search for a book through Google, information about your local library rental options will be easily available. Yeah, that's right. Your local library not only still exists, but it has eBooks, which are things you can totally borrow (for free) online! Before, this perk was hidden somewhere deep within your local library's website -- assuming it had one -- but now these free literary wonders are all yours for the taking.
This discussion has been archived. No new comments can be posted.

On the Google Book Scanning Project and the Library We Will Never See

Comments Filter:
  • for free (Score:4, Insightful)

    by supernova87a ( 532540 ) <kepler1@@@hotmail...com> on Friday October 20, 2017 @10:03AM (#55403179)
    Well, actually, isn't the problem that they want to sell it / use it for commercial purposes? If Google simply wanted to put this on the web for absolutely free, with no links to anything else, couldn't they?

    I thought it's only when you're trying to sell something that these issues arise.
    • Re:for free (Score:5, Informative)

      by Chris Mattern ( 191822 ) on Friday October 20, 2017 @10:07AM (#55403203)

      I thought it's only when you're trying to sell something that these issues arise.

      You thought wrong. It's a widely held fallacy about copyright, though. Copyright covers any unauthorized reproduction of a work, whether it's for sale or not. The only exceptions are for parody or fair use (which means such things as small quotes in a review of the work).

      • Well, then in that case, I suppose that 100 years from now when Google dies, some forethinking engineer's plan to release the encryption key of the entire library of the world secretly stored on our phones will be activated, and finally we'll get to read the books.
        • Books schmooks. Who's got time for books these days? Personally, I'm been trapped navigating a graph which encompasses all known knowledge, after starting with a single Wikipedia article.

          Information presented in a logical (topological?) order is for n00bs.

      • Isn't there an exception for libraries and archives too? How is the wayback machine legal?
      • by rgmoore ( 133276 )

        Parody is actually a form of fair use; it's legally considered a form of criticism, which is one of the things fair use is intended to protect. Fair use is actually a very complicated legal issue that has to be decided on the particulars of each case. It depends on a balance of four different factors (by statute):

        1) the purpose and character of the use, including whether such use is of a commercial nature or is for nonprofit educational purposes;

        2) the nature of the copyrighted work;

        3) the amount and su

        • That should make people more, not less, likely to buy the original.

          Proof by first derivative. Works every time. No dilemma ever.

          Er, um, hold the presses.

          No Google books: authors control most revenue, no soup for Google.

          Partial Google books: one author rats out the other (economically) by signing up. Author who signs up wins, author who holds out loses. Plenty of canned alphabet soup for Google.

          Full Google books: Google creams almost the whole of the economic surplus due to better consumption matching, a

      • Fair use is far broader than that - 'transformative use' - (such as using a book as input to a machine learning algorithm) is one of many additional fair use defense.

    • Re:for free (Score:5, Insightful)

      by Geoffrey.landis ( 926948 ) on Friday October 20, 2017 @10:27AM (#55403343) Homepage

      As an author, yes, I would like to be paid when my works are distributed.

      The problem is that Google wanted to distribute the work from authors for free.

      I do know that the idea that people should be paid for their work is controversial on /., where many commentators believe that information-- meaning other peoples' work-- should be free, and authors should be happy to starve, because, hey, it's exposure [theoatmeal.com].

      Well, actually, isn't the problem that they want to sell it / use it for commercial purposes? If Google simply wanted to put this on the web for absolutely free, with no links to anything else, couldn't they?

      Google is the most valuable company in the world [independent.co.uk]. They may want to distribute others peoples work for free, but they themselves plan to make a huge profit from doing so.

      It's merely the authors who don't get paid.

      • Re: (Score:3, Insightful)

        by Anonymous Coward

        Copyright length is the main issue, not a differing business model. There's a lot of content out there that the author's are dead and income are the least of their worries.

        • Copyright length is the main issue, not a differing business model. There's a lot of content out there that the author's are dead and income are the least of their worries.

          So, Mr. Anonymous Coward, what you're basically saying is that since dead authors don't need to be paid, you think it's ok if living ones don't get paid either.

          Yeah, great.

          • by BronsCon ( 927697 ) <social@bronstrup.com> on Friday October 20, 2017 @12:02PM (#55403973) Journal
            I actually read that as "dead authors don't need to get paid, copyright shouldn't outlive the author". I suppose I could stretch it to imply that copyright should be more limited than that, as well; say, the 14 years it was originally. And remember, when copyright was 14 years, printing and distribution were much slower than what we're capable of today. A book that would have taken a year to go to press and be shipped across the globe can now arrive on everyone's shelf tomorrow; if anything, that should further shorten copyright terms.
          • As an an author, Shirley you can do better than introducing a straw man into the argument? The poster did not make comment about living authors, so it ain't reasonable to criticise him for your unsupported inference.

            • by Joviex ( 976416 )

              As an an author, Shirley you can do better than introducing a straw man into the argument? The poster did not make comment about living authors, so it ain't reasonable to criticise him for your unsupported inference.

              He might, if he could understand your horrible spelling and grammar.

            • As an an author, Shirley you can do better than introducing a straw man into the argument? The poster did not make comment about living authors, so it ain't reasonable to criticise him for your unsupported inference.

              The anonymous coward poster was making a comment on my post, which most explicitly was about living authors. Since my post was about living authors, you should be responding to anonymous coward's post, not mine:
              As an an anonymous coward, Shirley you can do better than introducing a straw man into the argument? The poster you are responding to made a comment about living authors, so it ain't reasonable to criticise him for your unsupported statements about dead authors.

          • Here's my proposal how to fix the major flaws of copyright while ensuring that authors get paid:

            Replace copyright with payright. Here's what it means:

            The author gets a right to a clearly defined slice of revenue (e.g. 20% by default) from every commercial use of their work. If you register your work in a central registry, you get to set the percentage yourself and commercial users will have contact you. If you don't register it, statutory default applies and commercial users will just need to hold your slic

            • Here's my proposal how to fix the major flaws of copyright while ensuring that authors get paid: Replace copyright with payright. Here's what it means: The author gets a right to a clearly defined slice of revenue (e.g. 20% by default) from every commercial use of their work. If you register your work in a central registry, you get to set the percentage yourself and commercial users will have contact you. If you don't register it, statutory default applies and commercial users will just need to hold your slice of revenue in escrow until you contact them.

              So, you're saying that I can put up a site that makes the work of all the bestselling authors in America available for free, and the bestselling authors will get nothing. Because in your view they don't own their work, and aren't allowed to decide what their work is worth, or even if it is worth anything at all.

              Why do you think this is good?

              • You are so obsessed with fighting freeloaders that you're completely forgetting about trying to actually make profit. Screw the freeloaders, they're irrelevant anyway. Focus on the best ways to make profit and nothing else. That's why this is good.
            • One thing authors often do is sell some sort of first publication rights to get some cash sooner rather than to wait for the royalties. One thing magazines etc. do is to buy first publication rights so they can ensure that they can publish before anyone else, rather than having the December issue come out with a featured story that the competitor had in their November issue.

              So, this would reduce the desire to publish an author's works and reduce the amount of stuff published, while adding overhead to th

              • One thing authors often do is sell some sort of first publication rights to get some cash sooner rather than to wait for the royalties. One thing magazines etc. do is to buy first publication rights so they can ensure that they can publish before anyone else, rather than having the December issue come out with a featured story that the competitor had in their November issue.

                First publication rights would still exist. The rules I've described above would apply only to works that have already been published w

                • And we're back to the question of how we assure that authors have a good chance to get paid. With free eBooks readily and legally available, who's going to buy a copy when they can wait a few months and get a free one? Either we have a reasonably long period of exclusivity, or we need to find another way to pay authors.

                  • And we're back to the question of how we assure that authors have a good chance to get paid. With free eBooks readily and legally available, who's going to buy a copy when they can wait a few months and get a free one? Either we have a reasonably long period of exclusivity, or we need to find another way to pay authors.

                    I'm all for experimenting with other ways of making money. Most of them are currently blocked by copyright bureaucracy. Why should the law prefer selling ebooks as if they were physical goods over other business models?

                    • I'm in favor of experimenting until we find a better way. I'm not in favor of cutting off the current way we pay authors without finding and implementing something better first. Not just books, but all copyrighted materials are sold as if they were physical.

                      Currently, the system allows the following:

                      • Authors can get paid enough to live on and keep writing. (The two last Hugo-winning novels were written by an author with a day job.)
                      • The payment is set according to objective criteria (how many people w
                    • Take another look at the bigger picture of the current copyright system:

                      • Publishers keep trying to shove inferior products and services down consumers' throats because copyright essentially outlaws direct competition in the media market.
                      • Publishers keep ripping off both consumers and authors.
                      • Copyright enforcement tools provided by online platforms (as required by law) are frequently abused by trolls to silence political speech.
                      • Copyright is increasingly being abused to restrict ownership [boingboing.net] of physical property.
                    • Publishers are not necessary in the copyright system. You can always self-publish. Back when books were all made of dead trees, there were "vanity publishers". Currently, it's really easy to self-publish through Amazon and perhaps Barnes & Noble. The advantage, to an author, of using a regular publisher is that the publisher will provide services like editing and good formatting and proofreading (not necessarily doing it well), publishers have publicity channels ready to go, and publishers will oft

      • No, Google never wanted to distribute those works for free. (Except the public domain ones.) That was the Authors Guild's idea.

        See my comment further down the thread, and the link therein.

      • Re:for free (Score:5, Insightful)

        by 140Mandak262Jamuna ( 970587 ) on Friday October 20, 2017 @10:52AM (#55403535) Journal
        You must be a goblin.

        To a goblin, the rightful and true master of any object is the maker, not the purchaser. All goblin-made objects are, in goblin eyes, rightfully theirs

        "But if it was bought —

        then they would consider it rented by the one who had paid the money. They have, however, great difficulty with the idea of goblin-made objects passing from wizard to wizard. You saw Griphook's face when the tiara passed under his eyes. He disapproves. I believe he thinks, as do the fiercest of his kind, that it ought to have been returned to the goblins once the original purchaser died. They consider our habit of keeping goblin-made objects, passing them from wizard to wizard without further payment, little more than theft.

        • Amusing quote, and what's even more ironic, in the context of this discussion, is that you didn't bother to credit the author:

          J. K. Rowling, Harry Potter and the Deathly Hallows (Chapter 25).

          So, your worldview is apparently that not only should authors not be paid, they shouldn't even be credited.

      • How much do (you believe) I owe you for reading your comment?

      • Re:for free (Score:5, Insightful)

        by careysub ( 976506 ) on Friday October 20, 2017 @11:39AM (#55403833)

        The key piece of this picture that no one (yet, in any of the comments posted thus far) as even mentioned that what we are talking about are books that are out of print. These are books that you cannot buy (unless you can find an old copy, and may be exorbitantly expensive if so), and make the author no money at all. Zero.

        This is about 25 million books. Further it is estimated that half of these books are out of copyright under every iteration and perversion of copyright law and thus are already in the public domain - they belong to the public as is and was the intent of copyright law from the beginning.

        And the Google-Author's Guild deal actually provided a way to provide some revenue to authors of out-of-print books. Nearly all books go out of print after several years, never, ever to even be printed again so nearly all authors face this issue.

        So this is a lose-lose-lose situation (for Google, the public, and author's of out of print books).

        That so many books can be in the public domain and yet be unavailable is largely the result of the constant expansion of copyright at the behest and for the benefit of corporations that own publishing rights that has plagued society throughout the Twentieth Century.

        • by H3lldr0p ( 40304 )

          You missed another part of copyright law.

          If you can find and publish a text or song where the providence of such is in question, you can then claim for yourself the copyright.

          That's a problem in these cases. No one has actually said to have given up their rights to them. We don't know with whom the copyright resides. So if Google were to go in and being publishing these "abandoned" books Google could then claim them for themselves without having secured the necessity transfer of the claim.

          This has precedent

        • The key piece of this picture that no one (yet, in any of the comments posted thus far) as even mentioned that what we are talking about are books that are out of print.

          Completely irrelevant - copyright law doesn't care if the book is out of print or not.

          Further it is estimated that half of these books are out of copyright under every iteration and perversion of copyright law and thus are already in the public domain - they belong to the public as is and was the intent of copyright law from

      • Google is the most valuable company in the world.

        Umm, no. The link you originally included talked about the most valuable *BRAND* in the world, not the most valuable company in the world. (Also, it was from February.)

        An up-to-date list of the list of companies by market valuation, the true definition of "valuable", is at:
        http://dogsofthedow.com/largest-companies-by-market-cap.htm [dogsofthedow.com]

        As I post this, Apple is #1, and Google is #2, about $110 billion lower in market value.

      • Comment removed based on user account deletion
        • I've bought books because of Google Books service that let me look inside a book and see that it's going to be useful for me. Shutting down GB means closing this channel for you as an author. A stupid move, I would say.

          I agree. But it should be your choice to decide what and how much of your work to give away for free, not somebody else's.

          Your work, your decision.

    • No, it's not. If you give it away for free, the authors can't sell it.

      Before anyone descends up me with their "information wants to be free" meme, the people creating "content" (for lack of a better word) spend time and effort to do it. It's perfectly reasonable that they should be compensated with more than a hale and hearty "thank you". Copyright as current constituted has all sorts of opportunities for abuse, and is abused all the time.

      But, at the root, it's not a bad idea, i

  • An endeavor of this magnitude must have some type of value. You don't want to just give this away to the world. Corporations are not idealistic or altruistic. Once someone figures out how to extract some of the value from this collection, it'll be back.
  • by mellon ( 7048 ) on Friday October 20, 2017 @10:20AM (#55403269) Homepage

    I saw this go by back in April and was made sad by it. Now I am being made sad by it again. I wonder how hard it would be to crowdsource the same work. Like, just have everybody who thinks this is a tragedy do 10 books, and see how many that adds up to. The Google OCR API is available for use, and I think they may even have open sourced it so you don't have to run it in the cloud.

    • I wonder how hard it would be to crowdsource the same work.

      Project Gutenberg has been at it since the 70's. But they currently only have 54.000 books, not a whole lot compared to Google's 25 million books.

    • Erm, it's a LOT of effort to scan a book on a regular scanner. 99% of people have flatbed scanners, and if you are the 1% who have self feeding scanners you would have to separate all the pages first (destroying the book in the process). That being said people are doing it, there is a place on IRC (internet relay chat) where you can pretty much find any work of fiction produced (google it, I would rather not have the details indexed by google and associated with me). What I have trouble getting my grubby
      • by mellon ( 7048 )

        Take two sheets of glass, tape them into a V shape with cardboard to hold them up, place the book open on the V, take a picture from below with your cell phone camera. Repeat for each pair of pages.

  • AI silly! (Score:3, Interesting)

    by eager_agony ( 621576 ) on Friday October 20, 2017 @10:20AM (#55403273)
    They have a great corpus to train their AI with now. Maybe the best in the world.
  • by Anonymous Coward

    They were able to scan the books and data mine all the text. Why would they want someone else to be able to do the same?

  • Face it (Score:5, Interesting)

    by thegreatbob ( 693104 ) on Friday October 20, 2017 @10:25AM (#55403311) Journal
    I'm sure others will note... Google almost certainly just wanted the data. Why would they need/want anything else out of the arrangement?
  • by Anonymous Coward

    This and many other wrongs have happened because publishers, the RIAA, the MPAA, and especially Disney have been able to bribe lawmakers and buy extremely insanely long extensions of copyright. Works that should have long ago been in the public domain are being kept under copyright to the great detriment of our society. These same entities listed above are also doing everything that they can to eliminate Fair Use, and Right of First Sale. All in the name of price gouging and insane levels of uncontrolled

    • by bws111 ( 1216812 )

      RIAA established 1952
      MPAA established 1922
      Disney Corp founded 1923
      Berne Copyright extension of copyright to authors death + 50 years - 1908

      • The Berne Convention was actually signed in 1886. The US became a signatory in 1988, before which US copyright terms were extended to life+50 (75 total for corporate authorship) in 1976. Before 1976, the term was 28 years, with the possibility for another 28 year extension. Currently, it's life+70 or, for corporate-authored works, 120 years from creation or 95 years from publication, whichever is shorter.

        In 1922, 1923, and 1952, when those US entities were established, the US had not yet signed the Berne
  • by Lodragandraoidh ( 639696 ) on Friday October 20, 2017 @10:27AM (#55403349) Journal
    I think what happened is they got 1 terabyte in and realized that the data started to repeat over and over...and over.
  • by clickety6 ( 141178 ) on Friday October 20, 2017 @10:35AM (#55403407)
    Hey Google, use some of that vast money stockpile to undo the damage that companies have been doing to Copyright laws. Get some reductions in copyright duration to something more reasonable (15 years!) and then you'll be able to release the vast majority of your scanned books.
    • by Anonymous Coward

      The amount of money available across all the entities that benefit from copyright dwarf the assets of alphabet.

    • Brilliant! Google would benefit by being able to make better use of their data, people in general would benefit, and the rent-seekers in the video industry would be defeated. I don't think authors of fiction are getting much income 15 years after release anyway, so it's not as if any but a few authors of "modern classics" assigned in school will be worse off.
    • Why? I mean what's in it for Google? Or do you insist they start a new department called Google Charity?

  • Caliph Omar secured his place in history by ordering the Library of Alexandria to be burned, "It is says what's in Q`ran it is unnecessary. Burn it. If it does not say what's in Q`ran it is heresy. Burn it." The books and scrolls supplied hot water to the public baths for six months.

    So is it possible Google is shooting to secure a place in history?

  • [...] embroiled in litigation with a group of writers

    Down with the creators seeking to control their creations! How dare they?..

  • by Robotech_Master ( 14247 ) on Friday October 20, 2017 @10:49AM (#55403517) Homepage Journal

    Getting to see the books is not what Google Books is for. It was never what Google Books was for. [teleread.org] You've bought into the fallacy promoted by the Authors Guild, who came in after the fact and tried to wangle their lawsuit against Google Books into an orphaned-works library without actually having any authority to do so. Google shrugged and went along with it, because why not, but it was never what they had intended.

    From the very beginning, Google Books (nee Google Print) was intended to populate a search database so people could search within paper books as easily as they could search within the web. If the book was still in copyright, then finding that book to read was the searcher's problem. (Interlibrary loan works a treat.) Google was very straightforward about that in early blog posts and publicity about the project. Don't blame them for falling short of the Authors Guild's goals. Those goals were never theirs to begin with. See the link in the first paragraph for more information.

  • by kiviQr ( 3443687 ) on Friday October 20, 2017 @11:21AM (#55403717)
    What is stopping Google from operating as a library? For each city have a pool of ebooks that users can borrow for a week. They could have books that you can borrow for 1 min for search purposes. It should be cheaper that publicly funded libraries.
  • by idji ( 984038 ) on Friday October 20, 2017 @11:24AM (#55403729)
    Google Books helped me find books from 1838 that mentioned ancestors of mine by name and what they were doing. This is priceless to me.
  • The problem is that they want to *give* it to the world, instead of paying writers for their work. The US court has agreed for some weird reason, but foreign courts have not, and rightly so. Writers want to get paid for their work, just like you! They just happen to get paid in royalties, not hourly wages. Google wanted to be the only one to profit (from ads I might add).

    So yes: the library can be available to all, but once Google is willing to pay the writers.
     

  • In a world where every book, every music recording, every movie, tv show, all media is readily available for free somewhere on the internet, there's not enough hours in the day to read/listen to/watch all of it. And that's just what's in English. It's a noble cause but in the final analysis over 90% of it is not worth the time or effort. The totality of human knowledge is a real mess.
  • by Myself ( 57572 ) on Friday October 20, 2017 @01:08PM (#55404403) Journal

    Meanwhile, archive.org is scanning a thousand new books every day [archive.org] and nobody's writing news stories about it...

  • Back when Google first announced the book scanning project, there was a lot of argument back and forth here on slashdot about how copyright laws would affect the scope and scale of the project. We collectively agreed that snippets of content were fair use long before it was brought to the courts. A big part of the reason for that opinion was the previous experience we had in watching Google make content from Usenet and news organization websites accessible within the search function. There was a lot of deba
  • Most people don't know that there are a LOT of dark archives out there. They're used to back up journals and rare books to ensure that they should something happen (publishers go out of business, fires, etc.)

    I saw a talk once about a dark archive for music research. (I think it was at Research Data Access and Preservation, but could've been ASIS&T). They allowed people to submit jobs to run against it, but it was important that the results couldn't be used to recreate the music (possibly in conjunction

  • Google isn't innovating here. Overdrive has been around for quite a while and provides a very nice search interface showing which ebooks are available at your selected libraries. Also considerable integration with local libraries appears to be happening.

  • Imaginary Property is theft. Culture belongs to the People - it is not the personal property of degenerate capitalists.

    It must be very obvious to everyone now, that ownership of ideas makes the whole world needlessly stupider, and should be ended now.

    Until these badlaws are removed we must honor those heros like Alexandra Elbakyan who are expropriating scientific knowledge from the rich horders and and freeing it for the enlightenment of the whole people.

What is research but a blind date with knowledge? -- Will Harvey

Working...