Want to read Slashdot from your mobile device? Point it at m.slashdot.org and keep reading!

 



Forgot your password?
typodupeerror
The Internet

Vertical Search Engines and Copyright 62

Posted by kdawson
from the in-the-aggregate dept.
An anonymous reader writes "I am a big fan of Oodle, the online classifieds aggregator. I was disheartened when Craigslist announced that they would block Oodle from their site in late 2005 (old link), as I find their service very handy. I came across this page at the site of an aggregator of freelance job openings that summarizes the arguments around the legality of meta search engines and mashup-like sites and I found myself wondering if Oodle could have avoided the ban. There is an interesting argument there that seems to undermine copyright claims of user-generated content compilations. Are mashups legal? How does this affect sites like Digg or YouTube?"
This discussion has been archived. No new comments can be posted.

Vertical Search Engines and Copyright

Comments Filter:
  • by blaster151 (874280) * on Tuesday July 10, 2007 @03:26PM (#19817653)
    In content aggregation lies all of my excitement about the future of the web (if people are allowed to continue being innovative and aren't prevented by heel-dragging by legal departments).

    I don't even care if the aggregation happens server-side or browser-side. I want to be able to view a book product page on Amazon and click a "place local library hold" button. I want to be able to view my LiveJournal Friends page and have a superimposed queue and "recently watched" displays for those folks who are also my Netflix friends. Or current weather reports for those friends' locations. Fun stuff. I want to be able to stumble across an old news story and have a "there are 117 comments when this story was posted to Slashdot five months ago" notification.

    There is so much potential here for crossover - and it's all data that already exists! Crosslinking through simple knowledge of "which person on one service is which person on another service" - and "which product on one service is which product on another service" - would open so many doors. I hope legal departments don't keep preemptively closing them. To me, this is what would excite me if it were true about "Web 2.0" - beyond just simple pretty, AJAX-enabled user interfaces. Although those are cool, too.
    • by Phroggy (441) <slashdot3NO@SPAMphroggy.com> on Tuesday July 10, 2007 @03:28PM (#19817687) Homepage

      Crosslinking through simple knowledge of "which person on one service is which person on another service" - and "which product on one service is which product on another service" - would open so many doors.
      Wasn't this more or less the dream of Microsoft Passport?
    • by Shimdaddy2 (1110199) on Tuesday July 10, 2007 @03:38PM (#19817817)
      I'm glad you don't care where or how the aggregation happens, but who is going to pay the bills? If you use Amazon to find local books, what does Amazon get out of it? I think the real winner will not be the person who first creates all this aggregation, but the person who does it all in a way that allow profits to be shared.

      But this sharing is where problems arise, as everyone thinks they're entitled to a larger share of the cash than the next person...
      • What bills?

        I can already perform much of the above aggregation myself - manually and for free.

        If you're talking about someone investing development time for a cool browser plug-in or aggregator website that automated it for me, though . . . well, I know that I for one wouldn't mind kicking in some $$$ for something that useful.
        • Re: (Score:3, Insightful)

          by fbartho (840012)
          I think he's saying that Amazon and others get value by pushing their branding, and ads in your face when you use them. Some percentage of users actually generate revenue even though they were only contacted through these free options. Mashups, especially vertical search engines, can cause problems for the providers, because they let a user who currently uses that free stuff and is occasionally swayed by the ads, still get the value (and more) out of the free stuff, without providing any value, AND it lets
      • I'm glad you don't care where or how the aggregation happens, but who is going to pay the bills? If you use Amazon to find local books, what does Amazon get out of it?

        Precisely. Alone the same lines, the OP blames the 'legal departments' - he doesn't care about other people's rights, or about their ability to pay their bills. He just wants what he wants, now, Now, NOW!
         
        Other people and their rights and interests be dammed.
      • by raehl (609729)
        but who is going to pay the bills?

        You are. You install a browser plugin that adds the button to the Amazon.com pages that you view - it just takes the ISBN number from the Amazon page and matches it up with the ISBN number at the library and adds the button for you.

        Maybe you have to pay for that plugin, but quite likely it'll be a free plugin just like many plugins currently are.
      • by damelang (1028734)

        I'm glad you don't care where or how the aggregation happens, but who is going to pay the bills?

        As technology (both hardware and software) progresses, these bills might become so low that it won't really matter much any more.

        Perhaps at that point we'll have the whole of social computing running on an open, distributed, p2p-like system so that we all share the "bills" without even thinking about it. Or are we going to continue with this walled-garden approach to user-generated content? "Open APIs" like the Facebook API aren't open enough, IMO, as Facebook is the gatekeeper and still has the final s

    • by Threni (635302)
      > if people are allowed to continue being innovative and aren't prevented by heel-dragging by legal departments

      If they were innovative there wouldn't be a problem. Using other people's copyrighted material is likely to cause problems though, right?

      > There is so much potential here for crossover - and it's all data that already exists!

      Yes, but it's *not your data*!
      • In the sense that it's been served up to me, for free, I consider some of the ingredients of the mashups I described to be "my data" - my Netflix and Blockbuster queues, my friends lists on blogging sites (along with the entries they've written), etc. I'm not suggesting using some backdoor to take stuff merchants want to sell, and make it free.
    • You've been listening to that Berners-Lee fellow, haven't you.
    • by PopeRatzo (965947) *
      The fact that there's even a question about the legality of content aggregation shows just how useless our current intellectual property law has become. The discussion we should be having is how should we replace it? I think it's too late to try to fix it. There are flaws in the underlying model.
  • by RingDev (879105) on Tuesday July 10, 2007 @03:29PM (#19817711) Homepage Journal
    Now, maybe I'm just not keen on the latest batch of synergistical leet speak, but aren't Digg and YouTube user contribution driven aggragators? Isn't the key feature of a Mashup that it uses functionality from different web services to create a new set of functionality? Say like tieing CNN's RSS feed to Google Maps to Flicker to get an interactive graphical, geographical, news browsing interface.

    Or am I just out of touch?

    -Rick
    • by dunezone (899268)
      No man, your way in touch with everything. Therefor you are a witch, prepare to be burned at the stake.
      • Re: (Score:3, Funny)

        by RingDev (879105)
        Damn it. Not again!

        -Rick
      • No man, your way in touch with everything. Therefor you are a witch, prepare to be burned at the stake.

        If he weighs the same as a duck, he's made of wood. And therefore ...

    • by OverlordQ (264228)
      No, calling anything web-related a 'mashup' is a horrible bastardization of the word.
  • by Otter (3800) on Tuesday July 10, 2007 @03:50PM (#19817961) Journal
    I found myself wondering if Oodle could have avoided the ban.

    If I'm understanding correctly, craigslist has terms of service, and Oodle was systematically violating them. That's their right, whether there's a formal copyright violation or not.

    I'd never heard of Oodle, but craigslist is notoriously easygoing and their terms (you can run searches but not mirror the whole damn thing) seem reasonable, so I think the way Oodle could have avoided the ban is by not pissing Craig off.

    • craigslist is notoriously easygoing and their terms (you can run searches but not mirror the whole damn thing) seem reasonable, so I think the way Oodle could have avoided the ban is by not pissing Craig off

      Craig's gone man, sold out, walked away. I believe it was to these [ebay.com] people.

      • Re: (Score:1, Informative)

        by Anonymous Coward
        Craig is still the chairman. Look at their site.
  • mashup's (Score:3, Interesting)

    by jshriverWVU (810740) on Tuesday July 10, 2007 @03:52PM (#19817989)
    I've wondered how mashups would survive. Back in the 90's companies where suing others for just linking to their site. Let alone blatant copying of data feeds. It's a tricky situation.

    If it's a site that is funded strictly from ads, then they have a lot to lose by others ripping their content. But at the same time mashups are a wonderful way of getting a lot of similar info together so it's a convenience to the end user.

  • by blueZhift (652272) on Tuesday July 10, 2007 @03:59PM (#19818051) Homepage Journal
    I don't know if mashups are legal in the strictest sense, but I do have an idea how I would want it to work. Academic publications are impossible to produce without citing the work of others. That's how research works. Information that did not originate with the author is attributed to its respective source(s). No muss, no fuss, usually, and there are accepted conventions for how this is done. Right now I don't think the web has any such accepted conventions, but it should. Practically speaking, it would be impossible to close down all aggregation sites anyway, so the best course of action, imho, would be to develop standards for citing information that comes from other sources. While these still can't be enforced 100%, peer pressure should at least give people the idea that citing sources is a good thing.
    • by DGolden (17848)
      Well that sounds fairly sensible. It'll never catch on. ;-)

    • by stubear (130454)
      That only covers half the problem. How do you keep others from profitting from your work? Why shoudl another sit ebe allowed to sell ad space on their site and generate revenue from content I produced without sharing that revenue with me?
  • by Safiire Arrowny (596720) on Tuesday July 10, 2007 @04:32PM (#19818499) Homepage
    I am making something similar to create notifications for posts on craigslist right now. It is written in Ruby, and it basically enters the sections you specify on craigslist, and downloads and stores the last 100 postings into an Sqlite3 database.

    Then, as a human might do if he were obsessive, checks the section indexes for updates say every 10 minutes and incrementally stores new posts.

    The data in sqlite is then indexed by the ferret search engine library, so that it can perform searches on the post content and uses gtk2's libnotify to pop up a notification bubble if it has found anything you previously said you were interested in.

    I have not gotten banned in any way from craigslist, and I don't expect to be, since beyond the initial download of the sections, it behaves no different than an obsessive human who might be looking at 10 pages every X minutes. With this, I would be necessarily one of the first people to notice anything on the site that I'm interested in.

    I will probably release this on my site for everyone. I'm aware it's against the terms of service to completely mirror the entire site, but does this count as mirroring? Can it be deemed similar to greping your firefox cache, or personal mirroring and indexing?

    I know I'm sure as hell going to use it, that's why I made it, but it is an awkward feeling that if I give it away for free and people liked it, that I could get into some kind of trouble.
    • I'd say the best way to find out if it's against Craig's list ToS/rules/ w/e is to covertly ask them about the gray areas of their rules. Try and describe your program without saying it's your program. I personally don't see a problem with it because as I understand, it doesn't mirror... it searches and finds what you want, then gives it to you. Some items on the list isn't mirroring the whole list. But hey, I'm not Craig...
  • by w0lver (755034) on Tuesday July 10, 2007 @04:38PM (#19818563) Homepage
    I have experience at two companies that did site aggregation. First, with a company that did travel deals but searching other sites and the next was a job site that did the same. Searching and presenting a summary with link to the real live content is legal. Taking the content and re-purposing even with credit is illegal. So as an example, with a travel sight, searching all the airlines, Expedia and so on, and displaying links with prices is valid. However, showing the flights and prices without links and then booking it in the background never displaying the site, illegal. We had a number of companies that tried to sue us, we send over legal opinions and case history on the topic, the suits would disappear. However, we did have a few sites that blacklisted our IPs, tried to break our scraper, and post nasty things about us on other sites.
  • I wonder what the rules are surrounding my site..

    Then again, news = current events and current events are not copyrightable..
    • by coaxial (28297)

      Then again, news = current events and current events are not copyrightable.
      Perhaps not, but reports about the current event are, and always have been.
      • by neoform (551705)
        Yes, but the copyright rules around reporting on current events are different than regular copyrighted material.
  • I have found Yahoo Pipes to be an indispensable companion to Craigslist RSS feeds. I can plop in feed from say Fresno and SF Bay, search with positive, negative, and grouped searches, and restuff that back into a new RSS feed.
  • As an aside let me first just say this is a terrible slashdot posting on an interesting subject. The linked article is nothing but an ad (well about page) for yet another job search company. Kudos to their marketing team for getting on slashdot though.

    Anyway the comments so far seem to be blurring together several important but very different notions.
    1. The legality of crawling another companies publicly available web site and sucking down their content
    2. The legality of republishing that content in some manner
  • another site: ABCFREE.COM
    used to do the same thing, but they would stick Google Ads in between the actual scraped content so you were more inclined to accidently click a Google Ad than the Classified that you really wanted to see.
    ABCFREE.COM seems to have lost their Google Ad account because of this and then I guess it was not worth scraping Craigslist anymore because the site has "down for maintenence" page up now for quite some time.

    How many other sites had a business plan like this based on scraping

  • First thought that came to my mind were Artists from clearly
    different genres of music collaborating for a song, such as
    Eminem and Elton John, or Nelly and Tim McGraw, or when a producer
    of a Mix-tape samples various older songs from different genres and makes
    some sort of a dance mix, or even an entirely new song.

    Those would be cool to see more of.
  • Is it just me, or are job boards the worst offenders of "please don't use our content"? This has happened recently in Australasia with the takedown of jobby.co.nz and the legal threats of seek.com.au to myspider.com.au (blogged about at http://www.engageonline.co.nz/blog/?p=84 [engageonline.co.nz]).

    What really miffs me, is how the job boards can say they "own" the content, when actually, it's been posted by other people on these sites and is really their content.

Mediocrity finds safety in standardization. -- Frederick Crane

Working...