Forgot your password?
typodupeerror
The Internet

New Search Engine Takes "Dyve" Into the Dark Web 55

Posted by timothy
from the looking-for-porn dept.
CWmike writes "DeepDyve has launched its free search engine that can be used to access databases, scholarly journals, unstructured information and other data sources in the so-called 'Deep Web' or 'Dark Web,' where traditional search technologies don't work. The company partnered with owners of private technical publications, databases, scholarly publications and unstructured data to gain access to content overlooked by other engines. Google said earlier this month that it was adding the ability to search PDF documents. In April, Google said it was investigating how to index HTML forms such as drop-down boxes and select menus, another part of the Dark Web."
This discussion has been archived. No new comments can be posted.

New Search Engine Takes "Dyve" Into the Dark Web

Comments Filter:
  • so... (Score:5, Funny)

    by u4ya (1248548) on Wednesday November 12, 2008 @04:53PM (#25738527) Homepage
    this will help me get more porn, how?
  • Pay walls (Score:5, Informative)

    by tepples (727027) <tepples&gmail,com> on Wednesday November 12, 2008 @04:58PM (#25738593) Homepage Journal

    The company partnered with owners of private technical publications, databases, scholarly publications and unstructured data to gain access to content overlooked by other engines.

    I know why the other engines don't index these documents: they're behind pay walls. As the second link points out, Google already indexes (some) PDFs, but that doesn't help if the site doesn't want me to see the PDF. There are lot of topics, such as disability rehabilitation and linguistics, that I can't search for without Google returning a bunch of results from sites that require a subscription but to which my county library [acpl.info] doesn't subscribe. (A tip-off for these results is that "Cached" doesn't show up.)

    • Re:Pay walls (Score:5, Informative)

      by philspear (1142299) on Wednesday November 12, 2008 @05:03PM (#25738637)

      It appears this website ITSELF requires a subscription, the "beta" is free, the "pro" is not. Signing up for the beta will get you a registration page, followed by this helpful message:

      "Due to the wonderful interest that we have received, we will be sending out your username and password next week.
      We hope you enjoy using DeepDyve, the research engine for the Deep Web!"

      Not impressed so far that they can't let me use the search for a week unless I pay them money. Don't fall for this scam.

      • That must be new. I got a username and password yesterday. Unfortunately neither my test search nor the one test search that I tested among those posted returned any results. I suspect it was a victim of /.ing before even being posted to /.
      • Re:Pay walls (Score:5, Insightful)

        by z0idberg (888892) on Wednesday November 12, 2008 @06:26PM (#25739777)

        If they can't set up a registration system that can get someone registered in under a week then how good is the rest of it?

        And what do they need my street address for?

        Pass.

    • That's a very valid concern (and an astute observation) you raise. Still, I look forward to seeing how much of this dark web can be freely illuminated. I wish them the best of the luck in their "DeepDyve" project. But at the moment, I confess my mind is slightly more focused on getting some "DeepFryde". Mmmm, grease.
    • "There are lot of topics, such as disability rehabilitation and linguistics, that I can't search for without Google returning a bunch of results from sites that require a subscription "

      To me that's a breach of Google's own guidelines.

      Here are Google's guidelines:

      'Make pages primarily for users, not for search engines. Don't deceive your users or present different content to search engines than you display to users, which is commonly referred to as "cloaking."'

      In 2006 they blacklisted BMW for breaching them:

      • by tepples (727027)

        As Google's user, I very rarely want to get search results for content that I can't access.

        If you are still in university, you can access it: e-mail the URL to yourself and access it from a desktop computer inside the university library. I guess Google assumes that anybody who searches for what it thinks are scholarly topics is still in school and therefore has access to the school's subscription to JSTOR, Wiley, Elsevier, SpringerLink, and other closed-access scholarly journal sites that I see spamming the listings. But for those who have left school after finishing a degree, tough droppings. I'

        • by TheLink (130905)
          1) I'm not in university
          2) Which universities are allowed access? I'll find it interesting and amazing if somehow all universities worldwide have access.

          As it is, such results are useless and a big waste of time to me.
          • by tepples (727027)

            Which universities are allowed access?

            Any library that subscribes to journals is allowed access to articles from those journals. In general, underfunded county libraries choose not to spend tax money on them, but libraries affiliated with universities that offer courses in those areas need the subscriptions for their students.

  • This will certainly defeat the practice of obfuscating links with e-mail addresses in them, by using a picture link or "click here."

    If the search engine can read source code, it can certainly parse out an email address.
    • Re: (Score:3, Informative)

      by tepples (727027)

      This will certainly defeat the practice of obfuscating links with e-mail addresses in them, by using a picture link or "click here."

      "Click here" still works: use a web form to send e-mail instead of disclosing an e-mail address that doesn't use a whitelist. And AFB has reported that a picture of an address doesn't work even for legitimate users of speech or braille browsers.

  • I have pondered how or if this information could be made available. Looking good for open access!
  • You have to pay?! (Score:5, Insightful)

    by LingNoi (1066278) on Wednesday November 12, 2008 @05:06PM (#25738679)

    I don't know about you guys but I prefer not to have to sign up or use the "pro" version for my web searching needs.

    In fact why do I have to sign up to web search anything?

    Besides this thing looks like it just gets in your way [deepdyve.com].

    Thanks, but it's not a google killer.

    • by jabithew (1340853)

      Because it searches stuff Google can't access because of pay barriers. I have to be on the college network to get at this stuff.

      BTW, this is not especially revolutionary as Google Scholar from my university can search this stuff. Don't know quite how it works, but it seems to tie in with SFX and/or Metalib. Only it works much better than Metalib.

  • Woo... (Score:2, Insightful)

    by ZekoMal (1404259)
    Just what I needed: 40 million NEW search results to sift through; I already have to deal with the first 5 pages being useful, followed by 60 pages of, let's say 'pokemon glitch' that is really someone's blog that has 500 words slapped on the bottom (nothing quite as useful as finding out a website that came up on the search says at the very bottom of the blog 'boobs anal pokemon glitch asians etc'

    Good news is I can finally PAY to be annoyed.
  • Riiiight. (Score:4, Funny)

    by He Who Waits (1102491) on Wednesday November 12, 2008 @05:16PM (#25738809)
    It's apparently not working right now. But give it all your personal information now, and they will get back to you.
  • by z-j-y (1056250)

    basically it's like cavity search for the internet.

  • by Anonymous Coward

    Login? to search a "dark net".

    You are fucking kidding?

    I was right about Tesla crashing. I'll make another prediction.

    Deap Dyve out of business in 1 year.

    Cheers,
    Kilgore Trout

    P.S. : get the Cyrillic fonts enabled. Russia is invading the U.S.S.A. Finally !!!

    • by RobBebop (947356)
      I know you! For years Anonymous Coward has been making all sorts of predictions. You can't improve your credibility from one specific event to make me believe you! And signing the e-mail with the name of a fictional character from Vonnegut doesn't help either.
  • Try it (or not) (Score:1, Insightful)

    by Anonymous Coward

    I shall certainly try it out.
    BUT, if it is anything like how badly Cuil went on its first week, it will fail.

    Instantly, just by seeing the frontpage, i don't have high hopes.
    You have to sign up?
    Yes, i will try it, when i can be bothered signing up, WHICH would probably be never, as i will probably forget about it until the article posted here in a month saying how awful it is doing.

  • by harmonica (29841) on Wednesday November 12, 2008 @06:08PM (#25739495)

    The summary is a bit misleading. Google has been indexing the textual parts of PDFs for a long time. According to the article they have now started indexing scans inside of PDF files, which requires OCR.

    Google has been doing that for catalogs [google.com] for a while now, but OCRing large numbers of scans obviously requires a lot more resources.

  • Either this DeepDyve thing is the best search engine ever or they are smoking crack. They have a pro version for $45 a month. http://www.deepdyve.com/why_deepdyve/deepdyve_pro [deepdyve.com] that's got to be some pretty good venn diagrams to be worth $45 a month...
    • Not that it is likely that many people will read this but I got my login to the beta and used it a bit.

      I'm in the humanities so perhaps the experience of people in the sciences will vary. However, I'm not terribly impressed. It has potential but as it is there doesn't seem to be much that it offers that Google Scholar doesn't and it has things that don't seem to be of much use at all.

      Most of my search results came from Sage Publishing which typically show up in Google Scholar results anyway. If the
  • "In April, Google said it was investigating how to index HTML forms such as drop-down boxes and select menus, another part of the Dark Web."

    -Great, now I can have 10,000 times more irrelevent search results to dig through!

  • There is a difference! A "Dark" web (or more properly Dark Net) is designed to be private. The "Deep Web" simply accesses more information that has always been public, just hard to find.

    There is a VERY big difference!
  • Google (Score:3, Insightful)

    by Aggrajag (716041) on Thursday November 13, 2008 @01:14AM (#25743141)
    I'll wait for Google to assimilate DeepDyve before I'll check it out.

Kleeneness is next to Godelness.

Working...