Forgot your password?
typodupeerror
The Internet Science

The Web of Data, Beyond What Google and Yahoo Show 50

Posted by timothy
from the thought-symantic-was-just-some-company dept.
jccq writes "Both Google and Yahoo have been supporting Semantic Web markup (RDFa, RDF and Microformats) for weeks and months respectively. What they do, at the moment, is use the markup only for visual feedback by returning better looking, more functional 'page snippets.' But how would it look if you could get all these bits and compose them automatically to form a single structured information page about what you're searching for? The folks at the DERI institute have just released Sig.ma, a visual browser and mashup generator that will go all over the web of data and find dozens of sources to combine together when answering a user query. It also comes in API mode to reuse the information Sig.ma finds inside applications. Here are a screencast and a blog post, with semantic-web-geek details."
This discussion has been archived. No new comments can be posted.

The Web of Data, Beyond What Google and Yahoo Show

Comments Filter:
  • by ionix5891 (1228718) on Sunday July 26, 2009 @07:15PM (#28831147)

    and studied at nearby uni,

    DERI is a money blackhole, most of the people there know that semantic web has many many issues and probably will never bear fruit, but chose not to speak up in order not to damage their academic careers and keep their cushy "research" positions

    • by Sique (173459)

      Hey, but going sledge riding in the Alps with some of the women from DERI was nice though.

    • by jccq (113245)

      The semantic web has many issues but these are not denied

      quite evidently however there is people who put a lot of work into this in finding the right balance between technology and socially sustainable models..

      but go ahead just diss everybody :-) why not, i mean.

      • The summary calls this a "visual browser" but I don't see any downloadable browser programs??? All I see is a *search engine*. Oh well. I guess that's to be expected in a world where people think Google/Yahoo are browsers.

        I typed in my name to this Sig.Ma search engine, and it turned-up virtually nothing. So yes, I'd say this approach has serious problems. Using my name in Google turns-up all kinds of dirt... er, information about myself. I'll stick with google.

        • P.S.

          Anyone have suggestions on how I can remove the "dirt" off my self-search google results? I've deleted some of the original messages from the 1980s and 90s, but for some reason they keep hanging around in archives.

          Could I claim "copyright" over my own words, and issue a DMCA takedown notice? Hmmm.

    • DERI has produced some good stuff: I particularly like D2R which is a wrapper providing a SPARQL endpoint around a relational database. Both cool and useful (if you know how to use SPARQL queries).

    • i was not involved with them, but I agree with waht you're saying!
      ~Ami
      Chicago Web Design [transcendevelopment.com]
  • by ghostis (165022) on Sunday July 26, 2009 @07:24PM (#28831223) Homepage

    The folks at the DERI institute used to have Sig.ma, a visual browser and mashup generator that will go all over the web of data and find dozens of sources to combine together when answering a user query.

    • Re: (Score:3, Interesting)

      by ghostis (165022)

      As I wrote the above I realized that "used to {verb}" is a really odd idiom. Can anyone explain?

      • Re: (Score:1, Informative)

        by djfuq (1151563)

        I used to smoke - now I'm smoking

        • by ghostis (165022)

          I was looking for the origin, actually... :-/

          • Re:Fixed for you... (Score:5, Informative)

            by GigsVT (208848) on Sunday July 26, 2009 @08:58PM (#28831827) Journal

            I was looking for the origin, actually... :-/

            http://www.englishpage.com/verbpage/usedto.html [englishpage.com]

            "Used to" expresses the idea that something was an old habit that stopped in the past. It indicates that something was often repeated in the past, but it is not usually done now.

            I wonder how you could ever tell a semantic search engine that you wanted the history of the idiom itself. Google picked it right up though, just had to search for "used to" quoted.

            Semantic intelligence in the form of incoming links is pretty damned powerful, anyway.

            • Re: (Score:3, Informative)

              by brusk (135896)
              With Google, you can search for "define:$word", which looks in dictionaries. Not perfect but for this kind of task it's helpful.
            • I wonder how you could ever tell a semantic search engine that you wanted the history of the idiom itself.

              Probably much more easily than with Google. If you want to look up the etymology of "used to", a query like:

              "used to" etymology ?

              And it would complete your "sentence" by finding the value of "?".

      • by logixoul (1046000)
        "I *used to* X" - "In the past I *was used to* doing X" - "In the past I had the habit of doing X".
  • Markup (Score:4, Informative)

    by jefu (53450) on Sunday July 26, 2009 @07:29PM (#28831265) Homepage Journal

    RDF is nice and there are various different syntaxes for it (including various triples formats), and promises, if it can be built, deployed and trusted(!!!) to make the web ever so much more searchable. This will depend though on people writing good ontologies (not easy) and using them correctly (even less easy).

    RDFa and microformats look, on the surface at least, to be nice ways to manage RDF type information in HTML. But I'm a bit more dubious - they don't, in many cases, have careful ontologies built around them - when they do (RDFa, mostly) they seem to be very resource intensive (a heavily RDFa annotated HTML page is likely to balloon to several times the same page without RDFa), and the uses of them I've seen have been less than convincingly correct. This doesn't mean that they're useless, just that they're not doing the job at the moment, or they're doing the job poorly.

    The solution that seems to be favored by the semantic web types is to present RDF pages as an alternative to HTML pages when RDF is requested. This looks, by far, to be the best way to work this, but does require site builders (and CMSs and web frameworks), and content authors, to be able to build correct RDF pages that represent the information presented, often at the same time as they present HTML pages to human readers (and non-RDF search engines). This is going to be a major problem.

    • by QuantumG (50515) *

      Why is it that every time someone mentions Ontologies I can't help but think of pseudoscience.. and that typically makes me think of scientology. Oh, that's right, because its all bullshit. Ontological classification is completely arbitrary.. and typically only helpful when it is specifically tailored to a particular application.

      • "Foo is bullshit. Foo is completely arbitrary. ... Uh, but Foo is useful when done a certain way..."

        While I get the gist of your comment (assuming you don't actually have a self-contradictingly caricaturized model in your head), it seems to me you could have put it more clearly.

  • It was the future in 2001 [amazon.com]; inspired the masses with its vision of the glorious future in 2003 [amazon.com]; and of course we are presumably right on the cusp of this golden future today.

    • by aharth (412459)
      The field has come a long way since 2001 or 2003.

      The main obstacle to "this golden future" so far has been an insufficient amount of data published online. Many organisations sit on their data like hens sit on their eggs, and publishing data right requires some effort.

      That's slowly changing, especially with more openness and transparency -- voluntarily or forced -- in all kinds of organisations and agencies (data.un.org, data.gov, data.gov.uk... ), more people getting the idea of open data, and the establi
  • Cat got my tongue (Score:4, Interesting)

    by WiFiBro (784621) on Sunday July 26, 2009 @07:33PM (#28831309)

    I don't know why but their presentation pisses me off beyond reason.

    Probably because it's the n-th time somebody is trying to impose some silly standard.

    And pretends it's the best invention since you-know-what.

    I have in real life a fairly common name, there's at least 10 of me worldwide, I recognized that they deliberately picked a unique name to show how well it works.

    Ach we'll see.

    • Re: (Score:1, Interesting)

      by Anonymous Coward

      I don't know why but their presentation pisses me off beyond reason.

      Probably because it's the n-th time somebody is trying to impose some silly standard.

      And pretends it's the best invention since you-know-what.

      I have in real life a fairly common name, there's at least 10 of me worldwide, I recognized that they deliberately picked a unique name to show how well it works.

      Ach we'll see.

      It seems trivial to add a city to go with your name and narrow it down.

    • Re:Cat got my tongue (Score:4, Informative)

      by derGoldstein (1494129) on Sunday July 26, 2009 @08:23PM (#28831597) Homepage

      I managed to try it out while it was posted on the firehose, and the very initial impression was good. Gradually, however, I noticed that it was just dumping data on my lap, and left it up to me to sort it out. It reminded me a bit of Wolfram Alpha [wolframalpha.com], except half of the information was wrong (and if I gave it names, most of the information was wrong).

      Even within the presentation, they point out the flaw of having to sift through the mess and pick out the irrelevant information.

      I don't think it's useless, I mean it does provide you with many links that you'd normally not get on other search engines, at least when you enter something unique as a query. But as far as actually placing relevant information in brackets (location:... history:... personal-information:...), it doesn't do a very good job.

      Also, if something is truly unique, you'll get a better result in wikipedia anyway (in terms of how its arranged, anyway). And if you want more accurate info dumps, Wolfram Alpha currently does it better.

    • Have you actually ever looked into the idea, or do you like to just read a summary, and then rant about how silly it is?

      Semantic data structures (ontologies) are most likely the ultimate way to structure data. If you think that the table is the advancement of the list. And the tree is one step further. Then the next step, that contains it all, are graphs of semantically structured data.
      Tagging stories on /. is a simplified version of it. File systems with soft-/hardlinks are another. And ultimately, I can't

      • Sure... once somebody gets it to actually work!

        And this is yet another example. I entered in a name that I know to be unique, and know also to have hundreds of listings in Google, for example, and many other sources. Yet Sig.ma came up with exactly nothing, after 5 minutes of grinding away.

        FAIL.
      • by WiFiBro (784621)

        "Have you actually ever looked into the idea, or do you like to just read a summary, and then rant about how silly it is?"
        I've RTFA and watched the presentation. It is the presentation I have the biggest problem with, too "we'll change the web experience" and pretending it is working while it is clearly in it's infancy when you test it.

    • by GigsVT (208848)

      Maybe because web devs struggle more than necessary with simple things like maintainability while the W3C tilts at windmills and lets semantic web nerds run the show?

    • I don't know why but their presentation pisses me off beyond reason.

      Because you're unreasonable? ;)

      Personally, I think it's pretty great. There have been lots of attempts at a semantic search engine, but this one looks usable. Unless it does very unpleasant things, it's going to be my default search engine from now on. The semantic web has been a long time coming, but for me, it's finally arrived.

    • by jccq (113245)

      You're right, name disambiguation is not covered by this release.
      The truth is that thanks to the semantic descriptions it will be more and more possible to do disambiguation is a smarter, more precise way e.g. using any other property you might put in any of your online presence files, e.g. homepages, work , interests etc.

      it just takes work :-) a disambiguating sigma is expected by december.

      Cheers.

      p.s. we're not imposing any standard really.. Google and Yahoo ARE supporting RDF, RDFa and Microformats, and p

    • I tried, "Barack Obama" and my own name. None showed any results. Yes I understand its because there are probably no RDF/RDFa formatted pages with these names yet. But as of now I'm not sure what to do with Sig.Ma
  • My name brought the sig.ma server to its knees.
  • Some people have already suggested that common names will cause problems with this system. The next big thing should be searching by context. I hate searching for "supernova" only to get a long list of songs by some band. The keyword "space" or "star" helps, but that usually results in other false hits, too. Don't even get me started on acronyms, or things that don't have anything to do with computer technology.

    Would there be any way for a search engine to examine a whole bunch of keywords and content i

  • it Dies on common names and website, seems to find the wrong names most of the time. Its main info source is dbpedia, which is a ad hoc, system for turning wikipedia entries in database items, (since wikipedia isn't very semantic the dbpedia has to do some guessing). Maybe Sig.ma will get usable someday, it isn't now.

    ---

    AI Feed [feeddistiller.com] @ Feed Distiller [feeddistiller.com]

    • by jccq (113245)

      Thanks, i tend to agree with you myself (the poster).

      This is still a demonstrator.. the idea is show that this is possible and that putting markup on your pages is useful becouse eventually there will be sigma 2, 3 (or whoever else), the S/N ratio will increase and it will be possible to reuse it with one simple HTTP call to make any SAAS software (or any software really) do cool things automatically.

      Giovanni

  • I gave it my first and last name, and it came up with nothing. Then, I gave it my complete name. It took my middle name (David) added "Baltimore" as a random last name and gave me facts about somebody named David Baltimore. Absolute, utter, meaningless gibberish. I am not impressed.
  • by Wee (17189)
    I have a slightly outdated buzzword bingo card, but I think I have a winner even still. So, hold your cards.

    -B
  • the frackin thing doesnt even work. and when you hit the 'Contact' button to yell at the creators of the site for sucking, all you get is an Apache error. Smoooooooooooth.

One small step for man, one giant stumble for mankind.

Working...