Forgot your password?
typodupeerror

Untangling Web Information 76

Posted by CmdrTaco
from the i-don't-understand-you dept.
Ostracus writes "The next big stage in the evolution of the Internet, according to many experts and luminaries, will be the advent of the Semantic Web — that is, technologies that let computers process the meaning of Web pages instead of simply downloading or serving them up blindly. Microsoft's acquisition of the semantic search engine Powerset earlier this year shows faith in this vision. But thus far, little Semantic Web technology has been available to the general public. That's why many eyes will be on Twine, a Web organizer based on semantic technology that launches publicly today."
This discussion has been archived. No new comments can be posted.

Untangling Web Information

Comments Filter:
  • First, before anything even really started, The Semantic Web was merely a pipe dream [slashdot.org].

    But that was the long long ago, so let's fast forward a few years. When its future looked most bleak, Sir Tim (who can summon fire and explosions at will) told us what to expect [slashdot.org] .... twice [slashdot.org]. And we were happy.

    Then a few years passed and nothing.

    Until the 2006 World Wide Web conference made us suspicious of the Semantic Web [slashdot.org]. We spread rumors about the Semantic Web and told all the cooler technologies that the Semantic Web was just out to rape our privacy. So we challenged the Semantic Web [slashdot.org]. And claimed it would fail [slashdot.org].

    Just when I was expecting Sir Tim to get underneath a blanket & release a sobbing YouTube video of everyone being bastards for attacking The Semantic Web right when she was going through really tough times and that we should all just leave her alone ... the Semantic Web went mainstream [slashdot.org] and started getting real [slashdot.org].

    I've got no problem with people pushing technologies but this one sounds more like a soap opera than anything. Has the Semantic Web changed anything for anyone on Slashdot? I haven't seen anything directly if it has ...
    • Re: (Score:1, Insightful)

      by Anonymous Coward

      semantic web = new buzzword for funding

    • by Anonymous Coward on Monday October 27, 2008 @10:58AM (#25527427)

      First off, that was brilliant.

      I've got no problem with people pushing technologies but this one sounds more like a soap opera than anything. Has the Semantic Web changed anything for anyone on Slashdot? I haven't seen anything directly if it has ...

      My problem is simply this: Assuming a "semantic web" existed right here and now, how would I use it? Google I can understand: Go there, fill in the blank with whatever I can think of, hit "Search" and hope for the best.

      Trying to get a computer to understand the meaning of a web page is, fundamentally, getting machines to do my thinking for me. In my experience, they're pretty bad at it.

      And that's not even considering sites with political spin; how would a machine work out the meaning there? Someone's going to say it's wrong, and if that someone is the user performing the search, then the semantic search is going to take the blame.

      This will also lead to "Semantic Engine Optimizers" figuring out how to polish turd websites into something that even shows up, which makes the semantic web less useful. About as useful as, say, Google is now.

      The "Semantic Web" to me is like the future: everyone has their own idea of how it will be, and the reality will disappoint them. Ideas are cheap, and at this point, nobody should care unless they're actually going to make something out of them.

      When a semantic search engine ships that is actually useful, let me know. Until then, I'll stick with the best semantic search engine I've ever seen: the people I know.

      • Re: (Score:3, Interesting)

        by blahplusplus (757119) *

        "Trying to get a computer to understand the meaning of a web page is, fundamentally, getting machines to do my thinking for me. In my experience, they're pretty bad at it."

        They're pretty bad at it NOW, personally I think the web would get infinitely better if all user tastes and profiles were congregated, as they are at delicious, so people with similar interests are pointed to the results found by others. That's one thing I like about delicious, you can browse the bookmarks of others who have "done the th

        • you can browse the bookmarks of others who have "done the thinking" for you.

          Unless you find that thought itself rather off-putting.

          but it means giving up any kind of privacy

          Yet another strike.
          • "Yet another strike."

            Not really since any act of you being on the net is being in public - i.e. recordable and not private, routers, scripts, ads. Anything that has to be downloaded to your system run and loaded and sends data back is all one needs. One can always find creative ways to expose what would otherwise be data through innocuous means.

            Also the math to reconstruct a persons interest is going to get better and better, so it will make exposing data a moot point since you can reconstruct the pattern

      • Re: (Score:3, Insightful)

        by iangoldby (552781)

        I don't think the computers are supposed to do the thinking for us. My understanding of the Semantic Web is an attempt by humans to encode meaning into the markup.

        I'll believe it works when I can finally do a search on a keyword like "digicam" and choose an option to exclude any website that is trying to sell me something.

        You may respond: "That won't work. Sellers will game the system in some way." Maybe. Or maybe it can be designed in such a way that in order to be able to sell anything they will have to g

    • by squoozer (730327) on Monday October 27, 2008 @11:05AM (#25527515)

      I worked, in a research capacity, on technologies for building the semantic web for a while and to be quite honest with you I can't see how it could ever work in the real world. Just in the department I was in there must have been a dozen different ideas for how to build a semantic web and the only thing that tied them all together was the fact that they all relied on humans doing a lot of work to tell a computer what the content was about.

      I'm sure that some of this semantic web technology will be useful somewhere but it's not going to take the world by storm simply because it doesn't work well enough and it requires too much up front effort for possibly / probably no gain.

      The only way I can really see it working is if we can develop AI to the point where it can actually understand what it is reading without a human having to first develop some huge ontology and join the dots for it. But that's just my opinion.

      • The only way I can really see it working is if we can develop AI to the point where it can actually understand what it is reading without a human having to first develop some huge ontology and join the dots for it

        In which case we wouldn't need it anymore?

      • by quietwalker (969769) <pdughi@gmail.com> on Monday October 27, 2008 @11:54AM (#25528295)

        This about sums up my experience with it as well.

        First we started off with categories, and tags in our searches. Then we switched to no-searching, but a filter-based tree mechanism for reducing the number of hits - instead of a table of contents. Then we switched to a table of contents using "task", "product" +4 other tree 'heads'. Then they started mulling over per-sentence tagging. It kept ballooning because it was obvious that though we had all these tags and a hierarchy and divisions, it didn't help - our customers used google to search for our help doc rather than our internal systems/help application.

        In the end, they decided that they needed to automatically categorize everything. I tried to point out the futility of it, and what that would get them, but no one really wanted to listen. They were very surprised in the end when they got a search engine that looked for keywords.

        Exactly what the help system they started with already did.

        The two biggest problems with semantic-anything is

        1) it doesn't provide any additional value without an exponentially increasing order level of (human) effort,
                      and
        2) Unless someone comes up with a single, agreed upon, final, categorization (an ontology) - your markup will always disagree with someone elses, except for the most simple things that would be noted by search engines looking for keywords.

        When I left, the project had been ongoing for 3 years, and they still didn't know what they wanted it to do - they were still searching for purpose and changing the target every day. ... we didn't share an office, did we?

        • by theaveng (1243528)

          >>>When I left, the project had been ongoing for 3 years, and they still didn't know what they wanted it to do

          So basically Welfare for the white-collar workers. Nothing is actually done, but you still get paid. I like those jobs. ;-) Reminds me of what I was working for the government...

        • by squoozer (730327)

          I'm glad I'm not the only one who has experienced this with the semantic web. In the end I left because it was quite clear that the project I was working on was going no where fast and I wanted to actually solve problems. The real problem with the semantic web work was that no one had looked at it from a business point of view. To be adopted a technology has got to increase profits either by growing the customer base or reducing overhead, the semantic web (currently) does neither.

          Even though I have left tha

      • by copdk4 (712016)

        I worked at a big tech company doing SemWeb, where my experience was exactly the same. Everyone was scratching their head.

        Now I've moved into Healthcare IT environment, where SemWeb makes perfect sense. Its like the best tool for the job.

        The essential difference is what end of the stick you are picking up. The tech folks who are trying to shoe-horn RDF/OWL onto anything n everything (e.g. search) are failing. On the other hand, Healthcare/Life science folks who have to work with heavy knowledge intensive st

        • by squoozer (730327)

          I agree that there are certain areas where the semantic web is actually pretty useful and I think health care is one of those areas. The important thing to realize though is why it works well in that area: it is a very tightly defined domain, it's a domain that has a lot of money, it has a limited number of highly skilled people interacting with the system, etc, etc. This doesn't describe the real world in general.

    • by CaptainPatent (1087643) on Monday October 27, 2008 @11:07AM (#25527537) Journal
      I don't think it has changed anything yet . However in order to get a fully functional semantic organizer you have to teach a computer English first, then tell it to search. English is my own native tongue so I personally don't remember learning it, however I do have several friends that learned later in life and it was a bear teaching them grammatical exceptions and expressions that do not have consistent meaning (and there are a ton of them.)

      Computers are getting closer [slashdot.org] and can "understand" some basic phrases and grammar, but on the whole remain mostly useless because of everything they miss. I think semantic internet implementations are possible, but the reason it hasn't changed anyone's life is that it's still a long ways off.
    • by ceoyoyo (59147)

      Sure has. Some of those tags are funny. Not very useful, but funny.

    • by MrEkted (764569)
      Here's something:
      There are some queries you can Google today, and it will give you a very thoughtful answer. For instance How tall was Babe Ruth? [google.com]

      It's not a canned response, but an attempt to pull information on the height of that man from sites which purport to know (in a generic way).
      Interestingly, if you look at the source of that page, they actually list his height as 6' 2", but Google parses it incorrectly and calls it 6'.

      I think that's how the Semantic web will evolve - from incrementally devel
    • by ultranova (717540)

      I've got no problem with people pushing technologies but this one sounds more like a soap opera than anything. Has the Semantic Web changed anything for anyone on Slashdot? I haven't seen anything directly if it has ...

      /blockquote>

      Semantic Web hasn't and won't happen because it would require the content producers to include accurate machine-readable semantic information in the webpages they create, but it's in their best interests to include everything imaginable to get maximum viewership, assuming the

    • I have a dozen web sites, and the couple of dozen search engines trying to index the sites grew unsitely. So I now block all but the top six.

      Here's a list of excluded ones:

      #
      # blocked UAs
      #
      regexp -nocase
      {^Mozilla/4.0$|CorenSearchBot|ActiveX|iPhone|nutch|NaverBot|attributor
      |rarest|spider|DBLBot|Robot|Indy Library|Yandex| obot|ISC Systems|
      OOZBOT|WebDataCentreBot|Twiceler|discobot|SnapPreviewBot|Snapbot
      |Szukaj|BecomeBot|oder so|proximic|scoutjet|mrcarlito|Transcoder|
      Opera Mini|SuperBot|WebAlta} $
  • my future (Score:5, Funny)

    by nimbius (983462) on Monday October 27, 2008 @10:45AM (#25527241) Homepage
    will include a digital rights management compliant
    cloud based on a service oriented architecture
    that will empower my workgroup over the new semantic web 2.0

    insert license fee here.
  • when will slashdot move to a semantic web model?
    • FTFS:

      that is, technologies that let computers process the meaning of Web pages instead of simply downloading or serving them up blindly.

      Do you really want to figure out the true meaning of most slashdot comment pages? That could be totally enlightening, or universe-ending catastrophic. I'm betting on the latter.

      • Do you really want to figure out the true meaning of most slashdot comment pages?

        "I'm sorry Dave, I can't do that ... "

        Works on a couple of levels.

  • Semantics for search engines? You mean like "hunt" engines, "inquest" engines, or "inquiry" engines? Come see the new and exciting functions inherent in the brand new Microsoft "find stuff" engine!
  • by CodeBuster (516420) on Monday October 27, 2008 @10:52AM (#25527341)
    The advertisers and search engine optimizers have already shown that they have absolutely ZERO qualms about providing false or misleading information to search engine robots in the form of page cloaking, hidden frames, false meta tags, etc so what makes anyone believe that they will not play the same games, possibly with even greater result, against the semantic web? There is money to be made by gaming the system and as long as it is possible for website operators to describe themselves on the semantic web then they will describe themselves in any way they have to to drive traffic to their sites and get ad hits, truth be damned.
    • by maxume (22995)

      What you are driving at is that 'You can trust me' isn't really particularly valuable information.

      The upside of the whole semantic web thing is that you still need some sort of trust metric without it, so publishing structured data where possible (rather than removing/ignoring structure in the publishing process; this seems to be the real world outcome of all of the energy going into the semantic web) ends up making things better, even if it doesn't magically solve the trust or effort problems that are inhe

    • Hello CodeBuster, you have that absolutely right: just like on the web anyone can publish whatever they want on the semantic web.

      There are a lot of good ideas for provenance and general trust propagation, but I have not seen this problem really solved.

      It could be that businesses will simply only accept RDF data sources from companies and individuals that they trust. Sort of like blogs: I read abut 10 blogs a day that are written by people who I know and trust (at least to write interesting :-)

    • by 615 (812754)

      It's funny you should say that.

      Have you ever used Google AdWords? AdWords encourages advertisers to select relevant search terms for their ads by progressively delisting ads that perform poorly (low click-to-view ratio). Some ads perform poorly because nobody at all is interested, but I think most do so because they're not targeted well (you can sell anything to the right person). Advertisers who wish to succeed must do the work of correctly matching search terms with interests....

      It's been mentioned in thi

  • by Anonymous Coward

    as an "early adopter" all i can say is this is the most overhyped and pathetic bookmarking site i've seen in a while.

    all it does is let you bookmark URLs (via the amazing tech of "bookmarklet"), and then print them URLs embedded in a lot of tags (awww, yeah, RDF, semantics-schemantics). if that is what the semantic web ought to be, thanks, but how about no.

    i tried to upload a picture via their e-mail system from my phone. it was a jpeg with embedded location data. guess what I got -- I got an "item" classif

  • Works like a charm (Score:1, Insightful)

    by Anonymous Coward

    Looking for "Penn State" returned two "tweens". One for the Golden State Warriors and the other for State Cell Phone Driving Laws. How relevant.

    Oh, and here's to hoping "tween" doesn't catch on as a buzzword... ugh.

  • Baby steps (Score:4, Insightful)

    by truthsearch (249536) on Monday October 27, 2008 @11:01AM (#25527463) Homepage Journal

    Twine seems to be just a generic contextual search engine, as opposed to a pure keyword search engine. While it's a step, it's a very tiny step.

    What I want to see is more about the correlation between topics. For example, if I'm looking into PHP templating and search twine [twine.com], I get a few people's bookmarks on the topic. Nothing especially useful, and definitely nothing I couldn't find elsewhere. With real semantics I'd want to see a list of various templating engines, pro and con articles grouped for each, and maybe other sections on related design patterns and frameworks.

    In other words, I want to see semantics. Context search isn't going to make anyone turn their head.

    • by DerCed (155038)

      So you are essentially looking for this page on wikipedia [wikipedia.org]. And you could even help improving the page by adding your own findings. I would think that our beloved Wikipedia is quite semantic and immensely useful.

      Then there's social networking sites with tons of favorites, link lists and events listed that can actually overwhelm you with semantic information.

      If you think about, the semantic web is everywhere. Not really defined on any kind of protocol level, but rather as web applications or services themselv

      • Not really. That wikipedia page was not written by a computer. So it has nothing to do with the semantic web where pages are connected with some form of computer "understanding". Those are humans documenting the semantics, not computers interpreting them.

        Social networking links still have no "understanding" for the computer. They just tell you what your friends like, not what they mean.

        The only semantics on the web today are human input with no innovative output.

        • by tylerni7 (944579)
          You're right that wikipedia doesn't really 'understand' the subjects it has, but it does have interconnected links in a computer readable form that link different topics together in logical ways.
          While it is generated by users rather than being automatic, it's still a step in the right direction.

          It isn't necessary to have a computer understand the context of something more than what is useful. If a computer can do what we want, who cares if it really 'knows' what it's doing?
  • by BobMcD (601576) on Monday October 27, 2008 @11:06AM (#25527523)

    ...how are they supposed to teach a machine to infer meaning better than they're able to?

    I'm seriously wanting to know.

    • I'm up for sem antics. Like egging the neighbors heuse, that seunds like sem fun times.

    • by mcgrew (92797) *

      ...how are they supposed to teach a machine to infer meaning better than they're able to?

      I taught my toaster to play dead.

      • Ah, but does your toaster understant death? Does it question it's place in the universe? Does it ask you why you like your toast extra dark?

        If so, I'd be scared of your toaster.

    • by Joe Snipe (224958)

      Some can't even spell it!

      Seriously, though this is such a huge impossibility (right now). I've been wondering what the hell the financial experts were smoking when they engineered the real estate boom; apparantly they were buying it from the tech forecast experts.

    • It's not. (Score:3, Interesting)

      by Stu Charlton (1311)

      All the semantic web gives you is the ability to layer a logical design over data. It's like a database design, except it's "open world", meaning there can be many different designs, it's up to the agent to pick the one it trusts, and it can't really make assumptions based on what it doesn't know.

      The only inferences made are those that have been imagined by some human designer. And they might be very wrong , if the designer was wrong.

      The "kinds" of inferences available are also pretty limited, like hierar

    • I can't afford a slashvertisement but my pipe dream has always been Naive Bayesian Classification as a prime candidate for the job.

      Also, my pipe dream [sux0r.org] is AGPL. In the long term I want it to be distributed, not a service controlled by a single entity.

      There's a lot to improve, but it's there as a starting point for those less interested in handing over all their personal data to a business, more interested in cleaning up le dépotoir known as the web.

  • This concept of a semantic web is pretty new to me as a general "Web 2.0" type buzzword. There's no question that in general we want our computing experiences to be more thorough and intelligent. But if we're talking about computers analyzing the web, what we really are looking for it seems is true artificial intelligence. We want the type of AI that Tony Stark has. And I think 25 years from now perhaps we may start getting systems that come close to that goal, but there still appears to be a lot of work re
  • Yawn. Call me when the Ajax Rich AI Virtual Semantic iWeb 3.0 comes out.

    Until then, I'm sticking with Lynx!

  • In a nutshell, the goal of the Semantic Web is to bring knowledge representation to the Web (using graphs, networks, binary predicates, however you want to call it).

    I've been trying to apply data from the Semantic Web for a few years now.
    I can see two roadblocks to mainstream adoption:

    * Web data is immensely scruffy. If thousands of people contribute to a dataset without any restrictions, you get a mess (e.g. multiple URIs used to denote the same class or individual, which results in fractured data)

    • There's one issue to think about when discussing the new web, be it social sites or the semantic web. The quantity and quality of information required for both to work properly vs privacy issues.

    • by Tharos (1097065)
      In my opinion the problem with the current semantic web is the use of ontologies. An ontology is very good at describing knowledge within a certain domain such as health care or art. The resulting knowledge base can be searched in very inventive ways with a lot of relevant results. I have experience with tools such as protege to create such ontologies and they work great within a certain domain. The problem however is that an ontology only describes a certain domain. It is not possible to describe the enti
  • I browsed to the Twine page mentioned in the summary and every single link on the page when clicked takes you to a blank white page. So much for a launch.. :)
  • I think that we will see something like the semantic web, but it will likely develop from grass-roots efforts rather than top-down driven standards.

    That said, I have a chapter on the SW in my latest book (hopefully about to be printed in a few weeks) and my next book project will be about the commercial AllegroGraph SW kit.

    Good open source and commercial tools exist for RDF repositories, SPARQL queries and inference, etc. The problem is that the current round of applications don't really excite me (yet).

    The

  • by Grashnak (1003791) on Monday October 27, 2008 @11:58AM (#25528363)

    Buncha anti-semantists on this site.

  • serving up web pages blindly IS what the 'web' is supposed to do. More than that and you have a new application. I'm always amazed at how many even technical people discuss the web as if it WAS the Internet.
  • Search engines pretty much ignore meta tags, because spammers used them to misrepresent their pages and get more hits, so why do these "experts" expect anything different from tags which try to represent "meaning" in the semantic Web?
  • ...technologies that let computers process the meaning of Web pages...

    Assuming it is buggy enough, it could serve as an automated summary-generator for slashdot.

  • ... or how to launch a site using slashdot and a poorly written summary of vague buzzwords.
  • I typed in "XUL" and it surprised me with results that Google will have problems to surpass. Just go to Google and compare.

    (I tried even more keywords at once, but I was simplifying it more and more... so I ended up with one keyword to get the best results).

All this wheeling and dealing around, why, it isn't for money, it's for fun. Money's just the way we keep score. -- Henry Tyroon

Working...