Forgot your password?

typodupeerror
GUI Data Storage KDE Software The Internet

Nepomuk Brings Semantic Web To the Desktop, Instead 140

Posted by timothy
from the semantic-researchers-play-better-tag dept.
An anonymous reader writes "Technology Review has a story looking at Nepomuk — the semantic tool that is bundled with the latest version of KDE. It seems that some Semantic Web researchers believe the tool will prove a breakthrough for semantic technology. By encouraging people to add semantic meta-data to the information stored on their machines they hope it could succeed where other semantic tools have failed."
This discussion has been archived. No new comments can be posted.

Nepomuk Brings Semantic Web To the Desktop, Instead

Comments Filter:
  • Horrible name. (Score:3, Insightful)

    by haeger (85819) on Tuesday December 16, 2008 @12:32PM (#26133627)

    NepoMUCK? Anything ending in "MUCK" doesn't sound like a good product. The concept is very interesting but the name isn't the best I've seen.

    I'm glad that they don't prefix everything with K though.

    Yes, I know that Nepomuk means "Networked Environment for Personalized, Ontology-based Management of Unified Knowledge" as stated in the article.

    .haeger

  • by Anonymous Coward on Tuesday December 16, 2008 @12:35PM (#26133665)

    "Semantics" is information about meaning (whereas syntax is information about form). Semantic tools try to provide meaning by describing relationships between information atoms. The goal is to create systems which can answer questions like "how old is the president's oldest child?" with just the age, instead of listing all documents which contain the words "old" "president" "oldest" and "child".

  • by thermian (1267986) on Tuesday December 16, 2008 @12:51PM (#26133849)

    It describes the ability to add metadata to web content (tags, etc), and you haven't heard of it because web 2.0 is the more popular term. ;)

    Personally I think that metadata/tag based systems are the wrong road for semantic analysis of web pages. As soon as the semantics of a thing is decided by additional information added to describe that thing, its open to abuse.

    The only advantage is its faster than what should be done, which is using good old maths to extract the true 'meaning' of a document or object.

    Its not hard. Well, ok, its a little hard. Oh ok, its really rather difficult, but there are plenty of places you can get example code or libraries to make things easier.

  • by Dynedain (141758) <slashdot2&anthonymclin,com> on Tuesday December 16, 2008 @12:58PM (#26133915) Homepage

    I've got a better reason why it failed that doesn't require delving into first year philosophy.

    People are lazy. Look at any image database and figure out why it's difficult to find something. Because people don't want to spend 20 minutes filling in tags for a single image they just want to show off to their friends.

    Now expand that to every other form of data type, and its easy to see why the semantic web never did, and never will take off without significant AI involvement.

  • by Angostura (703910) on Tuesday December 16, 2008 @01:08PM (#26134021)

    And I'll tell you why.

    The Nepomuk Web site wants to make me chew my own arm off. Now, I'm familar with the Semantic Web, I'm excited by the idea of semantic organisation. But this site is the epitome of grim, lifeless European research-ese. It completely fails to convey the technological approach, how it works, or why you should give a damn. I get the impression that the team was more interested in the EC funding then actually developing a disruptive technology.

    Why why can't researchers spend 15 minutes thinking about how to convey the importance and excitement of what they are trying to do in terms of practical examples.

    I'm afraid you'll probably have to wait until some enterprising 3rd party to grab the source and build some of the technology into a different product.

  • Redundant (Score:3, Insightful)

    by bjourne (1034822) on Tuesday December 16, 2008 @01:15PM (#26134123) Homepage Journal
    All information is semantic. This slashdot post is information encoded using English semantics. Unfortunately for the machines, the English semantics are way to complicated for them to understand. So they need a simpler set of grammar rules to be able to parse it. But why would anyone want to waste time marking it up just for the benefit of machine readability when google basically can accomplish the same thing without all that metadata markup cruft?
  • Re:Horrible name. (Score:4, Insightful)

    by Znork (31774) on Tuesday December 16, 2008 @01:40PM (#26134505)

    Yep, that was my first thought as well. Quickly followed by wondering if 'into a collaboration environment which supports both the personal information management and the sharing and exchange across social and organizational relations' was some kind of euphemism for, eh, group pr0n of some kind.

    Oh, well, either they have much less dirty minds than mine, or someone's desire for well-indexed pr0n browsing has gotten slightly out of hand.

  • Re:Um, no thanks (Score:3, Insightful)

    by Wonko the Sane (25252) * on Tuesday December 16, 2008 @01:53PM (#26134741) Journal

    Whoosh

    Didn't you get the memo [slashdot.org]?
    That should read:

    "You may have Frontotemporal Dementia [wikipedia.org]. Please see your physician."

  • by flyingfsck (986395) on Tuesday December 16, 2008 @02:30PM (#26135319)
    Everybody and his uncle tries to make systems that will index every piece of crap on your PC and it invariably results in a useless and horrible waste of resources. The biggest annoyance is trying to figure out how to turn these damn things off. Considering that the average user only searches for something once in several years, an on-demand search system makes far more sense.
  • by raddan (519638) on Tuesday December 16, 2008 @03:21PM (#26136025)
    There's actually a pretty good introduction to the semantic web in this month's Communications of the ACM [acm.org]. You're right when you say that the semantic web is, as yet, mostly unrealized. But it has huge potential.

    Relational databases were in the same position in the late 60's/early 70's. We needed ways to combine and extract information automatically with a simple and expressive language. Relational database management systems, combined with SQL were the result of that, and they were a smashing success. They are now a standard business tool. The key to that success is essentially the role that the database's ontology plays in an RDBMS.

    Having spent a lot of time professionally and academically working with and studying database technologies, most of the work is in understanding your data. Specifically, building a data model. A well-built data model is essentially an ontology. There are various techniques used to make sure that your can be handled automatically, mainly by normalization. This requires a tremendous amount of work on the part of the database designer, but the end result is that the end-user can query this data in fairly simple terms and get an enormous richness of data, sometimes in ways that even the database designer did not foresee. I think the success of database systems is what is driving a lot of the work in building the semantic web.

    So you can see-- the big problem with the web is not just that data is not just unstructured, but that there are no standardized ontologies out there. RDF is an attempt to solve some of these problems simply, because you can embed your ontology, but it may be well off. On the other hand, if new tools make structuring data very easy or natural, people may be motivated to do the extra work because they'll personally benefit from it. For example, many people annotate or organize their photo collections naturally, so that they can share them with others. A smart photo gallery software writer may be able to come along and take advantage of that behavior to further enhance the meaning of that data.
  • by hey! (33014) on Tuesday December 16, 2008 @03:29PM (#26136145) Homepage Journal

    Actually, I'd say it's too early to say that the Semantic Web has failed. What has clearly failed for now is the vision for how the technology was to be used.

    For one thing, it turned out that really, really clever textual matching is a lot more powerful than anybody thought possible. Twenty years or so ago, you'd have thought that you'd need to have some kind of sophisticated metadata to do the kinds of stuff we take for granted in Google today. I turns out that a technology that turns a needle in a haystack into a box of needles with some straw mixed in is pretty darned useful. Human intelligence picks the needle of meaning from the straw of superficial matches pretty effectively.

    But what about non-human intelligence?

    Well, here is another failure of the vision. Clearly, a semantic web is much more friendly to non-human agents. However, the whole agent philosophy of software design is extremely failure prone. A project which makes a resource easier to use for people is a safer bet than one which tries to replace human reason.

    That said, you have the wrong end of the stick, philosophically. It is because meaning is not an attribute of data that we need semantic technology, It might be less contentious and pretentious if we simply call it "metadata".

    If I want to find the rate of a certain disease in each county, the numerator is quite easy: I count all the instances of the disease. But the denominator turns out to be tricky, because of what I call the curious case of the dog barking in the night: some counties don't report any cases because they don't have any, others lack the technical capability to detect it.

    Consider a county that can't detect the disease. I ought to exclude that county from the denominator in my rate calculations. On the other hand, a county which can detect ought to be included in the denominator, even if it reports no cases. However, since it found no cases, what we usually have is an absence of data which looks identical to the absence in counties that aren't capable.

    You have to have the metadata to tell these cases apart. You have to have a model saying such and such a lab protocol is capable of detecting such and so set of infectious agents, and then you need metadata linking each data set to the appropriate model. You can do it by hand, manually discarding the data for counties you know you can't use, but this is really quite awkward when you cosider that the situation can change from year to year, or even within a year.

    The model aspect presents a considerable can of worms. For any purpose, you want enough model, but no more than that. This is akin to the situation of novice designers who set out to create object frameworks before the have defined the software application. For us to share data we have to have some common model of things (although our terminology may differ). On the other hand it is certain our models disagree with each other; we want enough shared model to work together without forcing our entire model on each other, which is impractical.

    The point is that you can't guess all the kinds of uses that future users as yet unknown might want to put data to, what kind of meaning they might extract from it. That's why search engine technology works so well: you put your stuff on the web and it gets spidered by Google: no guesswork needed. The Semantic Web, on the other hand, requires anticipating how the data will be used, which limits its usefulness. The "limits" here are, however ones of scope; the Semantic Web can't do everything, it certainly can't take the place of Google. Within the scope of its potential applications, it could be very useful indeed.

  • by eihab (823648) on Tuesday December 16, 2008 @09:19PM (#26140431)

    If web content is readable and meaningful to me than it already has inherent meaning. Semantic tagging duplicates effort.

    Semantic web is also about accessibility. Take a blind person for example surfing the web using a screen reader, do you have any idea how horrible his/her browsing experience would be like in the web today?

    [Robotic voice]
    Document Title - Slashdot | Nepomuk Brings Semantic ..
    Document Body
    Stories - Anchor link
    Slash boxes - Anchor link
    Comments - Anchor link
    Search
    Form field - Text - Query
    Submit button - Search
    New for nerds, stuff that matters
    Hello eihab! - Link
    Help & Preferences - Link ....

    Click on a different page, and there you go listening to the same headers _again_. It can get very frustrating.

    Without semantics there's no easy way for a screen reader (or other accessibility enabling devices) to successfully translate a document to something intelligible and usable.

    You can see how a photograph (or even worse, an image with text that conveys something) can be completely hidden from a blind user without proper meta data that describe it. Or how a mildly complicated table would be read completely out of order if the reader couldn't distinguish between header rows and content rows, etc. (That's why designing using tables is a horrible idea).

    The example I gave above is solved *cough*hacked*cough* today by adding two anchor links at the top of the page (skip to contents and skip to navigation) then hiding these links from regular browsers using CSS (Note: It happens to also be a valid [not hack] solution to the problem of scrolling past long navigation links on mobile devices).

    I think you can see how a reader could easily identify which parts of the document are important and what should be skipped over or highlighted had it been served a semantic and valid [x]HTML document.

My haircut is totally traditional!

Working...