Nepomuk Brings Semantic Web To the Desktop, Instead 140
An anonymous reader writes "Technology Review has a story looking at Nepomuk — the semantic tool that is bundled with the latest version of KDE. It seems that some Semantic Web researchers believe the tool will prove a breakthrough for semantic technology. By encouraging people to add semantic meta-data to the information stored on their machines they hope it could succeed where other semantic tools have failed."
Horrible name. (Score:3, Insightful)
NepoMUCK? Anything ending in "MUCK" doesn't sound like a good product. The concept is very interesting but the name isn't the best I've seen.
I'm glad that they don't prefix everything with K though.
Yes, I know that Nepomuk means "Networked Environment for Personalized, Ontology-based Management of Unified Knowledge" as stated in the article.
Re:Care to explain? (Score:2, Insightful)
"Semantics" is information about meaning (whereas syntax is information about form). Semantic tools try to provide meaning by describing relationships between information atoms. The goal is to create systems which can answer questions like "how old is the president's oldest child?" with just the age, instead of listing all documents which contain the words "old" "president" "oldest" and "child".
Re:Care to explain? (Score:3, Insightful)
It describes the ability to add metadata to web content (tags, etc), and you haven't heard of it because web 2.0 is the more popular term. ;)
Personally I think that metadata/tag based systems are the wrong road for semantic analysis of web pages. As soon as the semantics of a thing is decided by additional information added to describe that thing, its open to abuse.
The only advantage is its faster than what should be done, which is using good old maths to extract the true 'meaning' of a document or object.
Its not hard. Well, ok, its a little hard. Oh ok, its really rather difficult, but there are plenty of places you can get example code or libraries to make things easier.
Re:Care to explain? (Score:5, Insightful)
I've got a better reason why it failed that doesn't require delving into first year philosophy.
People are lazy. Look at any image database and figure out why it's difficult to find something. Because people don't want to spend 20 minutes filling in tags for a single image they just want to show off to their friends.
Now expand that to every other form of data type, and its easy to see why the semantic web never did, and never will take off without significant AI involvement.
I doubt it will catch on... (Score:3, Insightful)
And I'll tell you why.
The Nepomuk Web site wants to make me chew my own arm off. Now, I'm familar with the Semantic Web, I'm excited by the idea of semantic organisation. But this site is the epitome of grim, lifeless European research-ese. It completely fails to convey the technological approach, how it works, or why you should give a damn. I get the impression that the team was more interested in the EC funding then actually developing a disruptive technology.
Why why can't researchers spend 15 minutes thinking about how to convey the importance and excitement of what they are trying to do in terms of practical examples.
I'm afraid you'll probably have to wait until some enterprising 3rd party to grab the source and build some of the technology into a different product.
Redundant (Score:3, Insightful)
Re:Horrible name. (Score:4, Insightful)
Yep, that was my first thought as well. Quickly followed by wondering if 'into a collaboration environment which supports both the personal information management and the sharing and exchange across social and organizational relations' was some kind of euphemism for, eh, group pr0n of some kind.
Oh, well, either they have much less dirty minds than mine, or someone's desire for well-indexed pr0n browsing has gotten slightly out of hand.
Re:Um, no thanks (Score:3, Insightful)
Didn't you get the memo [slashdot.org]?
That should read:
This indexing fad should curl up and die (Score:2, Insightful)
Semantic Web Article in CACM (Score:3, Insightful)
Relational databases were in the same position in the late 60's/early 70's. We needed ways to combine and extract information automatically with a simple and expressive language. Relational database management systems, combined with SQL were the result of that, and they were a smashing success. They are now a standard business tool. The key to that success is essentially the role that the database's ontology plays in an RDBMS.
Having spent a lot of time professionally and academically working with and studying database technologies, most of the work is in understanding your data. Specifically, building a data model. A well-built data model is essentially an ontology. There are various techniques used to make sure that your can be handled automatically, mainly by normalization. This requires a tremendous amount of work on the part of the database designer, but the end result is that the end-user can query this data in fairly simple terms and get an enormous richness of data, sometimes in ways that even the database designer did not foresee. I think the success of database systems is what is driving a lot of the work in building the semantic web.
So you can see-- the big problem with the web is not just that data is not just unstructured, but that there are no standardized ontologies out there. RDF is an attempt to solve some of these problems simply, because you can embed your ontology, but it may be well off. On the other hand, if new tools make structuring data very easy or natural, people may be motivated to do the extra work because they'll personally benefit from it. For example, many people annotate or organize their photo collections naturally, so that they can share them with others. A smart photo gallery software writer may be able to come along and take advantage of that behavior to further enhance the meaning of that data.
Re:Care to explain? (Score:3, Insightful)
Actually, I'd say it's too early to say that the Semantic Web has failed. What has clearly failed for now is the vision for how the technology was to be used.
For one thing, it turned out that really, really clever textual matching is a lot more powerful than anybody thought possible. Twenty years or so ago, you'd have thought that you'd need to have some kind of sophisticated metadata to do the kinds of stuff we take for granted in Google today. I turns out that a technology that turns a needle in a haystack into a box of needles with some straw mixed in is pretty darned useful. Human intelligence picks the needle of meaning from the straw of superficial matches pretty effectively.
But what about non-human intelligence?
Well, here is another failure of the vision. Clearly, a semantic web is much more friendly to non-human agents. However, the whole agent philosophy of software design is extremely failure prone. A project which makes a resource easier to use for people is a safer bet than one which tries to replace human reason.
That said, you have the wrong end of the stick, philosophically. It is because meaning is not an attribute of data that we need semantic technology, It might be less contentious and pretentious if we simply call it "metadata".
If I want to find the rate of a certain disease in each county, the numerator is quite easy: I count all the instances of the disease. But the denominator turns out to be tricky, because of what I call the curious case of the dog barking in the night: some counties don't report any cases because they don't have any, others lack the technical capability to detect it.
Consider a county that can't detect the disease. I ought to exclude that county from the denominator in my rate calculations. On the other hand, a county which can detect ought to be included in the denominator, even if it reports no cases. However, since it found no cases, what we usually have is an absence of data which looks identical to the absence in counties that aren't capable.
You have to have the metadata to tell these cases apart. You have to have a model saying such and such a lab protocol is capable of detecting such and so set of infectious agents, and then you need metadata linking each data set to the appropriate model. You can do it by hand, manually discarding the data for counties you know you can't use, but this is really quite awkward when you cosider that the situation can change from year to year, or even within a year.
The model aspect presents a considerable can of worms. For any purpose, you want enough model, but no more than that. This is akin to the situation of novice designers who set out to create object frameworks before the have defined the software application. For us to share data we have to have some common model of things (although our terminology may differ). On the other hand it is certain our models disagree with each other; we want enough shared model to work together without forcing our entire model on each other, which is impractical.
The point is that you can't guess all the kinds of uses that future users as yet unknown might want to put data to, what kind of meaning they might extract from it. That's why search engine technology works so well: you put your stuff on the web and it gets spidered by Google: no guesswork needed. The Semantic Web, on the other hand, requires anticipating how the data will be used, which limits its usefulness. The "limits" here are, however ones of scope; the Semantic Web can't do everything, it certainly can't take the place of Google. Within the scope of its potential applications, it could be very useful indeed.
Re:Care to explain? (Score:3, Insightful)
If web content is readable and meaningful to me than it already has inherent meaning. Semantic tagging duplicates effort.
Semantic web is also about accessibility. Take a blind person for example surfing the web using a screen reader, do you have any idea how horrible his/her browsing experience would be like in the web today?
[Robotic voice] .. ....
Document Title - Slashdot | Nepomuk Brings Semantic
Document Body
Stories - Anchor link
Slash boxes - Anchor link
Comments - Anchor link
Search
Form field - Text - Query
Submit button - Search
New for nerds, stuff that matters
Hello eihab! - Link
Help & Preferences - Link
Click on a different page, and there you go listening to the same headers _again_. It can get very frustrating.
Without semantics there's no easy way for a screen reader (or other accessibility enabling devices) to successfully translate a document to something intelligible and usable.
You can see how a photograph (or even worse, an image with text that conveys something) can be completely hidden from a blind user without proper meta data that describe it. Or how a mildly complicated table would be read completely out of order if the reader couldn't distinguish between header rows and content rows, etc. (That's why designing using tables is a horrible idea).
The example I gave above is solved *cough*hacked*cough* today by adding two anchor links at the top of the page (skip to contents and skip to navigation) then hiding these links from regular browsers using CSS (Note: It happens to also be a valid [not hack] solution to the problem of scrolling past long navigation links on mobile devices).
I think you can see how a reader could easily identify which parts of the document are important and what should be skipped over or highlighted had it been served a semantic and valid [x]HTML document.