Forgot your password?
typodupeerror
The Internet

Why the Semantic Web Will Fail 179

Posted by kdawson
from the coopetition dept.
Jack Action writes "A researcher at Canada's National Research Council has a provocative post on his personal blog predicting that the Semantic Web will fail. The researcher notes the rising problems with Web 2.0 — MySpace blocking outside widgets, Yahoo ending Flickr identities, rumors Google will turn off its search API — and predicts these will also cripple Web 3.0." From the post: "The Semantic Web will never work because it depends on businesses working together, on them cooperating. There is no way they: (1) would agree on web standards (hah!) (2) would adopt a common vocabulary (you don't say) (3) would reliably expose their APIs so anyone could use them (as if)."
This discussion has been archived. No new comments can be posted.

Why the Semantic Web Will Fail

Comments Filter:
  • Far out! (Score:4, Funny)

    by neonmonk (467567) on Wednesday March 21, 2007 @07:40AM (#18426941)
    Thank God for Web4.1!
    • Re: (Score:3, Funny)

      by Threni (635302)
      > Thank God for Web4.1!

      Software development is getting worse. x.1 of anything is as bad as 1.0 used to be. You'd be advised to wait for Web4.2 or at the very least Web4.1 Service Pack 1.
      • Re: (Score:3, Funny)

        Except Google, where everything is in permanent Beta!
  • by kalidasa (577403) on Wednesday March 21, 2007 @07:41AM (#18426951) Journal
    One of the problems is lack of standardization, and one of the symptoms is Yahoo! normalizing Flickr's user accounts with its own?
    • by CastrTroy (595695)
      Lack of standards isn't always a bad thing. Sometimes it's easier to write your own protocol, then to write a standard protocol that could encompass all possible future uses. Working with things like EDI [wikipedia.org] which attempt to standardize everything can make things more difficult than just working out at method that works for exactly what you need it to do.
      • Making your own protocol is only good for a on-off job with a limited and predetermined set of users; once you have to exchange data with a lot of systems you need to have a standard. When writing software to send data between suppliers and supermarkets I could look at the Tradacoms standard and be sure what fields were manditory, the type of data they held, the length, etc, and I didn't have to make a new version for each one.
      • I had to read your first sentence several times before I realized that you meant "than" and not "then". A little justification for the spelling Nazis, as this post was somewhat incoherent due to the spelling error. Surprisingly, the sentence was still grammatically correct with the spelling error, which was probably the reason it took me a moment to catch it.

    • Yeah, they've "ended Flickr identities" by making you log in with a non-Flickr Yahoo account instead of your non-Flickr email address, while keeping the same Flickr username, URL, and photos. I know I feel my identity was wrested away from me. Web2.0 is dead, and I'm going to get my revenge by moving all of my crappy photos to a broken site with like 5 users. That'll show 'em!
  • by jrumney (197329) on Wednesday March 21, 2007 @07:42AM (#18426959) Homepage

    The semantic web will fail because it is too complex and noone outside the academic community working on it really understands it. The ad-hoc tagging systems and microformats Web 2.0 has brought are good enough for most people, and much simpler for the casual web developer to understand.

    • by Jellybob (597204)
      Web standards are doing a lot to create a semantic web without people having to think about it. We're fast moving from "this is a big red piece of text" to "this is a heading" thanks to CSS allowing us to state that headings should be big and red.

      I doubt we're ever going to be in a position where every site is marked up with RDF metadata, but a lot of sites are now offering APIs that are good enough to do the job, sure we're unlikely to have a universal API that allows us to query any website on the interne
      • Re: (Score:2, Insightful)

        by crimperman (225941)

        We're fast moving from "this is a big red piece of text" to "this is a heading" thanks to CSS allowing us to state that headings should be big and red.

        Fast? CSS has allowed us to do that since 1996 [w3.org]! :o)

        I doubt we're ever going to be in a position where every site is marked up with RDF metadata, but a lot of sites are now offering APIs that are good enough to do the job, sure we're unlikely to have a universal API that allows us to query any website on the internet and extract the data we're looking for, but

      • by h2g2bob (948006)
        I wouldn't write off RDF completely. It's slowly creeping in (emphasis on slowly :-). But it is creeping in - the creative commons tags come with rdf stuff, which Google and others can pick up on if you do a "creative commons only" search.

        Really I don't understand what people are so worked up with about RDF. It's just the XML version of the meta tags you get at the top of html documents. Some things read those and make decisions based on them, just like RDF.
    • by tbriggs6 (816403) on Wednesday March 21, 2007 @08:38AM (#18427299) Homepage
      Have you ever read the original presentation of work by Codd on relational databases? How about the RFC standards on TCP/IP? How about the original presentation and arguments on the inclusion of Interrupts in a processor? Boy, those were so easy to understand and obvious that they were even published at all. The process of science is to push the state of the art; which by definition is new and novel. This is the job of the computer science researcher. It is left to others to examine the research and reformulate in terms that mere mortals can understand. If you understand the concepts behind the OSI layers, Lambda expressions, or symmetric multi-processing, thank a computer science educator who abstracted and distilled the hell of the science and research and packaged in such a way that you can understand it and maybe even use it. To claim that failure is imminent because the current presentation of the Semantic Web is too complex is nonsense.
      • by Niten (201835) on Wednesday March 21, 2007 @09:36AM (#18427861)

        You're missing the point. It's not that the current "presentation" of the Semantic Web is too complex; the problem is that actually creating the Semantic Web is too complex a task for most Web content creators to be interested in.

        Essentially, the Semantic Web asks users to explicitly state relations between concepts and ideas to make up for our current lack of an AI capable of discerning such things for itself from natural human language. But let's face it, the average Joe writing his weblog or LiveJournal entries - or even a more technical user such as myself - would generally not be interested in performing this time-consuming task, even with the aid of a fancy WordPress plugin or other automated process. This is what the parent meant by saying it's just "too complicated".

        The way to realize the Semantic Web is to advance AI technology to the point where it becomes an automated process. Anything less would require too much manual labor to take off.

        • No, the problem is capitalism, and the incentive it creates to screw people over for profit rather than co-operate for the common good.

          Technology isn't going to overcome that problem. We need a new economic system.
          • And the common good does what in this case?

            If search engines start using semantic tags to prioritize their results then people will start semantically tagging their data. Time is a finite resource and people need incentives to spend their time tagging their site semantically. This has little or nothing to do with a particular economic system.
        • by bbtom (581232)
          That sounds just like a big user experience problem. There is nothing technically wrong with the data model, it just needs to have people design decent, usable interfaces and make the experience of using the system easier. That, combined with motivation - putting incentives in front of people to make data available. Sounds enough to keep the Silicon Valley type startup culture rolling for a few more years... :)
        • Essentially, the Semantic Web asks users to explicitly state relations between concepts and ideas to make up for our current lack of an AI capable of discerning such things for itself from natural human language.

          Well, no, it doesn't.

          While that may be a practical necessity with most existing tools (just as in the earliest days of the web, end-users hand-coding HTML was a practical necessity), what the Semantic Web is about is standardized ways of exchanging such descriptions, just as what the plain-old-web

        • by CodeBuster (516420) on Wednesday March 21, 2007 @01:41PM (#18431413)
          Essentially, the Semantic Web asks users to explicitly state relations between concepts and ideas to make up for our current lack of an AI capable of discerning such things for itself from natural human language.

          The problem here is trust. All of the previous features of the web, whether it is javascript or metadata or something else, have invariably been abused by those seeking to game the system for profit. The semantic web is asking the marketplace to state relations in an unbiased fashion when there are powerful economic incentives to do otherwise (i.e. everything on the semantic web will end up being related to pron whether it actually is or not). Indeed there are entire businesses devoted to "optimizing" search engine results, targeting ads, spamming people to death, and other abuses. The problem was that the people that designed and built the initial web protocols and technologies did not account for the use of their network by the general public and thus did not take steps to technologically limit abuses (their network of distinguished academic colleagues was always collegial after all so there would be no widespread abuses). The semantic web will fail precisely because human nature is deceptive, not because the technology is somehow lacking.

          In fact, this whole discussion is reminiscent of the conversation that Neo has with the Architect in The Matrix Reloaded. The Architect, as you may recall, explains why a system (the Matrix), which was originally designed to be a harmony of mathematical precision, ultimately failed to function, in that form, because the imperfections and flaws inherent in humanity continuously undermined its ability to function as it was intended. The same general principle is at work with the Semantic Web, the perfect system could work in a perfect world, but not in our world because humans are not perfect.
          • Re: (Score:3, Insightful)

            by DragonWriter (970822)

            The problem here is trust.

            I agree on this much.

            The semantic web is asking the marketplace to state relations in an unbiased fashion

            I don't think it really is. Its certainly asking people to make claims about resources, but those claims themselves are resources that may be the subject of metadata making claims about those claims. How people (or automated systems) treat particular claims on the Semantic Web can certainly depend on claims made about those claims by particular other sources of metadata. Trust

        • I think the big issue right now is that the computer industry doesn't even know what they mean by "Web 2.0" and the marketing departments hide their ignorance admirably by repeating buzzwords until people think they understand concepts they don't. Ok, Tim O'Reilly is careful to define such terms when he uses them (good for him!) but few others seem to do the same.

          At least with things like TCP/IP, relational database theory, information theory, and the like, the concepts are well defined, not some mishmash
    • by KjetilK (186133)
      They are two quite different things. Microformats and tagging is for making data available and simple one-data-source applications. And it is very useful for that. The Semantic Web is a consistent data model and more elaborate data access methods for larger things that involve multiple data sources.

      Also, GRDDL [w3.org] has just made microformats a part of the Semantic Web, and I have just created a system to marry taxonomies and folksonomies [opera.com], (i.e. big controlled vocabularies and tags).

      There is no conflict her

  • Web services (Score:4, Insightful)

    by Knutsi (959723) on Wednesday March 21, 2007 @07:48AM (#18426983)
    Doesn't Web 2.0 reach a "critical mass" as some point, where busineese will no longer be able to not cooperate? Of course, it all gets very fragile even then...
    • Exactly. There's a hidden assumption in the question that the Web is now and will continue to be run by businesses. Anyone who's been around long enough knows that most of the trends seen on the Web today were set forth years before any businesses started showing up. The businesses started following the trends then and they will continue to follow the trends set in motion by the pioneers of the Web, as long as they continue to reach critical mass.

    • by Jessta (666101)
      Everything that is old is new again.
      We already solved the interprocess communication issue.
      But now that our processes are being run on many different machines, by many different companies, all of which don't conform to any kind of standard, and the user has no control, we need to solve the issue again.

      It's going to be fun to see the mess.
    • Only in the case where service users are also service providers. If websites out there all use Google's API and Google finds that they are losing money by losing direct traffic, they will truncate their API or drop the service.

      No for-profit business is in the business of providing services for free. What they will do is give you a free lunch in exchange for picking up the dinner bill.
    • by quanticle (843097)
      >>Doesn't Web 2.0 reach a "critical mass" as some point, where busineese will no longer be able to not cooperate?<<

      That could happen. However, in order to reach "critical mass" you need to have some number of businesses cooperating (and profiting from the cooperation) so that other businesses see the advantages of following open standards.

      On the other hand, if "pioneer" companies like Google, Yahoo, and others fail to cooperate, then the odds of Web2.0 reaching critical mass are much reduced.
  • by ilovegeorgebush (923173) on Wednesday March 21, 2007 @07:48AM (#18426985) Homepage
    ...says the guy who's blogging this opinion...
    • Re: (Score:3, Interesting)

      by linvir (970218)

      http://en.wikipedia.org/wiki/Semantic_Web [wikipedia.org]

      http://www.w3.org/DesignIssues/Semantic.html [w3.org]

      http://infomesh.net/2001/swintro/ [infomesh.net]

      Nothing on any of those pages indicated that blogging is an inherent part of the "semantic web". As best as I can tell, the semantic web people want there to be some kind of SQL language for websites, so you can type "SELECT `images` FROM `websites` WHERE `porn` > 0 AND `price` = 0 AND `subject` = 'shitting dick nipples'" instead of going to Google or something.

      I guess it'd be n

      • I thought the "blogging" comment was supposed to be a jab at the idea that some random blog bost has merit as a newsworthy article. Most blogs are basically opinions, conjecture and ramblings of a person that is not likely to be an expert on the question at hand.

        I never really thought of blogs inherently being semantic web, it doesn't have to be, and that didn't occur to me when I read the grandparent post.
  • by Anonymous Coward
    The researcher is just annoyed because no one sent him invites to Gmail.
  • by patio11 (857072) on Wednesday March 21, 2007 @07:51AM (#18426999)
    It was created to solve a problem we had when everyone was using Hotbot and Altavista, but people are trying to introduce it into a world where everyone is using Google. (And Wikipedia. And all that Web 2.0 junk.)

    I don't need you to mark "This page is a REVIEW of a CELL PHONE that has the NAME iPhone" anymore. All I need to do is Google "iPhone review" or hop on over to Amazon. Problem pretty freaking solved from my perspective.
    • by Scarblac (122480) <slashdot@gerlich.nl> on Wednesday March 21, 2007 @07:57AM (#18427041) Homepage

      Just not true. For one thing, Google's results are much too noisy. For another, it relies on keywords occurring on pages, and that's rather primitive (it's not always trivial to find good keywords, and even then you might miss the one page your were looking for because they used a synonym or misspelled it).

      But the most important reason is that it would be much cooler to have a web where you could say "give me a list of all the goals scored by Romario" and have it list them for me. I don't care about pages, I want information, answers to questions. That's what the Semantic Web is supposed to be a first mini step for.

      • by Wah (30840) on Wednesday March 21, 2007 @09:17AM (#18427661) Homepage Journal
        For one thing, Google's results are much too noisy. For another, it relies on keywords occurring on pages, and that's rather primitive

        No it doesn't. The genius of google was that it relies on people linking to pages talking about keywords. And uses various tools to identify and promote good linkers.

        But the most important reason is that it would be much cooler to have a web where you could say "give me a list of all the goals scored by Romario" and have it list them for me.

        That's a curious thing to ask for, since the first google result is a story about how there is a good bit of controversy surrounding Romario's "1,000" goals. The problem is your request is to vague and doesn't define all the words within itself (i.e. does a goal scored as teenager in a different league count?).

        This goal is quite a bit higher than many realize, as you could get 10 people (5 of them experts) in a room and they wouldn't necessarily be able to agree on the "right" answer.

        To ask, or even demand, that computers do the same task as a background function is ludicrious, IMHO (at least when applied to a universal context).
        • by MarkGriz (520778)
          >> For one thing, Google's results are much too noisy. For another, it relies on keywords occurring on pages, and that's rather primitive

          > No it doesn't. The genius of google was that it relies on people linking to pages talking about keywords

          Being a gigantic web circle jerk doesn't make the results any less noisy.
          It just makes popular stuff more "popular". If you aren't searching for the
          latest craze all the cool kids are talking about, you can spend a fair amount of time
          crafting keywords to filte
          • Re: (Score:3, Interesting)

            by Jesus_666 (702802)
            How is the Semantic Web supposed to mitigate those facts? As far as I know it still relies on the site telling the world what it is about - and just like I can put "horny schoolgirls viagra playstation ponies" in an invisible <div> I can surely publish an RDF document stating that my website is about sex, naval warfare and Segways. We don't get less junk, we just get machine-readable junk.

            Also, false advertisement aside, when requesting a listong of everything pertaining to, say, "Alice Cooper", how
            • how do you deal with the thirty million hits for websites that offer Alice Cooper lyrics? Of course you can construct complex queries, but that's also possible with Google.

              The difference is in the interaction initiative. With Google you have to construct a complex query from scratch, by combining logical operators (AND, OR) and filters (+, -), and you have to decide which keywords to include.

              A semantic search, on the other hand, will suggest relevant terms related to the main item - this allows to refine th
          • by AuMatar (183847)
            I think you're full of shit. I have never had a search in google not find exactly what I want in the first 5 results. You're either lieing to jump on the anti-google bandwagon, or suck at creating querries.
            • by MarkGriz (520778)
              Just because your google search results are always spot on, doesn't mean I'm wrong, lying, full of shit, or suck at creating queries.
              In fact, I've gotten pretty damn good at creating queries to filter out all kinds of crap (most notably ebay and other shopping sites).

              Don't get me wrong.... I'm not anti google at all. I use Google dozens of times a day and most often it finds what I'm looking for.
              However, for something more esoteric, you might have to retry your query many times over, changing a word here o
            • by Fred_A (10934)
              If you spell your queries right, you do get millions of hits. Your approach is original though.
        • by Thuktun (221615)

          But the most important reason is that it would be much cooler to have a web where you could say "give me a list of all the goals scored by Romario" and have it list them for me.

          That's a curious thing to ask for, since the first google result is a story about how there is a good bit of controversy surrounding Romario's "1,000" goals. The problem is your request is to vague and doesn't define all the words within itself (i.e. does a goal scored as teenager in a different league count?).

          That's a matter for the questioner to deal with. Humans can sometimes figure out what you /really/ mean when you ask an ambiguous question like that, but the questioner getting the proper result really relies on him/her having the clarity of thought to ask the right questions.

      • Re: (Score:2, Interesting)

        by asninn (1071320)
        Google is actually somewhat fault-tolerant when it comes to spellings - it doesn't just offer suggestions along the lines of "did you mean FOOBAR" when it thinks you mistyped something, it also includes spelling variants in your search results by default. I can't come up with a *good* example right now, but for a bad one, try to search for "head set" (sans quotes) and observe that you also get hits for the word "headset".

        I do agree about noise, but only to the extent that the spam sites and the like you get
      • by snowwrestler (896305) on Wednesday March 21, 2007 @10:01AM (#18428201)
        But there are three ways to get that.

        1) A search service that indexes all of Romario's goals.
        2) A manually built asset that aggregates all of Romario's goals.
        3) A standard system of semantic tags that self-identifies all Romario goal assets.

        #1 is Google. As you point out now it relies primarily on keywords but you oversell the problem in two ways. First of all most video hosting sites already provide author and/or community tagging--thus providing a way for keywords to be assigned. Second, you're comparing a future semantic Web against the Google of today.

        #2 can be provided by commercial video companies now ("1,000 Great Man U Goals," etc). It's also possible that a fan site could do the manual labor to find, upload, and keyword the videos.

        #3 is the "semantic Web" approach, wherein all content providers follow a standard for self-identifying their content in a computer-parsable way.

        The thing that distinguishes 1 and 2 from 3 is the scope of work required. #1 and #2 rely on a small team of dedicated people to accomplish the task. #3 relies on a very broad group of people of varying levels of dedication.

        If you're talking practically about the solution, none of those approaches are going to to get to 100%. As others have pointed out there is a real human semantic problem in identifying which goals of Romario to count, how far back to look, etc.

        But the key is that #1 and #2 are approaches of a scope that we know can work. #3 seems unlikely to get the buy-in and effort required.
        • What web 2.0 and the semantic web can do together is better than #1 and #2. Instead of a dedicated team of 20 you get a completely arbitrary and undedicated team of 20 million doing their best to tag content with keywords. When you do a search on a keyword you get a relevance of say 90% for videos with MAN U goals because random people who stumbled upon them tagged them as such. This takes all the work out of it for the distributor. In essence you can put up a video with no meta data.... and let the masses
      • by Wildclaw (15718)
        Exactly. The biggest problem with the generic search engines today is that they don't separate content from page layout.

        A good example of this is a product page that contains the words "add review". If the site in question has a decent page rank, it may appear high up in the google listing when someone searches for 'review "product name"' even though the page in question doesn't contain a single review. This isn't a problem for popular items that have reviews on big review sites that are assosciated with th
      • I don't care about pages, I want information

        I don't think that everyone would agree with that. I don't believe that the semantic web is about data aggregation as much as it is about context sensitive search.

        If you were correct, then the semantic web would not get adoption until the data aggregation function could honor the syndicator's financial need to display advertising. That would be very tricky. The syndicator would no longer be able to make any promises to the advertiser regarding placement. That would have a chilling affect on the most p

      • I think the problem is that it's not being done well enough to do something like what you want.

        Another problem with the way you want to do is that the sites that have this information want you to go to them. If you visit them, they get ad impressions, possibly ad clicks and some attention/notoriety/fame/etc. If there's no attention and no money to be made because some other service has slurped your information, then it's often not worth puting up the information in a manner that's easily & automatical
    • by KjetilK (186133)
      Google is an excellent example why this isn't solved. So, my employer wanted to link to some of our partners. We have a really high google rank. But we couldn't do that, because google would punish us because they think we're getting paid to link to our partners. Google is actually broken.

      What is broken about it? The thing is that google thinks that the link has semantics, and the PageRank algorithm is totally based on that flaw. A link doesn't have any semantics.

      So, what have we done about it? RDF is

  • by ooze (307871) on Wednesday March 21, 2007 @07:51AM (#18427001)
    Only way to set an industry standard is, to get so fast so big in a new market/technology that everybody has to follow.
    Problem is, when you get so big so fast, there are almost neccessarily major flaws in the designs.
    Problem is, you never get rid of them again.
  • Google (Score:2, Insightful)

    by c_fel (927677)
    What are those rumors about Google who would be closing their search API ? Are we talking about the boxes we can put on our sites to make a search in Google ? I thought the add shown besides the results were their main revenue : Why the hell would they close it ?
    • Re: (Score:3, Informative)

      by discord5 (798235)

      What are those rumors about Google who would be closing their search API ? Are we talking about the boxes we can put on our sites to make a search in Google ?

      No, this is about the SOAP API [google.com] being replaced by a less flexible AJAX API [google.com]. Never used either of them to be honest, but that's because I don't have any real need for them. When it comes to the content of my own websites (or rather my customers websites), I'd much rather prefer relying on my own database than an index google made.

  • Why it will fail (Score:5, Insightful)

    by squoozer (730327) on Wednesday March 21, 2007 @07:56AM (#18427031)

    It might fail for the reasons given (no I've not read the full article yet - naturally) but personally I think it will fail simply because it's too much work for the amount of payback. It would be great if one day magically over night all our data was semantically marked up but that's not going to happen. The reality of it is that we will have to mark up the majority of content by hand. Even then inter-ontology mappings are so difficult that I'm not sure the system would be much use.

    Perhaps worse than that though is the prospect of semantic spamming. It would be impossible to trust the semantic mark up in a document unless you could actually process the document and understand it. What would be the point in the mark up in that case?

    • It would be impossible to trust the semantic mark up in a document unless you could actually process the document and understand it.

      A major point of the semantic web is that semantic markup about a resource can be provided anywhere else on the (semantic) web. You may not trust the markup in an unknown document about which you know nothing, but you may be able to trust the markup about that documented provided by a trusted source. And, if you apply some rules for delegating trust, you may automatically provi

  • What is it anyway? (Score:4, Insightful)

    by mwvdlee (775178) on Wednesday March 21, 2007 @07:58AM (#18427045) Homepage
    So what is this semantic web / web 2.0 thing anyway?

    Sure, we're all seeing community sites, blogs, tagging, etc. But each of those sites is an individual site, and their only connections seem to be plain HTML links. Community sites don't really allow collaboration, blogs are standardized personal web pages and who here uses tags to actually find information? All these things might warrant a "Web 1.0 patch 3283" label, but is it really a new type of web? Is it the type and magnitude of paradigm shift that the first web was? It only seems like people are just becoming more aware of the possibilities of the same web it was 10 years ago.
    • Re: (Score:3, Insightful)

      by AutopsyReport (856852)
      I would agree. The current idea of what constitutes Web 2.0 doesn't fit the label. If I had to propose a new definition for Web 2.0, it would be the beginning shift of desktop applications to the Web. I just can't consider a trend in graphics, tagging, and social networking as a major advancement in the Web. Yeah, it's cool and it can be fun, but you said it best when you called it the "3282 patch". That's a more appropriate title for what's going on.

      What's really cool is the beginning of desktop to web
      • by mgblst (80109)
        This is because you don't really understand it. This is huge, when you start to think about it. It is an agreed upon form of describing data, rather than just blindly putting it up on the web as a blog or whatever. Go read up on it.
        • I don't need to read up on it, I've been using it. And it's not standardized. The concept of tagging data isn't new -- it's just new to the Web and "blogs" and "social networks". Ever heard of metadata? Tagging is shorthand metadata on a global scale. I suggest you be the one to do some research. Here's some links for you: Tags (metadata) [wikipedia.org] and Metadata [wikipedia.org]
    • Re: (Score:2, Interesting)

      by asninn (1071320)
      There might not be a clear revolution, but there certainly is a lot of evolution going on. For example, compare early web pages [w3.org] (written a mere 15 years ago) to, say, Google Maps; I think it's safe to say that there happened more than just a move from "Web 1.0" to "Web 1.0 patch 3283".

      The problem with "web 2.0" is not that the web hasn't changed dramatically, it's that the term is rooted in marketing rather than technology.
    • by KjetilK (186133)
      Indeed, the Web 2.0 thing is really a small, incremental change. Write-web was really in TimBL's ideas from the beginning. Collaboration too. Arguably, the application web is a bigger step up.

      But Web 2.0 is distinctly different from the Semantic Web. The Semantic Web is about structuring data on a global level and allow queries on them. There is a lot of structured data out there (in backend databases, XML trees, etc), and making them available in a consistent data model, the Semantic Web is here.

      The bi

  • I do not think it means what you think it means
    • by radtea (464814) on Wednesday March 21, 2007 @09:26AM (#18427769)

      `When I use a word,' Humpty Dumpty said, in rather a scornful tone, `it means just what I choose it to mean -- neither more nor less.'

      `The question is,' said Alice, `whether you can make words mean so many different things.'

      `The question is,' said Humpty Dumpty, `which is to be master -- that's all.'

      Lewis Carol had it right [sabian.org], and George Orwell agreed with him [wikipedia.org]: "Which is to be master" is the question that matters.

      In free societies, everyone is master, and our language is conditioned only by the minimal need to communicate approximately with others. Beyond that, we are free to impose whatever semantics we want, and we do this to a far greater extent than most people realize. As a friend who works in GIS once said, "If I send out a bunch of geologists to map a site and collate their data at the end of the day, I can tell you who mapped where, but not what anyone mapped." Individual meanings of terms as simple as "granite" or "schist" are sufficiently variable that even extremely concrete tasks are very difficult.

      Imposing uniform ontologies on any but the most narrowly defined fields is impossible, and even within those fields nominally standard vocabularies will be used differently by rapidly-dividing "cultural" subgroups within the workers in the field.

      The semantic web is doomed to fail because language is far more highly personalized than anyone wants to believe. I think this is a good thing, because the only way to impose standardized meanings on terms would be to impose standardized thinking on people, and if that were possible someone would have done it by now. Whereas we know, despite millennia of attempts, no such standardization is possible, except in very small groups over a very specialized range of concepts.

      • by bbtom (581232)
        So... you let people make their own ontologies. Since the Semantic Web is an open platform with open standards, this is easier than you would think. The only objection to this is "people are stupid". I don't think they are. YMMV.
  • Go to Wikipedia (for example) and look up the definition. Then tell me you understand it.

    See? Not a hope that a concept which includes 'collaborative working groups' as part of its definition can ever succeed.

    I mean these are the people which gave us HTML and CSS, god help us.

    Meaning is derived by humans from the interaction between data, knowledge and dialogue. What the semantic web will give us is:

    1) Data
    2) Limited knowledge to the extent that common, sufficiently rich models of relationships, taxonomies
  • One word: SPAM (Score:5, Insightful)

    by ngunton (460215) on Wednesday March 21, 2007 @08:34AM (#18427247) Homepage
    The thing the academics who push the semantic web fail to consider (most of the time) is that the Real World does not function like their Ideal World. In the Ideal World, everybody cooperates and works together to produce something of value for all mankind. So we get lots of correctly and appropriately marked up pages that give useful information on what's stored therein.

    But in the Real World, any online system that is used by a large enough number of people will eventually become attractive for spammers and scammers to defile and twist to their own purposes. So you'll get a deluge of pages that appear to be useful reviews of digital cameras (and are marked up as such) but in fact simply go to a useless "search" page that has lots of link farm references.

    And if you say "Ok, so we don't trust the author of the page, we have someone else do it"... then who? Who's going to do all the work? Answer: Nobody. AI is nowhere near being smart enough for this. Keyword searching is, unfortunately, here to stay. If you trust the author to do the markup, then the spammers have a field day. If you say "Only trusted authors" then the system will still fail, due to laziness on most people's part - if a system isn't trivial to implement and involves some kind of "authentication" or "authorization" then nobody will use it, period. The Web succeeded in the first place because anybody anywhere could just stick up a Web server and publish pages, and it was immediately visible to the whole world.

    The Semantic Web will fail for the same reason that the "meta" tag failed in HTML: Any system that can be abused by spammers, will be abused.

    So, the Semantic Web, which is all about helping people find stuff, will fail. Not because of any technological shortcomings (it's all very nice in theory), but simply because we as people won't work together to make it work. Well, a small number of people could work together, but as that number got larger, until it reaches the point of being useful, it will automatically get to the tipping point where it becomes worthwhile for the spammers to jump in and foul it all up.
    • Re: (Score:3, Insightful)

      s/Semantic Web/Wikipedia/g;

      I believe all your arguments have been used to explain why Wikipedia will fail. Well, it hasn't failed yet.
      • Re: (Score:3, Interesting)

        by jlowery (47102)
        I believe all your arguments have been used to explain why Wikipedia will fail. Well, it hasn't failed yet.

        Ummm...

        1) Academia won't allow Wikipedia as a primary reference
        2) Steven Colbert
        3) Authorities with unverified academic credentials
        4) Reversion wars
        5) Article lock-downs

        Also, Wikipedia relies on many editors working on a single resource, wherease the SW relies on single editors working on many resources. It is hard to corrupt many editors, but easier to have corrupt single editors.
    • by bbtom (581232)

      Whitelisting and search personalization. When your search engine is giving priority to people you know and trust and sites they say they trust then the spam problem is significantly lessened. The OpenID community are already using OpenID as a spam whitelisting mechanism. In the Semantic Web layer cake [w3.org], there is a layer called Trust, which is based on a combination of the RDF stack and document signing and encryption. Even if that isn't the way it's approached, Trust is something that SemWeb people are conce

    • Clearly, you haven't seen anything of what the academics have been doing. Of course, people brought along those lessons learned from the failures of HTML, which was partly because the data model was not clear, partly because it was indeed much to easy to abuse. Academics have of course discussed this at length [l3s.de]. Indeed, some approaches have been rather academic, such as building everything on big PGP-based trust networks, but others are very practical.

      Right now, we're building semweb based trust metrics [w3.org] f

    • But in the Real World, any online system that is used by a large enough number of people will eventually become attractive for spammers and scammers to defile and twist to their own purposes.

      Sure, but the Semantic Web also enables the kind of metametadata that enables automatically ignoring metadata from sources that your circle of trust (has designated as "spammers" or "scammers"), provided that you also have an accountability/trust mechanism, like relying on signed metadata.

      But, really, lots of problems

  • Obvious (Score:3, Insightful)

    by AlXtreme (223728) on Wednesday March 21, 2007 @08:35AM (#18427263) Homepage Journal
    The Semantic Web is a solution in search of a problem.

    No matter how cool your RDF/OWL ontologies are, the real world is perfectly happy with plain XML/CSV. If there isn't an obvious benefit, people won't switch.
    • by bbtom (581232)

      Have you seen GRDDL [w3.org]? It's a way of producing RDF/XML data from "plain XML" (although not plain CSV...), and XHTML too (and HTML if you run stuff through Tidy). It's typically W3C - a big long document to explain 'XML+XSLT=RDF', but it's pretty neat nonetheless. Anything that's even vaguely structured can be turned in to RDF very easily. The W3C made a mistake in thinking that RDF etc. would be picked up straight out of the gate in 1999 (when RDF was standardised) - instead, it's taking a rather bending path

  • Other Market (Score:2, Interesting)

    by Anonymous Coward
    Maybe these things will fail in the public world of free service bureaus with which this guy is familiar, but the concept of webservice API is exploding in the vertical market spaces. In only the last two or three years virtually every single vendor my company works with in the financial industry has launched fully WSE compliant webservices to tie into their products. Previously you would have to work in batch by uploading a file to a secure FTP site and wait for results to appear as another file in that
  • After it would work the academic way. It would be spammed to hell.

    Who do you trust giving away the right semantics for a page?

    Maybe a handful of companies will trust each other. Or google will make them sign something?

    Not a WEB I'm part of I guess.
  • Best essay on the topic I have come across: http://www.well.com/~doctorow/metacrap.htm [well.com]
  • ... but not for the reasons the researcher cited.
  • It's relative (Score:3, Insightful)

    by tbannist (230135) on Wednesday March 21, 2007 @09:51AM (#18428041)
    This is the real world, most things aren't total successes or total failures.

    Most likely the symantic web will fail to achieve all it's objectives but achieve some of them, and may eventually rise again after it's failed. This is the nature of progress. Good ideas that fail are usually resurrected later. However the blogger is probably right, as long as the symantic web is going to be "handed" to us by a group of established corporations it will most likely never succeed, there's too much incentive for back stabbing in that top-down implementation. For it to succeed it needs to be so obvious that there's more money and power available by playing nice that all but the most black hearted capitalists will play nice. We have to be aware that people like spammers exist, though, and anything that could potentially be used to generate advantage will be abused to death.
  • The more functionality and interactivity you have between what were always envisioned as static documents, the more security holes are opened up. This combined with the Search Engine Optimization Industry, which is dedicated to lying about a sites content and relative importance, will ultimately sink any attempt to bring any trustable semanticness to the Web.
    • by bbtom (581232)
      The flipside to this is that the more 'varied' user agents are, the less susceptible they are to exploits. The Semantic Web - broadly understood - could find solutions to SEO. Search personalization and recommendation systems can go some way to neutralize SEO. My inbox on del.icio.us gets no spam at all, because the only people who can post things to it are people I trust. A web based on 'subscriptions' and pubic expressions of 'trust' will be a lot harder to spam and SEO.
  • I recently read a critique of "weak" SW (the "lower case semantic web") techniques like microformats, etc. The idea was that we need a high level metadata standard.

    Contrary to this opinion:

    I recently wrote in my my AI blog [blogspot.com] about my expectations that the SW will develop from the bottom up. I also wrote about this 3 years ago (PDF "Jumpstarting the Semantic Web" [markwatson.com], skip to page 3).

    So, I partially agree with Stephen Downes that cooperation is unlikely, but the SW in some form will happen.
  • by Finin (97295) <[ude.cbmu.sc] [ta] [ninif]> on Wednesday March 21, 2007 @11:23AM (#18429301) Homepage
    Stephen's argument is based on the belief that "The Semantic Web will never work because it depends on businesses working together, on them cooperating." He says:

    "But the big problem is they believed everyone would work together:
    • would agree on web standards (hah!)
    • would adopt a common vocabulary (you don't say)
    • would reliably expose their APIs so anyone could use them (as if)"
    While the argument he makes is grounded in his distrust of corporations, which I share to some degree, his second point above is off the mark, at least for RDF.

    One of the features of the W3C's model [w3.org] (based on RDF) is that it doesn't push the idea that everyone should adopt the same vocabulary (or ontology) for a topic or domain. Instead it offers a way to publish vocabularies with some semantics, including how terms in one vocabulary relate to terms in another. In addition, the framework makes it trivial to publish data in which you mix vocabularies, making statements about a person, for example, using terms drawn from FOAF [xmlns.com], Dublin Core [dublincore.org] and others.

    The RDF approach was designed with interoperability and extensibility in mind, unlike many other approaches. RDF is showing increasing adoption, showing up in products by Oracle [oracle.com], Adobe [adobe.com] and Microsoft [dannyayers.com], for example.

    If this approach doesn't continue to flourish and help realize the envisioned "web of data", and it might not after all, it will have left some key concepts, tested and explored, on the table for the next push. IMHO, the 'semantic web' vision -- a web of data for machines and their users -- is inevitable.

  • by KjetilK (186133) <kjetil@@@kjernsmo...net> on Wednesday March 21, 2007 @11:26AM (#18429349) Homepage Journal
    Well, if his first point was correct, the web wouldn't exist at all. Allthough there are lengthy fights in for example the HTML area, and it took a while to get RDF on a firm footing, semweb standardisation is actually moving pretty quickly now that we have the foundations.

    His second point is just a common misconceptions and FAQs [w3.org]. It doesn't require that people does that.

    I have just accepted a position with a consultancy that does a fair amount of work for those cut-throat businesses. And they are interested, very interested, in fact. Which is also why Oracle, IBM, HP, even Microsoft is interested.

    Typical use case for them is: So, you bought your competitor, and each of the companies sit on big valuable databases that are incompatible. You have huge data integration problem that needs solving fast. So, throw in an RDF model, which is actually a pretty simple model. Use the SPARQL query language. Now all employees have access to the data they need. Problem solved. Lots of money saved. Good.

    But this is not part of the open web, you say? Indeed, you're right. So, Semantic Web technologies have allready succeeded, but not on the open web. And since I'm such an idealist, I want it on the open web. So, the blog still has a valid point.

    We need to make compelling reasons why they should put (some) data on the open web. It isn't easy, but then, let TimBL tell you it wasn't easy to get them on the web in the first place. It is not very different, actually. The main approach to this is capitalise on network effects. There is a lot of public information, and we need to start with that.

    So, partly, that's what I'll do. We have emergent use cases, and that's the evil part of cut-throat business. You don't talk about those before they happen. So, sorry about that. I think it will be very compelling, but it'll take a few years. If you're the risk-averse kinda developer who first and foremost has a family to feed, then I understand that you don't want to risk anything, and you can probably jump on the bandwagon a couple of years from now, having lost relatively little.

    But if you, like me, like to live on the edge, and doesn't mind taking risks doing things that of course might fail, then I think semweb is one of most interesting things right now.

    • by bbtom (581232)

      Yes, it's typical "argh! It's not happening!". Well, unless people take an interest and do something, of course it won't happen.

      The adoption problem needn't be. If companies and organisations are unwilling to put data up semantically, someone else will. We see this already with accessibility - in the UK, a replacement train times website [traintimes.org.uk] has been built to replace the crappy National Rail website. I wrote a MySpace screen scraper recently so I wouldn't have to visit the profiles of my friends, and instead I

  • I think the problem is in the author's head. Difficulties always exist between vendors. They are worked out when the beneifts of cooperation outweigh the benefits of non-cooperation. I believe what we are calling the semantic web has other features that many consider failure but in reality are inherent to sharing the amount of information we are trying to share, namely, universal uniformity and a single clean interface into human civilization (which is really what the sm is) is impossible and foolish to
  • Google already turned off its Search API for new registrations; only those already w/ accounts can continue using the web services-based search API. I believe their AJAX API, which is less useful, is still open.

...when fits of creativity run strong, more than one programmer or writer has been known to abandon the desktop for the more spacious floor. - Fred Brooks, Jr.

Working...