Follow Slashdot blog updates by subscribing to our blog RSS feed

Using the Semantic Web to Enhance Search 150

Posted by Zonk on Friday May 27, 2005 @10:21AM from the does-whatever-a-spider-can dept.

RobMcCool writes "At Stanford KSL, we really like the Semantic Web. So we've taken many of our favorite web sites, scraped them, and put together a huge pile of RDF, which we'll let you download. We've used that RDF to create a search application, in the spirit of Google Q & A or Microsofts recently announced MSN Search extensions. Our search can answer simple factual queries like the previously discussed population of Portugal but can also answer some more complex ones. We also have a smart autocomplete system, type "tom hanks birth" slowly to see it in action (best with Firefox). We're looking for people to be a part of this search system by running their own search sites, and by putting their data on the Semantic Web. Come check it out!"

This discussion has been archived. No new comments can be posted.

Using the Semantic Web to Enhance Search

Load All Comments

Search 150 Comments Log In/Create an Account

Comments Filter:

Google watch out... (Score:5, Insightful)

by jason718 ( 634659 ) writes: on Friday May 27, 2005 @10:24AM (#12654785)

Semantic-driven search engines have awesome potential. However, it does place a lot of demand on the content provider to provide metadata-rich content - or to be able to provide intelligent mining tools to create metadata from existing sites.

This is definitely one to watch...

Share
twitter facebook
- Re:Google watch out... (Score:1, Interesting)
  
  by Anonymous Coward writes:
  
  Note to self. Dreaming about the world tagging all their data isn't going to happen. It takes too much damn time. Semantic driven search using google's technique works. Producing a RDF graph is crap. Nothing to watch here.
  - Re:Google watch out... (Score:3, Informative)
    
    by Metasquares ( 555685 ) writes:
    
    As one who has written semantic web pages, it's also rather difficult. OWL is a real pain to write, and most interpreters don't support "OWL Full", which means I'm stuck writing for either "OWL Lite" (now with only half the calories!) or "OWL DL". Forget (X)HTML, too - you need to use XML+RDF to use OWL, which means that if you want content you either need a parser or you need to code two documents for each one: One for human readability, and one that contains the metadata. There used to be a language calle
    - Re:Google watch out... (Score:1)
      
      by ngibbins ( 88512 ) writes:
      
      It's not that most interpreters don't support OWL Full, but that there are no tractable, sound and complete algorithms for subsumption reasoning in the logic that underpins OWL Full. If you write OWL DL there are restrictions on what you can express, but you do then have tractable algorithms. It's the tradeoff between expressivity and complexity, in short.
      
      SHOE was primarily the result of Jeff Heflin's PhD research, and he used his experiences of writing SHOE to good effect on the W3C's Web Ontology Workin
  - - - Re:Bashers watch out... (Score:2, Insightful)
        
        by Dasch ( 832632 ) writes:
        
        So, where do you find the business case that justifies web designers all over the world spending even 10 % extra time to specify the information needed by the Semantic Web???
        
        if it would mean that their sites would rank higher in the search results, I'd say that they all would...
- Re:Google watch out... (Score:3, Interesting)
  
  by ShinmaWa ( 449201 ) writes:
  
  However, it does place a lot of demand on the content provider to provide metadata-rich content
  
  This statement is why I was wondering why this was considered such a wonderful thing. For a while now, there's been a research project at IBM called WebFountain [ieee.org] that not only does everything that Semantic Web attempts to do, but doesn't require any special mark up either. Its goal is to work with completely unstructured data of any type, including web pages, powerpoint documents, word docs, PDFs, etc etc.
  - Re:Google watch out... (Score:1)
    
    by ngibbins ( 88512 ) writes:
    
    From the IEEE Spectrum article:
    
    WebFountain works by converting the myriad ways information is presented online into a uniform, structured format that can then be analyzed. The goal is to provide a general-purpose platform that can allow any number of so-called analytic tools to sift the structured data for patterns and trends. Creating the needed structure automatically is WebFountain's big advance, because it requires at least some understanding of what the information actually means.
    WebFountain compl
- Re:Google watch out... (Score:2)
  
  by Azghoul ( 25786 ) writes:
  
  Geographers have been waiting for over a decade for metadata to catch on. Everyone hates building metadata, even when they know it makes their data infinitely easier for other geographers to use.
  
  In the context of GIS data, where metadata can be incredibly useful, creation of metadata is like pulling teeth.
  
  Unfortunately, until and unless there's automated tools - your "intelligent mining tools", this whole thing will never be more than a curiosity...
- Qualify as Semantic Web ? (Score:1)
  
  by copdk4 ( 712016 ) writes:
  
  The most basic aspect for any application to qualify as a "Semantic Web" app (from SW challenge, http://www-agki.tzi.de/swc/swapplication.html [www-agki.tzi.de]) is that the application should use "some formal description of the meaning of the data" ! RDF by itself doesnt give any *meaning* or *semantics* to the data. You need to associate your RDF data to RDFS/OWL for that purpose (TAP doesnt have a published OWL ontology http://tap.stanford.edu/tap/tapkb.html [stanford.edu])
  
  Also given that you dont have any 'meaning' to nodes and links
  - Re:Qualify as Semantic Web ? (Score:1)
    
    by ngibbins ( 88512 ) writes:
    
    Tap doesn't appear to have an ontology (OWL or RDFS) that's published separately to the RDF data, but the RDF data files do appear to contain class definitions. In my book, that's sufficient meaning to qualify as a SW application under the rules laid down by the SW Challenge. It's certainly about as much meaning as we had in CS AKTive Space [aktors.org] when we won the first SW Challenge in 2003.
    - Re:Qualify as Semantic Web ? (Score:1)
      
      by copdk4 ( 712016 ) writes:
      
      SW 2003 Challenge was in October, W3C-OWL standard wasnt yet finalized (It was a Recommendation Standard in Aug 2003, http://www.w3.org/2001/sw/WebOnt/#L151 [w3.org]) and to my knowledge no reasoner (Fact/Racer) supported full OWL/DAML+OIL reasoning. So I guess the 'semantics' aspect was not a big concern then..
      
      Today OWL is formalized. Several OWL based api/reasoners are in place. Using such 'RDF only' applications misguides people and the community. My only request to you all Semantic Web Gurus is to preach right
Loading... (Score:1)

by DrinkingIllini ( 842502 ) writes:

As soon as you even begin to type it is loading something, it keeps loading with each character, guessing it is the autocomplete "feature" but it loads too slowly for me to tell. Anyone else have any luck?
- Re:Loading... (Score:1)
  
  by fa2k ( 881632 ) writes:
  
  It's /.'d
From the check it out link... (Score:2, Funny)

by Anonymous Coward writes:

"Search on TAP was built to answer the following types of queries: There are also two actors named Harrison Ford: the one who played Han Solo, and a silent film star from the 1920's."
That's nice and all but who shot first and is there a mash up of both scenes with crazy alien bar music mixed with 20's sinister piano.
600MB?!?!?!? (Score:1)

by phantasma6 ( 799340 ) writes:

You're linking a 600MB file from slashdot?

(oh, and I'm getting 503's for the searches)
How 'bout... (Score:1)

by Himring ( 646324 ) writes:

http://sp11.stanford.edu:8000/valve?query=it+is+sl ashdotted&ckb=Aggregated+Data [stanford.edu]
autocomplete (Score:5, Insightful)

by cryptoz ( 878581 ) writes: <jns@jacobsheehy.com> on Friday May 27, 2005 @10:28AM (#12654845) Homepage Journal

Autocomplete is a useless feature that nobody wants to see when the type "a"...and see it load everything that beings with "a". The user is not interested in items starting with "a". Perhas they're interested in terms beging with "anon" or something, which has many fewer items to load, therefore making the load time much faster and not annoying the user in the process.

Or, even better, never have any autocomplete turned on automatically. Do a VB-like idea, where if you want to see possibilities at a certain point, hit a specific key that will register for the list to pop down.

Share
twitter facebook
- Re:autocomplete (Score:1)
  
  by coolcold ( 805170 ) writes:
  
  or maybe limit the auto complete list to like 20 results?
  
  auto complete list will not show up until only 20 results are returned
  
  of cos, this is just another example
- Re:autocomplete (Score:3, Insightful)
  
  by davegust ( 624570 ) writes:
  
  Have you tried Google Suggest [google.com]? Auto complete is very useful when it doesn't slow down the typing, and when the results are in a useful order.
- Re:autocomplete (Score:2)
  
  by no soup for you ( 607826 ) writes:
  
  Autocomplete is a useless feature that nobody wants to see when the type "a"...and see it load everything that beings with "a".
  
  Thats just usability, the concept is sound. Instead of filling in results with "a", fill them in on three letters like "ast", which could have asterisk, astronaut, etc. The idea is to 1) save time by not making them typein an extra 6 letters and 2) cut down on misspellings.
- Useless? I don't think so. (Score:2)
  
  by jxyama ( 821091 ) writes:
  
  I don't agree that it's completely useless. Don't we all tend to type the most important query word first?
  In any case, for Japanese/Chinese/Korean - autocomplete is almost a natural part of using a web search engine, so it's not a "useless feature that nobody wants to see."
  
  Those languages use alphabet-based inputs which are then converted into native text. Why bother converting if you can take the direct alphabetical input and start showing native text autocompletes?
- Re:autocomplete (Score:1)
  
  by RobMcCool ( 886341 ) writes:
  
  Normally I would agree with you, but we added autocomplete for a very real reason.
  
  As a prior post pointed out, the most important problem with the Semantic Web is getting people to generate data. Until that happens on a widespread basis, the data coverage will always be spotty compared to a keyword engine.
  
  We added the Autocomplete dropdown in response to user feedback that they had no idea what was in the system until they hit "enter", and by then it was too late. The dropdown gives immediate feedback abo
Semantic Web? (Score:5, Informative)

by DoctoRoR ( 865873 ) * writes: on Friday May 27, 2005 @10:29AM (#12654854) Homepage
The Stanford research is interesting, but I'm still trying to make up my mind about the Semantic Web, learning about RDF, and whether I need to bake in ways of handling these kinds of assertions in my web app. The Stanford group writes, "Our hope is that our search application spurs development of the Semantic Web, and leads to sites publishing their data in this format so that we don't have to." It obviously takes more work to encode such information and getting user contributions auto-marked for the semantic web. For a counter viewpoint, take a look at some of Clay Shirky's work -- in particular:
- The Semantic Web, Syllogism, and Worldview [shirky.com] written in Nov 2003.
- His excellent talk "Ontology is Overrated" in downloadable spoken audio [itconversations.com]
Will the semantic web be supported by future versions of Drupal, phpBB, and other grass-roots content management web apps? Not sure. Since a lot of the content is visitor generated, you would have to build in ways of providing easy markup. Would be interested to hear /. thoughts on the matter.
Share
twitter facebook
- Re:Semantic Web? (Score:2)
  
  by maharg ( 182366 ) writes:
  
  Until truly intelligent semantic classifying engines are available, the semantic web is best suited to things like wikipedia where the information is (generally) of a higher quality than what you find on a more general purpose site, like the one you are viewing now !! For example, a slashdot story about a newly discovered type of <crab type="crustacean"/> would soon degenerate into postings about <crab type"venereal disease"/>. Marking quickie (pun intended) posts up semantically would detract f
- Re:Semantic Web? (Score:2)
  
  by daviddennis ( 10926 ) writes:
  
  I have never been fond of articles like this. Slashdot points us to something new (at least to me), and links to horribly long-winded and incomprehensible explanations of what it is. Sure, I could understand them ... if I had an extra hour or two.
  
  Since it's obvious that you do understand, would it be possible for you to come up with a 1-2 paragraph explanation of what the Semantic Web is and does?
  
  I've spent some time on the linked to web site, and read Clay Shirky's essay, and I'm still not sure what it
  - Re:Semantic Web? (Score:1)
    
    by smartdreamer ( 666870 ) writes:
    
    take a look at the official w3c reference there [w3.org]. Read the header (first paragraphs). That's a very basic introduction.
    In short, the goal of the semantic web is to make the web (semantic) understandable to computers (by any mean possible). This, to bring new possibilities and automatism. For this to be possible, we need to explicit things in a formal manner.
    - Re:Semantic Web? (Score:2)
      
      by daviddennis ( 10926 ) writes:
      
      I guess what I'd like to see, instead of a vague initial paragraph and pages of formal specifications, is a concrete example of how you would code this, and then how it would be used.
      
      Many thanks.
      
      D
      - Re:Semantic Web? (Score:1)
        
        by smartdreamer ( 666870 ) writes:
        
        Okay, you want more than words... I guess you ask to much. ;)
        Semantic web is not something you can thing of as a concrete application nor we can consider it mature. As you surely read, semantic web is an extention of the current web. So I can link you to firefox [mozilla.org] or some HTML editor. Joke aside, it is more complicated than that and if you want to embrass semantic web you should get to know XML, RDF and OWL (in this order). In fact, if you are not working to build sw, you should consider another approach. I
      - Re:Semantic Web? (Score:1)
        
        by Albinofrenchy ( 844079 ) writes:
        
        Alright, so let us check out a sample from OWL (Web Ontology Language): [w3.org]
        
        Wine Rdf [w3.org]
        
        Look through that RDF with emacs/notepad. You will probably not understand all of it, but you can get the gist of it. It attempts to classify things categorically almost, so finding out context for a word is simple. For instance, the owl:Class of "Wine" is a subClass of "PotableLiquid" with a couple restrictions and properties that wine could have in real life.
        
        Why is this useful? It dramatically increases the level at
  - Re:Semantic Web? (Score:1)
    
    by TrappedByMyself ( 861094 ) writes:
    
    I wouldn't worry about it right now. The Semantic Web and RDF and OWL and Ontologies all that jazz are still mostly in the academic circles. I could go one step further and argue that "metadata" is still an academic subject. It's big at universities and in the $$ government contracting world, but the average joe application of it all just isn't here yet.
    
    The smarties of the world know that metadata can be used to do all sorts of great things, but it just hasn't happened yet. The technology and the understan
- Re:Semantic Web? (Score:1)
  
  by miro2 ( 222748 ) writes:
  
  Clay Shirky's objections don't hold water. His examples of faulty logic assume that RDF statements should be reasoned on in isolation. In fact, many systems which pair truth-values with statements are quite capable of avoiding the faulty logic he claims is an inherant consequence of using RDF statments. Look at http://www.cogsci.indiana.edu/farg/peiwang/papers. html [indiana.edu] NARS or probabilistic term logic for example.
slashdotted (Score:3, Funny)

by maharg ( 182366 ) writes: on Friday May 27, 2005 @10:31AM (#12654869) Homepage Journal

faster than a thousand speeding gazelles [stanford.edu]

Share
twitter facebook
- Re:slashdotted (Score:1)
  
  by RobMcCool ( 886341 ) writes:
  
  I think I'll lock the door so the IS department can't find me.
  
  There's a coral cache [nyud.net] of the static content, including screenshots, if you can't get through to my melted pile of servers.
Semantic Web Pitfalls (Score:4, Insightful)

by aftk2 ( 556992 ) writes: on Friday May 27, 2005 @10:32AM (#12654883) Homepage Journal

While the idea of the semantic web has been legitimately lambasted [shirky.com], I think it's a bit far from DOA. While I agree that it's not exactly practical, I think that if you get enough sites displaying their content in such a manner, you'll eventually reach a point at which others will do the same.

I mean, think about it this way - while laziness or inertia might initially win out, once someone's competitors start to explore the idea of the semantic web, interest will start to be shown in it, especially once it becomes either profitable to do so.

Share
twitter facebook
- Re:Semantic Web Pitfalls (Score:2)
  
  by Omnifarious ( 11933 ) writes:
  
  Well, part of Shirky's point is that it is so lacking in usefulness that there will be no advantage to anybody for display their content that way. I think he's right. I've watched AI based on these kind of logical rules and semantics stumble along for years without producing anything useful, and then along comes some program that takes little pieces of what other people said and 'mindlessly' strings them together in new ways and it wins a Turing contest.
  
  Logical reasoning of this kind, despite all the hyp
  - Re:Semantic Web Pitfalls (Score:1)
    
    by RobMcCool ( 886341 ) writes:
    
    Logical reasoning is currently primitive and definitely overrated. We don't use OWL. The reasoning we do is very primitive, and is not of the sort that Clay Shirky is talking about. I actually agree with the thrust of his essay, despite the flaws that others have pointed out.
    
    TimBL has talked about the Semantic Web as less a thing of logic and more like a giant database. I think that characterization has some problems also, but it's closer to what Search on TAP is doing.
- Re:Semantic Web Pitfalls (Score:2)
  
  by Eternally optimistic ( 822953 ) writes:
  
  It gets worse: the method relies on the web site content author to know the semantic content, and to honestly report it. How would you check these things? Voting to determine if the earth revolves around the sun?
  - Re:Semantic Web Pitfalls (Score:1)
    
    by RobMcCool ( 886341 ) writes:
    
    We haven't really dealt with the spam problem because it's a problem we'd love to have. Right now there's so little content that we can afford to only pick the highest quality sites.
    
    The automated techniques like those WebFountain uses are susceptible to the same problems, as is Wikipedia, so I'm not convinced that this is necessarily a Semantic Web problem as much as an Internet problem.
    - Re:Semantic Web Pitfalls (Score:2)
      
      by Eternally optimistic ( 822953 ) writes:
      
      As far as spam goes, and mistaking popularity for correctness, yes you are right, and both of these are a big problem already.
      But there remains the problem that this technique does not find semantic connections that the authors don't know about.
This won't work (Score:2, Interesting)

by holyshitholyshit ( 877523 ) writes:

Firstly scraping is the same as what google does, which is fine but only a fool would trust the scraper not to censor their output.
Secondly, scraping doesn't always work and you will surely have low-grade porno and get rick quick schemes/scams littering your sematic data.

But let us suppose that the main benefits of a semantic web are (A) access to reference data [which may be falsified, oops], and (B) access to product availability data [which may be falsified, oops, like mail order companies that pret
A similiar project that I worked on at school (Score:1)

by matt_king ( 19018 ) writes:

Check out QuASM (Question Answering using Semi-Stuctured Meta-data)...we used similiar processes and approaches to getting "answers" out of a large (40 TB) collection of .gov, .edu, .org web pages. The demo page is no longer available (we completed work on this in 2002) but you can checkout the paper at ACM:

http://portal.acm.org/citation.cfm?id=544220.54422 8 [acm.org]

It was a really interesting project to be a part of!

Go UMass!
A tale of two technologies.... (Score:3, Interesting)

by Crimson Dragon ( 809806 ) * writes: on Friday May 27, 2005 @10:35AM (#12654913) Homepage

The Semantic Web appears to be a budding server-side solution to the paradigm of information glut online. Social bookmarking appears to be a client-side solution to the paradigm of information glut online.

It is refreshing to see exciting new solutions to the problems we have at present of targeted information retrieval on the internet. I can remember years of stagnation in this field (read: early 90's), and any change from today's google-and-pray searching mentality among the majority of end-users will be welcome.

Share
twitter facebook
One more step... (Score:2)

by LegendOfLink ( 574790 ) writes:

...towards the future.
awesome! (Score:3, Funny)

by Anonymous Coward writes: on Friday May 27, 2005 @10:42AM (#12654974)

...now I can finally search for "images of women with breasts larger than 36D"!

Share
twitter facebook
- Re:awesome! (Score:2)
  
  by mikefe ( 98074 ) writes:
  
  Idiot.
  
  Just go to any hardcore site for that.
  
  Maybe that's why sites showing non-professionals are becoming more popular. Not everyone likes the big fake ones...
- - Re:awesome! (Score:1)
    
    by RobMcCool ( 886341 ) writes:
    
    I'm not sure our sponsors in the military and intelligence agencies would fund research in breast sizes. They're kind of a sensitive bunch. But maybe I'll stick it in a proposal and see what happens.
    
    I think the comment that semantic web research has focused on logic such as query analysis, comparisons, and groupings is fair for the Semantic Web in general.
    For Search on TAP we don't have a lot of people or resources. Despite that, I spend an awful lot of time generating data. The compressed RDF, which we
RDF? (Score:1)

by Thnikkaman ( 818752 ) writes:

We've used that RDF to create a search application... Steve Jobs is the only one using any RDF to get applications made. Has he finally gotten a distortion field so big that others think they have them? hmmmm....
Might actually help (Score:4, Insightful)

by Artifakt ( 700173 ) writes: on Friday May 27, 2005 @10:50AM (#12655049)

This looks like it will broaden the volume of useful searches. Right now, there are at least two limits that show up when searching:

1. For really popular subjects, the useful links are swamped in the noise of sites trying to make a buck off of getting you to look at their ads before directing you to somewhere else, that might have the actual content or might not.

2. For many less popular subjects, there is some oddity, like an unusual term being borrowed by some other field, so that it is something most people have never heard of, but people in two or more specialties use it frequently, in very different ways. resulting in strangeness. (i.e. the search engine throws up 23,003 links for a search on "Sator Resartus". 30% are esoteric literary criticism, 20% relate to apoptosis (cell biology), 20% relate to building moral inhibitions into A.I., 10% to Keith Laumer novels, and the rest are probably noise).

(I'm sure there are more than these two limits. Someone else may want to comment on some others).

This is likely to help with the second case, oddities in the data set grouping. (it could sort links into the larger sub-categories, query the user which one(s) seemed most applicable, and maybe even sort out a small set of links that explain, for the previous example, how a high brow literary term got borrowed by the other fields).
It's not as likely it would help with the first case, though, as sites that don't have actual content are actively duplicitous. Something that is actively trying to fool humans is still likely to be very successful at fooling our tools.

Share
twitter facebook
Semantic Horse shit (Score:1, Interesting)

by Anonymous Coward writes:

I hate to say it, but Semantic Web blows chunks. No business is ever going to tag all their data so that anyone can use it. Business prefer to build specific webservices to integrate and charge customers. Get a clue W3C, RDF is fertilizer. So far, all the RDF rule engines out there suck from a scalability and performance perspective. There are two RDF rule engines that claim to implement RETE, but several people have analyzed it and shown that neither Jena2.2 nor pychinko implement RETE.
The best part is t
- Re:Semantic Horse shit (Score:1)
  
  by matt_king ( 19018 ) writes:
  
  The Internet is used for things other than businesses, or have you forgotten that? The concept of a Semantic Web has huge implications for many reseach projects, as a way to get better "information" out of all the "data" that is available on the internet today.
  
  Although it would be nice, no one is mandating or asking every website out there to mark up all their pages semantically. But if you want your information to be shared, a good way to start is to mark it up semantically so that more and better inform
  - Re:Semantic Horse shit (Score:2, Insightful)
    
    by Anonymous Coward writes:
    
    Nice straw man argument. How many people making their own personal site is going to dedicate 2/3 of their time to tag their content? The only people that are going to tag their content are those looking to abuse the system. No sane individual is going to spent 3 months of time to go back and edit all their pages with tags. Even then, you still have the problem of conflicting categories (aka ontologies). There will never be a globally accepted set of Onotologies. It's all pipe dream. Why should users spend h
    - Re:Semantic Horse shit (Score:1)
      
      by matt_king ( 19018 ) writes:
      
      It's going forward that this is important...if you are designing a new site today, it might be worth your while to try and represent the data semantically. Just as real web designers no longer design with nested tables *shudder* and use CSS to seperate out presentation logic from content, so too will people start going even deeper, into making their "web data" into "web information.
      
      This is not about arguing over a set of standards over the ontology of how the data should be represented; this is about think
    - Re:Semantic Horse shit (Score:2)
      
      by pfafrich ( 647460 ) writes:
      
      There quite a few things where people might want to use some semantic mark-up:
      
      Creative Commons, use rdf to specify copyright and licence info about a page, you can now search on this using special pages on google and yahoo.
      
      Anyone who want to sell something, will be interested in making their content easy to find. A little bit of semantic mark-up , could help them shift units.
      Anything pulled out from a database. Here its relatively easy to modify the code to add some extra mark-up.
      Tagging this seems to
- Re:Semantic Horse shit (Score:2)
  
  by ultranova ( 717540 ) writes:
  
  I hate to say it, but Semantic Web blows chunks. No business is ever going to tag all their data so that anyone can use it. Business prefer to build specific webservices to integrate and charge customers.
  Fine with me. I don't want their information. In fact I'd like to get rid of their information (banner ads and spam).
  If I want to deal with businesses, I go to my local shop. If I can't find what I want there, I look up the yellow pages of my local phonebook. If I can't find what I want there, I loo
- Re:Semantic Horse shit (Score:2)
  
  by MarkWatson ( 189759 ) writes:
  
  You have a valid point of view, but just one quick clarification:
  
  Rete scales really well as you add rules but scales really poorly with the number of items in working memory.
  
  I believe that rete would be a bad choice for the SW where you would have a very large data set in working memory.
  
  (I used to do a lot of rete hacking: commercial expert system tools for Xerox Lisp Machines and the Mac, and hacking OPS5 to support 'multiple data worlds' for in house use.)
- - Re:Semantic Horse shit (Score:1)
    
    by f00zbll ( 526151 ) writes:
    
    The thing about this is that it appears to be a solution in search of a problem. While one might say that the "problem" is unorganized data, this isn't actually the solution, because it doesn't work until the data is organized. It's like the solution to the problem is "the web is unorganized", these "semantic web" people say "well organize it! There, problem solved!"
    semantic web definitely is a solution in search of a problem. it's probably naive to think organizing data is easy. the original post was
    - Re:Semantic Horse shit (Score:1)
      
      by smartdreamer ( 666870 ) writes:
      
      In fact semantic web is already there in some forms : foaf [foaf-project.org], mindsap site [mindswap.org] or think of every RSS feeds.
      People who don't have a clue about semantic web tend to refer about it as semantic horse shit. It's a petty that those who don't believe in things try to demolish them rather than let it go... or let it perish if they are so sure about it's doom.
      - Re:RSS is not Semantic Web (Score:1)
        
        by smartdreamer ( 666870 ) writes:
        
        I agree with you, semantic web is not a reality outside certain circles. But, my point is that it wont come like many think it will : in a big google like demo. It will come from many little implemantations like we see with foaf. We can imagine a big mainstream ISP provider encouraging users in such community.
        As for RSS, it is limited, but it took off rapidly. RSS v1.0 introduced RDF. That is another step in the right direction.
        BTW RDF isn't that complicated. Think of it as a triplet : Subject Verb Obje
        
        Re:RSS is not Semantic Web (Score:1, Interesting)
        
        by Anonymous Coward writes:
        
        As for RSS, it is limited, but it took off rapidly. RSS v1.0 introduced RDF. That is another step in the right direction. BTW RDF isn't that complicated. Think of it as a triplet : Subject Verb Objet.
        I don't think the evidence on RDF mailing list supports that opinion. Look at the literature in the bookstores about semantic web. If anything, it is full of confusion and the specification is poorly written compared to the HTML and XML specification.
        
        Triplet does not equal (Subject verb object). What the R
        
        Re:RSS is not Semantic Web (Score:1)
        
        by smartdreamer ( 666870 ) writes:
        
        I don't think the evidence on RDF mailing list supports that opinion. Look at the literature in the bookstores about semantic web. If anything, it is full of confusion and the specification is poorly written compared to the HTML and XML specification.
        
        I don't know which mailing list you refer to, nor which books but the web is an excellent source of information for that matter. Take a look at links returned by google for RDF : here [xml.com], RDF homepage [w3.org] full spec, RDF primer [w3.org] for some graphs and there [wikipedia.org] or this [oreilly.com] e
        
        Re: poor explanation (Score:1)
        
        by smartdreamer ( 666870 ) writes:
        
        my explanation was poor. according to the RDF spec, rdf consists of two parts: model and graph. The model represents an object, like car, cat, boat, house, etc. The graph represents the relationship between the objects, like honda->car->vehicle. The graph is suppose to allow the system to "infer" facts which are not explicitly stated. In other words, a RDF engine would be able to infer a Honda is a type of vehicle.
        
        I see what you mean. None the less, reasonning over hierarchies uses RDFS since hie
My question (Score:5, Interesting)

by News for nerds ( 448130 ) writes: on Friday May 27, 2005 @10:55AM (#12655096) Homepage

Does it have a countermeasure against 'semantic spam'?

Share
twitter facebook
- Re:My question (Score:2, Interesting)
  
  by smartdreamer ( 666870 ) writes:
  
  There is no such thing as semantic spam. What you refer to is desinformation or information junk. Like the actual web, semantic web is about freedom, openess and accessibility. So, everybody can publish (I don't refer to governement laws, repression, etc.). But semantic web has a solution to this wave of information in a thing called the web of trust which propose giving trust ranking to information and introduce inference engines to compute which links/sites may interest you and why. But this is not for to
- Re:My question (Score:1)
  
  by RobMcCool ( 886341 ) writes:
  
  I replied to a lower-scored post with this question that we haven't had this problem yet, but that it's a problem that exists with any technique, whether it's Wikipedia, and automated technique like WebFountain, or the Semantic Web. It's an Internet problem.
  
  A followup to this post mentioned using a web of trust to counteract spam. That's something that Guha has done a lot of work with, and Paulo is working in the lab here on some prototypes based on movie data.
  Spam is a problem I would love to have beca
In Related News (Score:2)

by MrAnnoyanceToYou ( 654053 ) writes:

The average starting salary offer for Stanford graduate students has raised 30% in the last hour, as Microsoft, Google, and Yahoo each vied tooth and nail for their services.

(starts filling in application)
auto-complete (Score:2)

by derubergeek ( 594673 ) * writes:

One wouldn't think this would be particularly newsworthy here in supposed geek-haven, but Google has an auto-complete [google.com] feature as well.
Of course, it's a beta feature at Google Labs. FYI...
Slashdotting Google bomb? (Score:3, Interesting)

by bcmm ( 768152 ) writes: on Friday May 27, 2005 @11:28AM (#12655503)

That second link goes to http://www.google.com/url?sa=U&start=1&q=http://ww w.w3.org/2001/sw/&e=9707 [google.com]
How is that different to linking to http://www.w3.org/2001/sw/ [w3.org]?

Is Slashdot trying to improve someone Google ranking?

(Also, did Slashdot always linkify URLs entered as plaintext? I didn't write any "a href" for those two.)

Share
twitter facebook
- Re:Slashdotting Google bomb? (Score:1)
  
  by RobMcCool ( 886341 ) writes:
  
  No, I just did a search for "semantic web" and copied and pasted the first result. I didn't realize they were sending people throught google.com/url now; it used to just go straight there. When did they start doing that?
  - Re:Slashdotting Google bomb? (Score:2)
    
    by Stauf ( 85247 ) writes:
    
    They always did it, for a random number of links every few queries or so. It's so they can collect data on which sites people thought were relevant to their query. These links seem to have become more and more common though.
    - Re:Slashdotting Google bomb? (Score:2)
      
      by bcmm ( 768152 ) writes:
      
      Thats right and proper and everything, because thats part of how they rank pages. Your explaination was nice, because I had been noticing both direct and monitored links and wondering what was going on.
I'm still trying to figure out... (Score:3, Funny)

by rah1420 ( 234198 ) writes: <rah1420@gmail.com> on Friday May 27, 2005 @11:45AM (#12655708)

...not only what the Semantic Web is about, but more pragmatically why this is in "Hardware." :)

Share
twitter facebook
The semantic data is already there (Score:2)

by saddino ( 183491 ) writes:

Although I find the Semantic Web project intriguing, the idea of tagging data to define it is somewhat of a cop-out. The "meaning" of any given page is already there: in the page. Instead of spending so much time tagging pages, how about working on algorithms to derive meaning from the content. Surely those in the field of Computational Linguistics can make a real push at this: "artificial" corpora aren't needed anymore: the web offers more data than you'll ever need.

Shameless promotion: for OS X users,
- You missed the point! (Score:3, Insightful)
  
  by holygoat ( 564732 ) writes:
  
  The Semantic Web is about describing resources, not tagging pages.
  
  Indeed, you might output RDF from your processing of Web pages.
  
  Extracting information from semi-structured text is very different to making logical assertions about resources.
  - Re:You missed the point! (Score:2)
    
    by saddino ( 183491 ) writes:
    
    Yes, that is a valid point. However, considering the (IMHO) substantial barriers to widespread adoption (getting authors to provide semantic descriptions, dealing with SPAM or purposefully misleading descriptions, etc.), I still would like to see more effort in context analysis research. The AI field has been floundering for so long, a catchy phrase such as "Semantic Web" (which, has been quite a successful meme) applied towards AI applications in contextual derivation could be helpful in moving things al
  - - Re:You missed the point! (Score:2)
      
      by holygoat ( 564732 ) writes:
      
      "Webservices have no need for semantic web"
      
      Ignoring your grammar, I would reply: tell that to the people trying to develop Web Services standards! Specifically, I'd point you to OWL-S, and its simpler, ad-hoc cousins.
      
      One of the most common uses of the Semantic Web at present is describing PEOPLE (FOAF, as used by LiveJournal and countless others). Do you not see that the Semantic Web goes beyond a Web of human-readable documents into a machine-understandable Web of data? You don't find pages on the S
      - Re:You missed the point! (Score:2)
        
        by holygoat ( 564732 ) writes:
        
        What do you think RSS 1.0 is?
    - Re:You missed the point! (Score:1)
      
      by smartdreamer ( 666870 ) writes:
      
      The most prevalent form of "resource" is a page.
      You live in the past. ;)
      You are refering URL but semantic web stands on URIs which are a superset URL. If you limit yourself to URL we better not talk semantic web because it transcend the current view of resources. This concept is very significant.
      Just to be a little clearer, let's take a simple example. You want to refer to yourself. How can you do this? You can't download yourself on the net (can't you? ;). What you can have is an homepage : your page
      - Re:You missed the point! (Score:1)
        
        by smartdreamer ( 666870 ) writes:
        
        No, I'm not confusing URI with URL. The problem is that the world at large uses them interchangably. URI is abused all over the place. That's human nature. Telling the world "you're mis-understand how to URI" doesn't solve the problem that Semantic Web defines URI as authorative source. Talk about building a house on a beach with a hurricane coming.
        
        This is normal that people uses them interchangably, current web uses exclusively URL, so URI is like a new concept for most. As a matter of fact, they coul
- Re:The semantic data is already there (Score:1)
  
  by RobMcCool ( 886341 ) writes:
  
  An automated technique that could do better than a human tagger would have an additional feature of being able to pass the Turing Test.
  
  I admire your faith in automated techniques, since the ones I've seen have a catastrophic error rate and can't provide particularly rich data. The state of the art there is constantly improving, though, and there's no reason why such algorithms can't generate RDF anyway. The Semantic Web is about file formats and conventions, it doesn't necessarily mean human tagging.
  For
Gathering Metadata from Apple's Filesystem? (Score:1)

by Oori ( 827315 ) writes:

Seems to me there's a useful metadata resource available now due to the way that OSX-Tiger is now allowing metadata to be attached to a file (either as xattribs, or via the Spotlight keyword field). See here. [slashdot.org]
Does anyone know if web crawlers/gatherers (google, harvest, combine etc') have the ability to access that information and associate it with the file?
I would love an automatic gatherer extracting my metadata from the filesystem and allowing searches on it, in combination with the full text option.
What!? (Score:1, Offtopic)

by HishamMuhammad ( 553916 ) writes:

In the examples page [stanford.edu], PLO and Al Fatah are listed under "Terror Organizations". This is a horrible misrepresentation.

The PLO is the organization representing the Palestinian people that eventually evolved into the Palestinian Authority. It had observer status in the UN General Assembly and even special permission to participate on Security Council debates (sans voting rights). Al Fatah is a political party which was involved in guerilla activities in the 70s, but that has, since the Oslo Accords, accepte
- - Re:What!? (Score:1)
    
    by smartdreamer ( 666870 ) writes:
    
    categories, as we know, are not conceptualized by necessary and sufficiency relations, but by family resemblance.
    
    Hey, that sounds like a logical (ontology building) reasoning. This thread is not offtopic. :)
  - Re:What!? (Score:2)
    
    by HishamMuhammad ( 553916 ) writes:
    
    Apparently, you missed the point that I was not talking about Al Aqsa Martyr Brigades.
Piggy Bank (by MIT) (Score:1)

by panck ( 69848 ) writes:

I haven't RTFA yet, but I wanted to link to Piggy Bank [mit.edu], which is a Firefox plugin by the Simile MIT group, which seems to be making a large step forward in bringing the usefulness of the sematic web to the users.

It contains a RDF engine, and allows you to install "screen scrapers" for different sites, plus it knows automatically how to read FOAF and some other ontologies that have spread on the net a little bit. When you see the "Semantic web coin" icon in your status bar, you can click on it and it will
How is this different from HTML? (Score:2, Insightful)

by klatty ( 871061 ) writes:

The Semantic Web provides a common framework that allows data to be shared and reused across application, enterprise, and community boundaries.

Isn't this basically what HTML is supposed to do kind of?
- Re:How is this different from HTML? (Score:1)
  
  by TheMESMERIC ( 766636 ) writes:
  Not really.
  
  RDF does not display pictures
  
  RDF does not contain self-styling
  
  RDF does not contain scripting
  
  RDF conforms to XML, HTML doesn't (xHTML does)
  
  The semantics of RDF is highly different from the (much ignored) semantics of HTML.
  
  HTML is a "hyper-textual" document with images, objects and links.
  RDF's prime purpose is for organzing resources and creating catalogues.
metacrap (Score:2)

by braddock ( 78796 ) writes:

Maybe the Sematic Web can work someday, maybe not.

However, anyone who thinks this is a utopia in the making should the infamous MetaCrap essay by Cory Doctorow:

Metacrap: Putting the torch to seven straw-men of the meta-utopia. [well.com]

After you are done reading, go to e-bay and pick yourself up a cheap Plam Pilot. :)

1. Introduction
2. The problems
2.1 People lie
2.2 People are lazy
2.3 People are stupid
2.4 Mission: Impossible -- know thyself
2.5 Schemas aren't neutral
2.6 Metrics influence results
2.7 There's more th
Who'd want to see that? (Score:1)

by imthesponge ( 621107 ) writes:

'type "tom hanks birth" slowly to see it in action' hmm
meaning (Score:1)

by tute666 ( 688551 ) writes:

i wouldn't like, nobody, especially a machine, telling me what i mean. the possibility of censorship is enormous
- Re:best with firefox (Score:5, Insightful)
  
  by Timesprout ( 579035 ) writes: on Friday May 27, 2005 @10:30AM (#12654862)
  
  No, 'works best with Firefox' is just as bad as 'works best with IE'. What would be nice would be to see 'works best with any standards compliant browser'.
  
  Parent Share
  twitter facebook
  - Re:best with firefox (Score:1)
    
    by cloudreader ( 801693 ) writes:
    
    You have a good point here. But in my experiance "works best with firefox" is equivalent to it follows standards.
  - standards-compliant means (Score:2)
    
    by jbellis ( 142590 ) writes:
    
    "everything but IE"
    
    not entirely, but pretty close -- if you write compliant html/js, it has an excellent chance of working in all of {firefox, opera, safari}
  - Re:best with firefox (Score:2)
    
    by FidelCatsro ( 861135 ) writes:
    
    "works worst on MSIE" that would do
  - Re:best with firefox (Score:1)
    
    by RobMcCool ( 886341 ) writes:
    
    Phrasing it that way, that it works best with any standards compliant browser, doesn't get the point across to those who think IE is a standards compliant browser.
    
    Search on TAP has been tested with Firefox on Linux, Windows, and OS/X, and with IE on Windows. I think Andy might have tried it with Safari. I haven't tested it with Opera. With IE, I had to redo how the dynamic HTML was being generated twice to get around its limitations, and it's still ignoring my alignment tags.
    Saying it works with standar
  - Re:best with firefox (Score:1)
    
    by FlynnMP3 ( 33498 ) writes:
    
    It would be great if people would say their website works with any compliant browser. But much of the world doesn't care. In my opinion that's because standards doesn't carry connotations with anybody besides web/standards geeks.
    
    Now the cute little firefox plushtoy (have you seen it?) - that's what people will remember. As long as you keep the FF designers on the straight and narrow wwith regards to implementing web standards, then everybody gets what they want.
    
    Course, some will argue that Firefox isn'
- Re:best with firefox (Score:2)
  
  by bcmm ( 768152 ) writes:
  
  Well, maybe Fx can do whatever they need fastest? Maybe they use pipelining or something?
- Re:And the big deal is??? (Score:2)
  
  by Jane_Dozey ( 759010 ) writes:
  
  One word: Context.
  
  Currently keywords are used to search for relevant matches and yes, this seems to work ok for lots of things but imagine if you could add context:
  
  Imagine searching for the title of a peice of music that you heard in a certain film.
  Currently this could involve some digging but a semantic search engine could very quickly narrow this search. Have a look at this [mspace.fm] (theres a demo somewhere on the site). It's a research project run by Southampton Uni. It's pretty basic but hopefully you'll g
- Re:RDF, RDFS, DAML, OWL?? (Score:1)
  
  by Reverend528 ( 585549 ) writes:
  
  RDFS and OWL are both RDF formats.
- Re:Backwards (Score:1)
  
  by smartdreamer ( 666870 ) writes:
  
  We should be working on being able to use even more unstructured data on the web.
  
  I don't see how it can be more unstructured that it already is! You want to get rid of HTML? You can! Maybe we could forget about protocols and standards? We should all have our own language, now that would be unstructured!
  I really don't know what you mean by this nor how can it be good.
  
  P.S.: Everyone as to "confirm you're not a script" even logged users.
- Re:Full disclosure (Score:2)
  
  by wan-fu ( 746576 ) writes:
  
  Actually, I saw a preso for this project a while ago. It was pretty neat, showed a lot of promise, and I see that it's been progressing nicely. Stanford KSL actually DOES like the Semantic Web. Sure, they receive DARPA funding, but that's not why the like the Semantic Web. Also, some of the features/scrapers have been built as requested by the gov't, but it's not like the entire project is for the gov't.

There may be more comments in this discussion. Without JavaScript enabled, you might want to turn on Classic Discussion System in your preferences instead.

Google watch out... (Score:5, Insightful)

Re:Google watch out... (Score:1, Interesting)

Re:Google watch out... (Score:3, Informative)

Re:Google watch out... (Score:1)

Re:Bashers watch out... (Score:2, Insightful)

Re:Google watch out... (Score:3, Interesting)

Re:Google watch out... (Score:1)

Re:Google watch out... (Score:2)

Qualify as Semantic Web ? (Score:1)

Re:Qualify as Semantic Web ? (Score:1)

Re:Qualify as Semantic Web ? (Score:1)

Loading... (Score:1)

Re:Loading... (Score:1)

From the check it out link... (Score:2, Funny)

600MB?!?!?!? (Score:1)

How 'bout... (Score:1)

autocomplete (Score:5, Insightful)

Re:autocomplete (Score:1)

Re:autocomplete (Score:3, Insightful)

Re:autocomplete (Score:2)

Useless? I don't think so. (Score:2)

Re:autocomplete (Score:1)

Semantic Web? (Score:5, Informative)

Re:Semantic Web? (Score:2)

Re:Semantic Web? (Score:2)

Re:Semantic Web? (Score:1)

Re:Semantic Web? (Score:2)

Re:Semantic Web? (Score:1)

Re:Semantic Web? (Score:1)

Re:Semantic Web? (Score:1)

Re:Semantic Web? (Score:1)

slashdotted (Score:3, Funny)

Re:slashdotted (Score:1)

Semantic Web Pitfalls (Score:4, Insightful)

Re:Semantic Web Pitfalls (Score:2)

Re:Semantic Web Pitfalls (Score:1)

Re:Semantic Web Pitfalls (Score:2)

Re:Semantic Web Pitfalls (Score:1)

Re:Semantic Web Pitfalls (Score:2)

This won't work (Score:2, Interesting)

A similiar project that I worked on at school (Score:1)

A tale of two technologies.... (Score:3, Interesting)

One more step... (Score:2)

awesome! (Score:3, Funny)

Re:awesome! (Score:2)

Re:awesome! (Score:1)

RDF? (Score:1)

Might actually help (Score:4, Insightful)

Semantic Horse shit (Score:1, Interesting)

Re:Semantic Horse shit (Score:1)

Re:Semantic Horse shit (Score:2, Insightful)

Re:Semantic Horse shit (Score:1)

Re:Semantic Horse shit (Score:2)

Re:Semantic Horse shit (Score:2)

Re:Semantic Horse shit (Score:2)

Re:Semantic Horse shit (Score:1)

Re:Semantic Horse shit (Score:1)

Re:RSS is not Semantic Web (Score:1)

Re:RSS is not Semantic Web (Score:1, Interesting)

Re:RSS is not Semantic Web (Score:1)

Re: poor explanation (Score:1)

My question (Score:5, Interesting)

Re:My question (Score:2, Interesting)

Re:My question (Score:1)

In Related News (Score:2)

auto-complete (Score:2)

Slashdotting Google bomb? (Score:3, Interesting)

Re:Slashdotting Google bomb? (Score:1)

Re:Slashdotting Google bomb? (Score:2)

Re:Slashdotting Google bomb? (Score:2)

I'm still trying to figure out... (Score:3, Funny)

The semantic data is already there (Score:2)

You missed the point! (Score:3, Insightful)

Re:You missed the point! (Score:2)

Re:You missed the point! (Score:2)

Re:You missed the point! (Score:2)

Re:You missed the point! (Score:1)

Re:You missed the point! (Score:1)

Re:The semantic data is already there (Score:1)

Gathering Metadata from Apple's Filesystem? (Score:1)