Slashdot Log In
Using the Semantic Web to Enhance Search
Posted by
Zonk
on Fri May 27, 2005 09:21 AM
from the does-whatever-a-spider-can dept.
from the does-whatever-a-spider-can dept.
RobMcCool writes "At Stanford KSL, we really like the Semantic Web. So we've taken many of our favorite web sites, scraped them, and put together a huge pile of RDF, which we'll let you download. We've used that RDF to create a search application, in the spirit of Google Q & A or Microsofts recently announced MSN Search extensions. Our search can answer simple factual queries like the previously discussed population of Portugal but can also answer some more complex ones. We also have a smart autocomplete system, type "tom hanks birth" slowly to see it in action (best with Firefox). We're looking for people to be a part of this search system by running their own search sites, and by putting their data on the Semantic Web. Come check it out!"
This discussion has been archived.
No new comments can be posted.
The Fine Print: The following comments are owned by whoever posted them. We are not responsible for them in any way.
Full
Abbreviated
Hidden
Loading... please wait.
Google watch out... (Score:5, Insightful)
This is definitely one to watch...
Re:Google watch out... (Score:3, Interesting)
This statement is why I was wondering why this was considered such a wonderful thing. For a while now, there's been a research project at IBM called WebFountain [ieee.org] that not only does everything that Semantic Web attempts to do, but doesn't require any special mark up either. Its goal is to work with completely unstructured data of any type, including web pages, powerpoint documents, word docs, PDFs, etc etc.
Re:Google watch out... (Score:2)
In the context of GIS data, where metadata can be incredibly useful, creation of metadata is like pulling teeth.
Unfortunately, until and unless there's automated tools - your "intelligent mining tools", this whole thing will never be more than a curiosity...
Re:Google watch out... (Score:3, Informative)
Re:Bashers watch out... (Score:2, Insightful)
if it would mean that their sites would rank higher in the search results, I'd say that they all would...
From the check it out link... (Score:2, Funny)
That's nice and all but who shot first and is there a mash up of both scenes with crazy alien bar music mixed with 20's sinister piano.
autocomplete (Score:5, Insightful)
Or, even better, never have any autocomplete turned on automatically. Do a VB-like idea, where if you want to see possibilities at a certain point, hit a specific key that will register for the list to pop down.
Re:autocomplete (Score:3, Insightful)
Re:autocomplete (Score:2)
Thats just usability, the concept is sound. Instead of filling in results with "a", fill them in on three letters like "ast", which could have asterisk, astronaut, etc. The idea is to 1) save time by not making them typein an extra 6 letters and 2) cut down on misspellings.
Useless? I don't think so. (Score:2)
In any case, for Japanese/Chinese/Korean - autocomplete is almost a natural part of using a web search engine, so it's not a "useless feature that nobody wants to see."
Those languages use alphabet-based inputs which are then converted into native text. Why bother converting if you can take the direct alphabetical input and start showing native text autocompletes?
Semantic Web? (Score:5, Informative)
The Stanford research is interesting, but I'm still trying to make up my mind about the Semantic Web, learning about RDF, and whether I need to bake in ways of handling these kinds of assertions in my web app. The Stanford group writes, "Our hope is that our search application spurs development of the Semantic Web, and leads to sites publishing their data in this format so that we don't have to." It obviously takes more work to encode such information and getting user contributions auto-marked for the semantic web. For a counter viewpoint, take a look at some of Clay Shirky's work -- in particular:
Will the semantic web be supported by future versions of Drupal, phpBB, and other grass-roots content management web apps? Not sure. Since a lot of the content is visitor generated, you would have to build in ways of providing easy markup. Would be interested to hear /. thoughts on the matter.
Re:Semantic Web? (Score:2)
For example, a slashdot story about a newly discovered type of <crab type="crustacean"/> would soon degenerate into postings about <crab type"venereal disease"/>. Marking quickie (pun intended) posts up semantically would detract f
Re:Semantic Web? (Score:2)
Since it's obvious that you do understand, would it be possible for you to come up with a 1-2 paragraph explanation of what the Semantic Web is and does?
I've spent some time on the linked to web site, and read Clay Shirky's essay, and I'm still not sure what it
Re:Semantic Web? (Score:2)
Many thanks.
D
slashdotted (Score:3, Funny)
Semantic Web Pitfalls (Score:4, Insightful)
I mean, think about it this way - while laziness or inertia might initially win out, once someone's competitors start to explore the idea of the semantic web, interest will start to be shown in it, especially once it becomes either profitable to do so.
Re:Semantic Web Pitfalls (Score:2)
Well, part of Shirky's point is that it is so lacking in usefulness that there will be no advantage to anybody for display their content that way. I think he's right. I've watched AI based on these kind of logical rules and semantics stumble along for years without producing anything useful, and then along comes some program that takes little pieces of what other people said and 'mindlessly' strings them together in new ways and it wins a Turing contest.
Logical reasoning of this kind, despite all the hyp
Re:Semantic Web Pitfalls (Score:2)
Re:Semantic Web Pitfalls (Score:2)
But there remains the problem that this technique does not find semantic connections that the authors don't know about.
This won't work (Score:2, Interesting)
Secondly, scraping doesn't always work and you will surely have low-grade porno and get rick quick schemes/scams littering your sematic data.
But let us suppose that the main benefits of a semantic web are (A) access to reference data [which may be falsified, oops], and (B) access to product availability data [which may be falsified, oops, like mail order companies that pret
A tale of two technologies.... (Score:3, Interesting)
It is refreshing to see exciting new solutions to the problems we have at present of targeted information retrieval on the internet. I can remember years of stagnation in this field (read: early 90's), and any change from today's google-and-pray searching mentality among the majority of end-users will be welcome.
One more step... (Score:2)
awesome! (Score:3, Funny)
Might actually help (Score:4, Insightful)
1. For really popular subjects, the useful links are swamped in the noise of sites trying to make a buck off of getting you to look at their ads before directing you to somewhere else, that might have the actual content or might not.
2. For many less popular subjects, there is some oddity, like an unusual term being borrowed by some other field, so that it is something most people have never heard of, but people in two or more specialties use it frequently, in very different ways. resulting in strangeness. (i.e. the search engine throws up 23,003 links for a search on "Sator Resartus". 30% are esoteric literary criticism, 20% relate to apoptosis (cell biology), 20% relate to building moral inhibitions into A.I., 10% to Keith Laumer novels, and the rest are probably noise).
(I'm sure there are more than these two limits. Someone else may want to comment on some others).
This is likely to help with the second case, oddities in the data set grouping. (it could sort links into the larger sub-categories, query the user which one(s) seemed most applicable, and maybe even sort out a small set of links that explain, for the previous example, how a high brow literary term got borrowed by the other fields).
It's not as likely it would help with the first case, though, as sites that don't have actual content are actively duplicitous. Something that is actively trying to fool humans is still likely to be very successful at fooling our tools.
My question (Score:5, Interesting)
Re:My question (Score:2, Interesting)
In Related News (Score:2)
(starts filling in application)
auto-complete (Score:2)
Of course, it's a beta feature at Google Labs. FYI...
Slashdotting Google bomb? (Score:3, Interesting)
How is that different to linking to http://www.w3.org/2001/sw/ [w3.org]?
Is Slashdot trying to improve someone Google ranking?
(Also, did Slashdot always linkify URLs entered as plaintext? I didn't write any "a href" for those two.)
Re:Slashdotting Google bomb? (Score:2)
They always did it, for a random number of links every few queries or so. It's so they can collect data on which sites people thought were relevant to their query. These links seem to have become more and more common though.
Re:Slashdotting Google bomb? (Score:2)
I'm still trying to figure out... (Score:3, Funny)
The semantic data is already there (Score:2)
Shameless promotion: for OS X users,
You missed the point! (Score:3, Insightful)
Indeed, you might output RDF from your processing of Web pages.
Extracting information from semi-structured text is very different to making logical assertions about resources.
Re:You missed the point! (Score:2)
Re:You missed the point! (Score:2)
Ignoring your grammar, I would reply: tell that to the people trying to develop Web Services standards! Specifically, I'd point you to OWL-S, and its simpler, ad-hoc cousins.
One of the most common uses of the Semantic Web at present is describing PEOPLE (FOAF, as used by LiveJournal and countless others). Do you not see that the Semantic Web goes beyond a Web of human-readable documents into a machine-understandable Web of data? You don't find pages on the S
Re:You missed the point! (Score:2)
How is this different from HTML? (Score:2, Insightful)
Isn't this basically what HTML is supposed to do kind of?
metacrap (Score:2)
However, anyone who thinks this is a utopia in the making should the infamous MetaCrap essay by Cory Doctorow:
Metacrap: Putting the torch to seven straw-men of the meta-utopia. [well.com]
After you are done reading, go to e-bay and pick yourself up a cheap Plam Pilot.
1. Introduction
2. The problems
2.1 People lie
2.2 People are lazy
2.3 People are stupid
2.4 Mission: Impossible -- know thyself
2.5 Schemas aren't neutral
2.6 Metrics influence results
2.7 There's more th
Re:best with firefox (Score:5, Insightful)
Parent
standards-compliant means (Score:2)
not entirely, but pretty close -- if you write compliant html/js, it has an excellent chance of working in all of {firefox, opera, safari}
Re:best with firefox (Score:2)
Re:best with firefox (Score:2)
Re:Semantic Horse shit (Score:2)
Fine with me. I don't want their information. In fact I'd like to get rid of their information (banner ads and spam).
If I want to deal with businesses, I go to my local shop. If I can't find what I want there, I look up the yellow pages of my local phonebook. If I can't find what I want there, I loo
Re:Semantic Horse shit (Score:2)
Rete scales really well as you add rules but scales really poorly with the number of items in working memory.
I believe that rete would be a bad choice for the SW where you would have a very large data set in working memory.
(I used to do a lot of rete hacking: commercial expert system tools for Xerox Lisp Machines and the Mac, and hacking OPS5 to support 'multiple data worlds' for in house use.)
Re:Semantic Horse shit (Score:2, Insightful)
Re:Semantic Horse shit (Score:2)
Re:And the big deal is??? (Score:2)
Currently keywords are used to search for relevant matches and yes, this seems to work ok for lots of things but imagine if you could add context:
Imagine searching for the title of a peice of music that you heard in a certain film.
Currently this could involve some digging but a semantic search engine could very quickly narrow this search. Have a look at this [mspace.fm] (theres a demo somewhere on the site). It's a research project run by Southampton Uni. It's pretty basic but hopefully you'll g
Re:Full disclosure (Score:2)
Re:What!? (Score:2)