The Need For A Tagging Standard 200
John Carmichael writes "Tags are everywhere now. Not just blogs, but famous news sites, corporate press bulletins, forums, and even Slashdot. That's why it's such a shame that they're rendered almost entirely useless by the lack of a tagging standard with which tags from various sites and tag aggregators like Technorati and Del.icio.us can compare and relate tags to one another.
Depending on where you go and who you ask, tags are implemented differently, and even defined in their own unique way. Even more importantly, tags were meant to be universal and compatible: a medium of sharing and conveying info across the blogosphere — the very embodiment of a semantic web. Unfortunately, they're not. Far from it, tags create more discord and confusion than they do minimize it.
I have to say, it would be nice to just learn one way of tagging content and using it everywhere.""
Re:Automatic tagging (Score:4, Interesting)
Also, anyone trying to make a serious argument containing the word "blogosphere" should really try and get out more. Come on people, it's not world hunger we're solving here. Viz: http://coolestshop.com/headline-blog.html [coolestshop.com]
XSLT for Tags? (Score:3, Interesting)
Re:Don't agree (Score:5, Interesting)
Tags are keywords.
There's a keyword line up in the header that isn't being used for much these days.
If you want to tag your document in a machine-readable way, put the tags in the keyword field. Problem solved.
No.. and yes (Score:2, Interesting)
First of all tags are not exclusive to the blogosphere - they exist on the boardscape (see boardtracker [boardtracker.com] for example) and of course on the many social nets and pretty much everywhere else.
There are already microformats [wikipedia.org] for defining tags which can and should be used.
Tags are for building a folksonomy [wikipedia.org] and created 'by the people' so are by their nature, to a certain extent, personalized and flexible.. what makes sense to you may make no sense to everyone else but so what? You made it, its good for you and thats good enough.. however chances are it will make sense to some other people anyway, no matter what or how you tag, so its all good.
Re:Don't agree (Score:2, Interesting)
I agree with you, and would add:
World's best tagging system (Score:2, Interesting)
A similar system in law is the Westlaw key word system. The New York Times used to have a great keyword index, but I can't find it in the NYT online.
Re:Automatic tagging (Score:4, Interesting)
The automatically generated tags are exactly what I was talking about. I didn't get terribly explicit with my ideas, but you seem to be going in the same direction I was. Getting the software to both tag incoming documents and categorize the semantic webs generated by each is the key to some 'universal' tagging sytem. This way we have maximally efficient tags along with a standardized definition for each and (perhaps most importantly) an automatic way of tagging all the documents to be processed. No room for the "13 year old cheerleader tags" as someone so eloquently put before.
We still have the problem of naming the 'generic' tag categories generated by the software... The solution for that one is a lot hazier, though important. I don't think anyone will go looking for 'category 12233242' to find 'academic humor'.
the solution... (Score:2, Interesting)
Basically, people are too dumb/lazy/stupid to read a one-line description of how to format their tags. How confusing can it be? You just show people how to do it in the form, e.g.
Tags [ ] (eg dogs, "border collies", barking)
or
Tags [ ] (eg dogs,border_collies,barking)
or
Tags [ ] (eg dogs,borderCollies,barking)
Now, do we need a standard, OR do we need people to be able to read instructions? Note that one of these choices is a specific, set-in-stone piece of information, the other is a general piece of advice that people would do well to follow for most of their lives (although being able to read instructions is no guarantee that following them is a good idea).
Old problem, and you're not going to solve it. (Score:1, Interesting)
Here is a book. Where do you put the book in the library, and how do
you classify it so as to make it maximally useful for your [*] patrons?
[*] That's important! Your patrons are distinct from mine!
You can order all the books in the collection
by accession date (when you got the book).
You can order all the books by author's last name.
You can order all the books by title.
You can order all the books by subject. If you do this, you can use
Dewey classification, LC classification, or something else.
Suppose you just stick with LC classification.
Even two libraries that have the same book, and both use LC
classification, say, may classify a book differently. Say you
have an AI book that is *the* seminal text covering how to do
clustering via fuzzy logic. Do you put this with all the AI books?
with all the clustering books? with all the fuzzy logic books?
(and all three sets of books may be in different places.)
Tagging content on the web represents a similar situation. If you
use a 'standard term' to tag a text, different sets of users /
customers / readers may not associate that 'standard term' with
the meaning you intended. A given term or phrase can be 'classified'
(library science term) or placed into different categories of meaning,
depending on context.
I think that the original poster's statement ("Tagging was intended to
be universal and standardized") either shows great naivete or hubris
on the part of the unstated "intenders". Context is the key. Any
one and his dog can come up with a standardized tagging scheme, but
users of it will nonetheless adopt it (or not) based on the scheme's
ability to classify information in a way that works for the adopter.
What prospective adopters want, however, is not a straighjacket that
forces them to classify web pages in a way that the adopter's users
won't understand and won't use.
---a former AI researcher
Re:Automatic tagging (Score:2, Interesting)
This is just a problem I've worked on for a few years and have always had a small fascination with, I'm glad to share it (both in the mundane and fantastic applications).
there is a standard (Score:5, Interesting)
There is a standard but nobody uses it these days. Even the search engines disavow it anymore.
Re:Don't agree (Score:3, Interesting)
This is similar to the problem blogging sites have with cross site scripting. Try to tell a blogger you won't take HTML or bbcode posts (depending on generation of the blogger). Regardless of what you do, there's going to be sites that don't follow the rules and there will also be ways to screw it up for everybody.
There isn't a standard for many things on the internet which causes validation to be near impossible. Security researchers complain people don't do input validation, but I've never seen a complete webapp that's an example of security at the time its written. You can't validate things like addresses, names, e-mail addresses, or long text entries including blogs and content without leaving out characters that should be valid or flat out blocking some from using your service.
As for the standard, this reminds me or RDF. Had it taken off, we could tag data with properly defined, shared tags. Defining tags in RDF would allow sites to share the information through RDF and thus solve the problem of transmittal. Of course getting everyone to agree to this is another thing.
Re:The other option (Score:3, Interesting)
Technorati has a standard... (Score:3, Interesting)
We already have a book for tagging, its called... (Score:2, Interesting)
A dictionary.
There are people who live and die by tagging their information. They build folders and create lists.
There are people who just go through life serendipitously. They never use the laundry hamper and most people call them slobs.
Between these two groups are the rest of humanity. Sometimes they make lists and sometimes they don't. And just because the word, "librarian," strikes a fear of boredom, most people ignore library sciences. The science of tagging, if to be used as a global panacea, must be approached or studied to be feasible and usable over generations.
Re:Hyphens. (Score:1, Interesting)
Look into topic maps (Score:2, Interesting)
I've started to run into this problem myself from using del.icio.us as my primary bookmark source. One of my current issues is not what tags other people are using, but what tags I am using. Currently I have a lot of overlapping tags. I did some cleanup lately so that 'photos' and 'photo' are in a single tag, etc.
I started to look around and found there have been a lot of standardizations of topic maps. Although intended more for very large systems (think government sized systems categorizing millions of documents). The UK government has a topic system called the e-Government Metadata Standard (e-GMS) [govtalk.gov.uk]. The schema is browsable online [esd.org.uk]. Another good article is The TAO of Topic Maps [ontopia.net] (also in pdf [coverpages.org])
I think there should be a basic standard to avoid situations like the photo/photos tags above. But I think that should be as far as it goes. The good thing about tagging on most sites is you are not limited. The bad thing about tagging on most sites is you are not limited.