Slashdot is powered by your submissions, so send in your scoop

 



Forgot your password?
typodupeerror
×
The Internet IT

The Need For A Tagging Standard 200

John Carmichael writes "Tags are everywhere now. Not just blogs, but famous news sites, corporate press bulletins, forums, and even Slashdot. That's why it's such a shame that they're rendered almost entirely useless by the lack of a tagging standard with which tags from various sites and tag aggregators like Technorati and Del.icio.us can compare and relate tags to one another. Depending on where you go and who you ask, tags are implemented differently, and even defined in their own unique way. Even more importantly, tags were meant to be universal and compatible: a medium of sharing and conveying info across the blogosphere — the very embodiment of a semantic web. Unfortunately, they're not. Far from it, tags create more discord and confusion than they do minimize it. I have to say, it would be nice to just learn one way of tagging content and using it everywhere.""
This discussion has been archived. No new comments can be posted.

The Need For A Tagging Standard

Comments Filter:
  • Re:Automatic tagging (Score:4, Interesting)

    by remmelt ( 837671 ) on Monday January 15, 2007 @10:36AM (#17613184) Homepage
    Tags are probably very community based, so they would only make sense within that community. (!itsatrap wouldn't work so well on iloveponies.co.ae). That said, why make tags which are meaningless to other communities or have vastly different meanings to other people available as a sorting or searching option? Sure, you could make some pretty mean stats proving any point you'd like (bad grammar in tags up 14.8% from last year! tag "yes" used in 87% of all blogs, world population feeling positive!) but I don't see the point.

    Also, anyone trying to make a serious argument containing the word "blogosphere" should really try and get out more. Come on people, it's not world hunger we're solving here. Viz: http://coolestshop.com/headline-blog.html [coolestshop.com]
  • XSLT for Tags? (Score:3, Interesting)

    by null etc. ( 524767 ) on Monday January 15, 2007 @10:36AM (#17613194)
    Similar to how XML uses XSLT to transform XML documents from one application to another, it wouldn't be a half-bad idea to have a Tag Transformation Language. Organizations with a lot of market share can define their own tag standards, and then people can optionally specify the transformation between their own local ontologies and the established tag standards. This has the advantage of being participation-driven.
  • Re:Don't agree (Score:5, Interesting)

    by Anonymous Coward on Monday January 15, 2007 @10:40AM (#17613236)
    Er, guys?

    Tags are keywords.

    There's a keyword line up in the header that isn't being used for much these days.

    If you want to tag your document in a machine-readable way, put the tags in the keyword field. Problem solved.
  • No.. and yes (Score:2, Interesting)

    by slashmojo ( 818930 ) on Monday January 15, 2007 @10:44AM (#17613308)

    First of all tags are not exclusive to the blogosphere - they exist on the boardscape (see boardtracker [boardtracker.com] for example) and of course on the many social nets and pretty much everywhere else.

    There are already microformats [wikipedia.org] for defining tags which can and should be used.

    Tags are for building a folksonomy [wikipedia.org] and created 'by the people' so are by their nature, to a certain extent, personalized and flexible.. what makes sense to you may make no sense to everyone else but so what? You made it, its good for you and thats good enough.. however chances are it will make sense to some other people anyway, no matter what or how you tag, so its all good.

  • Re:Don't agree (Score:2, Interesting)

    by Anonymous Coward on Monday January 15, 2007 @10:49AM (#17613372)

    I agree with you, and would add:

    1. If you establish a "tagging standard", you practically guarantee nobody's going to follow it. What's in it for them?
    2. What normal person care about tags anyway? If we want more info on a topic, we use Google.
    3. The blogosphere is for losers anyway. Most of the time, they just sit around blogging about the blogosphere. Case in point: TFA. This garbage dump of anti-content can remain disorganized, for all I care.
  • by nbauman ( 624611 ) on Monday January 15, 2007 @10:51AM (#17613382) Homepage Journal
    I've talked to librarians and information scientists, and they talk about "controlled vocabulary". They told me one of the best systems was Pubmed http://www.ncbi.nlm.nih.gov/entrez/query.fcgi [nih.gov] which is an index of essentially every article published in a peer-reviewed medical journal. Every article is "tagged" with Medical Subject Headings (MeSH) keywords, and you can search the database for those keywords. If they can use "heart" or "cardiac", they have to decide which one to use (they use "cardiac"). They have keywords to separate human studies from animal studies. Here's more explanation http://www.nlm.nih.gov/mesh/meshhome.html [nih.gov] It's basically open source.

    A similar system in law is the Westlaw key word system. The New York Times used to have a great keyword index, but I can't find it in the NYT online.

  • Re:Automatic tagging (Score:4, Interesting)

    by drcoppersmith ( 1048722 ) on Monday January 15, 2007 @10:58AM (#17613476) Homepage Journal
    There are a lot of instances of manual tagging, and I agree with you that they're just too cumbersome (as does almost an entire field of psycholinguists [if you think you can get all of them to agree on anything you're sorely mistaken. They'll disagree just because they can]).

    The automatically generated tags are exactly what I was talking about. I didn't get terribly explicit with my ideas, but you seem to be going in the same direction I was. Getting the software to both tag incoming documents and categorize the semantic webs generated by each is the key to some 'universal' tagging sytem. This way we have maximally efficient tags along with a standardized definition for each and (perhaps most importantly) an automatic way of tagging all the documents to be processed. No room for the "13 year old cheerleader tags" as someone so eloquently put before.

    We still have the problem of naming the 'generic' tag categories generated by the software... The solution for that one is a lot hazier, though important. I don't think anyone will go looking for 'category 12233242' to find 'academic humor'.
  • the solution... (Score:2, Interesting)

    by Bazman ( 4849 ) on Monday January 15, 2007 @11:05AM (#17613552) Journal
    I just RTFA and apparently the biggest problem is whether you type your tags as "Windows Vista","Piece Of Crap" or Windows_Vista,Piece_Of_Crap or WindowsVista,PieceOfCrap, so that people who put tags on D.e.li.cio.us might get confused when putting tags on technofarti.com. Spaces? Quotes? Delimeters? Oh my. What shall we do.

      Basically, people are too dumb/lazy/stupid to read a one-line description of how to format their tags. How confusing can it be? You just show people how to do it in the form, e.g.

      Tags [ ] (eg dogs, "border collies", barking)

    or

      Tags [ ] (eg dogs,border_collies,barking)

    or

      Tags [ ] (eg dogs,borderCollies,barking)

    Now, do we need a standard, OR do we need people to be able to read instructions? Note that one of these choices is a specific, set-in-stone piece of information, the other is a general piece of advice that people would do well to follow for most of their lives (although being able to read instructions is no guarantee that following them is a good idea).

  • by Anonymous Coward on Monday January 15, 2007 @11:06AM (#17613578)
    Libraries have had a similar problem, for the past hundred years of so.

    Here is a book. Where do you put the book in the library, and how do
    you classify it so as to make it maximally useful for your [*] patrons?

    [*] That's important! Your patrons are distinct from mine!

    You can order all the books in the collection
    by accession date (when you got the book).

    You can order all the books by author's last name.

    You can order all the books by title.

    You can order all the books by subject. If you do this, you can use
    Dewey classification, LC classification, or something else.

    Suppose you just stick with LC classification.

    Even two libraries that have the same book, and both use LC
    classification, say, may classify a book differently. Say you
    have an AI book that is *the* seminal text covering how to do
    clustering via fuzzy logic. Do you put this with all the AI books?
    with all the clustering books? with all the fuzzy logic books?
    (and all three sets of books may be in different places.)

    Tagging content on the web represents a similar situation. If you
    use a 'standard term' to tag a text, different sets of users /
    customers / readers may not associate that 'standard term' with
    the meaning you intended. A given term or phrase can be 'classified'
    (library science term) or placed into different categories of meaning,
    depending on context.

    I think that the original poster's statement ("Tagging was intended to
    be universal and standardized") either shows great naivete or hubris
    on the part of the unstated "intenders". Context is the key. Any
    one and his dog can come up with a standardized tagging scheme, but
    users of it will nonetheless adopt it (or not) based on the scheme's
    ability to classify information in a way that works for the adopter.
    What prospective adopters want, however, is not a straighjacket that
    forces them to classify web pages in a way that the adopter's users
    won't understand and won't use.

    ---a former AI researcher

  • Re:Automatic tagging (Score:2, Interesting)

    by drcoppersmith ( 1048722 ) on Monday January 15, 2007 @11:16AM (#17613700) Homepage Journal
    I agree with your take on tags being community based. I think there's more use for this out there, such as categorizing communities, looking at the underlying semantics of a website, determining the focus of a company, or summarizing the entirety of a body of research (and more interestingly, categorizig what is part of and what is not part of that body of research).

    This is just a problem I've worked on for a few years and have always had a small fascination with, I'm glad to share it (both in the mundane and fantastic applications).
  • there is a standard (Score:5, Interesting)

    by Yonder Way ( 603108 ) on Monday January 15, 2007 @11:17AM (#17613718)

    There is a standard but nobody uses it these days. Even the search engines disavow it anymore.

    <META name="keywords" content="foo, bar, baz"/>
  • Re:Don't agree (Score:3, Interesting)

    by laffer1 ( 701823 ) <luke@@@foolishgames...com> on Monday January 15, 2007 @12:00PM (#17614322) Homepage Journal
    Case is also an issue. Some sites only allow lowercase tags while others don't care about case.

    This is similar to the problem blogging sites have with cross site scripting. Try to tell a blogger you won't take HTML or bbcode posts (depending on generation of the blogger). Regardless of what you do, there's going to be sites that don't follow the rules and there will also be ways to screw it up for everybody.

    There isn't a standard for many things on the internet which causes validation to be near impossible. Security researchers complain people don't do input validation, but I've never seen a complete webapp that's an example of security at the time its written. You can't validate things like addresses, names, e-mail addresses, or long text entries including blogs and content without leaving out characters that should be valid or flat out blocking some from using your service.

    As for the standard, this reminds me or RDF. Had it taken off, we could tag data with properly defined, shared tags. Defining tags in RDF would allow sites to share the information through RDF and thus solve the problem of transmittal. Of course getting everyone to agree to this is another thing.
  • Re:The other option (Score:3, Interesting)

    by _Sharp'r_ ( 649297 ) <sharper@@@booksunderreview...com> on Monday January 15, 2007 @12:42PM (#17614972) Homepage Journal
    Speaking of /., it'd be nice if someone would implement negative tags so that the community can remove obviously inappropriate tags. Something like using -itsatrap would work nicely.
  • by BovineSpirit ( 247170 ) on Monday January 15, 2007 @02:20PM (#17616452) Homepage
    The rel-tag microformat [microformats.org] is an attempt to standardise tagging. It relies on other microformats to define what it is you are tagging. There isn't a 'photo' microformat at the moment, so you can't do a web-wide search for photos tagged 'fireworks' for example. If you're interested in the semantic web it's worth checking out microformats. You can download a plugin [mozilla.org] for firefox that reads microformats. Go and have a look at Flickr with it, or any other site that implements microformats. If people have tagged something with a 'geo' tag giving long. and lat. then it will bring up a Google Map showing the location. If they've included a 'hCard' around their contact details you can add it to your address book.
  • by monkeyboythom ( 796957 ) on Monday January 15, 2007 @02:55PM (#17616912)

    A dictionary.

    There are people who live and die by tagging their information. They build folders and create lists.

    There are people who just go through life serendipitously. They never use the laundry hamper and most people call them slobs.

    Between these two groups are the rest of humanity. Sometimes they make lists and sometimes they don't. And just because the word, "librarian," strikes a fear of boredom, most people ignore library sciences. The science of tagging, if to be used as a global panacea, must be approached or studied to be feasible and usable over generations.

  • Re:Hyphens. (Score:1, Interesting)

    by Anonymous Coward on Monday January 15, 2007 @03:32PM (#17617358)
    I do not pretend to have a solution to this problem, but I think the situation would be improved if the editors or maybe the /.ers who wrote the top rated comment where the only people allowed to set the tags.
    I think a bigger part of the solution is to stop asking (leading/retorical/flaimbait) questions in the submission. When you do that all of the discussion gravitates to that idiotic question rather than the meaningfull content of the blurb/article.
  • Look into topic maps (Score:2, Interesting)

    by vrillusions ( 794844 ) on Monday January 15, 2007 @04:45PM (#17618514) Homepage

    I've started to run into this problem myself from using del.icio.us as my primary bookmark source. One of my current issues is not what tags other people are using, but what tags I am using. Currently I have a lot of overlapping tags. I did some cleanup lately so that 'photos' and 'photo' are in a single tag, etc.

    I started to look around and found there have been a lot of standardizations of topic maps. Although intended more for very large systems (think government sized systems categorizing millions of documents). The UK government has a topic system called the e-Government Metadata Standard (e-GMS) [govtalk.gov.uk]. The schema is browsable online [esd.org.uk]. Another good article is The TAO of Topic Maps [ontopia.net] (also in pdf [coverpages.org])

    I think there should be a basic standard to avoid situations like the photo/photos tags above. But I think that should be as far as it goes. The good thing about tagging on most sites is you are not limited. The bad thing about tagging on most sites is you are not limited.

He has not acquired a fortune; the fortune has acquired him. -- Bion

Working...