Catch up on stories from the past week (and beyond) at the Slashdot story archive

 



Forgot your password?
typodupeerror
Google

Google Tags Content Creators 67

Posted by samzenpus
from the all-due-credit dept.
bizwriter writes "Google announced that it will support authorship HTML tags, a way to associate Web content with the individuals who create it. Suddenly, search engines know when one person was responsible for a body of work, no matter where content appears on the Web. If Google incorporates this into page relevance and ranking, as it is considering, the result could change the balance of power between those who create and those who publish."
This discussion has been archived. No new comments can be posted.

Google Tags Content Creators

Comments Filter:
  • So you can add these tags that mean google will direct people to the original author rather than your click-through blog - but why would you?
    • by Anonymous Coward

      So you can add these tags that mean google will direct people to the original author rather than your click-through blog - but why would you?

      Because anything that helps put Gawker Media out of business is OK by me.

      More seriously, because if I'm reading your blog's link to an article, it's because I want your commentary on the article. I might want the Fark thread about it, but I certainly don't want Gawker's take on BoingBoing's post about that dude on Reddit who read a NASA press release. If you're ju

  • Article Explained (Score:5, Informative)

    by pinkushun (1467193) * on Thursday June 09, 2011 @07:09AM (#36386320) Journal

    It is made to sound more uncontrolled that it is. This is what really happens:

    The markup uses existing standards such as HTML5 (rel=”author”) and XFN (rel=”me”) to enable search engines and other web services to identify works by the same author across the web.

    This is handy, allowing search engines to find content by a specific author. It's not like Google will automatically decide what content links to which author.

    We can't expect Google to give purely weighted search results based on this either. More like they will keep their existing page rankings, and include this extra author meta-data in specialized searches.

    We know that great content comes from great authors, and we’re looking closely at ways this markup could help us highlight authors and rank search results.

    The bnet article seems to over dramatize it, possibly due to a lack of understanding what this means for content creators.

    Or do I also have the wrong idea?

    • by imamac (1083405)
      Parent=better sumary. Thank you.
    • I agree that they probably won't use it in search rankings, otherwise everyone will just copy the current number 1 "best author" in their tags..

      • by tepples (727027)
        It's also a crime in some countries to put a fraudulent notice of authorship on a work. For example, in the United States, see 17 USC 1202-1205 [copyright.gov].
        • by Balthisar (649688)

          Yes, but does that apply to the source code or to the displayed content? Copyright law doesn't seem to support HTML tags, whereas a direct statement "Copyright 2011 by Firstname Lastname" passes muster.

          (Note than in the USA we all know you don't need a copyright statement to have the copyright. That's not what this is about.)

          • by drinkypoo (153816)

            Yes, but does that apply to the source code or to the displayed content?

            I just checked, and the answer is in the link provided to you. But I'm not going to tell you what the answer is, because that would be enabling your asshat behavior.

            • by swillden (191260)

              Yes, but does that apply to the source code or to the displayed content?

              I just checked, and the answer is in the link provided to you. But I'm not going to tell you what the answer is, because that would be enabling your asshat behavior.

              By my reading of the law... it makes no distinction between source or displayed content, but I see nothing in the law that would prohibit a copyright holder from claiming that someone else was the author. Perhaps some other law would, particularly if the claim could be construed as defamation, but I don't see anything in copyright law that addresses this issue.

      • Why not GPG sign it?
    • Will this help or hurt? A little before the turn of the century I researched Quake and Quake II console commands, tested them all, and wrote short descriptions of how to use them and what they did. It was copied on dozens of other web sites, word for word, usually with no attribution and usually with someone else's name on it.

      Meta tags were badly misused to spam search engines. And what if you're putting content [slashdot.org] on someone else's site and have no control over the meta tags?

      • Will this help or hurt? A little before the turn of the century I researched Quake and Quake II console commands, tested them all, and wrote short descriptions of how to use them and what they did. It was copied on dozens of other web sites, word for word, usually with no attribution and usually with someone else's name on it.

        I'm not sure that would even be covered by copyright law. You aren't allowed to copyright "facts" or "factual data". Maybe if your "short descriptions" were long enough, or expounded on the command beyond being a simple summary, it could be considered an original work. But for the most part, a simple compilation or list of factual information is not considered a copyrightable work.

        • by jvkjvk (102057)

          I, on the other hand, believe it would be.

          Here's the original authorship:

          wrote short descriptions of how to use them and what they did

          Or are you saying that technical help documentation cannot be copyrighted?

          I imagine there are a few other people who would disagree with that as well.

          Note - this is entierly seperate from a discussion on what *should* be able to be copyrighted, much less what goals we wish with the laws and whether they accomplish those goals.

          Regards.

        • by mcgrew (92797) *

          The data can't be copyrighted, but its presentation is. If you write a book about chemistry I can read it, learn from it, and write my own chemistry book using the facts from your book as long as I present those facts in my own words. The plagarists copied the entire thing whole cloth, even using the same IP address I used in one of the examples. Although my question here is about plagarism rather than copyright infringement (I had no problem with someone republishing it provided they gave me credit and a l

      • Well then let me thank you for those lists, muchly appreciated! :: Q1 fan

  • The authorship link doesn't work for me so it may answer this, but...what's to stop me from "borrowing" someone's author tag and bumping up my site on the search results?
    • by DZign (200479)

      probably nothing.. as well as another site copying your site can just remove your tag and replace it with theirs, claiming they're the original author..

      • by DZign (200479)

        Replying to myself: seems it has to be reciprocal to work.So that's stopping someone from linking to an official author.

        You need rel=me on both sites linking to eachother.
        http://www.google.com/support/webmasters/bin/answer.py?answer=1229920 [google.com]

        Now I wonder - it's an html5 tag. Should I already implement it on my own website which isn't html5 or would google then just ignore it ?
        I can already put it on my own site, blog, facebook, .. but if it's going to be ignored then I won't bother..

        • by Tacvek (948259)

          Google's engine does not distinguish between the various versions of HTML. As long as Google successfully detects the page as html (and it is quite good at determining that), you can use any feature from any version and Google could not care.

          For what it is worth, this markup is also valid HTML 4, but HTML 4 simply does does not define the meaning of the "me" or "author" values of the rel attribute, while HTML 5 does define the meaning (although I have not actually verified that).

    • what's to stop me from "borrowing" someone's author tag

      Federal law, as I pointed out in another comment [slashdot.org].

    • The full power of the Copyright SWAT team. Or Slander & Libel.

      Summarizing you, you're talking about putting Respected_Author tags on 4chan posts.

    • by archen (447353)

      Which is exactly what will happen. Current link farms will cross pollinate each other and it will be nearly impossible to tell who really wrote anything. Least likely will be the person who did write the original content.

  • by TaoPhoenix (980487) <TaoPhoenix@yahoo.com> on Thursday June 09, 2011 @07:12AM (#36386346) Journal

    Oh dear me, am I missing something?

    So you can totally spoof random people's names into any webpage? So searches for author=Obama come up with doctored pics of Osama-Obama slash or something?

    • by NNKK (218503)

      Oh dear me, am I missing something?

      So you can totally spoof random people's names into any webpage? So searches for author=Obama come up with doctored pics of Osama-Obama slash or something?

      Thanks for the imagery, but what is it that makes you think you can't _already_ claim any random person wrote something? Do you think the normal non-tag text in an HTML document is under a magic spell that present misattribution?

      • Because this is an Author Tag! (Cue the Serious Stern Face.)

        Of course twerps can claim stuff. So far people can just laugh stuff off.

        Now the obvious use of the tag is for the copyright police... they're gonna try to make the author tag a statement almost akin to under oath. So all those tv show clips on youtube that don't have the network=author tag are instant slam-bait.

        But now the more dangerous case is when Da Gov wants to do False Flag cases, and posts pics of Democrats sharing lingerie, and they put "A

    • I pick a respected author, perhaps academic, who writes about similar things as me. I publish my crap whitepaper claiming to be him. It's likely that no human will notice the deception. Depending on my goals, the human-readable text of the whitepaper will claim the author to be him or me.

      • Oh, of course.

        I used a little humor. But yes, you absolutely have a clear case - you submit something in an intelligent style, and the first pass no one notices, until it accidentally gets picked up and then they slam the original creator.

        What for example if that math paper that got hosed last week was *spoofed*? It's bad enough if the original author goofed, but since he got pulverized for "not checking", what if it was a classy defamation attack?

    • Or you could attach the name of your arch nemesis to the goatse picture....
      • by Combatso (1793216)
        im waiting to see so of my more ugly friends getting tagged as "goatse" by Facebooks face detection
  • If this is implemented via tags in the HTML itself, it can be easily detected and stripped by content thieves, can't it?

    If I copy the entire body of work of, say, the War Nerd, and set up a copycat blog ("the war geek"), how can these tags (which I've already modified) tell this is a blatant rip-off?

    • That's probably true. But if I understood this right, the point is to make the authors more visible on the internet - for example if I find a blog I like, I can easily find more writings by the same author, no matter what site they're on.

      • That's probably true. But if I understood this right, the point is to make the authors more visible on the internet - for example if I find a blog I like, I can easily find more writings by the same author, no matter what site they're on.

        Unless the author has a common name like John Doe...

        The only way a tag like this *might* work would be to make the tag value a public-key signature of the content enclosed inside the tag. Which would allow you to see that content A was signed by key XYZ, as was conten
    • by grumbel (592662)

      Judging from the Google blog this doesn't sound much like a rip protection, but more as a way to allow searches like "Show me everything else the author of this particle has written". That said, rip protection should be possible, when they would mark the first page that they find with content as special and then everything with the same content as copy.

    • They can't. The fact that it is just basic HTML means that detect-and-strip will be downright trivial; but there is nothing(outside of the darkest fantasies of the "trusted computing" set) that could actually stop such activity.

      It seems like this falls into the category of 'potentially useful incremental change'. It isn't resistant to rip-offs(but neither was the status quo) and it makes it somewhat easier for good-faith actors to make a pertinent piece of metadata easily accessible. The metadata dreams
      • by jfengel (409917)

        If you include the host domain in the digital signature, you'd be able to prevent people from re-hosting the work (or at least detect it and ignore copies). You'd still need the priority system you suggested to identify THE author (otherwise, as you say, somebody could rip and re-sign the content for a new host).

        It's probably too much work for the benefit you'd get, but it might be worth the experiment, and Google is exactly the people to do that experiment. It means a vast amount of crunching, possibly t

  • If somehow it's discovered that a particular author earned a high pagerank, what exactly would prevent linkfarms from tagging that author on every one of their pages?

    • by xveg (183289)

      Google is not that dumb, the article is just wrong.

      From google [google.com]

      This tells search engines: "The linked person is an author of this linking page." The rel="author" link must point to an author page on the same site as the content page. For example, the page http://example.com/content/webmaster_tips could have a link to the author page at http://example.com/authors/mattcutts. Google uses a variety of algorithms to determine whether two URLs are part of the same site. For example, http://example.com/content, http://www.example.com/content, and http://news.example.com can all be considered as part of the same site, even though the hostnames are not identical.

  • Most people add their HTML to a server in one way or another. Isn't that publishing? It isn't like there are private web sites with articles that where written by an author then transferred to HTML to be posted to the web. Oh wait. No. AOL isn't that way any longer.

  • Lots of people have common names. You could be a Michael or a Mary or a Mohammed or a Jennifer or a William.

    • See details here [google.com], where it is explained that all works authored by someone in a domain should be linked to a unique author page at that domain, and that authors can associate/link their author pages between various domains using reciprocal linking.

  • will you prevent publishers from modifying that tag on the fly ? its just a simple text replacement operation.
  • I was wondering when it would be possible to quote and requote the amazing debate that will change our society as we know it and transform us all into peace loving philanthropists who respect life. Oh wait! that debate happened already in irc chat.
  • We'll get to find out who Goatse REALLY is.

  • that it will be easy to randomize/ spoof/ rip off, and a stupid tag doesn't change anything:

    FIRST APPEARANCE of author tag means something. and no, it doesn't mean i can change the publish date on the file to June 1st, 1896 and always be the first author: when did SEARCH ENGINES first see content XYZ with author tag ABC?

    that's case closed, right there. you can't spoof this system, unless you have a time machine, or you can hack google

    now, if anyone rips off your content, you will be able to point to google'

    • by rwv (1636355)
      In that case, ripped off content of sites that Google scans hourly will get credit while real authors who maintain sites that are scanned less frequently won't get the credit they deserve. The new SEO will be "use blogger" which gets scanned (at least by Google) when you press "Publish". Unless Google can collaborate with other sites which allow users to publish data for the "first published" data? Does WordPress have hooks for such a collaboration? Would such a system be able to track plagiarism that i
      • you're talking about some pretty fringe time cases

        besides, the problem is easily corrected: if you write something valuable to you that you fear someone will rip off, you ACTIVELY submit the page to the search engines, rather than waiting for them to be passively scanned

  • If this were used for ranking, then I would expect web masters to attribute articles to Big Names.

    I would hope that Google would have a policy of fingerprinting the articles. Most people's writing style is sufficiently unique that claiming that someone else wrote Foo is fairly obvious on analysis.

    I hope also that there is a search tool so that I can find all articles attributed to me.

    And suppose that Slashdot and phpBB support this tag so that I can find all the posts by a given author.

"Probably the best operating system in the world is the [operating system] made for the PDP-11 by Bell Laboratories." - Ted Nelson, October 1977

Working...