Follow Slashdot stories on Twitter

 



Forgot your password?
typodupeerror
×
Google User Journal

Googlebot and Document.Write 180

With JavaScript/AJAX being used to place dynamic content in pages, I was wondering how Google indexed web page content that was placed in a page using the JavaScript "document.write" method. I created a page with six unique words in it. Two were in the plain HTML; two were in a script within the page document; and two were in a script that was externally sourced from a different server. The page appeared in the Google index late last night and I just wrote up the results.
This discussion has been archived. No new comments can be posted.

Googlebot and Document.Write

Comments Filter:
  • by AnonymousCactus ( 810364 ) on Monday March 12, 2007 @01:18AM (#18312866)

    Google needs to consider script if they want high-quality results. Besides the obvious fact that they'll miss content supplied by dynamic page elements, they could also sacrifice page quality. Page-rank and the like will get them very far, but an easy way to spam the search engines would be to have pages on a whole host of topics that immediately get rewritten as ads for Viagra as soon as they're downloaded by a Javascript-aware browser. It's interesting to know the extent to which they correct for this.

    Of course, there are much more subtle ways of changing content once it's been put out there. One might imagine a script that waits 10 seconds and then removes all relevant content and displays Viagra instead. Who knew web search would be restricted by the halting problem? I wonder how far Google goes...

  • by vidarh ( 309115 ) <vidar@hokstad.com> on Monday March 12, 2007 @07:57AM (#18314545) Homepage Journal
    Because doing so without massive limitations would involve the halting problem. A search engine simply CAN'T determine whether a certain piece of javascript will terminate in the general case. In lots of special cases, yes (such as when there's no control constructs, or the control constructs can't possibly cause loops or recursion etc.) and they could use timeouts etc. or only execute the first "n" steps of an interpreter, yes. But all of it would mean essentially crippling the feature.

    And for what? So that some lazy web developer won't have to put the content they want indexed in a div and make it invisible and have their JS pick it up from there instead if they want to do more stuff with it?

    It would also pick up a lot of stuff that people have put in javascript because they don't want the search engines to index it.

  • by e-Trolley ( 771869 ) on Monday March 12, 2007 @08:44AM (#18314859) Homepage
    AJAX is for writing applications not Documents. Why and how should an application be indexed?
  • by Sancho ( 17056 ) * on Monday March 12, 2007 @07:55PM (#18325057) Homepage
    How should 'major', though? When most Firefox-borked sites were coded, Firefox probably had less than 5% (around what Safari had, last I heard). Is 5% enough to overlook? What about 3%? 1%?

    If you code to the standard, at least you can blame browsers for their broken implementation.

"Look! There! Evil!.. pure and simple, total evil from the Eighth Dimension!" -- Buckaroo Banzai

Working...