Follow Slashdot stories on Twitter

 



Forgot your password?
typodupeerror
×
Google

'Massive Issues' Reported For Google's Indexing of JavaScript Content (searchenginejournal.com) 53

The way Google is indexing JavaScript content is "still a massive issue," reports Search Engine Journal: As much as 60% of the JavaScript content is indexed within the first 24 hours after indexing HTML. But there is also bad news. As much as 32% of the tested pages have unindexed JavaScript content after one month, due to a variety of reasons...

Indexing delays can cause Google to take a lot more time in discovering newly added pages on your news website... If it takes ages for Google to index your JS-dependent product description, your competitors will be taking the top positions for prominent queries....

We also checked a random sample of URLs from popular ecommerce and news websites. On average, JavaScript content is not indexed in Google in 25% of these websites' pages. This is one of the results of the two waves of indexing. The second wave is not guaranteed. Indexing JavaScript can fail due to many reasons, or may not happen at all....

If you are using JavaScript for generating important content, you have to implement it wisely and keep it under control.

This discussion has been archived. No new comments can be posted.

'Massive Issues' Reported For Google's Indexing of JavaScript Content

Comments Filter:
  • If (Score:5, Insightful)

    by Anonymous Coward on Monday December 02, 2019 @12:43AM (#59475488)

    If you are using JavaScript for generating important content

    you are an asshat.

    • by antus ( 6211764 )
      This. So this. Of course its the new way of the world. But it is also crazy. Now that a good chunk of apps at my employer have moved to the cloud for higher cost, lower performance with less flexability it seems workflow changes are now mostly implemented by adding javascript to app themes to rewrite content after the page is rendered. We do know its horrible, but we just cant do it any other way. That of course is not content, more UI hacks.. but still.
    • by ccham ( 162985 )
      This has been an issue with flash sites over a decade ago and javascript sites for a little less time. Google didn't support flash sites for years and people still had them. I don't think this is an alarmingly new trend, though the journal is a pretty reliable site for search. The number of site developers not fully testing compatibility with common browsers seems to be increasing much less checking on site crawlability. If anything it is not google's fault as much as asshat developers and managers with
      • If only there was a way to add meta info to web pages so the search engines didn't have to rely 100% on analyzing the content.

        That, and a special file called "robots.txt"so that they could know what pages to visit without analyzing HTML events and Javascript to see what happens.

        • It reminds me of the DVD days and trying to build a video app that worked properly with a mouse. Sometimes you'd have a full screen with click boxes, easy.

          Other times each box you'd go to with the DVD remote arrow keys actually took you to an identical new screen but a different thing highlighted that you could then press Ok on. There was no knowledge of the other boxes except on their individual clone screen.

          So if the latter, how do you know all the boxes to enable for a random access mouse click? Belie

        • by AmiMoJo ( 196126 )

          Asking the spammers if the content of their websites are spam isn't a good idea. Metadata has been ignored for at least a decade, any half decent search engine that isn't 90% spam looks at the actual content of the site.

      • by jeremyp ( 130771 )

        I don't think it's alarming at all. If your web page doesn't appear on Google because you've rendered all the content with Javascript, well boo hoo.

        • The downside would be that if your competitor is using javascript but google was "kind enough" to index it, while they dont index yours....
    • Re:If (Score:5, Interesting)

      by Z00L00K ( 682162 ) on Monday December 02, 2019 @01:42AM (#59475588) Homepage Journal

      In a way you are right - and if search engines don't index javascript generated content then that habit might go away instead. On the other hand many sites are probably enforcing the javascript generated content in order to enforce trackers to work as well.

      • "if search engines don't index javascript generated content then that habit might go away"

        This. The search engines should specifically *not* index any dynamically generated content.

        The whole point of the web (imho) is to present the client with finished content ready to be rendered. Subsequent interactions may change that content (and this may involve JavaScript), but the initial delivery from the server should be complete and readable without JavaScript.

        A related point: Allowing dynamic content generation

    • Re:If (Score:4, Informative)

      by Bite The Pillow ( 3087109 ) on Monday December 02, 2019 @01:42AM (#59475590)

      Feature, not a bug.

      "If you are using JavaScript for generating important content, you have to implement it wisely and keep it under control"

      Graceful fallback has been an idea for a very large amount of JavaScript's lifetime. It shouldn't be a surprise.

      • Yep.

        Also: Google publishes entire pages telling you how to make your website work with their search engine. Even if your site is 100% plain text it's still not a good idea to let the google bot just look at that and try to figure out what the site is about. You have to add extra info for the search engine.

      • by AmiMoJo ( 196126 )

        They don't want graceful fallback because they are using Javascript as a kind of DRM. It makes it harder to scrape the site.

        Now they are annoyed that it's harder for Google to scrape the site.

    • My old workplace's website was raw HTML and CSS, then we paid some cunt to "upgrade" it by adding "web 2.0 features". Bandwidth costs shot through the roof, loading times quintupled, ublock broke the site, noscript broke the site, and the aforementioned cunt tee-heed into the distance with a pile of money and the chief refused to call the guy out on his claims of "improving speed and performance".

      We sold parts and tools for car repair.
    • If you are using JavaScript for generating important content

      you are an asshat.

      It depends on what you are trying to do.

      There are literally accessibility tutorials from WCAG saying that you should generate stuff with JavaScript. Controls, not content, granted, but still.

      Generating all or part of your page with JS is mainstream, and search engines may need to adapt rather than the other way round.

    • I had a website I did for a client many years ago. They moved away from me because they thought they could do better without a developer so moved to Wix. Their current website is exactly like this, entirely JS content. So in this case there's nothing I can do about it but I stopped caring because I tried for years working with their designer in a partneship but all they wanted to do is take over instead. All I can say is their content is now hard to index, their pages are now 5x the size to load and harder
    • Can't stand not being able to open links in new tabs, or not share URL's of what I am viewing, because the entire site is a dynamic JS something.
  • by Futurepower(R) ( 558542 ) on Monday December 02, 2019 @01:01AM (#59475514) Homepage
    When I look at some web sites, I get the impression that the person who built the web site was not caring about good communication, but was wanting to get experience with JavaScript.

    Look at the Los Angeles Times [latimes.com] newspaper using the NoScript browser extension [noscript.net]. It's amazing how many other places to which you are expected to connect besides the Los Angeles Times.

    Many web sites expect you to connect with GoogleAnalytics.com and other Google web sites such as GoogleTagServices.com, so Google follows you.

    It would be very interesting to me to have a Slashdot story in which someone analyzed each additional web site to which the L.A. Times web site connects. Why are there so many?
    • by evanh ( 627108 )

      This. Scripting is a gimmick used mostly for tracking. The term script kiddies comes to mind.

    • Plenty of frameworks such as Angular are built on JavaScript and won't function without it. They really cut down on the work you have to do to build your site as opposed to starting out with vanilla JS and hacking together a home-grown framework. For most people, when you put getting a great looking site up by leveraging existing frameworks and toolkits, but requiring JavaScript, OR making your page less reliant on JavaScript, and requiring more work and time to get a result that is not as good looking, it'

      • Duh! (Score:4, Insightful)

        by evanh ( 627108 ) on Monday December 02, 2019 @01:55AM (#59475604)

        It's the "frameworks" and "toolkits" themselves that need to ditch scripting.

        Same as for the ad industry. Take away the option of tracking and they'll stop using it.

        • by tepples ( 727027 )

          It's the "frameworks" and "toolkits" themselves that need to ditch scripting.

          I can concede this for a static document. But in a web-based whiteboard app, would you prefer to have to click every point along a curve instead of dragging the curve with your mouse, stylus, or finger? Or would you prefer to download a native app only to see "We're sorry! This app is not yet available for your device's operating system"?

      • Just ditch Javascript and frameworks and learn HTML and style sheets. OK, a few things might need scripting, but pure HTML will do most of what you want.

        • Yes, but a few things will need scripting...

          There's nothing wrong with JavaScript in itself, there's many things that HTML doesn't do and plenty of things that can really improve a page if they're dynamic and can respond to events (eg. menu bars).

      • False dichotomy. You can make sites that degrade gracefully but still have bells and whistles. Instead of using a CMS web developers are covering their incompetence with database queries and caching by drawing the page empty and then filling in the content later. I took all the candy coated bullshit out of my Drupal site to reduce memory consumption, but you can have all that crap and still literally fall back to working completely without JavaScript if you swallow your pride and use a CMS that handles it f

    • "When I look at some web sites, I get the impression that the person who built the web site was not caring about good communication, but was wanting to get experience with JavaScript." Exhibit A: https://medium.com/javascript-... [medium.com] The webshit who wrote this article is in his first year of a software "engineering" degree at an Indian University, and he's recommending next year's hottest webshit technologies.
  • documents model (Score:5, Insightful)

    by johnjones ( 14274 ) on Monday December 02, 2019 @01:37AM (#59475580) Homepage Journal

    maybe just maybe you should serve the actual content in the page and THEN style it with CSS and add useful features with javascript
    NOT serve the whole thing as a javascript page with a abstract virtual DOM ala facebook...

    it helps people as well as machines to make things clear....

  • Ironic (Score:3, Interesting)

    by The MAZZTer ( 911996 ) <megazztNO@SPAMgmail.com> on Monday December 02, 2019 @01:38AM (#59475582) Homepage

    The fact Google is even able to index JavaScript content in the first place is only possible thanks to Google Chrome, which many people are concerned about as Google ships it as the default browser on Android devices and overall it is responsible for over 50% of web traffic now IIRC. However this means sites that design for Chrome will also work properly with Google's indexing tools.

    Of course at the end of the day an indexing tool can't use human reasoning to identify and click on links to see where they take it; if the web developers aren't careful it's easy to make web links that an indexing tool won't follow, for example by not using an anchor tag and navigating via JavaScript. If you don't design your web page to be indexing-friendly, don't be surprised when it's not indexed properly. That's likely what is happening here.

  • by peppepz ( 1311345 ) on Monday December 02, 2019 @01:42AM (#59475586)
    Can't wait for sites to be distributed as .exe intead of .com. :-)
  • Shortcuts have only one destination, back to square one.

    If you don't have time to do it right the first time, what makes you think you have time to do it over?
  • by mwa ( 26272 ) on Monday December 02, 2019 @05:47AM (#59475874)

    Are we pissed at Google for sucking up all information or are we pissed because they're not getting enough or getting it fast enough?

  • No, really. I'd love to see anyone here build something complex like an e-commerce site, but with 100% full page refreshes for every action (because ya know the entire language of JavaScript is bad or something), and have people actually use it.
    • This isn't about AJAX. It is about using Javascript for navigation. Same language; two completely different use cases.
      • by Merk42 ( 1906718 )
        By "here" I meant the comments, which are vilifying the entire language of JavaScript in any use case.
        • web pages suck for software... I can't believe these guys have no idea what SaaS requires. Do the bulk of these guys even develop software any more? I just came back after a few years away, wtf happened to this community? I mean, hating on javascript is fine... but arguing that the web should be static pages... what?

          The browser is where we write client components for distributed software now.

          Also there's a bunch of basement fascists around.

          • by tepples ( 727027 )

            Do the bulk of these guys even develop software any more?

            Many do, but that's no consolation when you cannot uxe an application that you want to use because it happens to have been made for an operating system other than the one your device runs. JavaScript is the only way I know to provide one client-side executable artifact that runs on all five of X11/Linux, macOS, Windows, Android, and iOS, and is not subject to Apple Inc. censorship.

    • Pretty much every e-commerce site before the early 2000s didn't rely on JavaScript - it took so long for a standard (ECMAScript) to emerge and then just as long as for the browser manufacturers to adopt it uniformly. Browser-side scripting is a nice to have, but you should never rely on it for content or functionality. The greater issue is most folks in charge of an organizational web site these days have no idea whether or how their site uses JavaScript. Good web development has gone the way of the standar
    • You mean something like this? [youtu.be] Which is apparently something that JS people came up with, and thought it was a good idea for some weird reasons. Or something like this [youtu.be], which is what sane people use?
  • The problem here is the current ongoing craze in the content management sector, called headless content management [wikipedia.org], also called Decoupled Architecture for Content Management [pantheon.io].

    Under this scheme, the backend is just a RESTful API serving only structured data (usually JSON). The front end is written in a Javascript framework (Angular JS, Vue, ...etc.)

    The advantage is that you can hire someone who knows that front end, and knows nothing about the backend (so it can be Drupal or WordPress, or something custom).

    Th

    • It's not necessarily a bad thing.

      With plugins that disable JavaScript by default and explicitly block the worst offenders for JavaScript - - tons of ads, GoogleTagManager, Google Analytics, facebook tags, etc. - - there are plenty of sites that also end up blocked because they require JS to load any content. I've discovered that nearly every time they're sites I probably don't want anyway.

      Some examples:

      News sites that are owned by Sinclair use the pattern, and their pages show up blank. This is a win.

      Most

      • by tepples ( 727027 )

        If there is a site that shows up blank and I know I want the content, I can whitelist it. Usually it isn't worth it, so this is a win.

        What criteria do you apply to determine whether a JS-driven web application is worth it? One example is the web-based chat platform Discord (https://discordapp.com/ [discordapp.com].) Another is JSPaint (https://jspaint.app/ [jspaint.app]).

    • by tlhIngan ( 30335 )

      However, AJAX was invented and found to be useful in some cases (updating a block of data without the entire page being loaded). However, the idea was taken to the extreme in this headless thing which is about to take over a large part of the web as we used to know it.

      Being a Drupal contributor, I saw this headless craze building up for years. It has taken over most large enterprise site, while smaller ones did not bother.

      Which is easily solved if Google simply stopped trying to parse pages of javascript. T

      • by kbahey ( 102895 )

        Being a Drupal contributor, I saw this headless craze building up for years. It has taken over most large enterprise site, while smaller ones did not bother.

        Which is easily solved if Google simply stopped trying to parse pages of javascript. Those sites would rapidly see their Google rankings fall as competitors with more accessible pages get better indexed.

        Seems like a problem that can solve itself in short order.

        Even the frameworks would adapt.

        In fact, all Google has to do is absolutely nothing - the furt

    • The disadvantages are many but, in my view, outweigh the advantages: high CPU usage on the client, no JS = no content, less security, content indexing issues (like what this article was about), and so on ...

      What's the problem with having a web page generating gateway to the contents?

  • Seems that google has been so busy trying to influence politics that itâ(TM)s search business is beginning to suffer. Good.

Know Thy User.

Working...