'Massive Issues' Reported For Google's Indexing of JavaScript Content (searchenginejournal.com) 53
The way Google is indexing JavaScript content is "still a massive issue," reports Search Engine Journal:
As much as 60% of the JavaScript content is indexed within the first 24 hours after indexing HTML. But there is also bad news. As much as 32% of the tested pages have unindexed JavaScript content after one month, due to a variety of reasons...
Indexing delays can cause Google to take a lot more time in discovering newly added pages on your news website... If it takes ages for Google to index your JS-dependent product description, your competitors will be taking the top positions for prominent queries....
We also checked a random sample of URLs from popular ecommerce and news websites. On average, JavaScript content is not indexed in Google in 25% of these websites' pages. This is one of the results of the two waves of indexing. The second wave is not guaranteed. Indexing JavaScript can fail due to many reasons, or may not happen at all....
If you are using JavaScript for generating important content, you have to implement it wisely and keep it under control.
Indexing delays can cause Google to take a lot more time in discovering newly added pages on your news website... If it takes ages for Google to index your JS-dependent product description, your competitors will be taking the top positions for prominent queries....
We also checked a random sample of URLs from popular ecommerce and news websites. On average, JavaScript content is not indexed in Google in 25% of these websites' pages. This is one of the results of the two waves of indexing. The second wave is not guaranteed. Indexing JavaScript can fail due to many reasons, or may not happen at all....
If you are using JavaScript for generating important content, you have to implement it wisely and keep it under control.
If (Score:5, Insightful)
you are an asshat.
Re: (Score:2)
Re: (Score:2)
Re: (Score:3)
If only there was a way to add meta info to web pages so the search engines didn't have to rely 100% on analyzing the content.
That, and a special file called "robots.txt"so that they could know what pages to visit without analyzing HTML events and Javascript to see what happens.
Re: (Score:2)
It reminds me of the DVD days and trying to build a video app that worked properly with a mouse. Sometimes you'd have a full screen with click boxes, easy.
Other times each box you'd go to with the DVD remote arrow keys actually took you to an identical new screen but a different thing highlighted that you could then press Ok on. There was no knowledge of the other boxes except on their individual clone screen.
So if the latter, how do you know all the boxes to enable for a random access mouse click? Belie
Re: (Score:2)
Asking the spammers if the content of their websites are spam isn't a good idea. Metadata has been ignored for at least a decade, any half decent search engine that isn't 90% spam looks at the actual content of the site.
Re: (Score:2)
I don't think it's alarming at all. If your web page doesn't appear on Google because you've rendered all the content with Javascript, well boo hoo.
Re: (Score:2)
Re:If (Score:5, Interesting)
In a way you are right - and if search engines don't index javascript generated content then that habit might go away instead. On the other hand many sites are probably enforcing the javascript generated content in order to enforce trackers to work as well.
Re: (Score:2)
"if search engines don't index javascript generated content then that habit might go away"
This. The search engines should specifically *not* index any dynamically generated content.
The whole point of the web (imho) is to present the client with finished content ready to be rendered. Subsequent interactions may change that content (and this may involve JavaScript), but the initial delivery from the server should be complete and readable without JavaScript.
A related point: Allowing dynamic content generation
Re:If (Score:4, Informative)
Feature, not a bug.
"If you are using JavaScript for generating important content, you have to implement it wisely and keep it under control"
Graceful fallback has been an idea for a very large amount of JavaScript's lifetime. It shouldn't be a surprise.
Re: (Score:2)
Yep.
Also: Google publishes entire pages telling you how to make your website work with their search engine. Even if your site is 100% plain text it's still not a good idea to let the google bot just look at that and try to figure out what the site is about. You have to add extra info for the search engine.
Re: (Score:2)
They don't want graceful fallback because they are using Javascript as a kind of DRM. It makes it harder to scrape the site.
Now they are annoyed that it's harder for Google to scrape the site.
Re: (Score:2)
We sold parts and tools for car repair.
Re: (Score:2)
you are an asshat.
It depends on what you are trying to do.
There are literally accessibility tutorials from WCAG saying that you should generate stuff with JavaScript. Controls, not content, granted, but still.
Generating all or part of your page with JS is mainstream, and search engines may need to adapt rather than the other way round.
Re: (Score:1)
Re: (Score:2)
Re: (Score:2)
Minimize the use of JavaScript. (Score:5, Insightful)
Look at the Los Angeles Times [latimes.com] newspaper using the NoScript browser extension [noscript.net]. It's amazing how many other places to which you are expected to connect besides the Los Angeles Times.
Many web sites expect you to connect with GoogleAnalytics.com and other Google web sites such as GoogleTagServices.com, so Google follows you.
It would be very interesting to me to have a Slashdot story in which someone analyzed each additional web site to which the L.A. Times web site connects. Why are there so many?
Re: (Score:2)
This. Scripting is a gimmick used mostly for tracking. The term script kiddies comes to mind.
Re: (Score:3)
Plenty of frameworks such as Angular are built on JavaScript and won't function without it. They really cut down on the work you have to do to build your site as opposed to starting out with vanilla JS and hacking together a home-grown framework. For most people, when you put getting a great looking site up by leveraging existing frameworks and toolkits, but requiring JavaScript, OR making your page less reliant on JavaScript, and requiring more work and time to get a result that is not as good looking, it'
Duh! (Score:4, Insightful)
It's the "frameworks" and "toolkits" themselves that need to ditch scripting.
Same as for the ad industry. Take away the option of tracking and they'll stop using it.
Re: (Score:2)
It's the "frameworks" and "toolkits" themselves that need to ditch scripting.
I can concede this for a static document. But in a web-based whiteboard app, would you prefer to have to click every point along a curve instead of dragging the curve with your mouse, stylus, or finger? Or would you prefer to download a native app only to see "We're sorry! This app is not yet available for your device's operating system"?
Re:Minimize the use of JavaScript. -try HTML (Score:1)
Just ditch Javascript and frameworks and learn HTML and style sheets. OK, a few things might need scripting, but pure HTML will do most of what you want.
Re: (Score:2)
Yes, but a few things will need scripting...
There's nothing wrong with JavaScript in itself, there's many things that HTML doesn't do and plenty of things that can really improve a page if they're dynamic and can respond to events (eg. menu bars).
Re: (Score:2)
False dichotomy. You can make sites that degrade gracefully but still have bells and whistles. Instead of using a CMS web developers are covering their incompetence with database queries and caching by drawing the page empty and then filling in the content later. I took all the candy coated bullshit out of my Drupal site to reduce memory consumption, but you can have all that crap and still literally fall back to working completely without JavaScript if you swallow your pride and use a CMS that handles it f
Re: Minimize the use of JavaScript. (Score:3)
documents model (Score:5, Insightful)
maybe just maybe you should serve the actual content in the page and THEN style it with CSS and add useful features with javascript
NOT serve the whole thing as a javascript page with a abstract virtual DOM ala facebook...
it helps people as well as machines to make things clear....
Ironic (Score:3, Interesting)
The fact Google is even able to index JavaScript content in the first place is only possible thanks to Google Chrome, which many people are concerned about as Google ships it as the default browser on Android devices and overall it is responsible for over 50% of web traffic now IIRC. However this means sites that design for Chrome will also work properly with Google's indexing tools.
Of course at the end of the day an indexing tool can't use human reasoning to identify and click on links to see where they take it; if the web developers aren't careful it's easy to make web links that an indexing tool won't follow, for example by not using an anchor tag and navigating via JavaScript. If you don't design your web page to be indexing-friendly, don't be surprised when it's not indexed properly. That's likely what is happening here.
All of this will improve with webasm (Score:5, Funny)
Re: (Score:3)
Re: (Score:1)
Shortcuts (Score:2)
If you don't have time to do it right the first time, what makes you think you have time to do it over?
What day is it? (Score:5, Funny)
Are we pissed at Google for sucking up all information or are we pissed because they're not getting enough or getting it fast enough?
Whole lotta luddites here (Score:2)
Re: (Score:2)
Re: (Score:2)
they don't get it (Score:1)
web pages suck for software... I can't believe these guys have no idea what SaaS requires. Do the bulk of these guys even develop software any more? I just came back after a few years away, wtf happened to this community? I mean, hating on javascript is fine... but arguing that the web should be static pages... what?
The browser is where we write client components for distributed software now.
Also there's a bunch of basement fascists around.
Re: (Score:2)
Do the bulk of these guys even develop software any more?
Many do, but that's no consolation when you cannot uxe an application that you want to use because it happens to have been made for an operating system other than the one your device runs. JavaScript is the only way I know to provide one client-side executable artifact that runs on all five of X11/Linux, macOS, Windows, Android, and iOS, and is not subject to Apple Inc. censorship.
Re: (Score:1)
Re: (Score:2)
Current craze of Headless Content Management (Score:2)
The problem here is the current ongoing craze in the content management sector, called headless content management [wikipedia.org], also called Decoupled Architecture for Content Management [pantheon.io].
Under this scheme, the backend is just a RESTful API serving only structured data (usually JSON). The front end is written in a Javascript framework (Angular JS, Vue, ...etc.)
The advantage is that you can hire someone who knows that front end, and knows nothing about the backend (so it can be Drupal or WordPress, or something custom).
Th
Re: (Score:2)
It's not necessarily a bad thing.
With plugins that disable JavaScript by default and explicitly block the worst offenders for JavaScript - - tons of ads, GoogleTagManager, Google Analytics, facebook tags, etc. - - there are plenty of sites that also end up blocked because they require JS to load any content. I've discovered that nearly every time they're sites I probably don't want anyway.
Some examples:
News sites that are owned by Sinclair use the pattern, and their pages show up blank. This is a win.
Most
Re: (Score:2)
If there is a site that shows up blank and I know I want the content, I can whitelist it. Usually it isn't worth it, so this is a win.
What criteria do you apply to determine whether a JS-driven web application is worth it? One example is the web-based chat platform Discord (https://discordapp.com/ [discordapp.com].) Another is JSPaint (https://jspaint.app/ [jspaint.app]).
Re: (Score:2)
Which is easily solved if Google simply stopped trying to parse pages of javascript. T
Re: (Score:2)
Re: (Score:2)
The disadvantages are many but, in my view, outweigh the advantages: high CPU usage on the client, no JS = no content, less security, content indexing issues (like what this article was about), and so on ...
What's the problem with having a web page generating gateway to the contents?
Hmmmmn (Score:1)