Google: 60% Of The Internet Is Duplicate (seroundtable.com) 79
Google says "60% of the internet is duplicate." Gary Illyes from Google posted this slide at the Google Search Central Live in Singapore the other day.
It is better to live rich than to die rich. -- Samuel Johnson
80% of the internet is porn (Score:1)
So what does that leave?
Re: (Score:1)
Which, I guess, is why it is a story on
Re: (Score:2)
Slashdot is responsible for a big % of that. More, when we include slashdot duping itself.
Re: (Score:2)
There are a ton of sites that are literally duplicates in content, only the wrappers around are different. Ie, ask a question that is perhaps suited for one site, ie, stackoverflow, you will get the orginal site in the top 10 hits, but the other 9 out of 10 hits have exactly the same question and the same answer word for word. You got to get past the top ten before you find an alternative answer.
Re: (Score:2)
I don't worry too much about that and if you are smart you'd realize that a lot of information is duplicated and you'll have to dig deeper.
The hard thing for users on the net is actually when the information you look for exists on only one single site.
Re: (Score:2)
Slashdot is now on second spot for me when I google 60% Of The Internet Is Duplicate [letmegooglethat.com].
(Excluding paid inks)
Recursion win?
Except for cats. (Score:5, Funny)
Re: (Score:1, Troll)
Re: Except for cats. (Score:2)
Whereas humans are beneficial to all wildlife whereever they go, sans cats.
Re: (Score:2)
Re: Except for cats. (Score:1)
Indoor cats don't ravage wildlife, and barn cats are useful on farms for snake and mouse control. The problem is when there are outdoor neighborhood cats that perpetually breed unchecked. A well kept domestic cat will live a long healthy life without destroying local wildlife.
Re: (Score:2)
Re: (Score:2)
Re: (Score:2)
Re: Except for cats. (Score:2)
Re: (Score:2)
Only 60%? (Score:5, Funny)
/. hits its redundancy limit ... (Score:5, Insightful)
TFT: "60% Of The Internet Is Duplicate"
TFS: "60% Of The Internet Is Duplicate"
TFA: "60% Of The Internet Is Duplicate" (Tweet *and* and photo)
Re: (Score:3)
Just wait until tomorrow, then your mind will truly be blown.
Re: (Score:2)
Just wait until tomorrow, then your mind will truly be blown.
When 60% of the front page will be duplicate?
Re: (Score:2)
Just wait until tomorrow, then your mind will truly be blown.
When 60% of the front page will be duplicate?
We already have sites where 60% of the articles some days are about the latest serial fuck-ups of Elon Musk.
Musk + Crypto Bros == George Carlin was right - 90% of all stuff is shit. Some days I think he was a bloody optimist.
Re:/. hits its redundancy limit ... (Score:4, Interesting)
We already have sites where 60% of the articles some days are about the latest serial fuck-ups of Elon Musk.
Except for here. Not a single story in the past 2 - 3 weeks on the shitshow which is Twitter after Musk blew $44 billion on it. For the record, it appears half of the biggest advertisers [cbsnews.com] have stopped advertising on the site since that pedo guy took over.
Re: (Score:1)
To be fair they're 4 different stories (Score:2)
Re: (Score:1)
Musk is screwing the pooch that hard that there's 4 distinct, newsworthy fuckups in 1 day. That has to be a Guinness record or somethin'.
I'm sure he's going "Hold my bong - I can do better!"
Eventually PETA is going to go after him for screwing the pooch so much.
He wants revenue from users to be equal to revenue from advertisers - and the way he's scaring off advertisers, he should be careful what he wishes for, because he looks like he's going to achieve that goal.
One upside - so-called "journalists" will have to do more than just scan twitter for their latest stories.
Re: (Score:2)
Re: (Score:2)
Re: (Score:1)
The linked [mediamatters.org]original listed article had a headline that says half, but a list of 49 advertisers of which 7 have " issued a statement or was publicly reported as stopping its ads". Your post is a lie based on a lie based on a lie.
Re: (Score:1)
You should tweet that!
Re:/. hits its redundancy limit ... (Score:4, Interesting)
Except for here. Not a single story in the past 2 - 3 weeks on the shitshow which is Twitter after Musk blew $44 billion on it.
The current owners of Slashdot are cryptocurrency shills, and Musk is a crypto hero (with all the sarcasm that entails.) Just based on what does and doesn't get posted here, and what sources they use, not to mention what words have been put into the word filter, I suspect they are also right wingers. They literally have blocked both "Nazi" and "reich" in the past, and the name of their company is still in the filter.
Re: (Score:2)
That's why we need personal AI running locally. The model can be trained to be precise and civilised unlike the web. It will filter the web and give you only the good parts without the bad parts.
Re: (Score:1)
> George Carlin was right - 90% of all stuff is shit That's why we need personal AI running locally. The model can be trained to be precise and civilised unlike the web. It will filter the web and give you only the good parts without the bad parts.
I would hate that. You know the old saying - keep your friends close, and your enemies closer. I prefer to be aware of what the nutters are doing so when I encounter them in real life I can deal with them - usually with my variant of that saying - "keep your friends close, tell your enemies to fuck off."
Re: (Score:2)
When Beauhd doesn't realise msmash already posted this.
Re: (Score:2)
One random slide and lots of repetition (Score:2, Insightful)
As fluff pieces go, this is one of the fluffier ones.
But why post this here? Whitewashing the endless dupes because not only will these editors not edit, but they won't keep an eye on each other's postings either?
Also, failure to discern "internet" and "world-wide web". Google likes to pretend they're identical, and facebook likes to pretend they, one website, comprise the entire useful internet (witness "internet.org free basics"), but that doesn't make it so.
It just makes those parties all the more ar
Duplicate post? (Score:1)
Does posting this make you feel better about the constant duplicate posts here?
Re: (Score:2)
Does posting this make you feel better about the constant duplicate posts here?
Actually, yes. Not even half of slashdot articles are dupes, and if it’s not low information third party scrapes of legitimate news, two days late, and duped, I demand my money back.
Re: (Score:2)
Slashdot is responsible for 71% of the duplication :/
Re: (Score:2)
SEO spam (Score:5, Insightful)
--
We will soon have the option to harvest our farts, so we can post & comment on stats about them.
I've been saying that for years (Score:5, Interesting)
And what about getting AI generated pages that match search criteria but are pages of endless jibberish... nonsensical but syntactically correct?
Re: (Score:1)
Just lookup phone install instructions. There are exact dupe pages where only the address and graphics are different. .. who copied who? And these guys reached the holy Grail of SEO. Fooling the algorithm into filling the top 10 results with their dupes and slurping up those delicious advertising dollars.
And what about getting AI generated pages that match search criteria but are pages of endless jibberish... nonsensical but syntactically correct?
If you don't buy, then all you're doing is causing them to waste money. And after a while, it's obvious what results are spam without even clicking on them. Unfortunately, it's obvious that plenty of people are still stupid enough to click on them, or Google would de-prioritize them and run stuff that DOES get action.
Re: (Score:2)
Figure 3.13 : https://arxiv.org/pdf/2005.141... [arxiv.org]
And (Score:3)
Re:And (Score:5, Funny)
And 90% of it is caused by Slashdot.
Re: (Score:2)
Re: (Score:1)
You just made 50%, not 90% of post duplication.
And (Score:4)
60% of google’s search results are spam or link farms. It really pisses me off when they return a link that’s just my terms but fed into Amazon or some other retailer.
Re: (Score:1)
60% of google’s search results are spam or link farms. It really pisses me off when they return a link that’s just my terms but fed into Amazon or some other retailer.
Remember the ebay search results for "white slaves"? [marketwatch.com] I warned the boss back then NOT to buy results for any and all terms really cheap, and showed him what happened with our ad feed when people typed in "buy white slaves." People ended up seeing "Buy white slaves now!" "White slaves for sale!" "Best deals on white slaves."
But nope, didn't put in filters, cost him the $800/day ebay account.
Google and ebay still do stupid shit - try searching google for "buy used tampon". ebay has 5,300 results.
Repost (Score:1)
Or, as the old saying goes: Every repost is a repost of a repost.
Why would this be seen as a problem? (Score:2)
Why would this be seen as a problem?
If website X disappears and the content is still available on website Y that's a good thing and the next guy looking for that content will be happy to find it there.
Fix it (Score:4, Funny)
I know, let's put the internet on the Blockchain!
Then there's archive.org (Score:4, Insightful)
So, given archive.org's charter, they should at some point account for ~50% of the internet just from the Wayback machine, and growing well beyond that as it archives old versions of no longer active pages (considering that many of its snapshots aren't that much different than earlier ones). Add to that its quasi-library function that most likely includes large overlaps with things like Youtube videos, and this hardly seems surprising. I'm actually surprised that the duplication isn't a lot more than 60%.
And I certainly wouldn't call this wasteful.
This story posted somewhere? (Score:2)
It's Like Deja Vu... (Score:1)
dead internet theory (Score:4, Insightful)
I think there's a new thing going on. Lookup something absurd like
"eating rocks during christmas". This was the fist hit on my list
https://www.kidsacookin.org/the-dangers-of-eating-rocks/
There are some great lines in there. Like
Eating rocks can have some negative consequences
and
We seem to have a tendency in the United States to place blame on others rather than ourselves. Due to this, there has never been an ice cream man in My Town for quite some time. I don’t think it’s a good idea to give a kid a baby oil bottle. If we want to keep babies safe, we might want to tattoo warning labels on them as soon as they leave the hospital.
Seriously?
Is the whole website is prebuilt using gpt-3? but nonetheless, it's not really a website. It's an advertising trap, built with pure nonsense put together by an AI/program but targeted to capture search edge cases. Absurd searches have been monetized. Clever to fool google and bing.
Re: (Score:2)
So, let's start with... (Score:2)
AMP and regular HTML pages for the same name.
Hmmm⦠(Score:2)
Makes no sense over 50% (Score:2)
10% must be a triplicate.
LMGTFY (Score:2)
No seriously, just Google it.
https://www.google.com/search?... [google.com]
Stop scanning the Internet Archive (Score:1)
Tech blogs (Score:2)
Does that mean Slashdot will repeat (Score:2)
It's called Wikipedia (Score:2)
WP is basically a huge cut and paste from other websites, generally without permission. No wonder Google likes it so much.
Goes both ways (Score:2)
Huh? Only 60%? (Score:2)
I'm not saying that this statement is true or not. (Score:2)
Duplicate (Score:1)
Even this article is a duplicate
blame search engine optimization (Score:2)
A million sites duplicating wikipedia content is trash design to draw a few clicks. Basically half the internet is trash. Maybe stop interacting with trash and people will stop making so much of it.
Crime, fix, back up? (Score:2)
Yes, some stuff is stolen. There are a ton of very good, perfectly legal reasons to copy information, not just backup for data retention. How much of it is people de-sliding stuff, removing junk and keeping the good stuff? Or news articles that are licensed/sold to multiple 'news services' that may or may not be merely re-skinned local versions?
When Crypto drops like a lead stone do not be surprised if every single stock related website shows a YTD graph with them down 40%.
Google (Score:2)
Are 60% Google results duplicate?
60% is probbaly low (Score:1)
I can't imagine an internet where anything was only available on a single site.
retweet, quote, excerpts, curated content.
my gut tells me 60% is a low figure.