Cloudflare Says It Blocked 416 Billion AI Scraping Requests In 5 Months 43
Cloudflare says it blocked 416 billion AI scraping attempts in five months and warns that AI is reshaping the internet's economic model -- with Google's combined crawler creating a monopoly-style dilemma where opting out of AI means disappearing from search altogether. Tom's Hardware reports: "The business model of the internet has always been to generate content that drive traffic and then sell either things, subscriptions, or ads, [Cloudflare CEO Matthew Prince] told Wired. "What I think people don't realize, though, is that AI is a platform shift. The business model of the internet is about to change dramatically. I don't know what it's going to change to, but it's what I'm spending almost every waking hour thinking about."
While Cloudflare blocks almost all AI crawlers, there's one particular bot it cannot block without affecting its customers' online presence -- Google. The search giant combined its search and AI crawler into one, meaning users who opt out of Google's AI crawler won't be indexed in Google search results. "You can't opt out of one without opting out of both, which is a real challenge -- it's crazy," Prince continued. "It shouldn't be that you can use your monopoly position of yesterday in order to leverage and have a monopoly position in the market of tomorrow."
While Cloudflare blocks almost all AI crawlers, there's one particular bot it cannot block without affecting its customers' online presence -- Google. The search giant combined its search and AI crawler into one, meaning users who opt out of Google's AI crawler won't be indexed in Google search results. "You can't opt out of one without opting out of both, which is a real challenge -- it's crazy," Prince continued. "It shouldn't be that you can use your monopoly position of yesterday in order to leverage and have a monopoly position in the market of tomorrow."
fuck this guy (Score:2)
Re: (Score:3, Informative)
Oh you sweet summer child. The internet had a business model the second it was accessible outside of ARPANET.
Also no. I'm not sure if you stole your dad's Slashdot account, but no the best internet ever created was not preserved in academic wonderland. The internet was far more useful in the early 2010s than at any time in the past and long after it was commercialised. Is it going downhill? Yep. But does that mean it was better in its infancy? Hell no. It was fucking useless back then. There's far more info
Re:fuck this guy (Score:4, Interesting)
And 2010 coincides with the ascendency of the smartphone as the primary device most people use to access the internet. It's not the sole reason, but it does play a big part.
Re:fuck this guy (Score:4, Interesting)
There's far more information available now than there ever was.
You didn't have gopher?
The Internet was, and still is, more than http/https. But that's only for people who know their way around it. The rest of you are like the fat kid on a boy scout camp-out when the bear (Google) arrives. As long as you keep it fed, we'll be safe.
Re: (Score:1)
You didn't have gopher?
I did. You know who didn't? Most people. Which meant the Internet was largely a source of information by a handful of nerds for a handful of nerds covering little more than a handful of nerd topics.
Yeah great you can find out how to compile a new Linux kernel. Whoope de fucking do. Where's a Dutch language video guide to how to replumb your shower? I could find great information to help my hobby of amateur radio, but how does it help the wife who has an interest in creating fancy cakes? It doesn't, because
Re: (Score:2, Insightful)
Re: (Score:2)
The internet didn't always have a business model. Started out as an academic wonderland.
The internet was created by the military to allow university researchers to collaborate on weapon research.
Prediction (Score:2)
Google will get fined in the EU in 2026.
Re: (Score:2)
if you can't innovate, regulate
Anti-trust violations are not innovation. But thanks for writing the most stupid thing I've read this year, which is impressive given how low the bar already was. I see why you post AC.
AI scrapers should be illegal (Score:5, Interesting)
Re: (Score:2, Insightful)
Agreed. If someone wants a movie or music or software, they can request it from the owners and can only use the item they were specifically given permission to use. Otherwise, they have to pay for it.
Re: (Score:3)
Re: AI scrapers should be illegal (Score:2)
Re:AI scrapers should be illegal (probably) (Score:3)
Re: (Score:2)
Re: (Score:2)
Re: AI scrapers should be illegal (probably) (Score:2)
Perhaps the ownership-over-information ship has sailed?
Re: (Score:1)
Perhaps the ownership-over-information ship has sailed?
I think, on balance, copyright protects the little guy more than the big guy. Without it I can see even more concentration of power in shameless marketing conglomerates.
Re: (Score:2)
Just block Google entirely (Score:2)
Re: Just block Google entirely (Score:2)
China cope just fine without Google.
Is this only Google? (Score:3)
The search giant combined its search and AI crawler into one, meaning users who opt out of Google's AI crawler won't be indexed in Google search results
I can think of another company which has a search engine and is also one of the "big boys" when it comes to AI. Do Microsoft do their own AI crawling or is it sufficiently separated from Bing's that Cloudflare can tell them apart (so far).
A second approach is that Google's search engine is their original business. How tolerant would their user-base be if Cloudflare blocked their crawler, worldwide. That war would hurt the sites depending on Google for their traffic, but it would also hurt Google as well. The DOJ was already going after the company (not sure if that's still a "thing" under Trump) but their machinations take years anyway, as thegarbz points out [slashdot.org], the EU more likely to get involved and they sometimes move faster.
Re: (Score:2)
Do Microsoft do their own AI crawling
I thought Microsoft just shovel money at OpenAI. Do Microsoft even do any of their own model training at all?
416e9, really? (Score:2)
Were all of those really AI bots?
What about all those times that I went to a web site, got a stupid captcha, said fuck it, and went to a different site?
Re: (Score:2)
One of the browsers I use cannot handle Cloudflare Capchas, I'm not sure if it's down to browser sniffing or some incompatibility, but at that point I either say fuck it or copy the link into another browser.
Re: (Score:2)
Does your browser advertise itself as a web crawler? But let's math this shit!
You are a human. I suspect when you browse it takes you a good 10-20 seconds to make a connection, wait, receive a captcha, re-evaluate and context switch, and attempt to switch to another website. That's conservatively 6 attempts per minute. I assume you need to eat, drink, shower, shit and sleep so let's say you spend every other moment constantly battling Cloudflare like a lunatic even professionally while you work. That's 16 h
Re: (Score:2)
The Cloudflare captchas just loop back to themselves for me on over 90% of sites that use them. But a few work.
I believe they're selling an anti-adblock service to sites, and pretending everybody is blocked for a different reason, through creative categorization.
I'll bet some of those (Score:2)
Aren't scraping requests. I keep getting blocked by cloudflare and it sucks, it happens probably 1 out of 50 sites on multiple machines. They still can't recognize between power users and AI.
Why not create a standard to deliver to AI (Score:1)
Re: (Score:3)
That's great. You get all my competitors together to follow this standard with a minimal amount of opt-in data and I'll just keep hovering up the entire internet. You see why my competitors may not be so keen on your standard idea?
\o/ (Score:1)
Hello, which planet are YOU from?
No it didn't (Score:1)
It blocked me from accessing half the web with are you a human loops
Blocks users too! (Score:2)
... And it's f'ing annoying!
Bogus claims. (Score:1)