Want to read Slashdot from your mobile device? Point it at m.slashdot.org and keep reading!

 



Forgot your password?
typodupeerror
AI The Internet

Cloudflare Flips AI Scraping Model With Pay-Per-Crawl System For Publishers (cloudflare.com) 25

Cloudflare today announced a "Pay Per Crawl" program that allows website owners to charge AI companies for accessing their content, a potential revenue stream for publishers whose work is increasingly being scraped to train AI models. The system uses HTTP response code 402 to enable content creators to set per-request prices across their sites. Publishers can choose to allow free access, require payment at a configured rate, or block crawlers entirely.

When an AI crawler requests paid content, it either presents payment intent via request headers for successful access or receives a "402 Payment Required" response with pricing information. Cloudflare acts as the merchant of record and handles the underlying technical infrastructure. The company aggregates billing events, charges crawlers, and distributes earnings to publishers.

Alongside Pay Per Crawl, Cloudflare has switched to blocking AI crawlers by default for its customers, becoming the first major internet infrastructure provider to require explicit permission for AI access. The company handles traffic for 20% of the web and more than one million customers have already activated its AI-blocking tools since their September 2024 launch, it wrote in a blog post.

Cloudflare Flips AI Scraping Model With Pay-Per-Crawl System For Publishers

Comments Filter:
  • by xack ( 5304745 ) on Tuesday July 01, 2025 @02:10PM (#65489294)
    All because you couldn't behave, now we have the equivalent of the TSA for the internet. Expect browsers like Pale Moon, Ladybird and Seamonkey to get put on the wrong side of the wall again. Expect adblocker users to get hit by this soon too, as now they have the technology.
    • I did see a lot more Cloudflare prompts lately on sites which had no issue with Pale Moon before.

    • This is a good thing. The internet is nowhere near a net benefit to humanity as we all thought 30 years ago. At this point it's turned out to be a slight negative.

      Anything that cuts down on excessive usage globally is worth a small amount of pain. Think of it like going to the gym. It's not gonna kill you to go outside and enjoy the fresh air.

  • Someone already developed an AI Tar Pit code for website. When it detects bot traffic or robots.txt violations, you just keep creating more and more dummy pages in an infinite chain and then slow down the response time for the server to like 750ms. It traps the bots there forever without using much bandwidth.
  • No doubt Meta[stasize], Google, OpenAI and all other major AI shops will whine about having to pay for anything and conjure up some reasoning why this system is illegal because reasons and sue Cloudflare to tie them up in litigation - so my question is: when is that happening?

    • I actually think that's the way to go. There needs to be a law about whether AI is allowed to learn from content on the same terms as a living individual, or not. Then there also need to be technical means to enact whatever policy is legislated, which is here this Cloudflare technology could fit in.
      • There needs to be a law about whether AI is allowed to learn from content on the same terms as a living individual, or not.

        No. There needs to be a law about whether AI companies can suck up content without paying for it, for commercial purposes. The word "learning" is mathematically meaningless.

    • will whine about having to pay for anything

      You mean like people on here and elsewhere who brag about stealing music/movies/software because they don't want to pay?

      If it's okay for you to steal someone else's work, why is not acceptable for these companies to scrape available content?
      • by Sebby ( 238625 )

        will whine about having to pay for anything You mean like people on here and elsewhere who brag about stealing music/movies/software because they don't want to pay?

        "brag" != "whine"

        If it's okay for you to steal someone else's work

        WTF here said that in this discussion. Provide citations.

        why is not acceptable for these companies to scrape available content?

        Comparing apples to oranges.

        Multi-billion$ AI companies scrape content, then repeatedly sell access to services that use that content at scale without compensation to the creators, without whose content those companies would have nothing to offer in the first place.

        Quite different than some individual "stealing" a song for their own use (sure there's some level of deprivation of funding to the creator, but they're not making money of

      • We're moving on from that argument. The industrial speed and scale of data harvesting is the difference.
  • I'd imagine most of the AI crawlers out there have already ingested most of the freely available content. I don't imagine many places that have new content who haven't already struck up some agreement with the scrapers. For instance Ars Technica has.

    Had this existed three years ago, it might've been interesting.

    That said, there's something missing. There's two ways a crawler can work. Either they request content, get told a price and have to reconnect and agree to the price OR they can declare in ad
    • by allo ( 1728082 )

      That's an interesting and dangerous idea. Do you know that when you allow ad script, different advertisers "bid" for the space in your browser? They try to determine who you are and what you may buy (that's why there is so much tracking) and then there is a high-frequency bidding system who is willing to pay the most for the ad space.

      Now imagine a system in which site and bot bid about the price for the access. "You want to be included on the user's result page? For 3 cent you go there, I also get 100 pages

  • how is big tech supposed to profit from stealing your content if they have to pay for it?
  • Former it was like: AI might hurt itself when it gets more and more ai generated input
    Now its like: AI will only get ai generated content at all

    • But not the spam behind cloudflare...

      • by allo ( 1728082 )

        But in especially the spam behind Cloudflare.

        If a site can charge per access without telling before what's the content, what will they do? They will create clickbait for bots. Generate a page that looks like the outgoing links are worth to pay for, then hope the bot pays for access.
        Any bot owner who agrees to pay for accessing content makes himself a target for such spams/scams.

  • The reason we have so much malicious content generation and spam is that there is no cost to generate and send.

    Complete lack of friction leads to where we are: an unusable unsearchable internet floating in sewage.

    I'm warming up to a system of anonymous micro payments to access whatever. Email should have a nominal fee, like a stamp on an envelope, and spam would disappear overnight.

    The obvious flaws in that are needing a new protocol probably, and that the micropayment system itself is subject to funny busi
  • Any idea how Cloudflare distinguishes between search engine crawlers and AI crawlers?
    Presumably there are sites that want to be scanned by the former but not the latter.

    • by ipX ( 197591 )

      Any idea how Cloudflare distinguishes between search engine crawlers and AI crawlers?

      Yes, and it's extremely conerning to stay the least. I read everything I could about this today and it's called web bot auth: https://developers.cloudflare.... [cloudflare.com]

2.4 statute miles of surgical tubing at Yale U. = 1 I.V.League

Working...