Wikipedia Signs AI Licensing Deals On Its 25th Birthday (apnews.com) 51

Posted by msmash on Thursday January 15, 2026 @11:20AM from the 25th-anniversary dept.

Wikipedia turns 25 today, and the online encyclopedia is celebrating that with an announcement that it has signed new licensing deals with a slate of major AI companies -- Amazon, Microsoft, Meta Platforms, Perplexity and Mistral AI. The deals allow these companies to access Wikipedia content "at a volume and speed designed specifically for their needs." The Wikimedia Foundation did not disclose financial terms.

Google had already signed on as one of the first enterprise customers back in 2022. The agreements follow the Wikimedia Foundation's push last year for AI developers to pay for access through its enterprise platform. The foundation said human traffic had fallen 8% while bot visits -- sometimes disguised to evade detection -- were heavily taxing its servers.

Wikipedia founder Jimmy Wales said he welcomes AI training on the site's human-curated content but that companies "should probably chip in and pay for your fair share of the cost that you're putting on us." The site remains the ninth most visited on the internet, hosting more than 65 million articles in 300 languages maintained by some 250,000 volunteer editors.

Wikipedia Signs AI Licensing Deals On Its 25th Birthday

This discussion has been archived. No new comments can be posted.

Load All Comments

Search 51 Comments Log In/Create an Account

Comments Filter:

Sellouts (Score:1)

by sinkskinkshrieks ( 6952954 ) writes:

The people who donated time and money to this half-assery are chumps. And that's before talking about Wikia.
- Re:Sellouts (Score:5, Insightful)
  
  by thoriumbr ( 1152281 ) writes: on Thursday January 15, 2026 @11:58AM (#65926380) Homepage
  
  Servers are expensive, bandwidth is expensive, and Wikipedia draws massive amounts of human and bot traffic. They are a non-profit organization and provide their services for free. AI companies and their bots are hammering down their servers all the time, increase cost. Why should they not charge?
  
  I donated and keep donating here and there, and I don't mind their new policy. Unless they allow AI bots to pollute Wikipedia with slop, I am fine.
  
  - Re: (Score:3)
    
    by Sloppy ( 14984 ) writes:
    
    My understanding is that the entirety of Wikipedia is only about 60 GB and is conveniently downloadable. Anyone ought to be able to download a local mirror to use, instead of hammering wikipedia's servers, and doing so might be faster for the consumer, anyway.
    And in a world where hundreds of millions of mainstream users stream video, I'm not sure bandwidth really is expensive anymore. To us old-timers, the numbers today are just astonishing. I almost can't believe I used to worry so much about efficiency ..
    - Re: (Score:2)
      
      by narcc ( 412956 ) writes:
      
      I almost can't believe I used to worry so much about efficiency .. of .. anything.
      That attitude is one of the primary reasons why software is so bad these days. It's not even about optimization, you can get significantly better performance by just not doing stupid things.
    - Re: (Score:2)
      
      by larryjoe ( 135075 ) writes:
      
      My understanding is that the entirety of Wikipedia is only about 60 GB and is conveniently downloadable. Anyone ought to be able to download a local mirror to use, instead of hammering wikipedia's servers, and doing so might be faster for the consumer, anyway.
      The size of Wikipedia **text** is 156 GB [wikipedia.org] but increases to 26 TB when considering all revision history. When also considering Wikimedia, the size balloons to 585 TB [wikipedia.org]. So, the complete repository is far too large for a home user to store.
      AI bot traffic is estimated to be around 65% of all traffic now. However, that traffic is not just incremental, i.e., it doesn't just triple the hardware costs. The vast majority of the repository is in cold storage because it is normally never viewed. However, the AI bot
- Re: (Score:1)
  
  by Fly Swatter ( 30498 ) writes:
  
  Is Wikipedia not of benefit for the public good? If you donated time (or money) to stroke your own ego that's fine too, but anyone that decides to take their ball and go home after this announcement obviously wasn't in it for the public interest to begin with - to those I say good riddance.
  - Re: (Score:3)
    
    by wed128 ( 722152 ) writes:
    
    This seems predicated on the assertion that AI is a public good, and not another means of centralizing power and reducing the workforce. These companies do not have your best interest at heart.
  - Re: (Score:2)
    
    by karmawarrior ( 311177 ) writes:
    
    I'm not sure why you think your comment has anything to do with AI using Wikipedia's content to help in its mission to essentially destroy the web as an information medium.
    - Re: (Score:2)
      
      by Fly Swatter ( 30498 ) writes:
      
      My comment doesn't apply to AI destruction at all. I was addressing human contributors to Wikipedia that will be offended that their 'work' will be now be used by AI and will want to take their ball (undo their contributions) and go home.
- Re: (Score:3)
  
  by ArchieBunker ( 132337 ) writes:
  
  Wikipedia is charging AI companies a fee for access, as opposed to the slop companies abusing the servers for free. I really don't see the problem.
Shame (Score:2)

by liqu1d ( 4349325 ) writes:

Wonder how well that will go down with the editors.
- Re: (Score:3)
  
  by allo ( 1728082 ) writes:
  
  The editors know the license they need to use when writing for Wikipedia.
  This is also most likely not about articles. You can download a Wikipedia dump if you want to train on it. There are also datasets with these dumps prepared for training on the usual sites. This is about Wikimedia commons, which is A LOT more data. Rather have a company to pay for a direct download than have an inefficient bot crawling the same content. The license allows both, but the crawler causes more load on the server than allowi
- Re: (Score:2)
  
  by nightflameauto ( 6607976 ) writes:
  
  Well that's it for trusting Wikipedia for information. I already knew to be skeptical of Wikipedia edits, but going forward Wikipedia will be edited by AI bots with little to no human intervention. It will just be an entire shit show of hallucinations or bias opinions based on whatever training data the AI is fed.
  AI is going to send us back to the stone age. AI will create the next great war as the working population deals with massive job and income loss. In 20 years we will have a generation who are functionally illiterate and won't know a shred of history because nobody will trust any information as factual. Technology will be lost due to nobody being qualified to manage it. AI is going to make a society of idiots.
  Someone in the elite classes probably sees that as yet another benefit. Much easier to control a society of idiots than a society of well-informed, well-educated folks. Granted, with the way things are going I think we'll be finding an excuse to eliminate a whole lot of people well before we have to worry about the AI takeover, but even if we don't proactively do that, it may be the end result. "For the greater good," will be translated into, "for the good of the few uber-wealthy," and the rest of us will b
- Re:One more nail (Score:5, Informative)
  
  by gweihir ( 88907 ) writes: on Thursday January 15, 2026 @11:43AM (#65926318)
  
  Well that's it for trusting Wikipedia for information. I already knew to be skeptical of Wikipedia edits, but going forward Wikipedia will be edited by AI bots with little to no human intervention.
  You have got that backwards...
  
- Re: One more nail (Score:4, Informative)
  
  by samwichse ( 1056268 ) writes: on Thursday January 15, 2026 @11:44AM (#65926322)
  
  That's... not what's going on here. Wikipedia is licensing it's content for these AI companies to crawl for training their models. Presumably to, at minimum, pay for the bandwidth they're using.
  
  - Re: (Score:1)
    
    by GoJays ( 1793832 ) writes:
    
    Yeah, I realized this after I posted it... jumped to conclusions before reading the article in typical /. fashion. :P
    - Re: One more nail (Score:5, Funny)
      
      by know-nothing cunt ( 6546228 ) writes: on Thursday January 15, 2026 @12:12PM (#65926426)
      
      Yeah, I realized this after I posted it... jumped to conclusions before reading the article in typical /. fashion. :P
      Even better, you didn't even finish reading the first paragraph of the summary. Excellent work!
      
- Re: (Score:2)
  
  by allo ( 1728082 ) writes:
  
  Did you try to read at least the summary? This is not about bots writing for Wikipedia.
Why don't they just cache it locally? (Score:3)

by sabbede ( 2678435 ) writes: on Thursday January 15, 2026 @11:34AM (#65926282)

If the AI companies need to keep checking wikipedia, why not just use some of their massive storage to cache the damn thing and stop hammering the servers? What's with this attitude of "we'd rather check the server a thousand times a second than remember what we just read"?

- Re: (Score:2)
  
  by TwistedGreen ( 80055 ) writes:
  
  Because that would make their blatant theft of human knowledge be even more obvious copyright infringement, I guess... At least now they're paying for it.
  - Re: (Score:2)
    
    by sabbede ( 2678435 ) writes:
    
    Well, it's not theft if it's already free to everyone. But maybe copyrights are the reason they don't cache instead. Since the hosts are annoyed about being hammered by the AI companies, wouldn't it make more sense for the hosts to tell them, "you can cache, but you cannot constantly scrape"?
- Re: (Score:2)
  
  by gweihir ( 88907 ) writes:
  
  Bandwidth is cheaper than storage. Apparently.
  - Re: (Score:2)
    
    by sabbede ( 2678435 ) writes:
    
    Well, for the AI companies maybe. They don't bear the costs it places on the hosts. I would think the hosts would have the leverage to force the AI companies to stop if they allowed them to cache instead.
    - Re: (Score:2)
      
      by gweihir ( 88907 ) writes:
      
      Indeed. These people are completely without any concern for what damage they do to others. Extreme greed at work or, worse, some "messiah complex".
- - Re: Why don't they just cache it locally? (Score:2)
    
    by Anamon ( 10465047 ) writes:
    
    This is what happens when you vibe code your crawler. I assume these AI bro companies are dogshitting, er, I mean dogfooding, so their internal tools are probably as poor as the rest of the output their LLM tools generate.
    
    I have a static website. It hasn't been modified for years, and the headers indicate as much with their caching instructions. Yet I regularly have it that the same AI bot will be hammering the same static page tens of thousands of times in a day.
    
    They are not competent people.
- Re: (Score:2)
  
  by allo ( 1728082 ) writes:
  
  This is likely about Wikimedia commons (images, videos, etc.) which is a lot more data than the text dumps.
  - Re: (Score:2)
    
    by sabbede ( 2678435 ) writes:
    
    Even better - those probably don't change as often. Very cacheable.
Reasonable (Score:5, Insightful)

by gurps_npc ( 621217 ) writes: on Thursday January 15, 2026 @11:36AM (#65926288) Homepage

Wikipedia has terms of service that means you give up limited rights when you contribute to it. As such, they decide whether AI gets access to it.
Being a free service to the general public, it is totally reasonably to charge special users to use it.
This is a great way to fund the the general public's use, especially considering how the AI community has in general disregarded authors rights. Better to charge them up front.

- Re: (Score:3)
  
  by gweihir ( 88907 ) writes:
  
  Agreed. And without that deal, the slop-makers would just steal everything anyways.
  - Re: (Score:2)
    
    by evanh ( 627108 ) writes:
    
    Which the slop-makers already had done of course - With the resulting crawlers creating significant financial burden along the way.
    My guess is the Wiki is now being piped direct as changes happen. Like a rolling live simulcast. Then those crawlers go away and the burden of bandwidth and server costs lift.
    - Re: (Score:2)
      
      by gweihir ( 88907 ) writes:
      
      Yes. because the crawlers are completely ignoring all standards of common good Internet behavior and the legal system is incapable of stopping their crap, this is basically self-preservation by at least getting rid of the network load.
- Re: (Score:3)
  
  by Gravis Zero ( 934156 ) writes:
  
  Being a free service to the general public, it is totally reasonably to charge special users to use it.
  More importantly, AI companies are going to scrape the site regardless of it being allowed or not. What this does is gives them firm legal standing that companies doing so are causing them financial losses.
  - YES! (Score:2)
    
    by bussdriver ( 620565 ) writes:
    
    thank you.
    I'm sure they scraped most of Wikipedia already; anybody can go get an offline copy of wikipedia and have been able to do so for many years. It would be the 1st dataset I'd use to train something that large. I've already used it simply to get a list of the most commonly used english words. This is about new content, being hammered by bots, money, and legal issues.
- Re: (Score:2)
  
  by karmawarrior ( 311177 ) writes:
  
  Generally speaking, no, it's not a great idea to make a deal with organizations that intend to ensure nobody interacts with your website ever again.
  Wikipedia works because its readers are also its editors, and because its editors expect to have an impact. When you put an AI wall between your content and the readers, nobody visits, nobody has any reason to update a page, and the website dies a death.
  Again, I still don't understand why Slashdotters, of all people, have such difficulty understanding that peopl
more like Jimmy FAILS (Score:1)

by Pseudonymous Powers ( 4097097 ) writes:

It's finally happening, people! Surely they'll fork Wikipedia THIS time!
Probably the sane choice (Score:3)

by gweihir ( 88907 ) writes: on Thursday January 15, 2026 @11:41AM (#65926302)

Otherwise the AI pushers would just steal everything and cause damage to availability on top of that.

Just remember this kids! (Score:1)

by Wheres the kaboom ( 10344974 ) writes:

Wikipedia is completely unbiased! It’s crowd-sourced! And the crowd only uses approved sources:
https://en.wikipedia.org/wiki/... [wikipedia.org]
With Wikipedia, our AI’s are guaranteed to be benign overlords!
Sleep well.
- Re: (Score:3)
  
  by ArchieBunker ( 132337 ) writes:
  
  A fan of "doing your own research" I take it.
  - Re: (Score:1)
    
    by Wheres the kaboom ( 10344974 ) writes:
    
    A fan of "doing your own research" I take it.
    Is that sarcasm? Not a fan of “doing your own research”. We can’t trust the plebes!
    I spent thirty years in high tech - much with direct Bell Labs lineage - and the “do your own research” crowd? Morons.
    Be better!
    - - Re: (Score:2)
        
        by Wheres the kaboom ( 10344974 ) writes:
        
        ArchieBunker is in fact a moron and has a blind spot when it comes to bias. Thinks he's "smart" because he tinkers with electronics and holds politically correct opinions and sneers sarcastically at wrongthinkers.
        No. Impossible. The mods love him so far!
License violation (Score:3)

by Meneth ( 872868 ) writes: on Thursday January 15, 2026 @11:53AM (#65926366)

This seems to be a violation of the CC-BY-SA license granted to Wikipedia by its editors, since the AI companies do not give proper attribution in the output of their models. (Given the vast volume of sources each model has ingested, it would be impractical to do so.)

- Re: (Score:1)
  
  by Pseudonymous Powers ( 4097097 ) writes:
  
  Jimmy Wales Claims He Can Override Wikipedia License Terms By Executive Order Because of Little-Known "Nobody Will Stop Me" Loophole
So how long before Wikipedia is defaced by Redact? (Score:2)

by Fly Swatter ( 30498 ) writes:

Wikipedia is going to look like reddit soon if they aren't diligent. Wikipedia has a lot of power trip gatekeepers.
There goes the neighborhood (Score:2)

by ScooterBill ( 599835 ) writes:

Aren't we all just sitting around waiting for a handful of people to own every fucking thing on this planet.
AI, a overhyped bullshit front end to Wikipedia (Score:2)

by BrendaEM ( 871664 ) writes:

So, your brain has rotted to the point where you cannot use a browser? Then use AI, so it can rot further. : P
My complex relationship with Wikipedia (Score:3)

by xack ( 5304745 ) writes: on Thursday January 15, 2026 @01:15PM (#65926586)

I came across Wikipedia very early on, when it had barely 40,000 articles. I was a regular contributor for about two years from 2004-2006 before I gave up because of disputes and became a vandal. Wikipedia is still growing, even though notability limits true growth and that non notable content is routinely monetized by Fandom and Knowyourmeme. Even though I am banned from Wikipedia and am not sorry for vandalising it I'm still impressed what they have built. Wikipedia has made freely available what academic journals and newspapers lock behind paywalls, so I have a respect for that.

Wikipedia's content is traceable by its history, meanwhile AI just spits out whatever is computed in the LLM algorithm. Elon Musk's disaster of Grokipedia shows what can go wrong when AI tries to make an encyclopedia with a Nazi point of view, Wikipedia is still inherently more trustable.

There may be more comments in this discussion. Without JavaScript enabled, you might want to turn on Classic Discussion System in your preferences instead.

Sellouts (Score:1)

Re:Sellouts (Score:5, Insightful)

Re: (Score:3)

Re: (Score:2)

Re: (Score:2)

Re: (Score:1)

Re: (Score:3)

Re: (Score:2)

Re: (Score:2)

Re: (Score:3)

Shame (Score:2)

Re: (Score:3)

Re: (Score:2)

Re:One more nail (Score:5, Informative)

Re: One more nail (Score:4, Informative)

Re: (Score:1)

Re: One more nail (Score:5, Funny)

Re: (Score:2)

Why don't they just cache it locally? (Score:3)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: Why don't they just cache it locally? (Score:2)

Re: (Score:2)

Re: (Score:2)

Reasonable (Score:5, Insightful)

Re: (Score:3)

Re: (Score:2)

Re: (Score:2)

Re: (Score:3)

YES! (Score:2)

Re: (Score:2)

more like Jimmy FAILS (Score:1)

Probably the sane choice (Score:3)

Just remember this kids! (Score:1)

Re: (Score:3)

Re: (Score:1)

Re: (Score:2)

License violation (Score:3)

Re: (Score:1)

So how long before Wikipedia is defaced by Redact? (Score:2)

There goes the neighborhood (Score:2)

AI, a overhyped bullshit front end to Wikipedia (Score:2)

My complex relationship with Wikipedia (Score:3)

Related Links Top of the: day, week, month.

Slashdot Top Deals