Open Source Registries Join Linux Foundation Working Group to Address Machine-Generated Traffic (zdnet.com) 28
Under the nonprofit Linux Foundation, "a new Sustaining Package Registries Working Group will seek to identify concrete funding, governance, and security practices," reports ZDNet, "to keep code flowing as download counts grow.... Because software builds, continuous integration pipelines, and AI systems hammer registries at machine speed rather than human speed, the sites can't keep up.
"That growth has brought a surge in bot traffic, automated publishing, security reports, and outright abuse, exposing what the working group bluntly calls a 'sustainability gap'." Sonatype CTO Brian Fox, who oversees the Maven Central Java registry, estimates open-source registries saw 10 trillion downloads in 2025. And "The same pattern is appearing across ecosystems. More machine traffic. More automation. More scanning. More expectations around uptime, integrity, provenance, and policy enforcement. More cost. More support burden. More dependency on infrastructure that the industry still talks about as though it runs on goodwill and spare time."
ZDNet reports that "To tackle that, Sonatype has teamed up with the Linux Foundation and other package registry leaders, including Alpha-Omega, Eclipse Foundation (OpenVSX), OpenJS Foundation, OpenSSF, Packagist, Python Software Foundation, Ruby Central (RubyGems), and the Rust Foundation (Crates)." The idea is to give operators a neutral forum to discuss money, governance, and shared operational burdens openly. Once that's dealt with, they'll coordinate how to explain those realities back to companies and organizations that have long assumed registries are "free." No, they're not. They never were. As the Linux Foundation pointed out, "Registries today run primarily on two things: (1) infrastructure donations and credits; and (2) heroic efforts from small paid teams (themselves funded by donations and grants) and unpaid volunteers that operate and maintain registry services. The bulk of donations and grants comes from a small set of donors and doesn't scale with demands on the registry."
The working group is explicitly positioned as a venue where registry leaders and ecosystem stakeholders can align on "practical, community-minded" ways to sustain that infrastructure, rather than each operator improvising its own survival plan in isolation.
ZDNet says the group will also coordinate security practices and information, and craft frameworks "that make it politically and legally possible to introduce sustainable funding models without fracturing communities." And they will also "align messaging and educational content so developers, companies, and policymakers finally understand what it costs to run these services."
"That growth has brought a surge in bot traffic, automated publishing, security reports, and outright abuse, exposing what the working group bluntly calls a 'sustainability gap'." Sonatype CTO Brian Fox, who oversees the Maven Central Java registry, estimates open-source registries saw 10 trillion downloads in 2025. And "The same pattern is appearing across ecosystems. More machine traffic. More automation. More scanning. More expectations around uptime, integrity, provenance, and policy enforcement. More cost. More support burden. More dependency on infrastructure that the industry still talks about as though it runs on goodwill and spare time."
ZDNet reports that "To tackle that, Sonatype has teamed up with the Linux Foundation and other package registry leaders, including Alpha-Omega, Eclipse Foundation (OpenVSX), OpenJS Foundation, OpenSSF, Packagist, Python Software Foundation, Ruby Central (RubyGems), and the Rust Foundation (Crates)." The idea is to give operators a neutral forum to discuss money, governance, and shared operational burdens openly. Once that's dealt with, they'll coordinate how to explain those realities back to companies and organizations that have long assumed registries are "free." No, they're not. They never were. As the Linux Foundation pointed out, "Registries today run primarily on two things: (1) infrastructure donations and credits; and (2) heroic efforts from small paid teams (themselves funded by donations and grants) and unpaid volunteers that operate and maintain registry services. The bulk of donations and grants comes from a small set of donors and doesn't scale with demands on the registry."
The working group is explicitly positioned as a venue where registry leaders and ecosystem stakeholders can align on "practical, community-minded" ways to sustain that infrastructure, rather than each operator improvising its own survival plan in isolation.
ZDNet says the group will also coordinate security practices and information, and craft frameworks "that make it politically and legally possible to introduce sustainable funding models without fracturing communities." And they will also "align messaging and educational content so developers, companies, and policymakers finally understand what it costs to run these services."
I'm curious what the response will be. (Score:1)
Re: (Score:2)
I say: Let them freak out. What are they gonna do about it?
Re: (Score:2)
Re: (Score:2)
CI without local, not automatically updated repositories is pure insanity. These people are asking to get hacked via supply-chain attack. Looks like a lot of software creation is still done "cheaper than possible".
Let them freak out. It is not your problems that they use unsound practices and (mentally) lazy approaches.
Re: (Score:2)
I never understood how people let CI pipelines download the packages every time again. Shouldn't one just bundle the dependencies for the CI pipeline so it doesn't even need network access?
They oughta just torrent it. (Score:5, Interesting)
It feels like it'd be in the best self interest of all the agentic "developers" to mirror all the open source sources and documentation in decentralized, peer to peer manner. It should be pretty trivial to get an identical "security" guarantee by just validating checksums of whatever you download with the authoritative hosts at fraction of cost to them, while potentially saving everyone a lot of bandwidth and time, as it's pretty likely half the time the agents would just download the sources from the bazillion other agents fetching the same libraries from within the same datacenter.
With how bleak things look with Github, it feels like something decentralized to host FOSS will be needed sooner rather than later anyway, outside of the infinite needs of our infinite monkeys.
Re: (Score:2)
Re: (Score:3)
I mean, why not?
Re: (Score:3)
I think it's a great idea. It was a technically reasonable solution to sharing the costs of hosting and serving content when the web was small. It got run over by spam and trolls and warez eventually but we've learned a lot about content moderation and filtering in the last 25 years.
The main issue is that companies feel they can't monetize their own content if they have no way to control distribution servers, but that should not be a consideration for open source provided it's the kind of open source that
Re: (Score:2, Funny)
I had to shut down automated access (Score:4, Interesting)
I have a few open-source packages I wrote and maintain and I had to block downloads of one of them behind a form that required entering the answer to a question. CI systems from all over the world were just hammering my system.
I think this is the future: No more automated downloads. If you want automated access to packages, you'll have to download them once by hand and make your own mirror.
I've also had to password-protect my forgejo instance to block AI bots. The password is given right on the welcome page, but so far bots are not smart enough to recognize and use it.
Re:I had to shut down automated access (Score:5, Interesting)
Do you feel at least a little bit of an urge to make a honeypot version that no human would ever download on accident but which CIs would grab, that'd simply fail unpredictably, maybe with error messages that'd be extremely clear to a human but contain some safety guardrail breaking verbiage that'd take an LLM for a lengthy thinking token loop?
Re: (Score:2)
No, I don't. While I don't want people abusing my server, I'm also not anti-social.
Re: (Score:1)
That's the correct solution if you don't want people to find your stuff. Other folks are thinking bigger than you. Your comment adds nothing of value here because it's literally addressed in the summary: "...rather than each operator improvising its own survival plan in isolation."
Re:I had to shut down automated access (Score:5, Insightful)
Re: I had to shut down automated access (Score:2)
Counterpoint: a unique solution also means only one workaround to find around it.
Re: (Score:2)
People can and do find my stuff just fine. The project pages are visible to everyone. Only the downloads are protected.
And I think password-protection or filling in of a form in order to download is a perfectly-acceptable technique for anyone to use.
Re: (Score:2)
Or you could put them on npm / PyPI like a normal person.
Re: (Score:2)
Re: (Score:2)
My projects are not written in JS or Python. Perl and C. Some of the Perl modules are on CPAN where I don't have to worry about downloads, but the C ones are hosted on my server.
New model: Free and Free (Score:2)
If hammering is an issue, randomly drop with 429 95% of requests. Then as an alternative, allow people to buy an API key for 1000 downloads costing 1€.
Then patient individuals can always download for free. Big companies / CI / AI will want to pay or make their own mirror.
Re: (Score:2)
"Machine-Generated Traffic" ? (Score:1)
Like DNS ?
Rsync in a Cron job ?
Ntp queries ?
How awful.
Re: (Score:2)
You're being disingenuous; you know exactly what the issue is.
Re: (Score:2)