Is Mastodon's Link-Previewing Overloading Servers? (itsfoss.com) 39
The blog Its FOSS has 15,000 followers for its Mastodon account — which they think is causing problems:
When you share a link on Mastodon, a link preview is generated for it, right? With Mastodon being a federated platform (a part of the Fediverse), the request to generate a link preview is not generated by just one Mastodon instance. There are many instances connected to it who also initiate requests for the content almost immediately. And, this "fediverse effect" increases the load on the website's server in a big way.
Sure, some websites may not get overwhelmed with the requests, but Mastodon does generate numerous hits, increasing the load on the server. Especially, if the link reaches a profile with more followers (and a broader network of instances)... We tried it on our Mastodon profile, and every time we shared a link, we were able to successfully make our website unresponsive or slow to load.
Slashdot reader nunojsilva is skeptical that "blurbs with a thumbnail and description" could create the issue (rather than, say, poorly-optimized web content). But the It's Foss blog says they found three GitHub issues about the same problem — one from 2017, and two more from 2023. And other blogs also reported the same issue over a year ago — including software developer Michael Nordmeyer and legendary Netscape programmer Jamie Zawinski.
And back in 2022, security engineer Chris Partridge wrote: [A] single roughly ~3KB POST to Mastodon caused servers to pull a bit of HTML and... an image. In total, 114.7 MB of data was requested from my site in just under five minutes — making for a traffic amplification of 36704:1. [Not counting the image.]
Its Foss reports Mastodon's official position that the issue has been "moved as a milestone for a future 4.4.0 release. As things stand now, the 4.4.0 release could take a year or more (who knows?)."
They also state their opinion that the issue "should have been prioritized for a faster fix... Don't you think as a community-powered, open-source project, it should be possible to attend to a long-standing bug, as serious as this one?"
Sure, some websites may not get overwhelmed with the requests, but Mastodon does generate numerous hits, increasing the load on the server. Especially, if the link reaches a profile with more followers (and a broader network of instances)... We tried it on our Mastodon profile, and every time we shared a link, we were able to successfully make our website unresponsive or slow to load.
Slashdot reader nunojsilva is skeptical that "blurbs with a thumbnail and description" could create the issue (rather than, say, poorly-optimized web content). But the It's Foss blog says they found three GitHub issues about the same problem — one from 2017, and two more from 2023. And other blogs also reported the same issue over a year ago — including software developer Michael Nordmeyer and legendary Netscape programmer Jamie Zawinski.
And back in 2022, security engineer Chris Partridge wrote: [A] single roughly ~3KB POST to Mastodon caused servers to pull a bit of HTML and... an image. In total, 114.7 MB of data was requested from my site in just under five minutes — making for a traffic amplification of 36704:1. [Not counting the image.]
Its Foss reports Mastodon's official position that the issue has been "moved as a milestone for a future 4.4.0 release. As things stand now, the 4.4.0 release could take a year or more (who knows?)."
They also state their opinion that the issue "should have been prioritized for a faster fix... Don't you think as a community-powered, open-source project, it should be possible to attend to a long-standing bug, as serious as this one?"
First time? (Score:3)
They also state their opinion that the issue "should have been prioritized for a faster fix... Don't you think as a community-powered, open-source project, it should be possible to attend to a long-standing bug, as serious as this one?"
They should check out the open bugs for Firefox or KDE sometime. Some of them are years old.
Re: (Score:2)
Re: First time? (Score:2)
A hiccup in the performance of a website isn't what I would categorize as serious. Especially as it's only an issue with particularly large distribution groups.
It should be fixed. But I don't see why it should hold up a release if it's a long standing "bug".
Re: First time? (Score:2)
But it doesn't affect them, it affects the sites.
And mostly only sites that don't normally get huge traffic.
And how to fix it? Any fix would let you make fake previews that spread in the network.
Re: (Score:3)
> Some of them are years old.
Decades, even.
Re: (Score:2)
DOS Attacks! (Score:2)
So this is a self-triggered DOS attack, impressive, and they do need to fix this smartish.
Re: (Score:3)
You can trigger it against somebody else, you just won't be able to measure the bandwidth amplification factor if you do.
Speaking of which, 115 MB over 300 seconds is less than 400 KB/sec. If that is a significant burden on his server, at least one of three things is true: he's got way too many tiny assets in his web page design, he's running his web server on some tiny processor (like, microcontroller size), or he's using a terribly inefficient web server.
Re: (Score:3)
...115 MB over 300 seconds is less than 400 KB/sec. If that is a significant burden on his server....
That's for a single post. That would be multiplied by the numbers of posts, so it wouldn't take much to become a massive resource drain. And now imagine you're hosting on a cloud provider. You can say goodbye to eating for that month.
Re: (Score:2)
Hopefully these servers are caching previous based on the target URL, not the story that embeds the link. Then the load scales only with the number of different URLs that point to the web server, not with how many posts reference the URLs.
If you're hosting something on a cloud service without a budget cap, that's a problem to be discussed between you and your wallet.
Re: (Score:1)
Bug or Pull Request? (Score:3)
> community-powered, open-source project
Has someone written a feature to make this better and it's being ignored?
Or is this a matter of insisting someone else do something?
TBH it sounds like we need a standard for pre-rendered blurb and thumbs.
Extended OpenGraph for thumbs? Or does that exist?
Social media is moving to distributed no matter who dislikes that idea, so make it as efficient as possible, yes, but don't bitch about reality.
Re: (Score:3, Funny)
Has someone written a feature to make this better and it's being ignored?
Yes in fact, a certain Mr. Jia Tan fixed the issue. New version coming soon on his home page, due to obstinate Mastodon devs.
Re: (Score:3)
It seems like the sensible fix is to make it possible to pull that data from one Mastodon server to the next, whether this creates a standard or not is irrelevant to the particular problem. With my complete ignorance of the codebase this seems relatively trivial as tasks go given that they already have syndication features in the platform, but as there are many things I don't know about it perhaps there's some reason why this is difficult.
Re: (Score:2)
Maybe have this generated once on the posting server?
I'd actually enjoy if I could (as a user) configure whether or not the "blurb" (or worse, a big image, which might not even be related) appears in a mastodon post that has a web address. Sometimes, I'd be happier with just the linkified address...
It might also be possible to try to delay any fetching from Mastodon instances? If other servers want to check that the preview is accurate, that could always be an opt-in setting, and could be delayed a bit too?
Re: Bug or Pull Request? (Score:2)
Then every in between server can change, inspect or filter the content. The problem has been solved a long time ago, itâ(TM)s called BitTorrent and IPFS, perhaps they should build a true de-centralized platform on it.
Here's one example of CMS functionality (Score:2)
Of course this is a solved problem. I can give a detailed example using the Drupal Content Management System because I know it well, but no doubt other CMS' function similarly.
If the Drupal website is developed correctly, a user can simply upload an image attached to an article, ideally in a large, high resolution format, and the user is done. Drupal knows to render the large original to specification everywhere it is used on the website.
For example, a major newspaper's Sports section links to many complete
All those words and no description. (Score:2)
That is poorly written such that it can DOS itself? Cute. But I don't do Social.
Re: (Score:2)
> But I don't do Social.
Slashdot is social media.
Re:All those words and no description. (Score:4, Funny)
Poorly optimized servers (Score:2)
What's old is new again (Score:4)
Isn't this just a new version of the Slashdot effect?
Re:What's old is new again (Score:4, Insightful)
Except it seems anyone can create a slashdot effect by spamming multiple links from the same site in Mastodon posts.
That sounds like a possible DDoS on demand system.
Re: (Score:2)
Not quite. This is more like a DDoS amplification attack—a way of magnifying an attack against a target by generating a large amount of automated traffic in response to a relatively small input to some unwitting participants—than the Slashdot effect—organic, viral, huamn-driven traffic. In this case, orders upon orders of magnitude more traffic can be generated without any human intervention, allowing someone to target a server for very low cost.
Re: (Score:1)
Isn't this just a new version of the Slashdot effect?
The slashdot effect used to be millions of people loading a URL.
Mastodon is a hundred loads of that URL followed by a few hundred thousand people loading the URL.
If anything it was Twitter that became the new slashdot effect, as there can be millions of people loading a URL that has gone viral.
I think the problem is more that "ItsFOSS" returns megabytes per page view that is causing this problem for them.
Not just them of course, many websites are serving excessive bloat these days.
But the need to blame that
Caching issue (Score:4, Interesting)
Re: (Score:2)
Yeah. They stated in their article that they're behind Cloudflare as well, so this should be getting cached. I'm wondering if there's some setting in their server or in the Cloudflare page rules that's disabling the Cloudflare cache. Even if there's several MB of JavaScript files being downloaded, these *should* be static and cached, reducing the load.
It looks like their Cf-Cache-Status header on their page is showing as "DYNAMIC", not "STATIC". If that's the case, CloudFlare isn't doing anything and is pas
distracted by a shiny object (Score:3)
They were reported as a malware infection vector in Signal. I turned them off. My life went on.
You don't need them.
It's like when you get a divorce and the wife demands to keep the dog. You could fight, or you could let her keep the dog. Here's an idea. Let her keep the dog. You can get another dog, and now she has no power over you.
It's the same thing with software. Don't let software control YOU.
Stop giving away your agency, or stfu. There are alternatives.
As I have been yelling lately: Convenience has lead to fanatical avoidance of thinking.
Try thinking for yourself.
Re:distracted by a shiny object (Score:5, Interesting)
Stop giving away your agency, or stfu. There are alternatives.
I so wish folks would take this to heart. Whenever I talk about the times I've said "No, boss, that's a stupid idea and we shouldn't do it." People immediately start making excuses for their own lack of spine. Far too often I hear "Well, you don't understand the Enterprise software business" or "Microsoft has too much momentum here, we cannot hope to challenge it." They rarely even try.
Convenience has lead to fanatical avoidance of thinking.
Probably a lost cause / waste of time to remind folks of this. It seems like it's kind of baked into most people's personality. Bertrand Russell weighed in on this:
We all have a tendency to think that the world must conform to our prejudices. The opposite view involves some effort of thought, and most people would die sooner than think—in fact, they do so. But the fact that a spherical universe seems odd to people who have been brought up on Euclidean prejudices is no evidence that it is impossible.
Add the preview when posting initially. (Score:2)
Have the initial server get the preview and include it with the post. Then punish anyone who lies (server wide). Would need a mechanism to update the preview if the destination changes too...
I don't know enough about HTTP to know if you can check date modified without getting the whole file, but if not... it should be added. Or to SPDY or whatever the newer incarnation is. Needs change, and so should our methods of sharing typical internet traffic.
If-Modified-Since and If-None-Match (Score:2)
HTTP has the "If-Modified-Since" request header [mozilla.org], which instructs a server to process a request only if the requested document has changed since the provided date. It also has the "If-None-Match" request header [mozilla.org], which does the same thing for "ETag" values.
(I did not use the <code> element in this post where the HTML spec states that I should have because Slashdot issued a diagnostic "Filter error: Invalid HTML tag usage".)
DDoS army in combination with other vulnerability? (Score:2)
There's bound to be something else in mastodon which allows for this to be turned into an exploit, turning mastodon in a DDoS army for grabs.
Mastowhat? (Score:1)
No, they are not (Score:2)
They also disabled the Cloudflare cache themselves: https://infosec.space/@siguza/... [infosec.space]