Slashdot is powered by your submissions, so send in your scoop

 



Forgot your password?
typodupeerror
×
Social Networks

Is Mastodon's Link-Previewing Overloading Servers? (itsfoss.com) 39

The blog Its FOSS has 15,000 followers for its Mastodon account — which they think is causing problems: When you share a link on Mastodon, a link preview is generated for it, right? With Mastodon being a federated platform (a part of the Fediverse), the request to generate a link preview is not generated by just one Mastodon instance. There are many instances connected to it who also initiate requests for the content almost immediately. And, this "fediverse effect" increases the load on the website's server in a big way.

Sure, some websites may not get overwhelmed with the requests, but Mastodon does generate numerous hits, increasing the load on the server. Especially, if the link reaches a profile with more followers (and a broader network of instances)... We tried it on our Mastodon profile, and every time we shared a link, we were able to successfully make our website unresponsive or slow to load.

Slashdot reader nunojsilva is skeptical that "blurbs with a thumbnail and description" could create the issue (rather than, say, poorly-optimized web content). But the It's Foss blog says they found three GitHub issues about the same problem — one from 2017, and two more from 2023. And other blogs also reported the same issue over a year ago — including software developer Michael Nordmeyer and legendary Netscape programmer Jamie Zawinski.

And back in 2022, security engineer Chris Partridge wrote: [A] single roughly ~3KB POST to Mastodon caused servers to pull a bit of HTML and... an image. In total, 114.7 MB of data was requested from my site in just under five minutes — making for a traffic amplification of 36704:1. [Not counting the image.]
Its Foss reports Mastodon's official position that the issue has been "moved as a milestone for a future 4.4.0 release. As things stand now, the 4.4.0 release could take a year or more (who knows?)."

They also state their opinion that the issue "should have been prioritized for a faster fix... Don't you think as a community-powered, open-source project, it should be possible to attend to a long-standing bug, as serious as this one?"

Is Mastodon's Link-Previewing Overloading Servers?

Comments Filter:
  • by drinkypoo ( 153816 ) <drink@hyperlogos.org> on Sunday May 05, 2024 @11:41AM (#64449326) Homepage Journal

    They also state their opinion that the issue "should have been prioritized for a faster fix... Don't you think as a community-powered, open-source project, it should be possible to attend to a long-standing bug, as serious as this one?"

    They should check out the open bugs for Firefox or KDE sometime. Some of them are years old.

    • by Moochman ( 54872 )
      The word *serious* is the key here. Apparently the people in charge of triaging bugs didn't take it seriously enough, though.
      • A hiccup in the performance of a website isn't what I would categorize as serious. Especially as it's only an issue with particularly large distribution groups.

        It should be fixed. But I don't see why it should hold up a release if it's a long standing "bug".

      • But it doesn't affect them, it affects the sites.

        And mostly only sites that don't normally get huge traffic.

        And how to fix it? Any fix would let you make fake previews that spread in the network.

    • > Some of them are years old.

      Decades, even.

    • by xack ( 5304745 )
      Then there web devs who are still forced to support Internet Explorer because of corps too lazy to upgrade from Windows 7 or even XP. Even though it less than 1 percent of traffic it makes up a lot of revenue from big corp budgets.
  • a traffic amplification of 36704:1. [Not counting the image.]

    So this is a self-triggered DOS attack, impressive, and they do need to fix this smartish.

    • by Entrope ( 68843 )

      You can trigger it against somebody else, you just won't be able to measure the bandwidth amplification factor if you do.

      Speaking of which, 115 MB over 300 seconds is less than 400 KB/sec. If that is a significant burden on his server, at least one of three things is true: he's got way too many tiny assets in his web page design, he's running his web server on some tiny processor (like, microcontroller size), or he's using a terribly inefficient web server.

      • ...115 MB over 300 seconds is less than 400 KB/sec. If that is a significant burden on his server....

        That's for a single post. That would be multiplied by the numbers of posts, so it wouldn't take much to become a massive resource drain. And now imagine you're hosting on a cloud provider. You can say goodbye to eating for that month.

        • by Entrope ( 68843 )

          Hopefully these servers are caching previous based on the target URL, not the story that embeds the link. Then the load scales only with the number of different URLs that point to the web server, not with how many posts reference the URLs.

          If you're hosting something on a cloud service without a budget cap, that's a problem to be discussed between you and your wallet.

    • Yes, it's serious. It's sure to impact all 27 Mastodon users, lol.
  • by bill_mcgonigle ( 4333 ) * on Sunday May 05, 2024 @11:54AM (#64449366) Homepage Journal

    > community-powered, open-source project

    Has someone written a feature to make this better and it's being ignored?

    Or is this a matter of insisting someone else do something?

    TBH it sounds like we need a standard for pre-rendered blurb and thumbs.

    Extended OpenGraph for thumbs? Or does that exist?

    Social media is moving to distributed no matter who dislikes that idea, so make it as efficient as possible, yes, but don't bitch about reality.

    • Re: (Score:3, Funny)

      by vbdasc ( 146051 )

      Has someone written a feature to make this better and it's being ignored?

      Yes in fact, a certain Mr. Jia Tan fixed the issue. New version coming soon on his home page, due to obstinate Mastodon devs.

    • It seems like the sensible fix is to make it possible to pull that data from one Mastodon server to the next, whether this creates a standard or not is irrelevant to the particular problem. With my complete ignorance of the codebase this seems relatively trivial as tasks go given that they already have syndication features in the platform, but as there are many things I don't know about it perhaps there's some reason why this is difficult.

      • Maybe have this generated once on the posting server?

        I'd actually enjoy if I could (as a user) configure whether or not the "blurb" (or worse, a big image, which might not even be related) appears in a mastodon post that has a web address. Sometimes, I'd be happier with just the linkified address...

        It might also be possible to try to delay any fetching from Mastodon instances? If other servers want to check that the preview is accurate, that could always be an opt-in setting, and could be delayed a bit too?

      • Then every in between server can change, inspect or filter the content. The problem has been solved a long time ago, itâ(TM)s called BitTorrent and IPFS, perhaps they should build a true de-centralized platform on it.

    • Of course this is a solved problem. I can give a detailed example using the Drupal Content Management System because I know it well, but no doubt other CMS' function similarly.

      If the Drupal website is developed correctly, a user can simply upload an image attached to an article, ideally in a large, high resolution format, and the user is done. Drupal knows to render the large original to specification everywhere it is used on the website.

      For example, a major newspaper's Sports section links to many complete

  • Of what it is. Apparently it is a self-hosted social media platform. [wikipedia.org]

    That is poorly written such that it can DOS itself? Cute. But I don't do Social.
  • I'll go with poorly optimized servers. If they didn't send megabytes of JS and other crap, most likely they could handle the load.
  • by Calydor ( 739835 ) on Sunday May 05, 2024 @12:24PM (#64449454)

    Isn't this just a new version of the Slashdot effect?

    • by Deal In One ( 6459326 ) on Sunday May 05, 2024 @01:27PM (#64449600)

      Except it seems anyone can create a slashdot effect by spamming multiple links from the same site in Mastodon posts.

      That sounds like a possible DDoS on demand system.

    • Not quite. This is more like a DDoS amplification attack—a way of magnifying an attack against a target by generating a large amount of automated traffic in response to a relatively small input to some unwitting participants—than the Slashdot effect—organic, viral, huamn-driven traffic. In this case, orders upon orders of magnitude more traffic can be generated without any human intervention, allowing someone to target a server for very low cost.

    • by Anonymous Coward

      Isn't this just a new version of the Slashdot effect?

      The slashdot effect used to be millions of people loading a URL.
      Mastodon is a hundred loads of that URL followed by a few hundred thousand people loading the URL.

      If anything it was Twitter that became the new slashdot effect, as there can be millions of people loading a URL that has gone viral.

      I think the problem is more that "ItsFOSS" returns megabytes per page view that is causing this problem for them.
      Not just them of course, many websites are serving excessive bloat these days.

      But the need to blame that

  • Caching issue (Score:4, Interesting)

    by bubblyceiling ( 7940768 ) on Sunday May 05, 2024 @01:29PM (#64449604)
    That sounds more like a cache issue than anything else
    • Yeah. They stated in their article that they're behind Cloudflare as well, so this should be getting cached. I'm wondering if there's some setting in their server or in the Cloudflare page rules that's disabling the Cloudflare cache. Even if there's several MB of JavaScript files being downloaded, these *should* be static and cached, reducing the load.

      It looks like their Cf-Cache-Status header on their page is showing as "DYNAMIC", not "STATIC". If that's the case, CloudFlare isn't doing anything and is pas

  • by Big Hairy Gorilla ( 9839972 ) on Sunday May 05, 2024 @02:08PM (#64449656)
    Did you know you don't need link previews?

    They were reported as a malware infection vector in Signal. I turned them off. My life went on.
    You don't need them.

    It's like when you get a divorce and the wife demands to keep the dog. You could fight, or you could let her keep the dog. Here's an idea. Let her keep the dog. You can get another dog, and now she has no power over you.
    It's the same thing with software. Don't let software control YOU.

    Stop giving away your agency, or stfu. There are alternatives.
    As I have been yelling lately: Convenience has lead to fanatical avoidance of thinking.
    Try thinking for yourself.
    • by Seven Spirals ( 4924941 ) on Sunday May 05, 2024 @02:29PM (#64449702)

      Stop giving away your agency, or stfu. There are alternatives.

      I so wish folks would take this to heart. Whenever I talk about the times I've said "No, boss, that's a stupid idea and we shouldn't do it." People immediately start making excuses for their own lack of spine. Far too often I hear "Well, you don't understand the Enterprise software business" or "Microsoft has too much momentum here, we cannot hope to challenge it." They rarely even try.

      Convenience has lead to fanatical avoidance of thinking.

      Probably a lost cause / waste of time to remind folks of this. It seems like it's kind of baked into most people's personality. Bertrand Russell weighed in on this:

      We all have a tendency to think that the world must conform to our prejudices. The opposite view involves some effort of thought, and most people would die sooner than think—in fact, they do so. But the fact that a spherical universe seems odd to people who have been brought up on Euclidean prejudices is no evidence that it is impossible.

  • Have the initial server get the preview and include it with the post. Then punish anyone who lies (server wide). Would need a mechanism to update the preview if the destination changes too...

    I don't know enough about HTTP to know if you can check date modified without getting the whole file, but if not... it should be added. Or to SPDY or whatever the newer incarnation is. Needs change, and so should our methods of sharing typical internet traffic.

  • There's bound to be something else in mastodon which allows for this to be turned into an exploit, turning mastodon in a DDoS army for grabs.

  • Oh yeah, I remember that 3 months now, when people thought Mastodon might catch on. Didn't realize it still existed.
  • Here is an insightful response to their post: https://infosec.space/@siguza/... [infosec.space]

    Pulling up this site in a browser with no privacy/sanity plugins installed, it made a total of 3740 requests within 4 minutes, which amounted to 267.22 MB transferred.
    (...)

    Your website is an insult to the internet.

    They also disabled the Cloudflare cache themselves: https://infosec.space/@siguza/... [infosec.space]

    I mean, in a first step they could just bump their "cache-control: max-age=0" to something like 5 minutes for static assets, so

The question of whether computers can think is just like the question of whether submarines can swim. -- Edsger W. Dijkstra

Working...