Forgot your password?
typodupeerror
Communications Government IT News

State Dept E-mail Crash After "Reply-All" Storm 384

Posted by timothy
from the forward-this-story-to-all-your-friends-in-triplicate dept.
twistah writes "It seems that a recent 'reply-all storm' at the State Department caused the entire e-mail infrastructure to crash. A notice sent to all State Department employees warned of disciplinary actions which will be taken if users 'reply-all' to lists with a large amount of users. Apparently, the problem was compounded by not only angry replies asking to be taken off the errant list, but by the e-mail recall function, which generated further e-mail traffic. One has to wonder if capacity planning was performed correctly — should an e-mail system be able to handle this type of traffic, or is it an unreasonable task for even the best system?"
This discussion has been archived. No new comments can be posted.

State Dept E-mail Crash After "Reply-All" Storm

Comments Filter:
  • Bedlam... (Score:5, Interesting)

    by ghostis (165022) on Saturday January 10, 2009 @10:44PM (#26404285) Homepage
    • Re:Bedlam... (Score:4, Insightful)

      by DeadPixels (1391907) on Saturday January 10, 2009 @10:54PM (#26404365)
      Sounds like nearly the exact same situation. The problem here is that the average user is just going to click the first "reply" button he sees, and if that happens to be Reply All, nothing's going to stop him. Perhaps the mail client should have a feature enabled by default that warns if an exceptionally large number of messages are being sent and allow the option to cancel.
      • Re: (Score:3, Informative)

        by Anonymous Coward

        read bedlam. in annoying pathological cases, the user(agent) can't know who's on the dl or how big it is.

        for some cases, it's probably possible for the user agent to do something slightly more intelligent. but it's a hard problem.

      • Re:Bedlam... (Score:4, Insightful)

        by jamesh (87723) on Saturday January 10, 2009 @11:16PM (#26404549)

        Perhaps the mail client should have a feature enabled by default that warns if an exceptionally large number of messages are being sent and allow the option to cancel.

        Change that to 'that warns if an exceptionally large number of messages are being sent and smack the user over the head with a LART if they don't click cancel' and i'll agree with you.

        A large company should have an internal mailing list and/or intranet system that individual users can post messages to. Letting individual users send email to more than a few thousand users in one hit is madness. Especially if they are anything like our customers where they think it is a good idea to send a 10MB attachment to 500 users...

        • by The Dobber (576407) on Saturday January 10, 2009 @11:23PM (#26404609)

          I remember 10 or so years ago a disgruntled employee managed to send a heartfelt "Fuck You" to the entire 27,000+ employees as he was being given the heave ho.

          That one tied up the network for some period of time. I always wonder who the bright star was how had composed the distribution list for the entire company directory.

          • by MichaelSmith (789609) on Saturday January 10, 2009 @11:53PM (#26404805) Homepage Journal

            That one tied up the network for some period of time.

            Thats why I always use qmail for my Fuck You messages.

          • Re:Bedlam... (Score:4, Insightful)

            by TheLink (130905) on Sunday January 11, 2009 @04:38AM (#26405999) Journal
            There was at least one employee who actually spammed everyone for his direct marketing stuff... He got everyone which included the bosses ;).

            That said, I think there actually should be a distribution list for the entire company - it can be useful for some stuff.

            However the actual name should be hard to guess, and secret.

            Then you set up the "everyone" list for people to send to which actually goes to a moderator.

            If the moderator thinks the email should go out, it is sent out via the "secret-real-everyone-list", otherwise it isn't.

            If the email indicates that the sender has significant lack of discretion or intelligence, the moderator may wish to pass it to the Bosses concerned so that they can take necessary measures.

            In one of the places I worked for "everyone" actually went to the Big Boss(es), and I think it worked reasonably well.
            • Re:Bedlam... (Score:5, Insightful)

              by walt-sjc (145127) on Sunday January 11, 2009 @06:55AM (#26406389)

              It doesn't need to be secret if there are controls on who can send messages to the list. It is so trivial to do for any competent email admin no matter what software they use.

            • Re: (Score:3, Informative)

              by Skrynesaver (994435)

              A properly designed mail server would accept mails to named distribution groups and just drop the mails into each of the associated mailboxes for internal mailing lists. I know this is how it works with our mailserver and yes the physical IMAP machines are on different continents but each IMAP server recieves only one copy of the mail over the network.

        • Re: (Score:3, Informative)

          by LinuxDon (925232)

          Actually, this wouldn't have mattered so much if they were using Novell Groupwise.

          Groupwise would store the message only once in the database and then put a pointer in every user's mailbox referring to that message. If you'd recall the message it'd just remove the pointers in mailboxes where the message has not yet been read, in order to reflect the current situation.

          One of the reasons I avoided Exchange like the plague is that Microsoft implements stuff like a hack job instead of doing things properly.

      • Re:Bedlam... (Score:5, Informative)

        by JWSmythe (446288) * <jwsmythe@jws[ ]he.com ['myt' in gap]> on Saturday January 10, 2009 @11:33PM (#26404661) Homepage Journal

        This is a configuration error, not a newsworthy event.

            For sendmail [sendmail.org], it would be a configuration directive in their sendmail.mc (or whatever theirs is:

        confMAX_RCPTS_PER_MESSAGE("100") ... or a modified line in sendmail.cf:

        O MaxRecipientsPerMessage=100

            In MSExchange [microsoft.com] it would be a registry change

        HKEY_LOCAL_MACHINE\System\CurrentControlSet\Services\MSExchangeIS\ParametersSystem\Max Recipients on Submit

        DWORD Value 100

        • Re:Bedlam... (Score:5, Interesting)

          by Achromatic1978 (916097) <robert&chromablue,net> on Sunday January 11, 2009 @01:34AM (#26405309)
          So you know enough about Exchange to know the Registry Key for configuring a max recipient count, but not enough to think that they were using DLs, which count as one recipient?
          • Re: (Score:3, Interesting)

            by pfleming (683342)

            So you know enough about Exchange to know the Registry Key for configuring a max recipient count, but not enough to think that they were using DLs, which count as one recipient?

            A DL would only be reply. The problem is with reply-all, meaning there was a list of addresses in the CC: or To: fields. Otherwise, we would not even be discussing this. If it were just a DL then reply and reply-all are essentially the same function, no?
            So it sounds like they need to install decent mailing list software, not just an "everyone" address.

      • by calmofthestorm (1344385) on Sunday January 11, 2009 @02:02AM (#26405437)

        You have tried to send an email. Do you wish to allow or deny?

    • by Anonymous Coward on Saturday January 10, 2009 @10:57PM (#26404403)

      Haha, sex change team.

    • by russotto (537200) on Saturday January 10, 2009 @11:01PM (#26404423) Journal

      http://msexchangeteam.com/archive/2004/04/08/109626.aspx

      What's the M Sex Change Team? People who still haven't gotten over Judi Dench playing M? Come on, folks, M is a title, not a person; it's not a sex change!

    • Re: (Score:3, Funny)

      by vmxeo (173325)
      Me too!
    • Re:Bedlam... (Score:5, Insightful)

      by Snowblindeye (1085701) on Saturday January 10, 2009 @11:46PM (#26404765)
      So they create large distribution lists (which is normal), but they don't secure them in any way or lock them down where only certain users can use them.

      And then they threaten disciplinary action if someone uses them the wrong way. Wouldn't it be so much easier to just lock them down? It's what most companies do.

  • Like rain on your wedding day.

  • sigh (Score:5, Interesting)

    by wizardforce (1005805) on Saturday January 10, 2009 @10:53PM (#26404355) Journal

    What an irony that they decided to mass mail when they've warned their employees not to do so. What they should have done if they were concerned about their load [which evidently they should have] was to warn their employees in blocks, perhaps 10% at a time with space between to take care of the massive response... However, judging by the nature of their work [it is the state department after all] I don't believe it unreasonable that there could be events in their future requiring such mass mailings again and having the whole system crash under the load would be no doubt very bad in emergencies.

    • by Robin47 (1379745) on Saturday January 10, 2009 @10:58PM (#26404411)
      Of the reply all button. Please do not respond with the reply all button. What they need is a reply some button.
    • Re:sigh (Score:5, Insightful)

      by AngryElmo (848385) on Saturday January 10, 2009 @11:11PM (#26404503)
      Maybe someone could introduce them to the concept of a BCC.
    • No, just use BCC for mass mailings.

      A simple and complete fix.

    • Re:sigh (Score:5, Insightful)

      by Just Some Guy (3352) <kirk+slashdot@strauser.com> on Saturday January 10, 2009 @11:38PM (#26404701) Homepage Journal

      What they should have done if they were concerned about their load [which evidently they should have] was to warn their employees in blocks, perhaps 10% at a time with space between to take care of the massive response...

      No. What they should have done was installed a mailing list manager, created a read-only list called "employees", and posted to it. Voila - n-thousand workers get announcements with no ability to reply to the whole list. Problem solved.

    • Re: (Score:3, Informative)

      by aolsheepdog (239764)

      You assumed that they mass emailed the notice and are incorrect.

      As the article states, the notice was sent by "cable" which is the old telegram system and still the only official means of communication between the Department and US Missions overseas.

      The cable system is on a completely separate classified network.

      As the unfortunate recipient of the mail storm emails I will say that many people included information in their replies that referenced the cable (and subsequent Department Notice) telling people to

    • Re: (Score:3, Informative)

      by drew (2081)

      The summary is a little misleading, but from the article, the "notice" was in response to the reply-all's taking down their server, not the cause of it. And it doesn't sound like the notice was sent via email. TFA describes it as a "cable".

      A cable sent last week to all employees at the department's Washington headquarters and overseas missions warns of unspecified "disciplinary actions" for using the "reply to all" function on e-mail with large distribution lists.

  • I worked at a college using the groupwise e-mail system and the same thing happened. Someone sent out an information email to all students and instead of BCCing the entire list of addresses, they were all plopped into the "To" field. It bounced around forever and everyone was completely confused.

    Luckily it wasn't my department and I didn't have a student email account, so I was immune.

    Long story short, the system did survive unscathed....

    • by grahamsz (150076) on Sunday January 11, 2009 @12:13AM (#26404907) Homepage Journal

      I saw a weird variant on that back in university.

      One of the engineering departments had a room full of (at the time) fairly high end sun workstations, and these were used both interactively and for people running longer compute jobs overnight.

      To facilitate overnight jobs, the admins had set up a round robin dns alias that updated every couple of seconds to point to the machine reporting the lowest load average.

      One of the students in my class had the bright idea of "If put 'ssh lowest' in my bashrc file, every time i open a terminal window it'll automatically pick the least loaded machine".

      Fast forward a few minutes and we've got 80 sun workstations which have all systematically ssh'd to each other and none of which will accept any new connections...

  • Incorrect Headline (Score:5, Insightful)

    by Anonymous Coward on Saturday January 10, 2009 @11:00PM (#26404419)

    Whoever wrote the headline for this summary needs to have their slashdot editor privileges revoked.

    TFA states "an e-mail storm nearly knocked out one of the State Department's main electronic communications systems", and "a major interruption in departmental e-mail". The problem is clearly spelled out as "e-mail queues, especially between posts, back up while processing the extra volume of e-mails".

    This is simply the queues backing up, not the servers crashing. Nowhere does TFA state anything to suggest that there was a "State Dept E-mail Crash", which the summary's headline boasts. The proper headline should read "Large E-mail Queues at State Dept After Reply-All Storm".

    No, I'm not new here. That's why I'm fed up with the sensationalist "journalism" that is getting worse and worse here.

  • by vawarayer (1035638) on Saturday January 10, 2009 @11:01PM (#26404421)

    I remember my first year of college when I wanted to send Xmas greetings to 'everyone'. I remember, the IT director of the college running from computer lab to computer lab looking for student number xxyz.

    Fun times.

    • by Toonol (1057698) on Sunday January 11, 2009 @12:02AM (#26404855)
      Back in, oh, probably '90, the company I was working for had dumb terminals everywhere connected to a mainframe. They had just added a messaging feature, and one supervisor was messing around with it. She tried to send a message to her group, but accidentally sent it company-wide. The message was "IF YOU CAN READ THIS, RAISE YOUR HAND."

      I was supervising the call center at the time, and I saw hundreds of hands tentatively raising. The message probably went to two thousand people.
  • Two questions: (Score:5, Interesting)

    by drolli (522659) on Saturday January 10, 2009 @11:03PM (#26404441) Journal

    a) Maintaining large list by copying all recipients into the hrader is a fucked up idea at best (because there is no way this list will be kept updated), and a informaiton leak at worst (because somebody eralier on a non-updated list may get information which he should not get - e.g. former employees). Why do governmental institutions still us it?

    b) Why in the world do modern e-mail clients still allow reply all to hundreds of recipients without an additional safety question. I would expect my program would warn me before sending an emails to thousand people.
     

    • Re:Two questions: (Score:5, Insightful)

      by bugs2squash (1132591) on Sunday January 11, 2009 @01:15AM (#26405233)
      I have direct experience that whenever a popup is presented reading something like.

      Are you sure you want to do this stupid thing ?

      pops up, people universally click "OK" without a second thought.

      People have just been blasted by too many of these warnings to take any proper note any more.
  • by MEsSWorks (544458) on Saturday January 10, 2009 @11:04PM (#26404449) Homepage

    Dear state department

    I'm sorry to hear about your recent trouble

    There is a brand new invention on the internet which have the ability to ease the strain on your mailservers. it is called maillist managers. one is called mailman and can be found at: http://www.gnu.org/software/mailman [gnu.org]

    There are several others, some free, and some non free, but they exist for most server platforms. If you don't have the expertice in house to set it up corrctly, you can get any number of consultancy companies to help you out.

    Yours faithfull
    Almost anonymous coward

  • by Cassini2 (956052) on Saturday January 10, 2009 @11:08PM (#26404485)
    No good ever came from the Reply All button. It is like adding "Press this button to be fired" function to your corporate email system. You know someone is going to press the button, you know trouble will ensue, so why create the button?

    To all the mods, please don't destroy all my Karma. I really do hate that Reply All button.

    • Re: (Score:2, Insightful)

      by Scrameustache (459504)

      It is like adding "Press this button to be fired" function to your corporate email system. You know someone is going to press the button

      Yes: The guy who wants to quit but doesn't because he'll only get unemployment benefits if he's fired :)

      • by SuperBanana (662181) on Sunday January 11, 2009 @01:53AM (#26405401)

        The guy who wants to quit but doesn't because he'll only get unemployment benefits if he's fired :)

        Um...which goes to show how little you know about unemployment. At least in MA, you don't get shit if it is "termination with cause", ie fired. If you're laid off, great- but even then, your employer gets a phone call from the unemployment department asking whether you were fired or laid off. Nothing stops them from lying and saying you were fired with cause- and then you've got a legal battle on your hands, which you can't afford.

        Other fun facts about unemployment in MA: you don't get paid for two full weeks after you FILED- not after you were laid off, but after you FILED. You get a pittance compared to your normal salary; you'd be lucky to make rent on a studio apartment in Boston based off an entire month's unemployment checks.

        Any income is deducted from your UA check. Say for example you find a 2-3 hour consulting thing on CL and make $150 helping someone fix their computer. Guess what? Your unemployment check for that week will be $150 smaller. This basically means that you have no incentive to find any kind of income while you're on UA.

        Last but certainly not least: you have to pay taxes, medicare, medicaid, etc on your unemployment benefits. It's not bad enough that you're basically on welfare- you have to fork over a portion of the money the government is giving you, BACK to the government. Cute, eh?

    • I use the reply-all button frequently, for ad-hoc small group discussions. If I have a document I want two people to review, I send it to both of them, and they send their comments back to both me and the other person I sent it to with reply-all, so we're all on the same page.

      If the same group of people is frequently collaborating you can set up a mailing list, but it's a real pain in the ass to set up a mailing list every time you want a group of 3 or 4 people to exchange 5-10 emails.

    • Re: (Score:3, Insightful)

      by evanbd (210358)
      Let's suppose a [coworker|friend|colleague] sends an email to me, ccing three other people. I want to respond and CC those same people. Exactly what button should I press, if not reply all? Or are you one of those people that think forcing me to do things the hard way and copy those addresses manually to the CC line is a feature, because you don't know how to set up a mailing list so this doesn't happen?
  • One of the cool things that software can do is enforce policy. This is the one way that the software pays for itself. For instance, in accounting the software can be programmed to keep audit trails and prevents records from being erased, thus reducing the dependence on accountants Likewise, email software is often programmed, at least at the enterprise level, to automatically set appointments and set confirmation emails.

    It is interesting to me how the computer is used less and less to enforce policies

  • No email system should ever "crash" under any reasonable load. Back in the late 90's, I was involved in designing and implementing email systems for some of the largest (at the times) ISP's as a consultant for a company that an NDA forbids me to mention. One of the things we did was limit the number of simultaneous connections, such that a "reply storm" (or, more often, a DOS attack) would hit a speed bump fairly quickly. Sendmail has done this for 25 years, by cutting off acceptance of new messages when

  • "should an e-mail system be able to handle this type of traffic...?"

    Any system should be designed in such a way that a mere clueless user should not be able to bring it down accidentally. If an e-mail system can't handle "reply-to-all" when used carelessly, then it shouldn't have that function.
  • Reply All Insanity (Score:3, Insightful)

    by HangingChad (677530) on Saturday January 10, 2009 @11:39PM (#26404709) Homepage

    I've seen it so many times over the years. I wonder why it's so hard to add an administrative setting that limits Reply All to a certain number of users? Set at 100, it would only send the first 100, then ask the user if they wanted to send the next 100. Or 300 or 400 or whatever.

    I can't count the number of people sending a hasty and blistering reply to thousands of people. Not only committing public suicide but accounting for who knows how many unproductive man hours while the entire organization stopped to read their spew. It's just crazy.

  • A modern email system really should be able to handle this. High performance messaging systems will store one copy of the message, with n number of pointers to it per back end store. Sending a message to 10k users results in one store insert event and a 9,999 cheap pointer operations. The MTA will have to perform directory look ups for the recipients, but should use LMTP to insert them into the store and prevent redundant directory queries, etc... Sun's big mail server will even "relink" duplicate messa

  • by n0dna (939092) on Saturday January 10, 2009 @11:42PM (#26404727)

    http://www.hanselman.com/blog/HowToEasilyDisableReplyToAllAndForwardInOutlook.aspx [hanselman.com]

    2 simple lines that you can include in your Outlook client to prevent this action internally on your exchange server.

    Note this does not include any macros in the email.

  • Seeing that we've established that this was OpenNet which uses public-available systems and in this case that means Exchange, wouldn't it be reasonable to assume that as we're approaching the end of the error.. er.. era of the Bush admin there would be an uptick in "Goodbye, here's where to reach me" mails to entire address books? From there, it'd take no time to hit the hard limits in Exchange for file storage... talk about ungraceful failures that we've known about for years. (Wait, that's another Bush re

  • by taustin (171655) on Saturday January 10, 2009 @11:57PM (#26404827) Homepage Journal

    The problem is the message replied to having - RTFA - several thousand addresses in the To: and CC: fields. This is what BCC is for . Allowing people to put several thousand addresses in to the headers will eventually result in a mail storm, whether someone hits Reply To All or not. The first time someone opens a virus laden attachment that goes through their (archived by law, this being a federal agency) emails, it will send itself out to thousands of equally clueless people. One of them will run the attachment, which will send another copy to several thousand people. And so on. This happened where I work once, by people who should have known better. Before it was done, I was getting two hundreds copies of the virus per day.

    Whoever sent out the message replied to should be fired and criminally prosecuted for deliberately sabotaging the State Department's email system. But since the article doesn't mention this at all, I'm assuming it was some dumbass boss somewhree who is immune to any form of disclipline for anything, up to and including murder.

    • by Anonymous Coward on Sunday January 11, 2009 @01:51AM (#26405381)

      Having been a witness to the incident in question, here's what happened:

      1) Around December 30th a blank e-mail (with receipt request) went out to almost all users. Apparently it was from a single user with some malware etc. (we didn't get any further details).

      2) The next day, the same blank message was sent out again (from the same user).

      3) As people came back from vacations, we got a few "Please remove me from this list", and "What is this message" send as reply-all.

      4) Then, followed with a bunch of "Me Too".

      5) Then, a bunch of "Please, don't reply all" (sent, of course, reply-all).

      6) Followed by a bunch of "remove me from this list".

      and so on, and so forth, with no end in sight...

      The initial message didn't have any virus or other "payload"; just a blank message that caused a bunch of confusion. The whole incident was actually pretty hilarious to watch.

  • by rfc1394 (155777) <Paul@paul-robinson.us> on Sunday January 11, 2009 @02:09AM (#26405485) Homepage Journal
    I have my own, for lack of a better name, "Reply All" incident.
    I am on a list of bidders for potential contracts with the Washington Metropolitan Transit Authority, which operates the Metrobus and Metrorail for Washington, DC and the nearby suburbs in Maryland and Virginia. The annual budget for the Authority is in excess of a billion dollars; it's larger than the budget of the entire State of Montana, for example.
    One time I got a message with more than 25 recipients on it regarding a change in the way they were operating their procurement website. Well, I suspected that it was some spammer pretending to be from the Authority, because one of the "red flag" signs of being spammed is more than 10 recipients on the same messsage. But I discovered that it really was from the Transit Authority, it was simply an ordinary announcement with no url links and nothing but the announcement. But instead of simply either making the recipients BCC recipients, and sending it to an internal transit authority e-mail address as To:, or sending individual messages to each potential supplier, the contracting agent had simply sent it out To: listing all persons who were registered as bidders with the authority.

    My e-mail address was one of these potential suppliers along with a few other people.

    1,627 other people to be precise. This was the longest To: list on an e-mail message I have ever seen on a piece of e-mail that wasn't spam; 1,628 contacts. No, I didn't reply all, but I couldn't think of a way to refer to this incident as a "Send All" message and tie into this story. The other half of this incident was that the procurement agent had also just given all potential suppliers to the Authority, every other supplier's e-mail address, too.

    • In the same vein... (Score:5, Interesting)

      by microcars (708223) on Sunday January 11, 2009 @02:59AM (#26405709) Homepage
      and of course, off-topic from TFA, I signed up with a Product Testing Place. They email me once every six months and see if I want to test some new gadget or something and I get paid $75.

      I signed a confidentiality agreement with them.
      I am not allowed to discuss ANYTHING about the product or reveal I am testing it or anything. I was never there, I am nobody.

      Last year I got an email - From The President of The Testing Company - personally thanking me for all the help in the last year.
      He also thanked everyone else who "helped" last year as well and I could see who they were because apparently the President (or the secretary) just put all our emails into the TO: field and let it fly.
      Lots of Identifiable people on the list because they used their WORK email, like john.doe@largecorporation.com So it was easy to see who else was part of that big Butt Plug testing program.

      I did a REPLY to ONLY the President and laid into him about the confidentiality agreement and told him if he didn't know how to use email to stay away from the computer.

      Later that day we all got another email from the President, this time apologizing for revealing all our personal emails, never happen again etc etc. And apparently he figured out how to use BCC!

      So yelling at someone does seem to work to change behaviour.
      Also- this is a dupe comment, I posted this once before on Slashdot someplace, but since this is Slashdot I didn't think a dupe would be a problem.
  • Microsoft (Score:3, Insightful)

    by 1s44c (552956) on Sunday January 11, 2009 @12:39PM (#26407707)

    They used exchange and got screwed, just like -everyone- who uses exchange.

    This happens all the time just most companies cover up stuff like this because it's not good for the share price.

  • by Pepebuho (167300) on Sunday January 11, 2009 @12:42PM (#26407717) Homepage

    Am I the only one to think that it is quite peculiar that it is happening 9 days before the Government turns over? I mean, how much difficult could it be to say that some sensitive/embarrasing mails got lost during this crash? I think this should be looked into in more detail and make double sure that no mail was "lost".

Facts are stubborn, but statistics are more pliable.

Working...