Become a fan of Slashdot on Facebook

 



Forgot your password?
typodupeerror
×
Technology

Ten Technology Disasters 336

Ant writes "What do a 17th-century Swedish warship, an opulent Chicago theater and a Kansas City hotel "skyway" have in common? All met catastrophic ends and they have important lessons to teach today's innovators."
This discussion has been archived. No new comments can be posted.

Ten Technology Disasters

Comments Filter:
  • by SaxMaster ( 95691 ) on Friday May 17, 2002 @08:31PM (#3540949)
    ... Submitting your page to Slashdot, technology disaster number 11 :)
    • Doing a quick google search [google.com] (technology, disaster history), I found this story [buzzle.com] with this headline
      Twenty-five years ago, the greatest disaster in airline history killed 538 people, in part because of a "heterodyne" radio glitch that still hasn't been fixed.

      It's a good lengthy article. Worth the read
      • by Gordonjcp ( 186804 ) on Saturday May 18, 2002 @04:44AM (#3542011) Homepage
        A couple of things about the article:

        Firstly, that's not really what "heterodyne" means. Heterodyning is when you mix two signals to produce another at a different frequency. This is how pretty much all radio receivers work (yes, I know there are other ways. Go in a shop and buy a commercial super-regen radio, and I'll change that sentence). It's not a "glitch", it's more a constant physical property.

        Also, the problem was not directly caused by the radio equipment, but by what was said. Yup, it's an unpopular view to take, but it was just plain human error. No blaming the machines here. Why? Well, it goes like this...

        The day of the accident, there was very heavy fog around Tenerife. Visibility was extremely poor, and it was impossible to see the opposite end of the runway. Another factor was that normally, you only fly off from one end of the runway, depending on wind direction. If the surface winds are calm, it's the tower's call as to which runway is in use (denoted by the heading you're facing when taking off, in 10-degree steps, ie. Runway 25/Runway 07). *Both* runways were in use, so aircraft could line up at both holding points, to help reduce queueing.

        Now, the Pan-Am pilot was first out, so lined up at the takeoff point, and began his takeoff run. There was some confusion about whether or not the KLM aircraft was to taxi from the hold to the takeoff point, due to both the controller and the Dutch pilot having english as a second language. This wouldn't have been a problem for the most part, because even if the KLM had been at the takeoff point, the Pan-Am would have cleared it with plenty room, even though it shouldn't have been on the runway.

        The key is in what the Dutch pilot said - "We are now at takeoff". This is indeed a common phrase, generally meaning that the aircraft is sitting at the takeoff point and awaiting clearance. However, in Dutch, the prefix "at-" is equivalent to the English "-ing" suffix - the pilot had just effectively said "I am now taking off". It's an easy mistake to make if you speak more than one language. Even a language you don't often use creeps into things you say in your first language. Just watch it doesn't have consequences this serious!
        • "We are now at takeoff". This is indeed a common phrase, generally meaning that the aircraft is sitting at the takeoff point and awaiting clearance

          Thanks for the description. This story may be behind why I've never once heard "We're at takeoff" at major airports in many years. I've always heard the tower instruction as "taxi into position and hold" with the instruction repeated by the pilot. I guess because of this incident, they like to stay away from the word "takeoff" unless its an instruction to actually head down the runway. Interesting.
    • or Halifax. (Score:5, Interesting)

      by s20451 ( 410424 ) on Friday May 17, 2002 @09:27PM (#3541109) Journal
      In 1917 collision between two ships in Halifax harbor -- one carrying close to 3000 tons of high explosive -- resulted in an explosion [halifax.ns.ca] which levelled much of the city and killed 2000 people, in what was one of the largest non-nuclear manmade explosions in history.
      • or Texas City (Score:3, Interesting)

        It wasn't just Halifax; ports can be dangerous places. In 1947, there was a huge explosion [essortment.com]when a freighter loaded with fertilizer blew up in Texas City, near Galveston. I knew about it because my father's ship left port hours before the explosion. His mother got a letter he posted from there just before the ship left, and she thought he was dead for several months, until she got a letter from the next port of call.

        There are some pictures [texas-city-tx.org] on this page. It seems that over 600 people died; or at least they recovered that many bodies. There may have been some who simply disappeared. There was a tidal wave which swept 150 feet inland (NOT 150 feet high, but that far away from the beach.). Since the ship was at the dock, it started fires in the town, and at a chemical plant near the docks. It set fire to another ship which was nearby. That ship blew up the next morning with even more force, and did even more damage. There are more pictures here [chron.com] and here [chron.com], which give some idea of just how big ithe explosions were.

    • Uh, the Texas City explosion wasn't really a technology problem. More of an industrial-safety issue. Proper labeling, pre-planned procedures, and all that. A very low-tech incident, to be sure.
  • disasters course (Score:2, Interesting)

    by EricBoyd ( 532608 )
    The engineering undergraduate program at Queens University actually has a disasters course as one of the non-technical electives. Basically, it involves dividing the class up into small teams, each of which then picks an engineering disaster to analyse in great detail. Presentations and written reports are submitted at the end of the semester.

    Supposedly this engenders a greater sense of responsibility into the engineers to be. I think it worked it for me :-)

    Websurfing done Right! StumbleUpon [stumbleupon.com]
    • by Tumbleweed ( 3706 ) on Friday May 17, 2002 @09:51PM (#3541171)
      > The engineering undergraduate program at Queens University actually
      > has a disasters course as one of the non-technical electives.
      ...
      > Supposedly this engenders a greater sense of responsibility into the
      > engineers to be.

      Perhaps, then, this should be a required class instead of an elective one. *shrug*
    • From the story:

      "In assembling this list of exemplary technological disasters, we've omitted the most familiar--those whose names have entered into the language, like Bhopal, Chernobyl, Three Mile Island, Titanic and Challenger--in favor of some with fresher tales to tell and lessons to impart."

    • Did you actually read the story? Quoting the second paragraph:

      "In assembling this list of exemplary technological disasters, we've omitted the most familiar--those whose names have entered into the language, like Bhopal, Chernobyl, Three Mile Island, Titanic and Challenger--in favor of some with fresher tales to tell and lessons to impart."

      What a shameless and pathetic attempt at karma whoring...

  • Concorde? (Score:4, Interesting)

    by reaper20 ( 23396 ) on Friday May 17, 2002 @08:42PM (#3540987) Homepage
    It took just one more little mishap to make a disaster: a titanium "wear strip" fell off a Continental DC-10 in the path of an Air France Concorde leaving Paris. When the Concorde's tire hit the strip, a chunk of rubber tore off and smashed into the wing, punching a 600-square-centimeter hole in its skin and causing fuel to leak and ignite.

    Disclaimer: I know nothing about airplane safety or testing, but this one set off my common sense alarm.

    So, the tires on Concordes require to be changed alot - a chunk of titanium breaks of of another plane, and hits a tire on a Concorde, causing the accident - anyone else think that "Well gee, I don't think any kind of tire is designed to withstand titanium chunks slamming into them." Considering the condition of some of the commercial jets I've flown in, I'll take my chances with the Concorde. I'm sure there is more to it than just this, I thought it odd though.

    Though not a "disaster" per se - the Navy's dead Windows NT ship is tops for the funniest in my book.

    • Navy's Dead ship (Score:5, Informative)

      by reflexreaction ( 526215 ) on Friday May 17, 2002 @09:25PM (#3541104) Homepage
      An article on the NT problem is available here [info-sec.com].

      From the article
      The Yorktown lost control of its propulsion system because its computers were unable to divide by the number zero, the memo said. The Yorktown's Standard Monitoring Control System administrator entered zero into the data field for the Remote Data Base Manager program. That caused the database to overflow and crash all LAN consoles and miniature remote terminal units, the memo said.
      And a little bit later in the article
      "If you understand computers, you know that a computer normally is immune to the character of the data it processes," he wrote in the June U.S. Naval Institute's Proceedings Magazine. "Your $2.95 calculator, for example, gives you a zero when you try to divide a number by zero, and does not stop executing the next set of instructions. It seems that the computers on the Yorktown were not designed to tolerate such a simple failure."

      GO ARMY!!!!!!!
      • by ergo98 ( 9391 )
        So the code threw an exception when it divided by zero: That's a _wanted_ thing (because technically dividing by zero is an error state. You don't want to just skip over something like that when it could be guiding a missile or steering the ship). From everything I've heard about that Navy ship, the fault had absolutely zero to do with "Windows NT", and everything to do with a proprietary application that didn't wrap a non-deterministic calculation in a try/except : Hardly extraordinary. Unfortunate, yes. Fodder for anti-MSitism, hardly.
        • Except that there's no way in hell an APPLICATION should be allowed to crash the OS.
          • I've never seen anyone state that the OS crashed - just that their proprietary application crashed (which would be enough to cripple the ship).

            Or did I miss something?
            • My understanding was that the database crashed, causing a chain reaction on the network. I also seem to recall from the GCN article that there were BSOD's.
              • No, the original GCN article may have had vague comments on NT's ability to blue screen but these were from different unspecified incidents. In the incident actually described a client app accepted bad input, a server app corrupted it's database, this data was needed by other clients apps that controlled the ship. These later clients were LAN consoles. LAN consoles crashed, not the LAN itself. The client and server apps created the mess, they would have done so regardless of OS. The Chief Engineer on the ship at the time and the developer of the software have both said it was not NT.

                http://www.sciam.com/1998/1198issue/1198techbus2.h tml

                Also, the publisher of the original GCN article backed away from the article a little characterizing some of the content as "early speculation" or something like that.
                • Also, the publisher of the original GCN article backed away from the article a little characterizing some of the content as "early speculation" or something like that.

                  Well, at least that way the ship was paid for ;-)

        • From everything I've heard about that Navy ship, the fault had absolutely zero to do with "Windows NT", and everything to do with a proprietary application that didn't wrap a non-deterministic calculation in a try/except : Hardly extraordinary. Unfortunate, yes. Fodder for anti-MSitism, hardly.

          Yes, but the operating system (Windows NT) should have caught a divide by zero exception, and terminated/restarted the offending application. The operator should have then been able to restart the application and proceed as normal. This bug should not have brought down the entire system!

      • by AHumbleOpinion ( 546848 ) on Saturday May 18, 2002 @12:30AM (#3541617) Homepage
        http://www.sciam.com/1998/1198issue/1198techbus2.h tml

        "Others insist that NT was not the culprit. According to Lieutenant Commander Roderick Fraser, who was the chief engineer on board the ship at the time of the incident, the fault was with certain applications that were developed by CAE Electronics in Leesburg, Va. As Harvey McKelvey, former director of navy programs for CAE, admits, "If you want to put a stick in anybody's eye, it should be in ours." But McKelvey adds that the crash would not have happened if the navy had been using a production version of the CAE software, which he asserts has safeguards to prevent the type of failure that occurred."
      • "Your $2.95 calculator, for example, gives you a zero when you try to divide a number by zero, and does not stop executing the next set of instructions. ..."

        What kind of calculator have they been using? All the ones I've seen give you 'E' and refuse to do anything until you clear it.
    • Re:Concorde? (Score:3, Interesting)

      by iabervon ( 1971 )
      The titanium strip was just sitting on the runway, having fallen off the other plane. Of course, the plane was presumably going pretty fast at the time, but airplane tires should be able to withstand this sort of thing, or at least fail somewhat more gracefully.

      On the other hand, the failure in this case required 2 failures on the Concorde and bad luck with the fire, as well as hitting something that shouldn't have been there. There's a reason it took as long as it did for a Concorde to crash. I'm not sure exactly why this is in the list: in the other cases, the problem was that the makers were over-confident. The Concorde was supposed to be nearly indestructible, and it turned out that it could be destroyed once in a million times. So they fixed both of the things which contributed to that time. It's not the sort of thing you could say was just waiting to happen, or that you could say they should have found in simulation or testing.
    • funny? (Score:5, Interesting)

      by passion ( 84900 ) on Friday May 17, 2002 @10:52PM (#3541326)

      the Navy's dead Windows NT ship is tops for the funniest in my book.

      Many psychologists have suggested that the emotion of humor has evolved as expressing relief from danger.

      I find it truly frightening.

  • by Mulletproof ( 513805 ) on Friday May 17, 2002 @08:46PM (#3540997) Homepage Journal
    You can't breed out stupidity or rule out nasty ass-bad luck. This artical seems to infer you can do both.
  • by FortranDragon ( 98478 ) on Friday May 17, 2002 @08:47PM (#3540998)
    I live near KC and I remember when the skywalks collapsed. As the story unfolded after the tragedy, it became readily apparent that everyone just assumed everyone else was doing what they thought they should be doing or that their shortcuts were fine with everyone else. :-( Communication and checking up on how things are actually progressing versus the plans can be a real matter of life or death.

    Next time as a programmer you bitch about checking up on QA (assuming you are lucky to have a QA department) or on the users, just remember that your mistakes very rarely kill people. You've got it _easy_.

    Also, on a side note, the local KC TV news organizations try hard to prevent people from getting to their archives of what happened. They don't want to present Kansas City in a "bad light". This is also very stupid. If we can't easily learn from our mistakes we are going to make more of them. 'Protecting' KC's reputation just makes Kansas Citians look more retarded than the screwup that was Hyatt Regency Skywalks. :sigh: Yeah, mistakes were made, so let's own up to them and learn something so we don't do it again.

    • by K8Fan ( 37875 ) on Friday May 17, 2002 @10:43PM (#3541297) Journal
      I live near KC and I remember when the skywalks collapsed. As the story unfolded after the tragedy, it became readily apparent that everyone just assumed everyone else was doing what they thought they should be doing or that their shortcuts were fine with everyone else. :-( Communication and checking up on how things are actually progressing versus the plans can be a real matter of life or death.

      I lived in KC at the time, and I recall that there were more screw-ups than this short summery mentioned. The metal fabricator also changed the design of the beams. As designed, they were to be made of two "U" shaped channels welded together with a seam on the left and right sides of the beam. They didn't have those bits in stock, so they used two shallower "U" shaped pieces and welded them together at the top and bottom of the beam...and then drilled the holes for the threaded rod right through the welds!

      Everyone involved was criminally culpable...and (to my knowledge) went to prison.

      Also, on a side note, the local KC TV news organizations try hard to prevent people from getting to their archives of what happened.

      A good friend of mine was the first emergency physician on the scene at the Hyatt and performed the triage. He was recently interviewed by the BBC for a documentary about the Hyatt. They supplied footage to the BBC, but no...they don't have any reason to supply footage to random people.

    • I live in KC, and remember thinking that the guys who designed the skywalks got a bum rap.
      They were designed for people to walk from one side to the other, perhaps to pause and
      check out the view for a few moments before continuing on their way, but not for a huge
      crowd to fill them, swaying in unison in rhythm to the music. I have a great deal of sympathy
      for the people on the lower skywalk and those underneath them both, but the ones on the
      upper skywalk contributed to their own injuries. I never saw any acknowledgment of this
      distinction.
      • but not for a huge crowd to fill them, swaying in unison in rhythm to the music

        Read the article. It specifically says that dancing induced resonance was ruled out pretty early as an explanation for the disaster:

        speculation first fixed on the patrons who'd been dancing on them: perhaps their high-stepping had set off a harmonic wave that made the sky bridges buckle and crumble.

        The truth proved more prosaic. The hotel's engineers had originally designed two of the three walkways to hang on common, vertical metal rods. But the metal fabricator took a fatal shortcut, substituting shorter rods hanging from one level to the next.

    • "Yeah, mistakes were made, so let's own up to them and learn something so we don't do it again."


      By admitting any wrong doing people can open themselves up to enormous lawsuits, that's whay many times teh injured parties or those seeking redress often have to seek the truth on their own with little to no assistance on the accused. Look at Enron and Andersen for a godo example of this.

      The Enron and Andersen officials aren't being unhelpful because they want to be a pain in the ass, they are being inhelpful because they risk jailtime and possibly enormous fines. By not admitting to anything the jobis that much tougher to bring civil and/or criminal charges against them.

      Like it or not, it's unconstitutional to force people to incriminate themselves.
  • by vkg ( 158234 ) on Friday May 17, 2002 @08:48PM (#3541005) Homepage
    Seriously: ten catastrophic goofs, but I don't see anything which really ties them together!

    Am I missing something?

    Yeah, sure "Don't cut corners" and "Don't trust management who would like to cut corners", but that's pretty obvious and we all still do it, right?

    There's also some stuff like "Watch when retrofitting parts of an old system with new technology" and "pay attention to boundry conditions", but really I think this is just a laundry list.

    So does anybody know of a good reference work out there which actually has some worthwhile analysis on stuff like this? Didn't Feynmann write something up after Challenger?
    • by Registered Coward v2 ( 447531 ) on Friday May 17, 2002 @09:15PM (#3541078)
      So does anybody know of a good reference work out there which actually has some worthwhile analysis on stuff like this? Didn't Feynmann write something up after Challenger?


      Yes, it appeared as an appendix to the Roger's Report. He also discussed it in his autobigraphy either "Surely your joking..." or "What do you care...", I can't remember which. The appendix is a good read, and can be found here:
      http://www.ralentz.com/old/space/feynman-re port.ht ml
      or any of a number of other googleable links.

    • "There's also some stuff like "Watch when retrofitting parts of an old system with new technology"

      Tennessee is just about to do something similar with a
      nuclear power plant. [nytimes.com] This plant has been mothballed since 1985 but they want to bring it back online. Oh yeah, they also want to overclock it by 30%; it was originally designed for 1000 megawatts production but they are going to crank it up to 1300 megawatts.

      The plant had caught fire in 1975, causing a series of problems leading to the shutdown in 1985. Now they want to extend it's orginal 40 year design for another 20 years. A nuclear-safety engineer for the Union of Concerned Scientists figures that a new plant would be safer and cheaper. From an engineering point of view, "It's like trying to dust off an eight-track tape player rather than buying a DVD system..."

      First Three Mile Island. Then Chernobyl. Is Tennessee next?
      • by Melantha_Bacchae ( 232402 ) on Saturday May 18, 2002 @01:14AM (#3541699)
        Phrogger wrote:

        > First Three Mile Island. Then Chernobyl. Is Tennessee next?

        Sorry, Tennessee would have to get in line. One of the most spectacular examples of stupidity causing a nuclear accident was at a plant in Tokai-mura on September 30th 1999, and it is the greatest nuclear plant accident in Japan's history. Basically, they dumped all the safety precautions and mixed themselves up a batch of acidic nuclear soup in a big steel bucket and stirred. Instant hot fission! You can read the World Nuclear Association's writeup here (it has a nifty table of different levels of nuclear catastrophe that is a must read):

        http://www.world-nuclear.org/info/inf37print.htm

        The interesting thing is, Toho was filming on location at the Tokai plants for a Godzilla attack in the then upcoming "Godzilla 2000 Millenium". They were probably done with filming by the time the accident actually occured. In December 1999, the movie opened, with Godzilla heading over to attack the plants.

        This wasn't the first one of Toho's monster movies to "come true", only one in a long history. Here are two other famous ones:

        "Gojira" 1984: the Russians have a nuclear accident in the movie (in the original Japanese version, US version makes it a deliberate act). In 1986, the Russians had a real accident: Chernobyl.

        "Mosura 3: King Ghidora Raisu" 1998: the King of Terror (King Ghidora) begins his attack on Tokyo by flying through the twin towers of a skyscraper. Office workers flee while talking on cell phones. The US version ... well there was no US version, except the real life one on September 11th, 2001. Tristar, why was "Rebirth of Mothra 3" never released so we could have been warned as Mothra clearly intended?

        Sonora:"New Godzilla reading. He's moving inward toward Tokai."
        Shinoda: "The nuclear plants, I knew it.
        Sonora: "Afraid so."
        Yuki: "Well, that's just lovely. Another Chernobyl."
        "Godzilla 2000" (US version dialog)
      • In an ideal world they would build a new one.. but it would be impossible in todays climate. No new nuclear power plant has been built in the US since the 80's (I believe.. might be a little earlier/later). It causes too much of an uproar - NIMBY. Plus, you get wacky SUV driving soccer moms who complain about how much nuclear plants 'pollute.' Sigh.
  • by btempleton ( 149110 ) on Friday May 17, 2002 @08:49PM (#3541006) Homepage
    A story that claims to be reporting on the greatest tech disasters, in particular the lesser known ones, and it fails to mention Banqiao and Shimantan in 1975?

    I mean, not only was this the greatest technological disaster in human history with 80,000 to 230,000 dead depending on whose numbers you believe, but it also is sufficiently unknown that the author of an article on disasters doesn't appear to know of it!
    • by nels_tomlinson ( 106413 ) on Friday May 17, 2002 @11:16PM (#3541407) Homepage
      A story that claims to be reporting on the greatest tech disasters, in particular the lesser known ones, and it fails to mention Banqiao and Shimantan in 1975?

      Since the original post mentioned this as if we should be familiar with it, here're [sjsu.edu] the details: A big dam in China failed, in large part because the Communist ideologues over-ruled the hydrologists. Many thousands died, but of course that's all right because the houses of the Party cadre were built on high ground. Click on that link for the fine print.

    • The point of this article was, all those disasters were due to people being careless|cheap|stupid|etc and could have been easily prevented (which is not always true). Does that apply to the disaster you mention?

      For that matter, do you have any more information on it? I've never heard of this one either.
  • by DaveWood ( 101146 ) on Friday May 17, 2002 @09:08PM (#3541060) Homepage
    No discussion of the topic could be complete without mentioning RISKS. The RISKS Digest [ncl.ac.uk] has been discussing risk factors associated with technology and engineering (and to some extent generally) on the internet since 1986.

    Every engineer should spend time reading there. Any _good_ engineer should subscribe.

    -David
  • K-Boat (Score:3, Insightful)

    by Al Al Cool J ( 234559 ) on Friday May 17, 2002 @09:12PM (#3541068)
    If you want to talk about disasterous naval design flaws, then the British K-Boat probably takes the cake. A WWI steam-powered submarine, the K-Boats suffered from numerous flaws in design and engineering and as a consequence fell victim to many dozens of accidents and mis-haps, including the so-called "Battle of May Island" in which a flotilla of K-Boats was decimated by a string of collisions during night-time fleet training maneuvers. The K-Boats killed many hundreds of their crew, without ever inflicting damage on the enemy.


    See http://www.brisray.co.uk/misc/mind.htm [brisray.co.uk] (scroll down) for more info.

    • Another naval design that became a disaster waiting to happen was the Quebec class Soviet submarines built in the early 1950's.

      Imagine a closed-cycle internal combustion engine with a big oxygen tank nearby--one oxygen leak and if a fire breaks out the result would be a horrible disaster. In fact, that's exactly what happened in (I believe) 1956 when a large number of submarine crew was killed by fire onboard such a sub, and there would have been much more deaths had not the captain got the sub surfaced and managed to get a number of crewmen off the sub.
  • This is what happens when you have a system that allows the corporation to run amuck.

    The lowest bidder cannot be trusted to create products that are safe.

    In these cases, it is good to still have some government oversight.

    • That is quite a typical knee-jerk response.

      "the lowest bidder cannot be trusted to create products that are safe."

      Crap! If the lowest bid is for an unsafe product, then it isn't a bid for the project... If someone accepts a bid for what is essentially something other than the project for which they requested bids (i.e., an unsafe version of the goal) then they are foolish; corporations running amuck have nothing to do with it.

      It's easy to associate low price = low quality, but that simply is too simple. After all, many of the greatest foulups are when a nonlow bid is chosen for 'political' reasons.
  • by ewhac ( 5844 ) on Friday May 17, 2002 @09:20PM (#3541090) Homepage Journal

    Even if you never get near embedded systems of this type, you can't call yourself a responsible software engineer until you read and learn from An Investigation of the Therac-25 Accidents [vt.edu].

    Executive Summary: Company introduces next-generation radiation therapy machine, replacing hardware-based overdosage safety interlocks with software-based mechanisms. Software fails. People are killed.

    Schwab

  • by efuseekay ( 138418 ) on Friday May 17, 2002 @09:40PM (#3541147)
    every engineer has their own stories of how they SNAFU-ed. I have mine (one of the reasons why I wuss-ed out and now do theoretical physics instead :)).

    Usually, the problem is :

    (a) Pushing Envelope without prior analysis (Vasa)
    (b) Not exercising Due Diligence in design (Tacoma Narrows)
    (c) Failure of communication between departments (Mars Climate Orbiter : remember the units SNAFU?)
    (d) Insufficent redundancy design (Iroquis Fire)
    (e) Failure to recognize likely failure modes (Concorde, Titanic)

    and others of course.

    I've once fucked up an expensive spacecraft component because of (c). I worked on the mechanical design of the component housing, some electronics guy worked on the electronics detector sitting inside my housing. We have an innovative design whereby some of my mechanical supports were designed to keep some of his electronics ICs in place without the PCB board. The SNAFU : both of us thought the other is suppose to apply anti-vibration gell (layman's term here, we call it RTD...).

    So the part was fab-ed, electronics put in, and the whole thing was sent to a vibration table for testing..

    Result : a loose IC, clanking around the housing for 2 minutes at about 600Hz. The whole thing was toast.

    • And *THAT* is why you do the test. Imagine if the part hadn't been tested. It's a hell of a lot cheaper to replace/repair a satellite part destroyed in a test lab, as compared to doing it in orbit.
    • by statusbar ( 314703 ) <jeffk@statusbar.com> on Saturday May 18, 2002 @12:04AM (#3541552) Homepage Journal
      Those are all good points.

      Another problem I have seen was where TWO different bugs mostly functionally cancelled each other out causing new intermittent problems.

      I made a realization regarding strict-type checking languages versus dynamic typed languages.

      Typically, people who are used to java and c++ complain about languages like python - saying that the compiler should catch static type problems at compile time and that languages that do not do this are inherently unsafe.

      Then I realized that ALL of these people must not be running any real tests on their code! If they were running real tests on all your code (every line must be executed in your tests), then these dynamic typing errors would be easily caught ! those would be the easiest bugs to find.

      Too often I have seen C and C++ coders compile their project.... No errors! Ship it! :-)

      Another issue I have been thinking about is the relationship between code reuse and unexpected behaviours. Code reuse (and object class reuse) is fine as long as all of the functionality and limitations of the object/code are known.

      However for more complex class hierarchies I have seen people say '"I'll just inherit from this class publicly and change the public interface to match what I need for this project." - And then they are surprised when other pre-written code interacts funny with it. I'm not saying object-oriented is bad - I'm saying it is so common for programmers to break the basic concepts of OOP.

      I had one manager who was adamant that for any medium sized project there ought to be NO time spent on making the code re-usable. Every line of code should be directly related to specific aspects of the customer's requirements/specification document. At first I thought he was crazy.

      But after I saw some projects expand into massive class hierarchies just for the sake of the illusion of increasing the reusability of the code in other projects, I am starting to side with him a bit more.

      Extreme Programming has at least some very good points about it. ie: don't add features until you know you need them. Otherwise they probably won't be tested properly and won't be a good match for the new use. You can't predict every environment that the code may be reused in. It is harder to do than it sounds.

      So for high reliability systems I think one should have simple non abstracted code that can be measured, prodded, and always predictable. Then you can fashion your unit tests accordingly.

      --jeff++

      P.S.: scary thought/rant for today: How much C++ code do you see that is striving to be exception safe so that memory full errors will be caught properly? How many C++ coders understand that the default linux kernels and libraries will almost NEVER cause malloc() to return 0 and will almost NEVER cause operator new() to throw? Only virtual memory space is allocated. Real memory pages are only allocated as they are being used. Once all physical and swap pages are used, blammo goes your app (and possibly other apps on your system). In semi-critical systems, this is a real problem that is often overlooked.

      Where is the real problem in this case? Part of the problem is that the c++ environment running on the default linux kernel does not conform to the standard.

      The other part of the problem is that it is little known. If it were commonly known, people would be able to design around it (or change the kernel options). So people rely on what the documentation says, instead of properly testing the software limits.
      • Analogies are dangerous, but consider a tail light assembly. Other than something like a bumber clamp-on type of thingee, you have almost no chance of being able to reuse it from one model of car to another. Your manager is right in no time being spent on making the code reuseable. It is worthwhile making the code a bit more general than necessary, but the crux is in making the code match the edge conditions that exist in the customer's requirements. That makes little subtle distinctions that do NOT transfer well.
      • I had one manager who was adamant that for any medium sized project there ought to be NO time spent on making the code re-usable. Every line of code should be directly related to specific aspects of the customer's requirements/specification document. At first I thought he was crazy.

        I had a guy who thought dynamic memory allocation should be avoided at all costs, and you should never use a data structure more complex than an array.

        I still think he's crazy, but now I see his point. I mean, he was terrible for global variables and giant functions, but his programs never leaked memory and very rarely wrote to bad pointers. If you don't need dynamic memory allocation, you shouldn't use it, and when you do need it, you should only have one malloc and one free (or equivalent) for every dynamic data structure. Often, you only need one or two, even in a relatively large and featureful program. That way, I can write a good page of error handling code and comments on memory consumption for each dynamic memory access, and it saves me a lot of grief.

        I don't like reusing code, either, unless you can make a good case for it being a part of the underlying system. I like the analogy of an architect stapling someone else's blueprint of a fully-equipped foundry and machine shop to his design because the inhabitants will need a screwdriver. Reuse means bloat, and bloat is bad. Every extra line you add is another place for a bug to hide.
    • The Swedish warship, Vasa, also failed due to unrealistic timescales and lack of requirements validation. Many of these technology failures are really process/project management failures, of course.

      I saw the Vasa in its museum the other week in Stockholm - they retrieved the ship from the bottom of the harbour and it is now on display, with very interesting exhibits about how it was built. Worth a visit if you are ever in Stockholm.


    • So some gel is supposed to hold the thing together? I hope it was a vcr or something and not a jumbo jet.
  • I can't believe they didn't put the Tacoma Narrows Bridge on there!
    • The two Quebec City bridge collapses would have been good. Hundreds of workmen were killed.
    • Tacoma Narrows is exactly the sort of disaster he wasn't putting on the list. Just about everybody has seen the film, and in fact it's mostly well known because of the film.

      But in fact, I don't believe anybody was killed or injured, making it trivial compared to many other bridge collapses and disasters. The bridge was new so it didn't even disrupt life that much. It's an an interesting failure and a cool film, but was not a disaster nor is it unknown.

      Things that are much greater omissions include things in this thread, like Halifax, the Chinese dams, Tenerife etc.
  • Imagine if DigiScents hadn't ran out of money.

    At least the air freshener industry would benefit for the next 20 years as we attempt to de-stink the world
  • In case anyone is interested this story is in the current issue of the dead-tree edition of the magazine. Really interesting stuff!
  • OK, maybe the number of deaths wasn't a record, but the Space Shuttle Challanger disaster should rank up there as a technological disaster (anyone remember Feynman's presentation about the O-rings?)
    • They mentioned Challenger. They reason they didn't explore it in depth is because its a well known event that has been discussed at great length recently. Same with Titanic. The article chose to focus on events that are not as well known as the more popular events. And they succeeded, since other than the AT&T incident, I wasn't aware of any of them.

      -Restil
  • Concorde (Score:3, Interesting)

    by Wyatt Earp ( 1029 ) on Friday May 17, 2002 @10:06PM (#3541213)
    I was at work, and when I walked by a radio I caught something about Concorde. I yelled to my boss "The Concorde crashed I think!". He said. "No way, it can't crash, it's the Concorde."

    For me, an aerospace buff, that crash was as big as the Challenger.

    I remeber when the transcripts from the Concorde crash were released, it was really chilling, thinking about those pilots, knowing something bad is happening, and trying with all thier might to abort to Le Bourget, and that big Delta is stalling and Christian Marty can only say "Too late".
  • Good summary (Score:3, Insightful)

    by bubblegoose ( 473320 ) <bubblegoose@@@gmail...com> on Friday May 17, 2002 @10:41PM (#3541292) Homepage Journal
    It is necessary for us to learn from others' mistakes. You will not live long enough to make them all yourself.
    - Admiral Hyman G. Rickover
  • AFAIK every single large software project undertaken by the UK Government has been a huge unmitigated disaster, from the Inland Revenue to Social Security to Air Traffic Control. They're not just massively delayed, costs spiralling way over budget, but they don't even work when finally delivered. The one highlight is that Andersons (sorry, Accenture) had to pay millions in penalty payments, unlike others such as EDS. And you know when the large tenders come again there will only be the usual suspects on the list. I think the UK deserves a special "Persevering Towards Failure" awards.

    Phillip.
  • DMCA (Score:2, Insightful)

    by racerx509 ( 204322 )
    would the DMCA count?


  • Saw a rather interesting documentary on the Triangle Shirt Waist Factory fire in New York (I think) near the turn of the century.. Essentially, a sweat shop went up in flames, and the owners had padlocked all the emergency exits. Whoever didnt burn to death plunged to the ground below, diving out of windows.

    A couple people have probably mentioned the Hindenburg. The Hindenburg didnt crash because of sabotage, because of any engineering errors, or even because it was filled with hydrogen. Neither one of those are valid reasons, especially the hydrogen thoery. The hydrogen gas inside the blimp was doped with a substance that smelled like garlic, so the engineers and crew could smell hydrogen leaks if they occured. None were reported. A blimp like the Hindenburg contained pure hydrogen. Pure hydrogen by itself is NOT flammable -- An adequate mix of hydrogen and oxygen inside the ship would have been needed in order for it to ignite, and that mixture wasnt present. Besides, the footage of the accident clearly shows that there was no explosion -- It was only the outer skin that caught fire. The outer skin of the Hindenburg was coated with a combination paint and sealant that was both highly flammable, AND electrically conductive -- The prevailing theory on why the Hindenburg crashed is that the blimp collected so much static electricity during its descent into New Jersey (in a brief window inbetween thunderstorms, even..) that the charge eventually arc'ed, and ignited the outer skin of the craft. The Hindenburg crashed to earth not because of fire, but because of hydrogen loss.....all because of a poorly chosen paintjob, oddly enough..

    Cheers,

  • From the article:The result was the most lavishly appointed and heavily armed warship of its day, but one too long and too tall for its beam and ballast--a matchless array of features on an unstable platform.

    That's like Windows, right?
    • No, windows would be a ship of the same size with _one_ cannon that fired non-standard sized balls, sometimes fast enough to do damage, but more often than not, so slow that they would just plop out. It would still be unstable, but they wouldn't even be able to get 3 sailors on board before it started rocking. It would be launched anyway and even though it would sink after 5 minutes, 500 more ships of the same specs. would be built
  • MS Outlook (Score:2, Flamebait)

    by captaineo ( 87164 )
    Speaking of technology disasters- What about Microsoft Outlook, whose many unfixed security flaws have brought about waves of email-borne virii, costing millions of dollars in lost data and productivity?
  • The Vasa - that's the Swedish warship that sank at the start of its maiden voyage - was raised from the seabed in the 1961 and is now on display in a museum in Stockholm. I saw it in the late 1970s when the fragile timber was still being sprayed with a solution of polyethelenglycol to give it enough strength to bear its own weight as it was gradually dried out.

    It's now a massive visitor attraction. However, that's not without its own unfortunate side effects: I heard a report a few week back on the BBC that the wood is now rotting again in places due to the humidy in the air from the visitors' breath, perspiration, damp outer clothes on rainy days, etc.

    More information at the Vasa Museum [vasamuseet.se].

  • and now makes for a good museum [vasamuseet.se] in Stockholm, where you can learn the history and see the warship Vasa.
  • by Lethyos ( 408045 ) on Saturday May 18, 2002 @11:03AM (#3542640) Journal
    Does Slashdot fall into this so-called "technology failures/disasters" category?

"It takes all sorts of in & out-door schooling to get adapted to my kind of fooling" - R. Frost

Working...