Follow Slashdot blog updates by subscribing to our blog RSS feed

 



Forgot your password?
typodupeerror
×
Microsoft Technology

The London Stock Exchange Goes Down For Whole Day 792

Colin Smith writes "TradElect, the Microsoft .Net based trading platform for the London Stock Exchange, was offline for about seven hours, meaning that their 5-nines SLAs are shot for approximately the next 100 years. The TradElect system was launched back in June of 2007 and was designed for increased speed and system capacity."
This discussion has been archived. No new comments can be posted.

The London Stock Exchange Goes Down For Whole Day

Comments Filter:
  • by Anonymous Coward on Monday September 08, 2008 @04:25PM (#24924597)

    ...now if only my wife would do that! /rimshot!

    • by east coast ( 590680 ) on Monday September 08, 2008 @04:32PM (#24924749)
      Oh, she does... just not with you.

      nudge nudge, wink wink.
    • by MrJerryNormandinSir ( 197432 ) on Monday September 08, 2008 @09:24PM (#24928015)

      That was the their first mistake. What were they thinking? You need a 3 highly available Unix clusters with three SANs. You need three to elect a quorum. If you don't know what a quorum is you shouldn't be attempting to design system that is supposed to deliver on a 5-nine SLA. Each geographic location should include 1 cluster and 1 SAN. All three locations networked with dark fiber. fiber routing should be set up so that a cluster can fail over to a SAN in another location. As far as Hardware is concerned, I would go with a cluster of IBM P6-570 and use an EMC Symmetrix DMX SAN at each site.
      Who the heck designed this? .Net trading platform.. I have to laugh! Microsoft .net = 5.none SLA! .Net is only good for people who would like to create a light duty website. Under a load it breaks. The London Stock Exchange proves my point.

      • by hughk ( 248126 ) on Tuesday September 09, 2008 @03:06AM (#24929919) Journal

        WTF did a moderator mark this as flamebait? The poster was right, HA is a) hard and b) expensive.

        I designed some of the HA stuff many years ago for Eurex [eurexexchange.com]. We used OpenVMS and had two clusters (over 40Km apart) for the main and standby with the standby system also being used for development with a flick of a switch the standby cluster could take over in production. We had no SANs in those days but used Digital's Hierarchical Storage Controllers. These days it runs with SANs but the host systems still run VMS and there are now product specific clusters.

        The next level down there are access points containing communications servers providing connectivity to member systems and routing to the hosts which are scattered around the globe. A member normally has connectivity to two access points. The only single point of failure for a member is where both lines come together for the last few metres into their building and some idiot digs a hole in the road.

      • by Aceticon ( 140883 ) on Tuesday September 09, 2008 @03:53AM (#24930119)

        I work in London as a freelancer in IT in Investment Banking. My professional experience was mostly with IT Products/Services companies.

        Although I haven't worked in the LSE, from the places I've worked in around here I came out with the impression that most people in IT in this industry are amateurs (and that includes those in other geographical locations).

        Any kind of more advanced IT concepts such as technical analysis, software/hardware architecture, iterative software development processes are pretty much either not done or done by people you don't have clue about what they're doing.

        I'm hardly surprised with what happened in the LSE.

        • by Capt James McCarthy ( 860294 ) on Tuesday September 09, 2008 @06:53AM (#24930797) Journal

          I have a feeling that the 'normal' IT situation was to blame for this.

          Preamble: Technical Expertise provided a wonderful architecture that was HA and robust, fast, and scalable.

          Bean Counters looked at the cost and said "You Tech guys spend too much money."

          IT architects: "How much is your data worth?"

          Bean Counters: "Not this much. Look we don't really need all of these systems. My home system has been working for 4 years with no problems. And I've talked with Microsoft Execs and they will cut us a deal for their platform. Now go away, I've just decided how the architecture will be done. Why did we hire you anyways?"

  • That's okay (Score:5, Funny)

    by sokoban ( 142301 ) on Monday September 08, 2008 @04:25PM (#24924607) Homepage

    most of the american stock exchanges have been going down all year.

  • by xmas2003 ( 739875 ) * on Monday September 08, 2008 @04:26PM (#24924615) Homepage
    Assuming 8.5 hour trading day (0700-1530) and 250 trading days/year. Maybe a squirrel caused the problem ... ;-) [komar.org]
  • Oh, my. (Score:4, Interesting)

    by cp.tar ( 871488 ) <cp.tar.bz2@gmail.com> on Monday September 08, 2008 @04:26PM (#24924621) Journal

    So what happens when this happens again?

    • Re:Oh, my. (Score:5, Funny)

      by gnick ( 1211984 ) on Monday September 08, 2008 @04:46PM (#24924991) Homepage

      The same thing that happened this time?

    • Re:Oh, my. (Score:5, Funny)

      by Eudial ( 590661 ) on Monday September 08, 2008 @04:47PM (#24925005)

      So what happens when this happens again?

      Well, first "Have you tried turning it off and on again?"
      Otherwise, "Are you sure it's plugged in?"

    • Re:Oh, my. (Score:5, Interesting)

      by im_thatoneguy ( 819432 ) on Monday September 08, 2008 @04:49PM (#24925031)

      Actually this is "again".

      The LSE used to run on HP-NonStop (w/ Cobol and C as far as I can find) but still managed to take itself down for 8 hours in 2000.

      If they're going to go down for a day every 7-8 years it might as well be cheaper and faster. (Articles quote the CTO as citing 10x performance increases).

      (All based on a quick google search)

      So before the hounds descend upon Microsoft it would seem the LSE has a history managing to bring down whatever system they run on.

      • Re:Oh, my. (Score:5, Funny)

        by Darkness404 ( 1287218 ) on Monday September 08, 2008 @04:56PM (#24925121)
        But new computer systems should make things more reliable, along with more experienced coders and better languages.
        • Re:Oh, my. (Score:5, Informative)

          by im_thatoneguy ( 819432 ) on Monday September 08, 2008 @05:03PM (#24925213)

          Which from the sounds of this article http://www.computerweekly.com/Articles/2008/06/12/231031/agile-trading-software-critical-to-london-stock-exchange.htm [computerweekly.com] was the intent.

          One very interesting note is at the end of the article:

          Timeline for Tradelect upgrades

          18 June 2007: Tradelect launched, reducing the time taken to process trades from 140 milliseconds to 10 milliseconds. Capacity increased from 593 to 2,500 orders a second.

          November 2007: Version 2 upgrade. Trading time reduced from 10 milliseconds to about 6 milliseconds. Capacity increased by 70% from 2,500 to 4,200 orders a second. Introduced full suite of Mifid-compliant services.

          September 2008: Planned migration of Italian trades to Tradelect platform.

          September 2008: Tradelect Version 2 to launch. Plans to double trading capacity to 10,000 continuous messages per second. Aims to cut average time taken to complete a trade by half from 6 milliseconds to 3 milliseconds.

          Coincidence that this month was when they intended to release a new version?

        • Re:Oh, my. (Score:5, Insightful)

          by GigaplexNZ ( 1233886 ) on Monday September 08, 2008 @05:07PM (#24925295)
          These "better languages" are easier to use which allows for less experienced coders to perform the tasks. This is not an ideal world we live in.
          • Re:Oh, my. (Score:5, Insightful)

            by mrjb ( 547783 ) on Monday September 08, 2008 @05:47PM (#24925855)

            These "better languages" are easier to use which allows for less experienced coders to perform the tasks.

            I couldn't disagree more. Although automatic garbage collection is nice, this doesn't mean that you'll get "five nines uptime" systems by working with "less experienced" coders.

            If you're building a system that must guarantee 999.99% uptime, you wait until your best professionals become available, because it doesn't only involve code. You DON'T give the job to the less experienced ones, no matter how great the programming language. Five nines uptime requires a very robust design and very solid code quality running on a very solid platform which is running on a very solid OS on a very solid infrastructure. You'll want everything to be tested by unit tests, integration tests, regression tests, and whatnot. That involves a whole lot more than 'just' coders, but whoever works on it, they better be good at it.

  • Ugly Day (Score:5, Informative)

    by pyite ( 140350 ) * on Monday September 08, 2008 @04:26PM (#24924623)

    It was an ugly day of finger-pointing and near-fixes, but in the end, it just left all the financial firms standing there staring at the Exchange. Definitely was a big deal--and it seemed like a lot of volume spilled over to US markets, creating volume related issues here.

  • by 3seas ( 184403 ) on Monday September 08, 2008 @04:26PM (#24924629) Homepage Journal

    .... a method of controlling the market.

  • by caluml ( 551744 ) <slashdot@spamgoe ... minus herbivore> on Monday September 08, 2008 @04:26PM (#24924631) Homepage
    But Patch Tuesday is tomorrow?
  • by R2.0 ( 532027 ) on Monday September 08, 2008 @04:26PM (#24924639)

    Looks like someone needs to brush up on their buzzwords, specifically "mission critical" and "services no longer required".

  • single page (Score:5, Informative)

    by Anonymous Coward on Monday September 08, 2008 @04:30PM (#24924705)

    I wish people would get into the habit of linking to the single page version of the FA [reuters.com].

  • Misleading summary (Score:5, Informative)

    by denoir ( 960304 ) on Monday September 08, 2008 @04:31PM (#24924723)
    The summary implies that TradElect was responsible for the shutdown, but according to the stock exchange itself, it wasn't [itworld.com] the case. They say instead it was a network problem.
    • by tgatliff ( 311583 ) on Monday September 08, 2008 @04:41PM (#24924909)

      Why the heck they were using MS Windows for this type of environment is stunning... Transactional processing which is the bulk of this type of setup is where Solaris and Linux excel. Any company that builds a system like that on .Net should be thown out on the street.

      In short.. Not to rock on Windows, but different platforms always offer different strengths..

      • by japhering ( 564929 ) on Monday September 08, 2008 @04:52PM (#24925061)

        As is normally the case M$ threw lots of money at the exchange to get it to switch unix/linux base to windows net so that M$ can tout that a major exchange is running windows.

        Full page ads touting the switch and the reasons they cited were better through put and better up time.

        They even had ads touting it here on /.

    • by Hyppy ( 74366 ) on Monday September 08, 2008 @04:45PM (#24924971)
      if it was a network problem, then they're in more trouble than the summary implies. It's relatively simple to get 100% uptime (minus a dropped packet or two) in a network. The key here is redundancy. If you throw enough hardware at it, yes, it will not break.

      Internal? Dual(+) homed servers, redundant switches, redundant AC, redundant power.
      External? BGP on 2 or more transits on separate physical runs.

      What, you say that you need to account for natural disasters? Then get a second site, at least a few hundred miles away, and repeat.

      Virtual 100% uptime is a solved problem in the networking world.
    • by caluml ( 551744 ) <slashdot@spamgoe ... minus herbivore> on Monday September 08, 2008 @05:05PM (#24925271) Homepage
      Although:

      The Johannesburg Stock Exchange, which uses the LSE's trading platform TradElect, also suspended trading.

      Hmm. Smells like a new version to me.

  • by markana ( 152984 ) on Monday September 08, 2008 @04:33PM (#24924765)

    "and was designed for increased speed and system capacity"

    and see - it went down far faster and more completely than the previous system would have been able to. So that's progress. It's all in how you present it.

  • 5 nines? (Score:5, Funny)

    by andreyvul ( 1176115 ) <[andrey.vul] [at] [gmail.com]> on Monday September 08, 2008 @04:38PM (#24924847)

    So their 9.9999% uptime is screwed?

  • by heroine ( 1220 ) on Monday September 08, 2008 @04:42PM (#24924921) Homepage

    After the malfunction, TradElect was immediately bought by UK's government for $200 billion and all its debts waved. In an unrelated story, medicare tax was raised yet again because of an unexpected shortfall.

  • by Bert64 ( 520050 ) <bert@[ ]shdot.fi ... m ['sla' in gap]> on Monday September 08, 2008 @04:45PM (#24924981) Homepage

    Does anyone else remember the "The london stock exchange chose windows 2003 for reliability, they didn't choose linux" ad banners that used to run all over the place, including slashdot if i remember?
    Funny how it's all come crashing down...

    "The london stock exchange chose windows, but after 7 hours of downtime wishes they had chosen linux".

  • 5-nines SLA (Score:5, Informative)

    by skeeto ( 1138903 ) on Monday September 08, 2008 @04:46PM (#24924987)

    "5-nines SLA"

    I had to look this up, so I imagine other people didn't know it either (I thought was was a stock exchange term). First Google search result reveals the answer,

    The Battle With "3 Nines" and The Goal of "5 Nines" [cubiccompass.com]

  • ketan (Score:5, Interesting)

    by ketan324 ( 1085019 ) on Monday September 08, 2008 @04:52PM (#24925059)
    The LSE going down is a big deal. The US exchanges have been trying very hard to displace LSE's strong hold in the EUROPEAN markets. With the merger of NYSE/Euronext and NASDAQ/OMX this cuts market share and faith in LSE as everyday passes. Additionally with continued tech issues, NASDAQ could reinvigorate their bid for LSE again! I work for a data major data vendor, and I know from experience the NYSE and NASDAQ are much more reliable than their European counterparts. Also LSE going down today is huge, considering the news on Fannie/Freddie, WAMU, Lehman, and the WRONG news on United Airlines. Many arbitrage opportunities were lost for LSE traders.
  • Quote .NET (Score:4, Funny)

    by Legion_SB ( 1300215 ) on Monday September 08, 2008 @04:53PM (#24925081) Homepage
    .NET garbage collector: "Oops, that wasn't garbage!"
  • It's official. (Score:4, Insightful)

    by RightSaidFred99 ( 874576 ) on Monday September 08, 2008 @04:56PM (#24925119)
    Most of you are morons. Let me get this straight. TradElect is .NET based. TradElect failed. Ergo, Windo$e sucks, M$ sucks. and .N$T sucks, etc... You'd think you were technically illiterate morons or something who think that all or even most system failures are caused by the platform or programming language.

    Let me explain computers to you. See, the developer uses a set of platforms, languages, integration components, etc.. to deliver his functionality to the end user. A failure at any level can cause the application to fail. It could be application logic, network issues, hardware issues, integration with third party systems, a dipship systems administrator, etc...

    And yet the 90-105 IQ SlashDweeb set comes out in numbers with no data and says "lolz Windoze! .NET haha!". Crikey.

    • Re:It's official. (Score:4, Insightful)

      by Alioth ( 221270 ) <no@spam> on Monday September 08, 2008 @06:43PM (#24926513) Journal

      Well, no, it's just that Microsoft shouted long and hard about how reliable the LSE would be now it was running on Windows Server System 2003. So it's deliciously ironic that after all this trumpet tooting, it still fell flat on its face, regardless of the reason...since Microsoft's ads were obviously to get everyone to believe that the system would be highly reliable.

  • by reverseengineer ( 580922 ) on Monday September 08, 2008 @05:16PM (#24925411)

    President of Exchange: [Randolph Duke has just collapsed with shock] Mortimer, your brother is not well. We better call an ambulance.

    Mortimer Duke: Fuck him! Now, you listen to me! I want trading reopened right now. Get those brokers back in here! Turn those machines back on!

    [shouts - it echoes pathetically throughout the trading hall]

    Mortimer Duke: Turn those machines back on!

  • Get The Facts (Score:4, Informative)

    by moderators_are_w*nke ( 571920 ) on Monday September 08, 2008 @05:24PM (#24925553) Journal

    "In the past six years, there have been no production outages at the London Stock Exchange, and the new systems running on Microsoft technologies are critical to maintaining this 100 per cent reliability record."

    http://www.microsoft.com/casestudies/casestudy.aspx?casestudyid=200042 [microsoft.com]

    • Re:Get The Facts (Score:5, Insightful)

      by kesuki ( 321456 ) on Monday September 08, 2008 @07:15PM (#24926863) Journal

      Right from your article "and be cheaper to manage"

      sounds like the LSE fired expensive. knowledgeable admins and went for 'cheaper' ones, there is your problem right there. windows server isn't perfect, but clearly they had good hardware, were running mission critical apps, but went with cheaper less experienced admins.

      also, your fine article specified there were 'no production outages', they don't claim the system ran 24/7/365 with no reboots or glitches, but that there was no production outages for six years. there is quite a bit of difference. the former states that admins and hardware were able to offer the specific services needed at the time it was needed for 6 years, but not on the amount of redundant hardware, etc required to accomplish everything.

      so given everything i've read here, under experienced windows admin approves an under tested system upgrade that epic fails, and takes down the production server for the first time in 6 years. no shock here, they wanted to cut corners on admin costs, they brought the epic fail on themselves.

    • Re:Get The Facts (Score:5, Interesting)

      by narcberry ( 1328009 ) on Monday September 08, 2008 @07:43PM (#24927135) Journal

      Interesting since they haven't been "running on Microsoft technologies" for "the past six years"...

  • Bad upgrade (Score:5, Informative)

    by JShadow21 ( 871404 ) on Monday September 08, 2008 @05:42PM (#24925777)
    The article here [computerweekly.com] blames it on some sort of botched upgrade.
  • by alexmin ( 938677 ) on Monday September 08, 2008 @06:08PM (#24926095)
    Here: http://www.londonstockexchange.com/en-gb/products/membershiptrading/tradingservices/Incident/LIVE [londonstockexchange.com]
    Notice that there were several unsuccessful attempts to bring it back up.
    What's really pitiful, LSE has just a fraction of data/trade volume of major US exchanges like Nasdaq or NYSE and still, their systems are regularly getting hosed, albeit not as much as today's meltdown.
    Hopefully in coming years LSE will lose market share to Nasdaq/Europe, BATS/Europe, Chi-X and other electronic markets - that should teach them well.
  • by synthespian ( 563437 ) on Monday September 08, 2008 @07:48PM (#24927185)

    IIRC, Brazil Bovespa had a small glitch last month or two.

    Back in the day when Wall Street and financial markets ran on Solaris systems (AFAIK), this shit wasn't common.

    Now it's probably going to become *acceptable* for stock exchanges and aviation reservation software to crash.

    Apparently, there's a new generation of a-holes on the system administration markets who grew up with Windows and the Blue Screen of Death, that thinks it's acceptable for operating systems to crash, once in a while. Is it evolution?

Math is like love -- a simple idea but it can get complicated. -- R. Drabek

Working...