The London Stock Exchange Goes Down For Whole Day 792
Colin Smith writes "TradElect, the Microsoft .Net based trading platform for the London Stock Exchange, was offline for about seven hours, meaning that their 5-nines SLAs are shot for approximately the next 100 years. The TradElect system was launched back in June of 2007 and was designed for increased speed and system capacity."
99.9967% Uptime if up the next 100 years (Score:5, Informative)
Ugly Day (Score:5, Informative)
It was an ugly day of finger-pointing and near-fixes, but in the end, it just left all the financial firms standing there staring at the Exchange. Definitely was a big deal--and it seemed like a lot of volume spilled over to US markets, creating volume related issues here.
single page (Score:5, Informative)
I wish people would get into the habit of linking to the single page version of the FA [reuters.com].
Misleading summary (Score:5, Informative)
Re:Misleading summary (Score:5, Informative)
Internal? Dual(+) homed servers, redundant switches, redundant AC, redundant power.
External? BGP on 2 or more transits on separate physical runs.
What, you say that you need to account for natural disasters? Then get a second site, at least a few hundred miles away, and repeat.
Virtual 100% uptime is a solved problem in the networking world.
5-nines SLA (Score:5, Informative)
"5-nines SLA"
I had to look this up, so I imagine other people didn't know it either (I thought was was a stock exchange term). First Google search result reveals the answer,
The Battle With "3 Nines" and The Goal of "5 Nines" [cubiccompass.com]
Re:How many failures before.. (Score:5, Informative)
Also he said support was crucial for his company. If something went down, he wanted to be able to call someone immediately. He couldn't afford to just post a question on a message board and hope someone replies. He wanted contracts with 3rd party support that had experience with similar huge enterprise systems that he had.
When I said there were companies who could provide excellent Linux support, he said his ass was on the line if something broke so he wanted to be able to justify his software choice to the the C-level guys. And those guys knew the name Microsoft. So he didn't see anything else as an option.
Re:let me be the first to say (Score:2, Informative)
bollocks
Some info on the system itself (Score:3, Informative)
http://www.computerweekly.com/Articles/2006/09/26/218637/city-prepares-to-test-new-trading-platform.htm [computerweekly.com]
I bet the fingers are pointing today - Accenture (formerly Arthur Andersen) India vs HP vs Microsoft.
Re:Oh, my. (Score:5, Informative)
Which from the sounds of this article http://www.computerweekly.com/Articles/2008/06/12/231031/agile-trading-software-critical-to-london-stock-exchange.htm [computerweekly.com] was the intent.
One very interesting note is at the end of the article:
Timeline for Tradelect upgrades
18 June 2007: Tradelect launched, reducing the time taken to process trades from 140 milliseconds to 10 milliseconds. Capacity increased from 593 to 2,500 orders a second.
November 2007: Version 2 upgrade. Trading time reduced from 10 milliseconds to about 6 milliseconds. Capacity increased by 70% from 2,500 to 4,200 orders a second. Introduced full suite of Mifid-compliant services.
September 2008: Planned migration of Italian trades to Tradelect platform.
September 2008: Tradelect Version 2 to launch. Plans to double trading capacity to 10,000 continuous messages per second. Aims to cut average time taken to complete a trade by half from 6 milliseconds to 3 milliseconds.
Coincidence that this month was when they intended to release a new version?
Re:100 years? (Score:5, Informative)
5 nines does not mean what you think it means.
No, you're right. By my calculation, the actual figure is more like 360 years.
(Remember, this is a system that only operates 7.5 hours per day, 250 days per year)
Re:What, no ads? (Score:1, Informative)
"The london stock exchange chose windows, but after 7 hours of downtime wishes they had chosen linux".
Link is here:
http://www.microsoft.com/india/getthefacts/lse.aspx
Re:In other NEWS... (Score:5, Informative)
No, he'd waggle his arse .
A fanny would be a vagina in Britain.
Come on +5 informative!
Get The Facts (Score:4, Informative)
"In the past six years, there have been no production outages at the London Stock Exchange, and the new systems running on Microsoft technologies are critical to maintaining this 100 per cent reliability record."
http://www.microsoft.com/casestudies/casestudy.aspx?casestudyid=200042 [microsoft.com]
Re:Misleading summary (Score:5, Informative)
You have no clue. When people mention Linux in these environments they mean Linux running on one of these [ibm.com], not a home-brew distro running on a $150 PC.
Re:Misleading summary (Score:5, Informative)
Bad upgrade (Score:5, Informative)
It could be .. but wasn't :) (Score:3, Informative)
But it wasn't any of the above. The Stock Exchange failed after a failed upgrade of the Microsoft
Re:Still don't know why... (Score:5, Informative)
To be fair (Score:5, Informative)
Of course it is very unlikely that MS achieves five 9s on any installation, let alone as an average.
Why is Microsoft getting dragged into this discuss (Score:3, Informative)
Accenture built the Tradelect platform in India between late 2004 and March this year
Talk Like a Pirate Day (Score:3, Informative)
Is it "Talk like an Ass-Pirate Day" already?
No, TLAP Day is next week.
Link to incident status page (Score:5, Informative)
Notice that there were several unsuccessful attempts to bring it back up.
What's really pitiful, LSE has just a fraction of data/trade volume of major US exchanges like Nasdaq or NYSE and still, their systems are regularly getting hosed, albeit not as much as today's meltdown.
Hopefully in coming years LSE will lose market share to Nasdaq/Europe, BATS/Europe, Chi-X and other electronic markets - that should teach them well.
Re:It appears high load/usage crippled the system. (Score:4, Informative)
I'm not sure I understand the distinction you're trying to draw,
Latency versus throughput. If the new system processed those serially while the old could handle 130 in parallel, then the old system would be 10x faster even though the new was 10x quicker.
but total transaction capacity of the system increased along the same lines.
Yes, after throwing massive amounts of hardware at the problem.
Re:How many failures before.. (Score:5, Informative)
No, but I can point to the New York Stock Exchange, which uses AIX and Linux [techtarget.com].
Vietnam outperforms London (Score:2, Informative)
7 hours? Is that all you can do? We managed a 3 day outage earlier this year at the Ho Chi Minh City stock exchange.
http://www.bloomberg.com/apps/news?pid=20601087&sid=aCTlooFV6H0Y&refer=home [bloomberg.com]
Re:Potentially misleading summary (Score:5, Informative)
Well, the Reuters article does say that trading started normally, but some traders were unable to connect, so the whole exchange was bought down to avoid unfair advantage/disadvantage occurring, so actually both stories are consistent.
Re:The London Stock Exchange Goes Down For Whole D (Score:1, Informative)
Re:Misleading summary (Score:3, Informative)
Well another poster [slashdot.org] pointed out this story [computerweekly.com] with a juicy quote:
Re:Good lord, they're running on Windows? Why? (Score:1, Informative)
http://searchdatacenter.techtarget.com/news/article/0,289142,sid80_gci1254860,00.html [techtarget.com]
http://www.itjungle.com/big/big052008-story01.html [itjungle.com]
Though, to be fair, the NYSE also had a huge, embarrassing outage of its own in 2006 IIRC (not to mention a well-documented outage in 2001 when from a software bug pushed to their mainframes) - I guess there's no such thing as 100% uptime...
Re:How many failures before.. (Score:3, Informative)
The IBM Z/OS and AIX, Sun Solaris are at such a level that they would alert the respective companies before anything catastrophic actually happens and while systems running on a parallel sysplex backup system without any employee figuring it, IBM/Sun engineers would be fixing the issue. Ask any serious bank why they keep buying/using mainframes.
The term is "Autonomic computing" http://en.wikipedia.org/wiki/Autonomic_Computing [wikipedia.org]
I would investigate the decision maker having genius idea of running a financial, time critical process on Windows 2003 and .NET platform.
Re:Oh, my. (Score:3, Informative)
unless they are union in which case it would be the last hire.
Re:The London Stock Exchange Goes Down For Whole D (Score:3, Informative)
/rimshot!
Here you go. [instantrimshot.com]
status page (Score:2, Informative)
here their status page.
http://www.londonstockexchange.com/en-gb/products/membershiptrading/tradingservices/Incident/LIVE [londonstockexchange.com]
same below, but may not render properly
Incident Updates
Time Market Status Exchange Action
Client Impact
Client action
6.43pm Market Closed
The Exchange regrets the earlier interruption to trading and is conducting further investigations. It is in the process of confirming all of the steps necessary to ensure trading can commence as scheduled tomorrow.
Further updates this evening will be published on this website.
Monitor this Website.
4.49pm Market Closed Closing auction has now finished. Closing prices, where relevant, have been disseminated.
4.21pm Closing Auction
This is to inform you that due to on-going connectivity issues to resume a fair and stable market the Closing Auction will commence from 16:21 onwards. The Closing Auction will uncross as scheduled at 16:35 onwards (subject to a 30 second random period).
4.00pm Continuous trading
Standard trading schedule will be followed for the remainder of the day.
3.45pm Auction
The auction will uncross at 16:00 BST (subject to a 30 second random period) at which time continuous trading will resume.
There will be no further change to the remainder of the trading day. Therefore, the Closing Auction will commence as scheduled at 16.30 and uncross at 16:35 (subject to a 30 second random period).
From this time market maker quotes in both quote and order driven markets will be firm.
Prepare to resume trading
3.30pm Auction
The International Order Book and International Bulletin Board will NOT be available for automatic execution for the rest of today.
No closing prices will be issued in these trading segments (IOB, IOBU, ITBB and ITBU) today.
The remaining Trading Segments will remain in an auction phase. A further update will be provided.
3.11pm Auction
We will be re-enabling connectivity from 3.15pm
Connectivity will be phased and following completion all order book segments will remain in an auction phase.
Once connectivity is established orders can be entered and deleted, but no electronic execution will occur until the uncrossing and commencement of continuous trading.
2.38pm
Auction To ensure consistent connectivity we are suspending connectivity to trading for a short period from 2.45pm
Once connectivity is established orders can be entered and deleted, but no electronic execution will occur until the uncrossing and commencement of continuous trading.
During this time customers are required to reset their log on connection status to ensure legitimate connections can be established once connectivity is re-enabled
2.20pm Auction
We are continuing to establish connectivity with our customers. This process is taking longer than expected.
A further update will be provided.
Once connectivity is established orders can be entered and deleted, but no electronic execution will occur until the uncrossing and commencement of continuous trading.
1.13pm Auction
We are continuing to establish connectivity with our customers.
A further update will be provided shortly.
Once connectivity is established orders can be entered and deleted, but no electronic execution will occur until the uncrossing and commencement of continuous trading.
12.30 Auction
We are continuing to establish connectivity with our customers.
Continuous trading will re-commence at the end of the auction period. We will provide at least 15 minutes notice of when we plan to end the
Re:Tee Hee (Score:3, Informative)
Certain languages have features that eliminate large classes of errors. Whilst its possible that programmers will find other ways to screw up, I'd have though that reducing the set of errors that are actually possible would go some way to improving reliability.
With a general purpose programming language, the number of ways to screw up is effectively infinite. If you take another infinite set, say, the integers, and eliminate a large subset, say the even integers, you still have an infinite set left over. The GP is simply pointing out that there will always be programmers who screw up in ways that haven't been eliminated.
Using Microsoft for a 5-nines SLA? Is that a joke? (Score:5, Informative)
That was the their first mistake. What were they thinking? You need a 3 highly available Unix clusters with three SANs. You need three to elect a quorum. If you don't know what a quorum is you shouldn't be attempting to design system that is supposed to deliver on a 5-nine SLA. Each geographic location should include 1 cluster and 1 SAN. All three locations networked with dark fiber. fiber routing should be set up so that a cluster can fail over to a SAN in another location. As far as Hardware is concerned, I would go with a cluster of IBM P6-570 and use an EMC Symmetrix DMX SAN at each site. .Net trading platform.. I have to laugh! Microsoft .net = 5.none SLA! .Net is only good for people who would like to create a light duty website. Under a load it breaks. The London Stock Exchange proves my point.
Who the heck designed this?
Re:How many failures before.. (Score:4, Informative)
Off the top of my head, I know that all the LiffeConnect-based systems (London Financial Futures Exchange, EuroNext, Amsterdam, CBOT Metals Complex, Tokyo Futures Exchange, probably a couple of others) run on Linux (a relatively recent change from Sun boxen). NYSE now owns that codebase, and I'm pretty sure that the NYSE uses Linux and AIX on its own platform.
The Chicago Mercantile Exchange's GLOBEX trading engine (running CME, CBOT non-Metals, NYMEX plus a couple smaller exchanges like Minneapolis and Kansas City) platform runs on Linux. They migrated from Solaris to Red Hat back in 2004.
The Intercontinental Exchange's WebICE platform is written in Java and I believe it's running on Linux, but there may be some Solaris still around.
The CBOEdirect system is Java but runs mostly on Sun Enterprise hardware. There is some Linux in the mix, and they certainly use it on some of their other trading systems.
In the (futures and options) trading world, running on Windows servers is considered to be a sure sign of being bush-league. Demand for UNIX/Linux is huge. And I'm not saying this as a Java/UNIX/Linux snob - most of the systems I've written were Microsoft-based (for a variety of reasons - most started out as technology demonstrations that grew way beyond their intended lifespan - "the client's always right").
Re:That's okay (Score:3, Informative)
I heard it was more like 37 times.
Re:Using Microsoft for a 5-nines SLA? Is that a jo (Score:5, Informative)
I work in London as a freelancer in IT in Investment Banking. My professional experience was mostly with IT Products/Services companies.
Although I haven't worked in the LSE, from the places I've worked in around here I came out with the impression that most people in IT in this industry are amateurs (and that includes those in other geographical locations).
Any kind of more advanced IT concepts such as technical analysis, software/hardware architecture, iterative software development processes are pretty much either not done or done by people you don't have clue about what they're doing.
I'm hardly surprised with what happened in the LSE.
Re:Latency is kinda pointless for this kind of stu (Score:3, Informative)
It is not inserting into a trade, it is creating a trade when a difference in value is noticed. Arbitrage levels out prices between different exchanges, allowing people to trade on either without worrying if they would be getting a better price elsewhere. It is parasitic in the same sense that the oil pump is parasitic in a car - it doesn't add any power, but things turn much more freely with it in place, and it exacts a charge for doing so. Essentially, an arbitrageur is constantly shopping around for bargains an evening them out, meaning that ordinary traders don't need to do so.
Comment removed (Score:2, Informative)
Re:Oh, my. (Score:3, Informative)
"Complete" is an industry term. They use "complete", "cross" and "trade" in the same way.
What it means is:
* that there is currently a set of offers and asks on the exchange. Other people have submitted those already
* you want to buy/sell one of these as appropriate
* you send down your legally binding request for selling/buying that amount
* the matching algorithm sees that your sell/buy matches the buy/sell on the exchange
* that particular offer is then "yours" and no-one else can have it.
* you are notified of the success of your "trade"/"cross" which is "complete"
So the "match" occurs in 3ms. This is important, because otherwise you have to wait longer to know if you need to look elsewhere. There's more to it of course, this is just a mega-simple case.
Re:Latency is kinda pointless for this kind of stu (Score:2, Informative)
No, arbitrage is a necessary component of orderly markets. A perfectly balanced market has 0 arbitrage opportunity.
Example: ETF Conversion/Redemption. The ETF is priced in real-time based on the price of its components. Fast systems are able to detect small discrepancies in the price of some of the components and the basket as a whole and are able to execute trades ( say BUY ) on the components and the contra trade ( SELL ) on the ETF and then redeem the ETF from the components to pair off the contra trade.
This arbitrage always works to keep the components perfectly in line with the ETF itself. I think that is a good thing.