Follow Slashdot blog updates by subscribing to our blog RSS feed

 



Forgot your password?
typodupeerror
×
Software News

Software Upgrade Crashes UK Air Traffic Control System 233

pitpe writes "Earlier today the computer system controlling most of the UK's airspace failed, after tests in preparation for an upgrade failed. The original failure occurred at the West Drayton centre, which is an old (70's) system, as opposed to the new system at Swanage, which has had its own problems. A system wide reboot to fix the system resulted in the entire system being taken down temporarily."
This discussion has been archived. No new comments can be posted.

Software Upgrade Crashes UK Air Traffic Control System

Comments Filter:
  • Re:More problems... (Score:5, Interesting)

    by Mr_Silver ( 213637 ) on Thursday June 03, 2004 @10:14AM (#9325166)
    "If you want to know what is wrong with transport in this country it is that over decades successive governments did not spend enough on the infrastructure and air traffic control is no different," Darling told BBC radio."

    A dutch friend of mine once remarked that she didn't understand the mentality of the British. "You" she said, "have an amazing tendency to run things into the ground and then get around to fixing them rather than spending money on continually maintaining them so they never fall apart."

    It's a very good point.

  • by b06r011 ( 763282 ) on Thursday June 03, 2004 @10:17AM (#9325200)
    at least only the computers crashed

    as for the system crashing in the first place, it's unfortunate, but a good thing that they were able to cope and keep everyone safe - that's the main thing, right? (it's certainly my main concern)

    and as for the software not being up to the job, it may well not be. after all, air traffic has increased ever so slightly since the 1970's - is it reasonable to expect a program presumably designed for 70's hardware, and 70's air traffic loads to cope with heathrow in 2004?

  • Yes, I think that the software structure of a critical realtime system like ATC is much more important than which OS or language it's written in. It should be built like a strange composite stranded cable, with different strands of simple structure that can survive sporadic (even systemic) failure of its parts. In such a system, there should be no such thing as a system-wide reboot, since the only thing that is truly system-wide is the data.

    Without this structure, Linux would probably fail at an unacceptable rate too.
  • by grub ( 11606 ) <slashdot@grub.net> on Thursday June 03, 2004 @10:30AM (#9325341) Homepage Journal

    Yeah but the Y2K problem was "discovered" way back in the 70s. Banks doing 25 year mortgages in 1975 would extrapolate into 2000 and "whoops!" Any place which had Y2K problems gets no sympathy from me. :P
  • Hang on a second... (Score:3, Interesting)

    by Gordon Bennett ( 752106 ) on Thursday June 03, 2004 @10:41AM (#9325460)
    To quote from the NATS (National Air Traffic Services) press release:

    "The FDP was being tested overnight for a future upgrade. The system was successfully returned to service but at 06.03 errors were detected in the distribution of flight data between Centres. As a precaution, we decided to restart the FDP (known as a cold restart) causing an interruption to full service. The data processing system was restored at 06.42 and declared fully operational at 07.03. Flight capacity restrictions were lifted at 08.05. The system is now fully operational and we are confident that it is stable.

    Through the response team at West Drayton, we have been working with airports and airlines to clear the delayed departures, and expect the backlog to be cleared quickly.

    Our investigation into the cause of the problem is continuing."

    Let me get this straight: they ran a test on the FDP. The FDP glitched. They rebooted the FDP. They are still investigating the problem.
    Now, unless I am mistaken, I can only infer from their statement above that they are now running the FDP which is still susceptible to the problems highlighted by the test.
  • by YrWrstNtmr ( 564987 ) on Thursday June 03, 2004 @10:56AM (#9325643)
    Yes they did, and no, using a cell phone is not a certainty to cause problems.

    It does, however, carry the potential to introduce errors in various systems.
    Would you want the altimeter to read 200 feet too high, or have an uncommanded left turn, because some numbnuts is yakking on the cellphone?

    "DC-9 flight crew experienced an involuntary turn [cio.com] by the autopilot during cruise. Autopilot reacted normally after the captain asked passengers to turn off any personal electronic devices. Crew later learned that a cell phone in an overhead bin was heard during the time of the autopilot problem."
  • Re:More problems... (Score:3, Interesting)

    by plopez ( 54068 ) on Thursday June 03, 2004 @11:36AM (#9326168) Journal
    The same is true in the private sector. No money for plant maintenence until something breaks and threatens a lucrative contract.

    Or the management mentality of 'Oh, security is too expensive right now we'll ship it and fix it later'.

    Politicians only look to the next election and managers only look to the next quarter. It is a typical attempt by non technical types to ignore entropy, expressed quite nicely in the old saying 'rust never sleeps.' If you want a bridge to last, paint it today, not after it has rusted out and collapsed. The analogy holds in many ways in many different areas.

    And we as voters and consumers let them get away with it.
  • by orbitalia ( 470425 ) on Thursday June 03, 2004 @12:14PM (#9326700) Homepage
    Oops I lost my link there - Jovial Lives! [af.mil]
  • Re:More problems... (Score:3, Interesting)

    by Silburn_Luke ( 672738 ) on Thursday June 03, 2004 @02:10PM (#9327948)
    To be fair (not that I hold any affection for Mrs T. in my heart) the rot stretches back a lot further than the 70s.

    I'd say the UK has been letting the infrastructure maintenance slide since at least WW2, maybe earlier. We inherited a fantastic installed base from the Victorians - the fact that it took 50 years of neglect to rot away is a tribute to how well they built - but the sad fact is this stuff was put together by a world-spanning Empire at the top of its game. What with paying for a couple of world wars and then trying to keep up Great Power appearances in the postwar world, we didn't have enough cash to keep this installed base up to scratch or replaced in anything like a timely fashion.

    Unfortunately what has taken 50 years to fall to pieces is likely to take about as long to put back together again and (I have it on very good authority) *that* is the real reason why Blair and Brown are so keen on PFIs, despite them being such a poor deal for UK plc in the long term. Its not because they are a cunning dodge to keep spending off the treasury books and plump up the bottom line numbers for the current electoral cycle (although that's a handy side-effect); its because they know that they or their like-minded successors cannot stay in charge for the decades that a full infrastructure overhaul is going to take and they want to make damned sure that nobody raids the infrastructure warchest after their watch.

    What one government gives another can take away after all, so from their perspective its no good kicking off a massive overhaul project now if a Conservative government is able to come along in a few years time and gut it for tax handouts just when its about to pay off. What handing out those juicy multi-decade PFI contracts does is lock in a powerful City-based constituency who will scream bloody murder if a future Chancellor tries to raid those revenue streams for a quick handout.

    It doesn't make much fiscal sense, but politically its quite astute.

    Regards
    Luke
  • Re:Golden rules.. (Score:3, Interesting)

    by hughk ( 248126 ) on Thursday June 03, 2004 @02:57PM (#9328438) Journal
    A big bank did this, only they thought it was in UAT, however it registered itself with production as wanting to collect equity trades. It did, and very well too. They realised by the end of the day that the production backoffice was only seeing a fraction of the number of trades expected. Some poor bastard then had to trawl through the UAT database pulling out trades that were really intended to go to production and put them in the right place. I heard it took a couple of weeks. This is a shame because trades usually must be settled two business days after trading.

For God's sake, stop researching for a while and begin to think!

Working...