Follow Slashdot blog updates by subscribing to our blog RSS feed

 



Forgot your password?
typodupeerror
×
Software News

Software Upgrade Crashes UK Air Traffic Control System 233

pitpe writes "Earlier today the computer system controlling most of the UK's airspace failed, after tests in preparation for an upgrade failed. The original failure occurred at the West Drayton centre, which is an old (70's) system, as opposed to the new system at Swanage, which has had its own problems. A system wide reboot to fix the system resulted in the entire system being taken down temporarily."
This discussion has been archived. No new comments can be posted.

Software Upgrade Crashes UK Air Traffic Control System

Comments Filter:
  • Lucky in the US... (Score:5, Informative)

    by Kredal ( 566494 ) on Thursday June 03, 2004 @09:59AM (#9325014) Homepage Journal
    Considering that up until about 2000, all of the major Air Traffic Control centers in the US were running on vacuum tubes, we were lucky nothing like this ever happened here. Sure, there were glitches at regional centers, that required controllers to do everything by hand, but nothing that required a full reboot of the entire country's ATC system.

    Hopefully the UK will get the new system tested and online before it causes more problems!
  • More problems... (Score:5, Informative)

    by Mz6 ( 741941 ) * on Thursday June 03, 2004 @10:01AM (#9325035) Journal
    I found a similar article on MSNBC [msn.com].

    It seems they have been having problems with their computer systems since 2001 when it was "privatized".

    "The air traffic service has been beset by problems since it was partially privatized in 2001. A $484 million center at Swanwick in southern England opened five years late in 2002.

    The opening was delayed by problems with computer software, and the glitches continued for months afterward, as controllers misread aircraft altitudes and destinations because of hard-to-decipher computer screens. In at least one case, controllers mistook the Scottish city of Glasgow for Cardiff in Wales.

    Now.. that seems like a pretty big mistake for me.. especially for an air traffic controller to do. However, the article later states that:

    "Transport Secretary Alistair Darling said Thursday's problem did not lie at Swanwick but at the older West Drayton center, which is due to be closed by 2007."

    Thank goodness that old one is closing, however it doesn't sound like its replacement is doing any better!

    "If you want to know what is wrong with transport in this country it is that over decades successive governments did not spend enough on the infrastructure and air traffic control is no different," Darling told BBC radio."

    Excellent quote! While terrorism is on everyone's mind, we sometimes forget that safety of transportation should also be just as high. I couldn't imagine pilots relying on themselves to fly airplanes amid the thousands of others without the aid of traffic controllers and their computers.

  • by Kredal ( 566494 ) on Thursday June 03, 2004 @10:03AM (#9325050) Homepage Journal
    Because it IS old hardware.
  • Links for reference (Score:4, Informative)

    by matthew.thompson ( 44814 ) <matt@acERDOStuality.co.uk minus math_god> on Thursday June 03, 2004 @10:04AM (#9325064) Journal
    National Air Traffic Services http://www.nats.co.uk/services/index.html are the outfit responsible for this.

    They have a press release http://www.nats.co.uk/news/news_stories/2004_06_03 .html which explains quite nicely what they did and why.
  • by gowen ( 141411 ) <gwowen@gmail.com> on Thursday June 03, 2004 @10:07AM (#9325089) Homepage Journal
    Its already back running (and has been since this morning, BST) Now the only delays are caused by clearing the backlog of grounded flights.
  • Same in Ireland! (Score:5, Informative)

    by pixelbeat ( 31557 ) <P@draigBrady.com> on Thursday June 03, 2004 @10:09AM (#9325120) Homepage
    Much the same thing happened last
    week in Dublin [ireland.com]
  • by boschmorden ( 610937 ) on Thursday June 03, 2004 @10:11AM (#9325134)
    http://abclocal.go.com/ktrk/news/050404_local_airp ort.html
  • A string of failures (Score:0, Informative)

    by Anonymous Coward on Thursday June 03, 2004 @10:12AM (#9325145)
    It's not too surprising, after all when the system was developed it was re-tendered 2 or 3 times because of gross failures, I think it was something like 8yrs over due and 20M over budget.

    Hurray another British triumph!
  • by perly-king-69 ( 580000 ) on Thursday June 03, 2004 @10:17AM (#9325205)
    The new centre is at Swanwick in Hampshire, not Swanage in Dorset!!
  • by Xilman ( 191715 ) on Thursday June 03, 2004 @10:22AM (#9325259) Homepage Journal
    The new system is at Swanick near Southampton, not Swanage as posted here.

    Swanage is a pleasant little seaside resort. I know it well and stayed there a few nights when on my honeymoon.

    Finding Swanwick and Swanage on a map of southern England is left as a exercise. Hint: Mapquest [mapquest.co.uk] may be a good place to start.

    Paul

  • by aldoman ( 670791 ) on Thursday June 03, 2004 @10:33AM (#9325373) Homepage
    There is 2 ATC centers in the UK - West Drayton which is for the 4 major London airports only (Heathrow, Standstead, Gatwick and London City). This is a 70s system and is due to be replaced by 2006. This is the one that crashed, but because a large percentage of UK air traffic is destined for London, then it caused the other one to go to a standstill.

    The other one at Swanage handles the ATC for everywhere else. This was replaced with a new system in 2002.

    But, by 2006 hopefully all ATC in the UK will be running on new systems.
  • Vacuum Tubes (Score:1, Informative)

    by Anonymous Coward on Thursday June 03, 2004 @10:54AM (#9325615)
    Considering that up until about 2000, all of the major Air Traffic Control centers in the US were running on vacuum tubes

    I got news for you. All the air traffic control centers in the US are *still* running on vacuum tubes. What do you think the CRT displays in all the radars and computers are?

    In fact the computer I'm sitting at right now depends upon a vacuum tube as one of it's most important parts, without which it would be rather worthless :-)
  • by Shimbo ( 100005 ) on Thursday June 03, 2004 @10:54AM (#9325626)
    "which is an old (70's) system". As long as it's not 30-year-old hardware then the software should still be fine. Why does everyone think that simply because software was written in the past it is bad?

    Sadly, it really is running on ~30 year old hardware, at least in part. I've spoken to some of the service engineers.
  • Re:Three fingers (Score:4, Informative)

    by arkanes ( 521690 ) <arkanes@NoSPam.gmail.com> on Thursday June 03, 2004 @11:01AM (#9325704) Homepage
    Amazingly, it was also true. That was an SGI file browser running on Irix. See http://www.sgi.com/fun/freeware/3d_navigator.html [sgi.com]
  • by Anonymous Coward on Thursday June 03, 2004 @11:06AM (#9325753)
    Your regional centers cover about the same airspace as the West Drayton centre, so it is hardly fair to compare a US regional TRACON to the London FIR (comparing big regions to small countries....). Also, it wasn't a reboot of the entire countrys ATC system, it was a reboot of the host computer system at West Drayton, responsible for the printing of flight strips and processing of flight information at the Drayton centre - The radar at West Drayton was still functional and airports still have their own individual radar services, and this outage was only one FIR - the Scottish FIR was not technically affected by this crash, most of the effects were simply knock-on due to the fact that many flights simply pass into the London FIR or are destined to airports within that region, so you can hardly call this the "entire country's ATC system". Again, delays have been increased due to the route rotation most low cost carriers in the UK (and indeed the world) have been using; rather than have one aircraft doing on route, an aircraft may operate many different sectors during the day. Delays caused at one sector can cause other sectors and entirely different flights to be screwed up due to the lack of aircraft because of previous delays. You'll find that the majority of "mainstream" carriers delayed flights are clearing up quicker than those of the lo-co airlines.

    The only way this could be described as the entire countries ATC system is when you consider that London D&D (Distress and Diversion) is [in]conveniently located at West Drayton, and would be down during this outage. London D&D is responsible for monitoring 121.5 ("guard" frequency) across the whole country (Scottish and possibly Shannon FIR included) and becomes responsible for handling just about any aircraft emergency before handoff to local ATCO's.
    Whilst D&D wouldn't be affected technically, regulations require that the information provided by the HCS at West Drayton is always available to D&D, thus forcing a D&D shutdown in the event of the HCS crash. There is enough contingency to negate the use of D&D in an emergency, and handoff to local controllers immediately instead, but I don't think NATS and the CAA were willing to operate without D&D due to regulations and risk of litigation. Many GA flights (even those operating locally and un-filed under VFR) were also grounded due to the lack of D&D. It's a stupid system where one system needlessly relys on another then gets scared into submission when the other goes tits up.
  • Re:So what? (Score:4, Informative)

    by Scooter ( 8281 ) <owen@ann[ ]ova.force9.net ['icn' in gap]> on Thursday June 03, 2004 @11:28AM (#9326069)
    Hmm I don;t think there are those humans around. Cartinaly not in the quantities that would be required to manually guide the 1200 flights a day. We get dependent on the systems. We put the systems in because the load increases beyond the economic viability of an army of ATC guys, not to mention the communication overhead and possibility of error or mis communication. So we build a computer system to deal with it instead. That in turn allows us to up the load by an order of magnitude again. 30 years later, take the system away, and there's nothing.

    In scenarios like this, where load has increased whilst the computers systems were in place, we *are* reliant on them.

    Think of banks - time was when you had to almost plead on your knees to get a banck account, and they charged you for running it. This was becasue every account was written down manually in a book, and any calculations were performed by hoards of clerks. Then - computers. Now your new account is just one more record in a table somewhere, so the banks give out accounts to anyone who wants one, and do it for free. If for some reason your bank's computer system goes AWOL, there is no way they can process a month's interest calculations on the millions of balances and transactions - not to mention actually applying the transations that would now come in on bits of paper.

    I do agree that in a lot of cases, there remains a perfectly useable manual method, but where the computer system has enabled geometric increases in capacity over the manual system (which has been taken up) then, if you'll excuse the pun, it won't fly.

    You're right about the Y2k thing - I worked on a contract for a railway maintenance company in 1999 and the Y2K cordinator guy was tearing his hair out at the thousands of questions he got monthly such as "so, these nails, are they Y2K compliant?" He actually had solid steel track components called "chairs" that the rails sit on that had Y2K compliance stickers on them from the manufacturer. Presumably, they got fed up explaining it too, and decided it was easier to just stick the stickers on everything they made...

  • by Prendeghast ( 658024 ) on Thursday June 03, 2004 @11:32AM (#9326121) Homepage

    The 'fridge size boxes are 70's vintage (I suspect bits have been replaced over the years). The CPUs are only about five years old. The system consists of two identical computers for hot failover and they they had to get two custom CPUs made by the original manufacturer (IBM, I think) to deal with Y2K.

    As for the software? Written in some weird language called Jovial, and continually repatched - never rewritten.

    BTW, where the heck is Swanage? The new NATS center is in Swanwick!

  • by Anonymous Coward on Thursday June 03, 2004 @11:42AM (#9326244)
    This, like so many of today's reports (including the ones from the BBC), is inaccurate. England and Wales has 2 en-route ATC centres - that is centres that handle high-level traffic, outside the control of the terminal control areas round the airports. These are West Drayton and Swanick (not Swanage). The plan is that Swanick will handle all en-route traffic once flight data processing has moved from West Drayton. Swanick currently handles all en-route traffic control and routing, but if there is no flight data handling nothing can move.

    Swanick (formerly known as New EnRoute Centre - NERC) has just gone live after a very painful birth. This was largely because the system was based on the US AAS project which was cancelled before NERC was started.
  • by orbitalia ( 470425 ) on Thursday June 03, 2004 @12:04PM (#9326569) Homepage
    Hi, I worked on exactly this system for 4 years.

    The hardware is an IBM 9020 family mainframe, the application is written in Jovial (one of , if not THE first algebraic language), and BAL assembler (for the monitor mostly). The monitor is the operating system so it effectively is a custom written operating system for this application.

    Although MVS is also used for testing. The I/O capabilities of the mainframe are superb which means it can handle 2000+ flights with only 14 Megs of RAM (if I remember rightly).

    I believe the NAS application came as a freebee from IBM when the UK purchased the hardware and was the same NAS (national airspace system) application used all over the US. It has been continously developed since then (no mean feat when you consider that all variables are global in Jovial, It uses holleriths instead of ascii, and you are limited to 5 or 6 characters per variable name). The hardware has also been upgraded several times over its lifetime.

    It doesn't often go down, last time was 2002 sometime, and you can tell how important it is because everyone screams when it does go down. The people I worked with are extremely dedicated to their job, but one cannot test a system like this for absolutely every eventuality. No doubt some patch was applied and some special case came up that caused a FLOP (functional loss of operation). It happens, Radar is usually unaffected, so the safety implications are not large, but flow is affected.

    The UK approach to handling NAS is much different to the US, the US tends to not touch the NAS software and develop external systems that enhance the usage of airspace, where as the UK tends to delve into NAS and improve things directly in NAS. Jovial is a very interesting language it has been used heavily by the US military and exists in such applications as Cruise missiles and many other aircraft and missile systems. Read about Jovial here if you are interested.

    I can't say too much about it for various NDA reasons (OSA) I think most of the above is in the public domain.

    HTH.

  • by AlecC ( 512609 ) <aleccawley@gmail.com> on Thursday June 03, 2004 @12:09PM (#9326626)
    It's an IBM 360/370 class mainframe: not sure what model. Somebody up the line said that the software was written in Jovial, which strikes me as very likely. Jovial was an Algol variant popular in defence/high reliability circles at about the time this lot was written.

    I think the system which crashed was only responsible for admitting new flight plans to the whole complex. Any flightplan already filed could carry on; it is just that no-one could file a new plan for the next flight.
  • Re:More problems... (Score:2, Informative)

    by Anonymous Coward on Thursday June 03, 2004 @12:32PM (#9326909)
    Maggie is a paragon of Socialist Virtue.
    Care to tell us where all that North Sea Oil revenue has gone, then? Since 1997, public expenditure on health has risen by 50.6% Private spending has risen by 22.9% Which means Thatcher was more reliant on private money. Transport spending has risen by a similar amount.
    I doubt if Thatchers government would have blown nearly a billion quid on the Millenium Dome
    Well, no. That was Major's idea.
  • by general_re ( 8883 ) on Thursday June 03, 2004 @01:19PM (#9327387) Homepage
    ...Written in some weird language called Jovial...

    Muahaha. Languages from the stone-age. Jovial is an ancient semi-descendant of Algol, originally written especially for avionics systems. I'm not nearly old enough to have worked with it myself - Jovial's heyday was the mid-'70's or so - but I used to work with a couple of DoD greybeards who had done so, although even they hadn't touched the thing in years, as it's mostly been supplanted by Ada these days. The USAF can tell you a bit more about Jovial [af.mil] if you're having a slow day today ;)

Stellar rays prove fibbing never pays. Embezzlement is another matter.

Working...