Why OldTech Keeps Kicking 339
Hugh Pickens writes "In 1991 Stewart Alsop, the editor of InfoWorld, predicted that the last mainframe computer would be unplugged by 1996. Just last month, IBM introduced the latest version of its mainframe, and technologies from the golden age of big-box computing continue to be vital components in modern infrastructure. The New York Times explores why old technology is still around, using radio and the mainframe as perfect examples. 'The mainframe is the classic survivor technology, and it owes its longevity to sound business decisions. I.B.M. overhauled the insides of the mainframe, using low-cost microprocessors as the computing engine. The company invested and updated the mainframe software, so that banks, corporations and government agencies could still rely on the mainframe as the rock-solid reliable and secure computer for vital transactions and data, while allowing it to take on new chores like running Web-based programs.'"
because it works! (Score:4, Informative)
Old Technologies that are still kicking... (Score:5, Informative)
The QWERTY keyboard
SATA (yes, folks, a serial version of the old IBM AT bus!)
Drive letters, DOS devices
Does anyone actually use the tar program for its original purpose anymore?
Re:Just like analog television (Score:3, Informative)
Hit the nail on the head. (Score:4, Informative)
- Reliability
- Availability
- Capacity (including compatibility across upgrades)
in that order.
Reliability is the absolute must. Dropping pennies through the cracks adds up to big bucks in lost coinage and much BIGGER bucks in legal trouble from the people whose pennies got lost. Consistently total the bill wrong and you face class action suits, too.
Mainframes don't make errors, period. The internal components DO make errors, and the mainframe fixes the errors so the result is correct (though it may be delayed by milliseconds when a bit drops internally). They do this a number of ways: Error detection/bus-logging/stop-fix-restart, redundant components and voting, redundant components and comparison (see "error detection..."), error correcting codes to name just a few.
Redundant collections of less reliable machines don't cut it. Businesses solve the "distributed update problem" by avoiding it: Transactions are processed on a single, ultra-reliable, server. The data is backed up (offsite and often dynamically via a network) so that, in case of disaster, they can switch to ANOTHER single, ultra-reliable, server. But spreading the work over multiple flakey machines is not an option. (They know how to do it with people. But they don't want to go there with computers when there's a better option.)
- Availability is right up there.
Drop the real-time logging of phone calls for a reboot and a baby-bell's ong-distance phone lines are free. That's in the million bux and hour range. But it's a drop in the bucket compared to the cost of an outage in the trading support systems of a major brokerage.
- Capacity must continue to be "enough" as a business grows.
Throttling a growing business because the IT department can't crunch the extra transactions kills shareholder value. And this includes compatibility: Thrashing the applications and inducing delays and bugs, just to port to a machine of the necessary capacity, also isn't an option. A business-critical legacy application has to "just work" if the system must be upgraded for higher capacity. The source may be long lost and the programmer long dead, so even recompilation (or reASSEMBLY) may not be practical options. (Even if the source code ISN'T lost it may be in a language that's no longer supported and/or with no experts available.)
===
Makers of non-mainframe computers and their components and operating systems still haven't "gotten it" on these issues. The hardware designs are almost totally composed of "single points of failure" and flake out from time to time. OS crashes are a way of life (especially with the "dominant desktop OS" - which is what business decision-makers see).
The chip makers blew it with things like Weitek's floating-point accelerator that didn't do denormals and Intel's Pentium bug. (Those little numbers are VERY important for things like interest calculations.) In particular, Intel could have recovered from that by immediately replacing the chips with the fixed ones and giving business customers priority. Instead they fought it and claimed that the errors didn't matter for anybody but the users of "high-end games". GAMES? What does THAT look like to a guy in a business suit in the executive suite of a fortune 500 corporation?
Imperfect computers can work for the desktops that support the imperfect people who handle the day-to-day operation. The infrastructure is already in place for distributing the load across them and recovering from their errors. And they can work for the core of a network - where protocols can repeat dropped packets and machines can route around failed peers and cables. But like the EDGE of a network (where a customer's lines funnel through a single box, which must have telephone-switch-like reliability), the core of corporations' information processing is already built on and optimized for near-perfectly-operating machines. Despite their cost they're FAR cheaper and less risky than switching to, and running on, something less.
Prius pretty much does this... (Score:3, Informative)
I'm not sure I see what's wrong with the steering wheel as an input device for turning a car. However, there's no real reason why the wheel could just be turning a potentiometer that controls the steering. The original reason for a steering wheel was the mechanical advantage (thus the reason trucks had bigger steering wheels. Perhaps we should go back to the tiller -- which was what some of the original cars had.
Re:can be argued for other things too (Score:5, Informative)
Its not that things with electricity break while things without electricity don't, its that things with software break while things without software don't. Software, because of its discrete nature, is inherently harder to judge safe. A bridge rated for 10,000 pounds will easily carry 1000, but a piece of software that works with input 10,000 cannot automatically be guaranteed to work with input 1000. Any "drive by wire" system will need software (at least for the motor controllers that transform the steering wheel input into steering motion), and therefore consumers are understandably leery of it.
The other consideration is tactile feedback. A mechanical steering system provides lots of tactile feedback, since you're directly connected to the steering system via a mechanical linkage. Therefore, if there's something wrong you're liable to feel it (i.e. the car pulls to one side, or becomes difficult to steer), allowing you to detect problems before they become catastrophic. Without that mechanical linkage, you're dependent on the software designers to judge how much feedback the system provides. If there's a problem that the designers haven't anticipated, the system will not warn you, and small anomalies will grow to catastrophic proportions simply because the warning signs were filtered out from the driver's perception.
Worse yet, the two problems are interrelated. Increasing the amount of tactile feedback increases the amount of software needed, since you've got two output devices (steering wheel for tactile feedback, and steering mechanism for actual steering) and you need code to modulate output to both of them. This necessarily increases code complexity, making the job of making sure the code is bug-free even more difficult.
Finally, for those who are going to make an analogy with fighter jets' fly-by-wire systems, I must remind you that an aircraft has far more room to maneuver. And, even then, there were problems with the early fly-by-wire systems. The F-14, for example, had some serious issues with the flight control systems becoming confused and adjusting the wings inappropriately, leading to stalling and loss of control. These issues were eventually worked out, but the process took years. This is OK for a highly specialized system where your operators are specially selected and highly trained, but it is definitely not appropriate for any consumer grade system.
Re:Mainframe engineering is better. (Score:3, Informative)
Re:Old Technologies that are still kicking... (Score:3, Informative)
Um, we have these things called hard and symbolic links, and have had them for a few decades now.
IBM's real product: Peace of Mind (Score:3, Informative)
There's an old saying in the industry that still holds true: "Nobody gets fired for buying IBM." They provide the customer service businesses trust, and that's what closes the deal in large-scale business systems (and brings in a large, ongoing revenue stream). Look at their name: International Business Machines. Their reputation came from getting the job done year after year, from protecting the money spent on applications, development, and client data from instant obsolescence.
Companies remember that IBM mainframes give them years of faithful service, with on-site support a phone call away. Compare that to your PC experience!
When Something Goes Wrong... (Score:5, Informative)
I love this "single point of failure" argument. It's a fallacy. The only single point of failure with a single mainframe is the building it physically sits in. A single mainframe is internally redundant in every possible respect you can think of (and several you didn't think of). It is that cluster you talk about fondly, except there's no (error-prone) self-assembly and no particular management burden required. It. Just. Works.
But if you're concerned about a building failure -- fire, flood, whatever -- you can buy a second machine. IBM will sell that second machine to you at a lower price. You can put the second machine in a second building, you can run fiber (preferably with two separate physical paths) between the two machines, keep them many tens of kilometers apart, and run them as a single, seamless cluster (called a Geographically Dispersed Parallel Sysplex). And, as a programmer, you have absolutely zero coding responsibility to make that all work. If anything bad happens all your transactions instantly flip over to the other site, in-flight, real-time. And you don't lose a single byte or a single customer, and you can prove you didn't. You can also service any element of that cluster -- any element, from software to hardware to network to whatever -- without any interruption in business service. Yes, you can upgrade your database engine version while everybody's credit cards keep working. Neat party trick, that, but it's business-as-usual for mainframes.
Scalable? Each machine contains up to 64 main processors (and a minimum of two spares!) running at 4.4 GHz with more cache (and more cache levels, including copious shared cache) than anything else. (Even the clock speed argument is gone. It's a faster clock speed than X86.) Plus scores of secondary processors -- the main processors only do real work, not encryption or I/O. They don't even handle clustering -- there are dedicated processors for that. You can stuff 1.5 TB of RAM in each frame. And you can have a single cluster -- which behaves like a single logical machine from a programmer's point of view -- containing up to 32 of these machines. That's a single "machine" with 2048 main processors and hundreds (thousands?) of assist processors. Beyond that you can still do everything an Intel cluster can, like conventional load-balancing (e.g. HTTP spraying) across multiple 2048-CPU clusters. But no one has yet invented a core banking system, for example, that exceeds even a couple of these 64-way machines for a large Chinese bank, to give you some perspective.
No, this stuff is in a different league. Please read up on it sometime before dismissing it offhand. I don't dismiss the value of X86 blades for certain applications, but this mainframe stuff is very different and has important roles. Telecom switching, maybe maybe not. Telecom billing, you bet.
Re:Is it really "old" tech? (Score:2, Informative)
Re:So, where's my pocket mainframe? (Score:3, Informative)