Forgot your password?
typodupeerror
IBM Hardware Technology

IBM Announces Chip Morphing Technology 118

Posted by CowboyNeal
from the self-fixing-problems dept.
An anonymous reader writes "IBM has announced that it is now capable of producing self-healing chips. From the article: 'eFUSE works by combining software algorithms and microscopic electrical fuses, opposed to laser fuses, to produce chips that can regulate and adapt their own actions in response to changing conditions and system demands.' It goes on to say that the IBM system is more robust than previous methods, and that the chips are already in production. The future is here!"
This discussion has been archived. No new comments can be posted.

IBM Announces Chip Morphing Technology

Comments Filter:
  • Think about it... overheat a chip, it heals itself.
    • Think about it... overheat a chip, it heals itself.

      Kinda reminds me of the remote control automotive screws discussion. Overheat a chip, destroy its ability to heal itself too, perhaps? D'oh!
    • by dnoyeb (547705) on Saturday July 31, 2004 @08:26AM (#9851765) Homepage Journal
      Heal, lol. What did I miss? A fuse is something that interrupts a circuit permanently. Akin to gnawing off a leg.

      Reading their article, the big improvement is the leg has no chance to grow back.

      Sounds like total spin to claim that descruction of circuits is a healing process. I smell DRM all over this.
      • I agree.

        Using fuses seems best suited for small runs where your design is pretty fixed and you don't want to foot the bill for a custom chip mask. Like programmable logic arrays, etc...

        So if conditions change with the environment these chips are in, they blow some fuses to respond. If conditions change back to where they were before the chip blew fuses, oh well. Some sort of nonviolate ram seems more in order for "adaptive" technology, heck regular PC cmos adapts handily to new hard disks for instance.

      • "Sounds like total spin to claim that descruction of circuits is a healing process. I smell DRM all over this."

        So DRM smells like fried electronics?

      • We would do better to call them self-amputating chips.

        I doubt that DRM would be a driving factor, but I could see where a software security vulnerability might be exploitable to cause damage to the CPU.

        I could also see where a kernelmode DRM driver might seek to destroy CPU's used to rip CD's etc without permission... Many questions arise from this and how technology and content providers will reach a compromise. My own personal view is that such a compromise is becoming less and less possible.

        I think
    • by vrmlknight (309019) on Saturday July 31, 2004 @08:45AM (#9851834) Homepage
      You can do that once... The main thing this helps is when there is a single failure in a production server so now when it happens you are able to schedule down time and then replace that component. Is like when you have your redundant hard drives one goes then you can replace it when you get a chance. (hopefully soon before you have another failure)
      • I think this is the most fair analogy I have seen yet after reading the article. Its sounds like the chip is just takeing the defective area out of the ciruit and working around the problem. It seems like a degraded mode on a disk array to me, sure your raid 5 array keeps running with the loss of one drive but the performance is not great.
    • T2 is coming!!! Just everybody wait!!!
  • Sounds good to me (Score:1, Interesting)

    by no1here (467578)
    So what does this really mean for computers? And why design a chip? Can't it design itself. You give it all the resources it will need, tell it what to do, and it determines how to best configure itself. And then it could reconfigure itself to better adapt.

    P.S. First post. :)
  • Is this the basis for the PowerTune [eweek.com] technology either used in the 970FX or to be used in it? It supposedly automatically adjusts power consumption and processor speed based on how processor intensive the current operations are.

    This seems much different from the current speed stepping technologies as it doesn't scale down to a fixed MHz rating. That is, it isn't always 2.0Ghz during intensive operations and 1.2Ghz for non-intensive operations. /got nothing
  • by rob.sharp (215152) on Saturday July 31, 2004 @08:16AM (#9851738) Homepage
    "I know you and Frank were planning to disconnect me, and I'm afraid that's something I cannot allow to happen."
    • by Pharmboy (216950) on Saturday July 31, 2004 @08:38AM (#9851813) Journal
      How applicable. Nothing beats a technology that brings up images of either "2001" or "The Matrix". (insert eFuse Overlord joke here)

      But on a more serious note, while this sounds pretty cool, it still breaks down to this: If a portion of the chip is screwed up, eFuse will bypass it. If you bypass part of the chip, you will have lower performance. I can see where this would be good in enterprise computing *IF* the chip also *TELLS* you that it is messed up, so if a portion of the chip becomes defective, it will still operate until it can be replaced. This would be great for uptimes and in mission critical systems, but for overclocking desktop system, this seems pretty useless, here is why:

      Take a 2ghz chip. Overclock to 2.5ghz. Blow two eFuses (oops). Now chip at 2.5ghz functions as fast as a 2ghz chip. Clock back down, and it performs as fast as a 1.5ghz chip. Sell chip or system on eBay to someone without telling them eFuses are blown, screwing them over.

      Unless there is a way to test if the eFuses are blown, I can see some real problems on the used market for this kind of chip. This would also apply to "why is this server performing like crap?" situations. Of course, as long as the eFuses are not blown, but are instead just reordering its own logic for specific uses (web server only, database server, etc), this would be majorly kick ass offering a quazi-specific purpose system on the fly. Especially once you have a kernel module that can talk to it and tell it what kinds of changes in routing would be best for a given platform, telling it "this computer is used for $x only, route logic accordingly".
      • How about if it recognized that you were running say, SETI@Home, and it optimized itself to execute that algorithm faster?

        I think that's the real advantage of these kinds of things. Now you won't have just an all-purpose processor, but an all purpose processor that can specalize in the task that you are currently working on.

        Wouldn't this make it seem like you have a PC designed just for what you are doing, with everything you do?
        • How about if it recognized that you were running say, SETI@Home, and it optimized itself to execute that algorithm faster?

          It would seem to me that this is exactly what the article seems to indicate. If what you are doing is not i/o, disk or memory intensive, but instead 98% cpu cycles (like seti@home or other distributed computing) then it would adjust. If you are rendering frames in a SGI fashion, it would change. If you are using it for streaming media box, adjust. The big questions are: How long d
      • "If you bypass part of the chip, you will have lower performance." It ain't necessarily so! As long as we are blue skying these capabilities, the small pattern sizes allow for extra hardware to be included on chips that may not even be used until placed into service by a process such as this for healing or even when purchasing extra/future performance boosts!
        • As long as we are blue skying these capabilities, the small pattern sizes allow for extra hardware to be included on chips that may not even be used until placed into service by a process such as this for healing or even when purchasing extra/future performance boosts!

          This is too tasty for the marketers... It is inevitable... After this is widespread, you will only be able to purchase base model chips. You want performance, purchase these 3 performance packs (which activate circuitry on your existing

  • Awesome (Score:2, Insightful)

    by Stevyn (691306)
    This sounds like an innovation above and beyond upping the clock speed and making a bigger heatsink. Take that pentium!
    • yes it is a really cool inovation but really what does it do? Personnaly I have never broke a CPU, and I doubt many others have. The only real benefit I see is for the producer, being able to sell flawed chips that still work. People will end up paying more to ensure their eFuses are not blown on their chips much like buying an LCD with guaranteed no dead pixels.



  • I'm running to the Court House RIGHT NOW!
    Changing my name to John Conner!
  • by tklive (755607) on Saturday July 31, 2004 @08:25AM (#9851762)
    while it does sound like a big step forward ,esp considering "eFUSE is technology independent, does not require introduction of new materials, tools or processes" . But how exactly is it selfhealing ?

    nothing is mentioned abt the redundancy required for the reroutings... its obvious not all kinds of faults can be handled this way. so, do they try to predict possible faults and build in workarounds.. or do they just use the natural design to handle whatever can be ? ....how does this affect the way they design circuits ... make more generic blocks etc ?..and maybe i didnt really understand the article...but isnt it more of a self correcting rather than self healing feature?

    wish the article had more info...
    • by Anonymous Coward
      That's just what I was thinking. It sounds like this can only be used to incorporate some redundancy.

      Self-healing would be something completely different, imho -- the ability to rebuild damaged circuitry from some kind of schematic or remaining information, or maybe the ability to fall back to general instructions on the main CPU if a specialist unit like a GPU failed.
  • "eFUSE reroutes chip logic, much the way highway traffic patterns can be altered by opening and closing new lanes," said Bernard Meyerson...

    ...And much like the neurons in the brain? Doesn't his have rather large significance to AI, or artificial life, for that matter? If the IBM solution is part software, who is to say that the software cannot be intelligent?
    • That depends on your definition of inteligence. The thing just has a way to "heal" itself. Neurons in the brain are much more compilcated and have much better algorhitms for pattern recognision etc..
      • From younger days, I remember people speaking a lot about braincells dieing because they were not in use. I never devoted the time to figure out how this could be a Good Thing, but if a chip can do it too then surely it takes a step closer towards acting like the brain does. Maybe the brain works at a lower level than this IBM solution, and builds up its logical circuits by nuking selected cells?
  • With a limit? (Score:5, Insightful)

    by usefool (798755) on Saturday July 31, 2004 @08:30AM (#9851780) Homepage
    Surely a chip cannot keep self-healing indefinitely can't it?

    If it's capable of re-routing certain path when something went wrong, it'll eventually run out of alternative path, or the performance will be degraded to next to useless.

    However it's certainly a good pre-emptive tool for mission critical machines, provided it has a way of informing the admin that it's dying, rather than secretly degrading.
  • I'd seen a few or more articles about other dynamically reconfigurable chips such as this [emediawire.com] until now. In which point is IBM's one different from others? Making a single chip autonomic in itself is only about packing, I suppose.
  • On-Chip Sparing (Score:3, Insightful)

    by Detritus (11846) on Saturday July 31, 2004 @08:37AM (#9851811) Homepage
    Sounds like it is most useful for permanently reconfiguring a chip to use spare functional units after problems are detected with the currently selected functional units.
  • Default Color Link (Score:4, Informative)

    by Anonymous Coward on Saturday July 31, 2004 @08:39AM (#9851816)
  • by 3seas (184403) on Saturday July 31, 2004 @08:44AM (#9851828) Journal
    What this sounds like is a chip production success/failure rate improvement. As well as providing a bit more flexability in going from teh drawing board (design/theory) to production (testing/reality).

    I think it is very interesting that they are using something that was considered to be bad in chip reality (electromigration), as a positive thing.

    This is, in analogy, like how our bodies exist symboticly with many different germs and such, for without we'd die alot sooner.

    I don't think what the article is talking about is anything like reprogrammable chips (FPGAs) as some may think by reading the article, but rather something automatically used once between the chip production line and its actual ongoing system use to auto test and correct any production anomolies per chip. (is this where we say bye bye Neo?)

  • Oh yeah, PCBs go bad all the time. Wait, processors and PCBs are probably the most reliable things in all of our electronics. When a processor or PCB breaks, it's due to something that these chips would not be able to 'self-heal', like horrible electrical damage or overheating. How about a *true* self healing HD or optical media (CD/DVD).

    Maybe in a much larger scale, perhaps a motherboard that has reprogrammable chips, so that when your modem burns out from a power surge, it can reprogram some other mod
    • True for PCs, but an IBM server isn't anything close to a PC. considering almost all IBM i & p series servers are shipped with multiple deactivated processors as well as seperate processor cards to handle raw IO processing this is more of a RAID type thing. The purpose of RAID isn't to prevent the drive from failing but allow you to limp along until you can swap the drive. Also, remember that the IBM "big iron" has hot-spare, hot-swappable EVERYTHING...CPU, RAM, PCI cards, controllers, disks...
    • I tend to agree for the most part.

      But let's imagine you have a cluster with 1024 diskless nodes. At that scale, you need a ridiculously high per-node mtbf just to get anything done before your cluster breaks down. This might be a lot simpler and cheaper than trying to manage redundancy at higher levels.

      Or maybe you're building a chip to control antilock braking, or for that matter an airliner or a space ship. Even if the odds of braking something handled by this mechanism are fairly low, it might st

  • wow, now they've invented the 'eFUSE' maybe they could invent the 'eLAMP' and 'eDIODE' and 'eTRANSISTOR' - amazing 'e' components that can be controlled electronically!!

    i know on-chip fuses (PROM?) have been around before and this seems to basically just be the same thing but more reliable and with 'e' on the end which im guessing stands for electromigration, which AFAIK is a problem with very small paths on chips that get screwed up by the flow of electrons and some sort of ionic-bondage-thingy interactio
    • damn skimming, didnt see that electromigration was used _in_ the fuse, ok e-fuse makes sense, eFUSE is stupid, arguing about capitalisation is also abit stupid.
  • Sounds to me we're heading to customizable chips in the future. Flawed designed chips (remember floating point calculations in the early pentium age) can be updated with a better design.
    • Excuse me, but programmable circuits are not a new thing: as example, remember Altera PLA's, etc. The big problem of the "good old known" programmable circuits is that it is required a lot more space than for non programable ones, i.e. you could need 100 million of transistors to "program" a i386 with no cache, and of course, running at a much lower clock that the main core (think about clock propagation in a programable circuit, about 10x slower).
      Anyway, thinking about talented IBM people, may be they ha
  • by wamatt (782485) on Saturday July 31, 2004 @08:56AM (#9851871)
    Lets hope IBM has the for-sight to ensure that the eFuse feature cannot be controlled by software.

    Think about the latest worm going around taking your nice new 3200Mhz processor to an effective 100mhz by blowing all the fuses and crippling it.

    I would guess though, because of the high R&D costs involved, this will only ever see its way into high-end servers.

    • Think about the latest worm going around taking your nice new 3200Mhz processor to an effective 100mhz by blowing all the fuses and crippling it.

      I'm guessing that won't happen. Chances are this feature was designed to work around problems in a high end server. Trying to keep a mainframe at 99.999% uptime requires the ability to adapt to hardware failure. Thus, this would be a part of the hardware, and the software would only know about it enough to send the message to your IBM support person to come f

    • I would guess though, because of the high R&D costs involved, this will only ever see its way into high-end servers.

      You have a poor grasp on basic economics. If something costed a lot in R&D, that is a good reason to mass produce it to spread the R&D costs over a lot of units. The only reason why something is limited to high-end products is that the MANUFACTURING costs are high and the article explicitely states that the fuses are added at no additional cost. So the only logical thing for IBM
      • You have a poor grasp on basic economics.
        Yeah, you know me well.

        Why do you think companies purposely disable features with a ZERO manufacturing cost? Its called product differentiation, look it up in your ecos textbook, sunshine. They MAY decide to license it for mass market, then again it may be more profitable as a drawcard for the highend segment.

        • The fact that some companies do it, doesn't mean IBM should/will. And in any case, your argument is a logical fallacy, because marketing considerations are irrelevant to discussion about R&D costs. There might be other factors too, like logistics, manufacturing capacity, security/relyability preferences of the customers, compatibility, impact of the technology on expected lifetime of the chip, etc. But with all else equal, high R&D costs force companies to use the result in the mass product. And ove
  • Argh! (Score:2, Informative)

    by PKC Jess (797453)
    I know this is slightly offtopic, but how many people have ever actually BEEN to the IT section of slashdot? It hurts the eyes! On a more ontopic note, I am tres excited about this technology and I for one will keep a sharp eye on it. Go self healing technology!
    • The color scheme is terrible, yeah.

      A friend of mine pointed out that if you change the "it." in the URL to say... "games." it works fine, and is readable.
  • by shokk (187512) <ernieoporto@ y a h o o .com> on Saturday July 31, 2004 @09:10AM (#9851910) Homepage Journal
    Nothing new here. Virage Logic Corporation [viragelogic.com] has had these designs on the shelf for their Self-Test and Repair (STAR) Memory System for some time now [google.com]. It has been licensed to quite a few parties already for use in the various fabs so this is already being done.

    Look through the website. IBM is even a customer.
  • Who benefits really? (Score:5, Interesting)

    by hashwolf (520572) on Saturday July 31, 2004 @09:11AM (#9851912)
    When batches of silicon chips are made a number of them are always defective.

    This technology is more beneficial for IBM than for us because it will allow IBM to SELL defective-but-self-repairable chips instead of SCRAPPING them. Because of this, it is highly probable that there will be no way end users will be able to garner info about to what extent the chip has already repaired itself.

    If this is the case IBM will probably take one of the following roads:
    1) Continue with the current manufacturing standards - this would yield chips with more longevity.
    2) Manufacture chips with less stringent (and hence cheaper) manufacturing standards - although this would yield more defective chips, these won't be thrown away since they can self repair; they will be sold instead!

    I really hope it's not option #2 they chose.

    • by Detritus (11846)
      They already do #2 with DRAM chips, to keep the yields at reasonable levels. Although I think they have to be tested and repaired before they are shipped from the factory.
    • This has been done for quite some time, but in software. It was supposed to be something done once when the chip came out of manufacturing, and forgotten about, but the "fuses" kept growing back after time and causing errors. A map of these was kept associated with each chip (and since this applies mainly to Processors, a eeprom containing this data is shipped with each processor) and software handled the routing.

      Without this ability, the chips would be extremely more expensive, and probably not even via
    • Here's an idea:

      Instead of making lots of different ranges of chips, make one chip type for each architecture. With normal manufacture there are a lot of failed chips. But with this you can sell most of those chips at a lower price. You could then have a much improved success rate, the reduced function chips sold at a discount instead of tossed away.

      Depending on the numbers, this could reduce the overall price of chips.

  • The liquid metal chip.
    Overclocking makes the chip kill you!

  • by iansmith (444117) on Saturday July 31, 2004 @09:47AM (#9852026) Homepage
    The first thought that entered my head when I read this was, "Great... now we can have hardware that can be designed to self-destruct on demand." Imagine you get sold a CPU with an expiry date... software licences for hardware, the old you don't own the chip but are just renting it.

    IBM better be REAL carefull with this too. If it's possible to fool the chip into blowing these fuses, a virus could potentially damage millions of computers in a day of spreading.

    As others mentioned, it is a neat trick, but a solution in search of a problem. CPU's just don't fail all that often to need something like this.
    • Older geeks will remember all the stories back in the 70's about people who paid big bucks for some fantastic new feature in IBM's cpus, and watched the IBM guy come over and "install" it by clipping a jumper wire or two on a board.

      We're probably going to be hearing a lot more of those stories in the future as a result of this development. Except that the IBM guy won't have to actually come over and clip anything. They'll be able to do it across the Net by asking you to download an Install program, which
    • a virus could potentially damage millions of computers in a day of spreading.

      We already live in that world. Viruses can already in theory toast BIOSes by flashing them with crap, or (equivalently for most people) destroying the OS. This new tech really wouldn't change anything (and BIOS destruction is likely to be "lower hanging fruit" for a while yet).
  • by mangu (126918) on Saturday July 31, 2004 @09:48AM (#9852033)
    So, will we see a day when your computer catches a virus that transforms that gazillion GHz CPU into a 2 MHz 6502?
  • Memory - not logic (Score:2, Insightful)

    by Anonymous Coward
    Efuse and laser fuse are technologies for repairing memory defects, not for repairing logic defects.

    From the article, it appears this innovation applies to the embedded memory on a logic chip:
    "...all 90 nanometer custom chips, including those designed with IBM's advanced embedded DRAM technology"
  • what happens if they realise that they have to roll back to the prev chip config ??
  • How long before an emergent intelligence develops?
  • Self-healing chips. OK, we can built a T1000 now. The "rerouting" scene is a classic in SCI-FI, but what if...

    Some stupid worm uses a backdoor to start a haywire self-healing sequence?

    Dave... Dave... Nah. More like... FZzzzzttt...
  • The man, whose initials (obivously, coincidentally?) were DNA, must've been some sort of prophet. Remember, Deep Thought was the SECOND best computer. And when it came up with the answer (you all know it, admit it!), and then determined that it was the correct QUESTION that we really needed, set to work on buildng a BETTER computer. I mean, the Hitchhiker's Guide was the prototype for the World Wide Web, and Deep Thought was the ultimate self-healing--in fact self-UPGRADING--computer. Maybe he was the Secon
  • Radiation hardness? (Score:2, Interesting)

    by kievit (303920)
    Maybe this technology could be useful to make chips which can survive in radioactive environments like particle detectors in accelator laboratories or in satellites? (And if that is so then the military is probably also interested, to use them in battlefield drones.)

    • Why do people have to think radioactive?

      While satellites may exist in a highly ionizing environment, I think it is inappropiate to call it radioactive.

      It would also be useful for interplanetry probes - I would wager that a lot of mass can be removed if the hardware was more resilient.
      (Although, the support callout would be prohibitly expensive)

      Didn't the old Pioneer/Voyager probes have processors built with transistors in such a way that they can degrade gracefully? I seem to recall reading that the engi
  • Seems to me this will lead to lazy production practices, not better-built chips. Maybe it's needed in order to keep pushing the Moore's Law marketing vision.
  • FPGAs? (Score:2, Interesting)

    by andreyw (798182)
    I was thinking about blowing away some money on a large FPGA and associated hardware and software.

    I think it shouldn't be that mutch of an issues to program some part of the FPGA with the logic to reprogram the rest?

    And start from there. Damn, this sounds so uber-call. Retargatable and reprogrammable logic really blends the line between software and hardware.
    • I think it shouldn't be that mutch of an issues to program some part of the FPGA with the logic to reprogram the rest?

      We do it already. The latest Xilinx FPGAs have an internal reconfiguration port, so the FPGA logic can reprogram itself.

      We published a paper earlier this year about running Linux on an FPGA processor, with this reconfiguration port mapped into /dev. Basically we can partially reconfigure the FPGA under OS control while the rest of the FPGA (incl. the CPU with linux) keeps going. See m

      • So Im not the only one with the urge to have multiple Quad-Core processors that reconfigure dynamicaly from OS based controls... Hurra for Linux... Just thinking of using Windows to do this give me chest pains.
        • So Im not the only one with the urge to have multiple Quad-Core processors that reconfigure dynamicaly from OS based controls... Hurra for Linux...

          Absolutely. Oh the fun you can have when you can modify the hardware as easily as you can the OS...

          Just thinking of using Windows to do this give me chest pains.

          Indeed. Also since this is publically funded research, I feel there is an ethical responsibility to ensure that the outcomes benefit the community in general (a la open source), rather than a s

  • Does this mean in the future I'll just download a new design off the internet for my processor and install it and suddenly have a more heat efficient processor, or say one more specific to my needs. Or perhaps have a few different layouts on disk that i can change to. That sounds pretty far out, but a neat idea.
  • Wonder Processor Powers: ACTIVATE!
    Shape of: A Laser Fuse!
    Form of: A Morphing Microchip, uh, made out of ICE!
  • Self-healing chips are cool, but this color scheme sucks.

    Maybe we can direct IBM's research toward self-healing color schemes.

  • Heck, couple this with IBM's LPAR hypervisor on a power5 machine and you get so much redundancy and flexability it boggles the untrained mind! From: http://www-1.ibm.com/servers/aix/whitepapers/aix_ s upport.pdf "Dynamic Processor De-allocation enables defective processors to be taken offline automatically, before they fail. This is visible to applications, since the number of online logical processors is decremented. An application that is attached to the defective processor can prevent the operation from
  • by Dachannien (617929) on Saturday July 31, 2004 @12:34PM (#9852790)
    The future is here!

    Dark Helmet: "What happened to then?"
    Col. Sanders: "We passed it."
    Dark Helmet: "When?"
    Col. Sanders: "Just now. We're at now, now."
    Dark Helmet: "Go back to then."
    Col. Sanders: "When?"
    Dark Helmet: "Now."
    Col. Sanders: "Now?"
    Dark Helmet: "Now!"
    Col. Sanders: "I can't."
    Dark Helmet: "Why?"
    Col. Sanders: "We missed it."
    Dark Helmet: "When?"
    Col. Sanders: "Just now."
    Dark Helmet: "When will then be now?"
    Col. Sanders: "Soon."

  • The future is here!

    No, it's not! It won't be here for another three... oh, never mind. Now it is here.

    steve
  • Mighty Morphing Power PCs?
  • Imagine the fun we'll have when viruses can actually alter the router, switch and firewall hardware designed to protect us from the viruses.

I cannot conceive that anybody will require multiplications at the rate of 40,000 or even 4,000 per hour ... -- F. H. Wales (1936)

Working...