Stories
Slash Boxes
Comments

News for nerds, stuff that matters

Slashdot Log In

Log In

Create Account  |  Retrieve Password

Google Reveals "Secret" Server Designs

Posted by CmdrTaco on Thu Apr 02, 2009 11:06 AM
from the so-secret-everybody-knew dept.
Hugh Pickens writes "Most companies buy servers from the likes of Dell, Hewlett-Packard, IBM or Sun Microsystems, but Google, which has hundreds of thousands of servers and considers running them part of its core expertise, designs and builds its own. For the first time, Google revealed the hardware at the core of its Internet might at a conference this week about data center efficiency. Google's big surprise: each server has its own 12-volt battery to supply power if there's a problem with the main source of electricity. 'This is much cheaper than huge centralized UPS,' says Google server designer Ben Jai. 'Therefore no wasted capacity.' Efficiency is a major financial factor. Large UPSs can reach 92 to 95 percent efficiency, meaning that a large amount of power is squandered. The server-mounted batteries do better, Jai said: 'We were able to measure our actual usage to greater than 99.9 percent efficiency.' Google has patents on the built-in battery design, 'but I think we'd be willing to license them to vendors,' says Urs Hoelzle, Google's vice president of operations. Google has an obsessive focus on energy efficiency. 'Early on, there was an emphasis on the dollar per (search) query,' says Hoelzle. 'We were forced to focus. Revenue per query is very low.'"
+ -
story

Related Stories

[+] IT: Facebook Putting Batteries On-Board Its Servers 155 comments
1sockchuck writes "The data center of the future may have no central UPS units, and be filled with servers with on-board batteries. Facebook says it will adopt a new power distribution design that shifts the UPS and battery backup functions from the data center into the cabinet by adding a 12-volt battery to each server power supply, an approach pioneered by Google. Facebook says the move will slash its power bill and save millions in capital expenses on UPS systems and PDUs. Facebook acknowledged that these types of custom designs are limited to large companies, but called on server vendors and data center builders to adapt their offerings to make them available to smaller companies."
This discussion has been archived. No new comments can be posted.
The Fine Print: The following comments are owned by whoever posted them. We are not responsible for them in any way.
 Full
 Abbreviated
 Hidden
More
Loading... please wait.
  • The New Mainframe (Score:5, Insightful)

    by AKAImBatman (238306) * <akaimbatman.gmail@com> on Thursday April 02 2009, @11:07AM (#27431717) Homepage Journal

    Most people buy computers one at a time, but Google thinks on a very different scale. Jimmy Clidaras revealed that the core of the company's data centers are composed of standard 1AAA shipping containers packed with 1,160 servers each, with many containers in each data center.

    Mainstream servers with x86 processors were the only option, he added. "Ten years ago...it was clear the only way to make (search) work as free product was to run on relatively cheap hardware. You can't run it on a mainframe. The margins just don't work out," he said.

    I think Google may be selling themselves short. Once you start building standardized data centers in shipping containers with singular hookups between the container and the outside world, you've stopped building individual rack-mounted machines. Instead, you've begun building a much larger machine with thousands of networked components. In effect, Google is building the mainframes of the 21st century. No longer are we talking about dozens of mainboards hooked up via multi-gigabit backplanes. We're talking about complete computing elements wired up via a self-contained, high speed network with a combined computing power that far exceeds anything currently identified as a mainframe.

    The industry needs to stop thinking of these systems as portable data centers, and start recognizing them for what they are: Incredibly advanced machines with massive, distributed computing power. And since high-end computing has been headed toward multiprocessing for some time now, the market is ripe for these sorts of solutions. It's not a "cloud". It's the new mainframe.

    • Re:The New Mainframe (Score:5, Interesting)

      by spiffmastercow (1001386) on Thursday April 02 2009, @11:14AM (#27431867)
      But wasn't the mainframe just the old cloud? I seem to remember there was a reason we moved away from doing all the processing on the server back in the 80s.. If only I could remember what it was.
      • Re:The New Mainframe (Score:5, Interesting)

        by AKAImBatman (238306) * <akaimbatman.gmail@com> on Thursday April 02 2009, @11:22AM (#27431987) Homepage Journal

        I don't know which 80's you lived through, but mainframe processing was alive and well in the 80's I lived through. Minicomputers were a joke back then, and were seen as mostly a way to play video games. (With a smattering of spreadsheet and word processing here and there.) In the 90's, PCs started to take hold. They took over the word processing and spreadsheet functionality of the mainframe helper systems. (Anybody here remember BTOS? No? Damn. I'm getting old.)

        Note that this didn't retire the mainframe despite public impressions. It only caused a number of bridge solutions to pop up. It was the rise of the World Wide Web that led to a general shift toward PC server systems over mainframes. All we're doing now is reinventing the mainframe concept in a more modern fashion that supports multimedia and interactivity.

        Welcome to Web 2.0. It's not thin-client, it's rich terminal. The mainframe is sitting in a cargo container somewhere far away and we're all communicating with it over a worldwide telecom infrastructure known as the "internet". MULTICS, eat your heart out.

        • by AKAImBatman (238306) * <akaimbatman.gmail@com> on Thursday April 02 2009, @11:28AM (#27432119) Homepage Journal

          Derr... minicomputers should say microcomputers. My old brain is failing me. Help! Help! Help! He-- wait. What was I screaming for help for again?

          • by networkBoy (774728) on Thursday April 02 2009, @01:34PM (#27434309) Homepage Journal

            It's ok.
            We have a nice table with an integrated NEC 8000 for you to sit at. We even sprung for the sound dampening box for the daisy wheel printer for you.

            • by AKAImBatman (238306) * <akaimbatman.gmail@com> on Thursday April 02 2009, @02:01PM (#27434757) Homepage Journal

              We even sprung for the sound dampening box for the daisy wheel printer for you.

              When I first skimmed your post, I saw the words "daisy wheel printer" and my first reaction was, "Put it in the other room! Those fuckers are LOUD!" But it seems you've thought of everything.

              And that's what I'm talking about! WWII levels of efficiency. Not this namby, pamby, "I didn't know that slotting DIMMs of different sizes into the motherboard would disable dual-channel access" BS. Somebody give this boy a raise!

              /me goes off to play with the switches on the front of the computer

        • by divisionbyzero (300681) on Thursday April 02 2009, @12:32PM (#27433273)

          Not quite. While these server farms in a box are fault-tolerant they are not fault-tolerant in the same way as at least some mainframes where the calculations are duplicated. With mainframes you'd have wasted resources (doing every calculation twice) with lower latency. With server farms in a box you get, arguably, better resource utilization (route around something that is broken but wait till it breaks before doing so) but higher latency. The difference is incorporating the way the internet works into "mainframe" design.

          • Re:The New Mainframe (Score:5, Informative)

            by es330td (964170) on Thursday April 02 2009, @03:34PM (#27436125)
            You forget that fault tolerance is not of utmost importance to Google. I read an article somewhere that said, in essence, that since these are search results, and not financial transactions it is okay if some parts of the overall network don't know everything that every network knows. Having access to 95% (or 99%) of the data is still acceptable in the search world.
      • Re:The New Mainframe (Score:5, Interesting)

        by jellomizer (103300) on Thursday April 02 2009, @11:55AM (#27432585)

        Technology sways back and forth, and there is nothing wrong with that.

        1980s 2400/9600 bps Serial connections displayed the data that the people wanted fast enough for them to get their work done. And the computer had a lot of processing that can handle a lot of people for such simple tasks. And computers were expensive heck it was a few thousand bucks for a VT terminal.

        1990s More graphic intensive programs are coming out, Color Displays, Serial didn't cut it, way to slow. Cheaper hardware made it possible for people to have individual computers and networks were mostly for file sharing. So you are better off processing locally and allowed more load per demmand

        2000s Now people have high speed networks across wide distances Security and stability issues begin to happen so it is better to have your data and a lot of the processing done in one spot. So we go back to the thin client and server where the client actually still does a lot of work but the server does too to give us the correct data.

    • Re:The New Mainframe (Score:4, Informative)

      by DerekLyons (302214) <fairwater AT gmail DOT com> on Thursday April 02 2009, @11:16AM (#27431897) Homepage

      We're talking about complete computing elements wired up via a self-contained, high speed network with a combined computing power that far exceeds anything currently identified as a mainframe.

      By some measurements they exceed the computing power of a mainframe, by others they don't.

      • by AKAImBatman (238306) * <akaimbatman.gmail@com> on Thursday April 02 2009, @11:26AM (#27432065) Homepage Journal

        By some measurements they exceed the computing power of a mainframe, by others they don't.

        A fair point. However, I should probably point out that mainframe systems are always purpose built with a specific goal in mind. No one invests in a hugely expensive machine unless they already have clear and specific intentions for its usage. When used for the purpose this machine was built for, these cargo containers outperform a traditional mainframe tasked for the same purpose.

      • by Znork (31774) on Thursday April 02 2009, @03:56PM (#27436431)

        by others they don't.

        Seriously, I've fairly recently gone through every single benchmark, comparison, inference, etc, that I've been able to find on the subject (they're not exactly sprinkled all over the place) and I can't find any indications anywhere that mainframe hardware can surpass modern commodity hardware on any measurement. On price/performance variants it's not rare to see it outclassed more than an order of magnitude, and in absolute performance, well, there's very little magic hardware in the mainframe either anymore, it's pretty much the same silicon as anywhere else; Power CPU's, DDR infiniband, CPU to SC bandwidth almost equivalent to Hypertransport, same SAN as is used anywhere else, and as far as I can tell, to my horror, DDR2 533 memory(??). Please, correct me if I'm wrong and I very well may be, because actual specs aren't exactly flaunted. I mean, it's nice enough, but it's hardly magic.

        Sure, there's the old trick of moving system and IO load into extra dedicated CPUs, but that's becoming less and less relevant as pretty much any significant IO load has long since moved to dedicated ASICs that do DMA on their own without any CPU cost, and things like encryption accelerators aren't that hard to find. And it's not like you're not paying for the assist processors.

        Two or three years ago it might have been conceivable that it could have had at least a possibility of being superior in consolidation capabilities like being able to have the most unused OS instances running at a time, but with paravirtualized xen-derived tech commodity x86 hardware can accomplish the same or higher density. I can't say I've tried running 1500 instances, but for fun I did try running 100 instances on 5 years old junked x86 hardware which went fine until I ran out of memory at 6GB on the (like I said, junk) hardware in question. No significant performance degradation in relation to load versus what could be expected of the hardware, all 100 instances fully loaded both IO and CPU for a week to test for any throughput issues or over-time degradation, but that worked as well.

        IE, no practical limit for any non-contrived consolidation situation, and I have no doubt that it scales fine up to 1500 instances on reasonably modern hardware as well as it did on that hardware (and if you need higher density than that you should seriously be considering why you're using that number of OS instances that don't appear to actually be doing anything or consider moving to system-level virtualization like vserver or openvz)).

        So have you found any measurements that I couldn't find that you could point out that demonstrate lingering categories in which a mainframe might consistently outperform commodity hardware (ie, any measurement that is or can be compared to another at least somewhat related measurement on commodity hardware which demonstrates an advantage for the mainframe)?

        Outside pure performance there is the in-system redundancy which is nice in theory but which in practice seems to rarely result in higher actual uptime (mainframes appear to require an inordinate amount of scheduled service time and admins often engage in a disturbingly high IPL frequency).

        There is also the consistent load levels they tend to get (which seems to be largely due to culture, load selection and ROI requirements, rather than any inherent capacity), but beyond that it seems that the remaining aura of capability doesn't have much basis in reality anymore.

  • Patents & Catch-22 (Score:5, Informative)

    by eldavojohn (898314) * <my/.username@@@gmail.com> on Thursday April 02 2009, @11:09AM (#27431769) Homepage Journal
    From 2007 [slashdot.org], the modular data center patent [uspto.gov] (where the bottommost image of the article comes from). There's no lack of patents [uspto.gov] revealing piece by piece how their power management setup works.

    Ah, the catch--22 of the patent--being forced to reveal your hand in order to protect it while underpaid workers at Baidu figure out how to integrate your ideas into their hardware.
    • by Shivetya (243324) <shivetya@noSpAM.archonon.com> on Thursday April 02 2009, @11:16AM (#27431903) Homepage

      considering some of the mini's I worked on had similar setups in additions to external UPS.

      then again, we achieve all sorts of power, cooling, and reliability, when we consolidated many "pc" style servers into minis which do the same work. (the heat change alone was staggering)

    • by dfenstrate (202098) <dfenstrate&gmail,com> on Thursday April 02 2009, @11:38AM (#27432273)

      Ah, the catch--22 of the patent--being forced to reveal your hand in order to protect it while underpaid workers at Baidu figure out how to integrate your ideas into their hardware.

      That's not a catch-22, that's the point. In exchange for everyone learning from what you've done, you get society's protection for a limited number of years.

      Also, the workers at Baidu are not underpaid- if they where, they'd leave for better oppurtunities. The workers in question have obviously decided they're better off making stuff for google- they don't need your 'superior' judgement to tell them they should go back to subsistenance farming or melting hazardous materials for precious metals in their homes.

      A decision to work, or not to work, and to hire, or not to hire, are based on realistic alternatives, not what some westerner sitting at a keyboard 9,000 miles away thinks is best.

  • Kidding Me? (Score:5, Funny)

    by wtbname (926051) on Thursday April 02 2009, @11:10AM (#27431771)

    he said. "I worked 14-hour days for two and a half years,"

    Get that man a beer.

  • by Thanshin (1188877) on Thursday April 02 2009, @11:14AM (#27431855)

    We all know the searches are actually being done by a large amount of people in suspended animation, being fed the corpses of the previous people.

    The thing about each server having its own battery is a cruel joke.

    • by hansamurai (907719) <hansamurai@gmail.com> on Thursday April 02 2009, @11:32AM (#27432195) Homepage Journal

      Don't you remember in the Matrix where Morpheus holds up the Duracell battery to describe what the people are being used for? Google just managed to actually do it.

    • by neo (4625) on Thursday April 02 2009, @11:37AM (#27432253) Homepage

      I'm working on a solution. If only I can contact Oracle.

      • by Tumbleweed (3706) * on Thursday April 02 2009, @11:56AM (#27432601) Homepage

        I'm working on a solution. If only I can contact Oracle.

        "Thank you for calling Oracle. For English press 1, para en Español marque el numero dos.

        *beep*

        You have reached the Oracle Help Line. Please hold for the current Oracle. All calls are answered in the order received. There are currently [1,983,457] callers ahead of you. Estimated wait time is [5,347,987] minutes.

        Have you tried knowing thyself? Try checking our website at thereisnospoon.oracle.com.

        Thank you for holding."

  • Onboard UPS not new (Score:5, Informative)

    by Y2K is bogus (7647) on Thursday April 02 2009, @11:14AM (#27431863)

    The in-computer onboard UPS is not a new idea. I don't see how they could have gotten any patents on it since I used it have one of these (my day might still). The device I saw had a gel cell mounted on an 8-bit ISA card, full length. It had +5/12v pass through connectors for powering the drives and it powered the computer through the main bus. There was more logic to it, as it had some monitoring capabilities too.

    What's next, patenting a hard drive on a plugin board? Been there, it was called the Hard Card and put a 20mb HDD in an 8 bit full length ISA slot, a truly neat idea for upgrading old XT computers back in the day. You could make them work with AT computers too by putting a regular disk controller, without a drive connected, on the bus too and the BIOS would see the XT controller and boot from it.

  • No shit? (Score:5, Funny)

    by LordKaT (619540) on Thursday April 02 2009, @11:22AM (#27432001) Homepage Journal

    When the weather gets warmer, Google notices is that it's harder to keep servers cool.

    Brilliant journalistic work there.

  • by nebulus4 (799015) on Thursday April 02 2009, @11:29AM (#27432137)
    look at the date the article was published.
  • 99.9% efficiency (Score:4, Insightful)

    by Anonymous Coward on Thursday April 02 2009, @11:30AM (#27432159)

    This is a questionable number. The best DC-DC conversion is around 95% so they aren't including voltage conversions from the battery to what the system is actually using.

  • by David Gerard (12369) <slashdotNO@SPAMdavidgerard.co.uk> on Thursday April 02 2009, @11:35AM (#27432245) Homepage

    Many data centres expressly forbid UPSes or batteries bigger than a CMOS battery in installed systems - because when the fire department hits the Big Red Button, the power is meant to go OFF. IMMEDIATELY.

    So while this is a nice idea, applying it outside Google may produce interesting negotiation problems ...

    • by rotide (1015173) on Thursday April 02 2009, @11:46AM (#27432427)
      Isn't the red button for safety of the employees? As in, I'm under the floor and somehow the sheathing on a power feed to the rack next to me gets stripped? I start to light up and someone notices and hits the "candy red button" to save me?

      Pretty sure if the fire department is coming in to throw water lines around, they are going to cut the power to the building and not to just the circuit on the datacenter floor.

      I could be mistaken, but I don't think a 12 volt battery backup in these applications are going to pose much of a "life" risk. Obviously you don't want to put your tongue on the terminals, but I don't think they pose the same threat that the power lines under the floor do.

  • by Khopesh (112447) on Thursday April 02 2009, @11:47AM (#27432459) Homepage Journal

    This is composed purely of commodity parts. The power supply is the same thing you'd buy for your desktop, those are SATA disks (not SAS), and that looks like a desktop motherboard (see the profile view where all the ports on the "back" are lined up in the same manner they would need for a standard desktop enclosure).

    Only the battery is custom (or even non-consumer grade), and you can note that since the power goes through the PSU first, that's DC power. DC is significantly better than AC, since the PSU then has to convert AC-to-DC (which wastes power and generates needless heat). While you can get DC battery supplies for server-grade systems, these are not server-grade systems. Built-in DC battery backup therefore affords them the ability to keep the motherboards cheaper. Very smart.

    Also, if you recall from a few months ago, Google has applied pressure on its suppliers (I'm not sure why Dell comes to mind...) to develop servers that can tolerate a significantly higher operating temperature (IIRC, they wanted at 20 degree (Fahrenheit?) boost). I wouldn't be surprised if the higher temperature cuts down on operating expenses more than smarter battery placement.

    • by erpbridge (64037) <steveNO@SPAMerpbridge.com> on Thursday April 02 2009, @12:22PM (#27433093) Journal

      Actually, looking at the battery, ir looks like the same exact type of battery as you'd find in an APC small (450-800VA) UPS. We also used the same batteries for emergency power in our door access systems to power the controller when I was managing those at a small college. That type of battery is widely used to compensate for short term power outages.

      I presume, given the amount of hardware shown (2 drives, 2 processors, motherboard, RAM) that the battery would probably last that given system about 7-10 minutes... plenty of time for the electric system to failover to the generator farm (you know they have more than 2 for redundancy.

      As to the lifetime on those batteries... I was replacing them every 3-3.5 years, maybe 4 if I was lucky. It's a standard generic battery, and the failure rate on them is quite low.

      I'd echo another user... If Google wanted to be smart, they wouldn't bother repairing a server when a component fails. Server obselescence at a company that can afford it is about 3-4 years... pretty close to the time for these batteries. They'd probably just pull the main power on it, and when a threshold of servers is "dead" in the container, they pull it offline for renovation... Either to repair the bad servers, or just retire everything.

  • by wonkavader (605434) on Thursday April 02 2009, @12:00PM (#27432691)

    I'm a little surprised by the keyboard and mouse port and the two USB ports. If it uses USB, why not just use that for the keyboard and mouse? And why the second USB port? I suspect the second port doesn't consume extra energy directly, but it causes air resistance where they'd like to clear path to drag air across the RAM and CPUs.

    And why the slots which will never get used? In quantities like Google buys, you'd think those would be left off.

    Maybe they don't make any demands on Gigabyte (the manufacturer) and just buy a commodity board? When they're buying this many, you'd think Gigabyte would be happy to make a simpler board for them. On a trivial search, I don't see the ga-9ivdp for sale anywhere, but maybe it's just old.

  • by 1sockchuck (826398) on Thursday April 02 2009, @12:11PM (#27432907) Homepage
    Date Center Knowledge has videos of the secret server [datacenterknowledge.com] and a tour of one of the container data centers [datacenterknowledge.com].
  • by gweihir (88907) on Thursday April 02 2009, @12:49PM (#27433577)

    I could design this PSU configuration, and I do electronics only as a hobby.

    First, your main PSU delivers 12V in this scheme. Then this is stepped down to 5V and 3.3V for mainboard use, a design that is already employed by some Enermax PSUs, for example. For the 12V line, remember that +/-10% lower is acceptable. The lead-acid battery delivers up to 14V, so you need a step-down converter to 12V. In fact, you can design a switching regulator that steps the input voltage down to 13.2V (12V+10%), if it is larger, and just passes it through for 13.2V...10.8V with very, very low losses. A similar design can be done for 5% tolerances. Modern switching FETs go down to 4mR per transistor and you can do the transition from switching mode to pass-through mode very easily, e.g. with a small microcontroller that can then also do numerous monitoring and safety things. I had actually considerd such a design (purely analog though) for a lower-power, 12V external supply system myself some years ago, but a single UPS was so cheap that I did not went through with it.

    I do not mean to belittle the what the Google folks do, though. The real ingeniuity is relaizing you can do it this way on a datacenter scale when nobody else does it. The engineering is then not too demanding, at least for folks that know what they are doing.

    • by Bill, Shooter of Bul (629286) on Thursday April 02 2009, @11:19AM (#27431947) Journal
      Google claims they did the math and found it was cheaper with commodity hardware. I advise everyone else to do the same and run the calculations for themselves to determine the optimal hardware for their particular load. With out the specifics of their situation, its difficult to criticize in an intelligent fashion, other than a more generalized statement expressing surprise at their configuration.
    • by EvilMonkeySlayer (826044) on Thursday April 02 2009, @11:26AM (#27432075) Journal
      I've a few questions, if the data centre is built in the desert don't you have a number of issues?

      * Latency, if you have all your data centre's located in essentially a single part of the USA (lets ignore the rest of the world for this.. regardless that there are no deserts in Europe for example) won't that increase latency quite a bit to the more further away places that want the search results?
      * Bandwidth/redundancy, if you have all your eggs in one basket as it were aren't you going to have to pay extra to have lots of extra fibre laid down to be able to handle all that extra traffic? What about natural disasters, if you have all your data centres in a single location then surely you run the risk of things going pear shaped if it burns down, suffers earthquakes, aliens destroy the building etc.
      * Cooling, because it's in the desert isn't a lot of the electricity that is generated going to be cooling not only the building because of the outside heat, but also the heat generated by the servers? Surely it makes more logical sense to build in a colder climate say further north and use hydroelectricity? (if you're talking of using exclusively non active polluting (and non radioactive) natural electricity solutions)
    • by TheSunborn (68004) <tiller AT daimi DOT au DOT dk> on Thursday April 02 2009, @12:06PM (#27432785)

      A google mainframe would be stupid.

      If you take the price of a mainframe, and compare that to what google can get for the same money using their current solution, their current solution offers at least 10 times as much cpu performance, and much much more aggregate io(Both hard disk and memory) bandwidth.

      There are only 2 reasons to use mainframes now.

      1: Development cost. Building software that can scale on commodity hardware is expensive and difficult. It require top notch software developers and project managers. It make sense for Google to do it, because they use so much hardware(>100000 computers at last count).

      2: Legacy support.

    • Re:No way (Score:4, Insightful)

      by Anonymous Coward on Thursday April 02 2009, @11:28AM (#27432121)

      Greater than 99.9% efficiency? They likely made a mistake in their measurements.

      Maybe they measured 99.92% efficiency.

      That is greater than 99.9% efficiency and they aren't breaking any laws of thermodynamics.

      • Re:No way (Score:4, Informative)

        by mftb (1522365) on Thursday April 02 2009, @12:06PM (#27432789) Homepage
        They'd still have a computer there that is staggeringly efficient, especially since a computer's output energy is entirely heat - information is not energy, computers are all 0% efficient. Still, this isn't what they meant and the 99.9% figure probably comes from battery in/out figures.
    • by WPIDalamar (122110) on Thursday April 02 2009, @11:52AM (#27432553) Homepage

      Or maybe they think bigger...

      They're deploying containers of servers. Maybe when a container gets a to a certain age or a certain failure rate, they replace/refurbish the entire container.

      I doubt they care if some of their nodes go down in a power outage as long as some percentage of them stay up.

    • by mlwmohawk (801821) on Thursday April 02 2009, @11:57AM (#27432635)

      Hundreds of thousands of servers == thousands of dead batteries each month, since those batteries don't last more than a few years.

      I would imagine that the battery replacement schedule mimics the server obsolescence perfectly.

      LOL, when the battery catches fire, time to replace the server.