Follow Slashdot stories on Twitter

 



Forgot your password?
typodupeerror
×
Technology

Google Doubles Server Farm 258

Mitch Wagner writes "Here's our followup story on Google's colossal server farm. When we first wrote about Google last spring, they had 4,000 Linux servers, now they run 8,000. Last year we focused on the Linux angle, this year we thought it was more interesting to go into the hardware, giving a little detail about some of the things Google has to do to build and run a server farm that big." Impressive. I always think our 8 boxes are cool, until I see this kinda thing.
This discussion has been archived. No new comments can be posted.

Google Doubles Server Farm

Comments Filter:
  • by Anonymous Coward
    Go look at who has the SPECweb top slots on 1, 2, 4 and 8 CPU boxes.

    What the fuck is a "multithreaded TCP/IP stack"? The IP stack runs in both process context and interrupt context, there are no threads there, and it'd be stupid to use them. Perhaps you mean "fine grained locking," but just don't know what you're talking about.
  • I'm not sure how processor-intensive the Google software is, though.. Certainly, an S/390 has a lot of internal bandwidth, but I don't think it has the processing power of many PCs. If the searches are mostly just disk-intensive, it could work.. Of course, note that Google is using mostly IDE drives. Getting the same amount of SCSI storage would be, what? 3x the cost? Yeesh.
    --
  • Actually, the ranking system is equivalent to finding the principle eigenvector of a matrix with a billion rows and columns. Fortunately, there is a nice, iterative algorithm to do this. Each iteration performs a multiplication between a vector and a matrix, so it is at least n^2, and probably something like O(n^2 log n).

    For the curious: PageRank does not depend on your query; it is a global property of the link structure of the web. So Google does a normal keyword search and combines a keyword similarity value with the PageRank value, and sorts on this magic value.

  • I wonder how much it costs to get rid of all the heat billowing up from the farm. I imagine that place is popular in January.
  • Here's the interesting part of the traceroute I ran from my workstation here at, well work :)

    9 284.ATM7-0.XR2.DCA1.ALTER.NET (152.63.33.41) 5.685 ms 13.112 ms 4.145 ms
    10 194.ATM7-0.GW3.DCA1.ALTER.NET (146.188.161.77) 5.545 ms 7.685 ms 4.475 ms
    11 abovenet-dca1.ALTER.NET (157.130.37.254) 5.327 ms 6.011 ms 5.987 ms
    12 core5-core1-oc48.iad1.above.net (208.185.0.146) 6.132 ms 5.715 ms 6.948 ms
    13 core2-iad1-oc48.iad4.above.net (208.185.0.134) 5.818 ms 5.785 ms 6.011 ms
    14 main1colo1-core2-oc12.iad4.above.net (208.185.0.66) 7.527 ms 5.400 ms 4.853 ms
    15 64.124.113.173.available.google.com (64.124.113.173) 6.160 ms 5.705 ms 8.736 ms


    It appears to be co-lo'd at above.net. This was ran on the www server.
    Secret windows code
  • I'm unclear about how many of their 8000 boxes are indexing at any one time and how many are answering queries. Anyone knows?

    --

  • This is why such sites are usually powered by a moderate (typical site) to huge (Amazon, eBay) database with an enormous redundancy built in.

    Well, I'd think that eBay would split things up, as should Amazon, if they don't already.

    Sure, if the Computer section of eBay goes south the computer bidders are pissed, but it doesn't affect the Beanie Baby contingent.

    I think that the real reason that eBay/Amazon/Things 'N' Stuff aren't doing massive clustering (if, indeed they aren't) is that it takes quite a bit of planning and design to get something like that set up, and Amazon and eBay couldn't take the time. You have to be fast if you want to "build a brand"! Plus, to a greater or lesser extent, Google runs a single algorithm. Amazon runs a thousand of 'em, sometimes 4 or 5 a page.
    "Beware by whom you are called sane."

  • Exodus also generally has a limit on how much power they'll pull into you cage because of heat-density concerns.

    I think there's a great Ask Slashdot lurking in here about how they built and manage this stuff.
  • I'm sure the density would be a lot better with the DL360, a 2xCPU SCSI machine in the same box as the 320. The Rackable stuff is 33% higher density than standard 1U machines, and the cabling is easier to manage.
  • just scanning these posts, i can see that:

    * google uses redhat

    * they customise it extensively

    * they have arrived at workable solutions to problems of massive parallelism in several fields, eg load-balancing, tcp/ip optimisation, efficient segmentation of a huge database and the associated routing of queries, and presumably heat dissipation too.

    * in short, they have rolled their own into a system that even the /. beowulf fan club must admire

    * they make enough money to run 8000 pizza boxes and buy state of the art furniture by selling this combination of technologies to corporations who want to improve the efficiency of their knowledge workers.

    * they have contributed a total of, say, $3000 to redhat over the counter at Fry's.

    Now I'm not sure that counts as good oss citizenship.

    Overall i'm inclined to think that they're in credit just because google is so fscking good that it has replaced my bookmark file. I'd say that their public service, esp given the /linux branch and their flagship role, is enough to outweigh the fact that rather than returning _any_ of their code to the community they sell it privately to the worst kind of suit. I haven't even seen an educational or non-profit version (but i'd love to be corrected).

    It's hard to call, especially as i am a user of rather than contributor to linux and therefore benefit without being made use of, so i'm surprised not to see it being debated here. Just _using_ linux really doesn't deserve accolades any more. As they say in the article, it's an economic and practical decision, not an ideological one.

  • Even if they used the DL320 from Compaq (A 1U, 1 proc IDE server) or similar, they would still fill just a bit over 190 racks.

    And I thought some of the SAN setups here looked impressive.
  • Um, 6*80=240, which is only 3% of 8000. That would seem inconsistent with the claim that Exodus was one of three coloc locations for 8000 servers.


  • I've always understood that you place half yoru servers on the west coast and half on the east. should there be a net split i.e. contruction worker who didnt' call before he digged. you wont suffer he conciquecies. with all their servers in DC, how will they prepare for this
  • Sun boxes are expensive, but way fewer would be needed and that would save money.

    Did you actually read the article? Because the guy in charge of this stuff said that they were saving money by doing it this way. Considering the amount of money Google would be out if he were just lying through his teeth as part of the Linux Zealot Conspiracy (c), I really doubt that he's making that up. But if you'd like to point out all of the Google-sized sites that you're running, maybe we could talk.

    He also mentioned that using a freely-modifiable commodity OS on commodity hardware kept them free of any vendor pressure, which I imagine would be somewhat of a problem with Solaris, et al. No forced upgrades for Google!


    P.S. There is no Linux Zealot Conspiracy, of course, but you wouldn't know it by reading /. :P

    Caution: contents may be quarrelsome and meticulous!

  • As I said, I had forgotten something (for the sake of semplicity mostly). My point was: TCP is not simple, and parallelizing it is not pointless, nor everybody does it. For instance AFAIK FreeBSD has one of the most efficient TCP/IP stacks around, but it is not completely deserialized, and thus doesn't scale as well as it could on MP systems.

    About serializing: sure. Bot you can also tell that to the Java guys (in Java-ese, "serializing" means "transforming an object's internal status into a bytestream that can be transferred over the network to some peer where, given the object's class code and the serialized data, an identical instance of the object can be created").

  • New York had the highest energy costs in the nation.

    New York State is not at all homogenous. NY City and Long Island have horrendously high rates, while central and western NY are quite low.

    There is always a political tug of war regarding distribution of cheap hydro power from the St. Lawrence to the rest of the state, but you could always count on upstate being relatively cheap.

    When I moved from upstate NY to NJ my power rates tripled.

  • That's not necessarily the case. Even though they're using (resaonably cheap) IDE drives, they can still RAID 1-mirror them to prevent loss of data from hard drive failures. They would, however, have to suffer a half hour of downtime to replace the blown disk, but, despite what e-commerce consultants tell you ("if JimsGardenHoseEmporium doesn't get 5-nines availability, it'll lose all its customers!"), most applications could afford to have a 30-minute period of inaccessibility for 1% of their data at a time. The hard thing is desiging a resilient app that can operate well if a portion of its storage just disappears and then reappears sometime later.
    --JRZ
  • I wonder which gives them the highest electric bill, the servers themselves or the airconditioner required to do it?

    You know, you raise a funny point. When relocating our company, we looked at the cost of bandwidth and electricity, knowing that it was a cost of business. But when you've got 8,000 servers, you've got to think that electricity becomes a huge issue in picking your location. You almost want to move further up North, just to cut your air conditioning bills.
  • Ask and ye shall receive. Click on FAQ [slashdot.org] on the left side of your screen, and you will discover the hardware behind the dot.
  • As of today, Google makes its complete set of Usenet of messages available [google.com] (since 1995, over a terabyte of data).
  • In that MP3 stream linked somewhere in this thread, it is mentioned how Alan Cox helped solve a problem with the kernel. So Google gives feedback and they profit from the open development model used for Linux (in this case from a patch that Alan provided within 4 hrs ;-)).

  • no need to load anything from a "master DB", they stated in the article that there are several hundred copies of the index. That means, that if any one server goes out, there are still several hundred servers serving the same data. The point is, if an ecommerce site WAS set up like this, it would still be perfectly functional. However, that would be quite an impressive setup for an ecommerce site.

    -Restil
  • if you hadn't put your disclaimer, i do mark you as a troll. those 8000 boxes are automatically administered, via monitoring software. i don't know what they use, but there are programs to do that. Also, google doesn't go in and maintain those boxes every day, perhaps once a month or once in two months, they pull out all boxes that are down/giving trouble and replace it barely boxes, all they have to do is tell the box what index range to pulldown and store, i bet everything is very automated. Anyway, for what google is doing, you have to check where they are coming from, they need I/O! Are you not impressed when you search google and get a reply in 0.01 second? I am! Please don't compare with hotmail, google has never been down! hotmail on the other hand, ahem, ahem...

  • Try

    for server in $serverlist do
    scp patchNNN.tar.gz $server
    ssh $server (gunzip patchNNN.tar.gz; tar xf patchNNN.tar; install-patchNNN.sh)
    done

    It's not that hard to automate such a thing. Those 8000 servers are NOT managed individually -- that gets to be a real big pain, real fast.
  • AND they managed the upgrade without interrupting services. That is one of the benefits of using many indivudual smallish servers instead of a few large ones that way you dont get stuff like this that was on yahoo today: Whoops! We cannot process that request. We are presently performing system upgrades. During this time, some areas of the site may be unreachable. Yeah i know that yahoo uses google for their searches, but they don't use it for other services on their site.
    ----------------------
  • As my 3 Node IBM PS/2 Beowulf cluster
  • I actually didn't even design the shirts, it was a design from the "open source" tshirts site geekshirts.sourceforge.net. I created the cafepress site mainly because myself and my friends wanted one, and decided to leave it up afterwards. If you have a complaint about the origins of the quote, i'd suggest contacting the designer who submitted to geekshirts.sourceforge.net
  • Or it could be the fact that they are serving up bazillions of pages a day, each involving searches thru a petabyte database. Google's code is insanely good, they just happen to be one of the most heavily accessed sites on the internet, performing a very computationally intense operation (database searces).
  • For the cost of re-outfitting those machines with SCSI, you could probably add another 8000 servers
  • its irony, i was playing off the long-time slashdot "i wish i had a beowulf cluster of these" tradition

    although I agree it's not terribly witty, but i found it slightly amusing.
  • There is a project under GPL which is to be
    found under http://www.aspseek.org

    It is a deep crawler that works well, I did
    compile the actual stable Version under SUSE 7.0
    and get it running together with MySQL.

    ASPseek is not google but I would say that
    it imitates google a little bit. You can
    give it a try. I guess you do not need
    4 PCs. Crawling/searching on my Celeron333
    Server with 160 MB RAM and IDE HD did
    not stressed the machine. I dont know
    what happens if you got lots of pages.

    ASPseek people say that their baby got
    4 million pages indexed.
  • As it's theorized that there aren't even 10^100 atoms in the universe, or electrons for that matter (obviously), we're going to have to REALLY shrink our die sizes down to get there...

    The New Pentium XXI running on a -.0001mu core.

    Justin Dubs
  • What's most amazing about that is that the storage is spread across 8000 computers, instead of concentrated in a few monsterous racks. As someone working in the storage industry, I find that approach quite suprising.... I would have thought it been cheaper and far easier to manage, say, 1000 servers and a dozen massive disk arrays than to have 8000 points of failure to worry about.

    ----
  • All true, but are they really making money? I rarely see an ad there (not banner ad, mind you, but they're own form of search-related targetted ads). So are they still going off of vc, or do the few ads I see cover the bills?

    I really like to hear that companies that do so much for so little are doing well, such as google, or trolltech. I just worry for their actual business and the talented developers they employ...

    I guess they're doing OK if they added 4000 machines...


    --
  • by Anonymous Coward on Friday April 27, 2001 @10:23AM (#261364)
    I want to see pictures.
  • by Anonymous Coward on Friday April 27, 2001 @10:34AM (#261365)
    and without whoring themselves

    I have to say it's so nice not having a giant animated "Punch the monkey for $20" at the top of the screen. With Google, you actually have to look for the ads to see if there are any. It would be nice if a few other major sites learned something from this. What would that lesson be? Giant flashing ads only annoy people and do not bring in new customers.
  • by Bill Currie ( 487 ) on Friday April 27, 2001 @11:09AM (#261366) Homepage
    IMO, it's not the CPU power they're after (though it doesn't hurt), it's the io bandwidth. Think of it as a giant RAID array. Assuming their systems can pull 20MB/s off the hdds, that's 160000MB/s (or 156.25GB/s) total bandwidth (ignoring overheads).

    Bill - aka taniwha
    --

  • by Brigadier ( 12956 ) on Friday April 27, 2001 @10:34AM (#261367)


    I'm curious whether or not the optimizations made by google are readily available to the public. i.e GNU,
  • by cpeterso ( 19082 ) on Friday April 27, 2001 @10:30AM (#261368) Homepage

    "Google downloads Red Hat for free, taking advantage of the company's open source distribution. And Linux's open source nature allowed Google to make extensive modifications to the OS to meet its own needs, for remote management, security and to boost performance."

    I'm sure Red Hat is upset that they are missing out on the sale of 8000+ Linux licenses!! :-) Maybe they should block downloads from the *.google.com domain.

  • by the eric conspiracy ( 20178 ) on Friday April 27, 2001 @10:29AM (#261369)
    Buffalo NY would have to be the ideal location for this. Cold as hell, and right next to the Niagra Hydro plant for cheap power.

  • by Chewie ( 24912 ) on Friday April 27, 2001 @11:15AM (#261370)
    Well, Google has recently added paid links near the top of searches (but, thankfully, they've taken pains to identify them as such). Also, they make a metric buttwad of money licensing out their search engine to other sites (Yahoo!(TM) anyone?).
  • by crimoid ( 27373 ) on Friday April 27, 2001 @11:04AM (#261371)
    With all those machines you could just pull the dead ones out of service and leave them there until you wanted to do periodic maintenance (at which time you simply yank out the dead ones, replace them, flip on the power switch and walk away). Assuming you've got some clever auto-assimilation software you may not even need to configure the box manually.
  • As part of the infrastructure expansion, Google is consolidating. The company is moving out of datacenters in the San Francisco Bay and Washington D.C. areas, and consolidating in a new facility in the D.C. area. That means Google is moving from five to four datacenters--this, after adding three datacenters in the past year or so.

    I wonder if they really need that many servers or they doubled their size in order to have a seemless transistion during the move? I.e. Get the new site up and running and handling load and then take down the old site? Maybe they will sell off the old computers instead of move them. This could just be a PR spin to say "we doubled our size." Just devil's advocates conjecture, but they are probably moving to DC from SF to save money on space - so this is more of a cost cutting thing than anything else.

    Don't get me wrong, I love Google and use it everyday, but I don't see any reason they would suddenly double their capacity.
  • by harmonica ( 29841 ) on Friday April 27, 2001 @04:14PM (#261373)
    See its Freshmeat entry [freshmeat.net].
  • by DonkPunch ( 30957 ) on Friday April 27, 2001 @03:42PM (#261374) Homepage Journal
    Interesting slogan on those shirts.

    http://www.elj.com/elj-quotes/elj-quotes-1999.html [elj.com]
  • by segmond ( 34052 ) on Friday April 27, 2001 @10:41AM (#261375)
    4 copies of Microsoft Windows 2100.
  • by MustardMan ( 52102 ) on Friday April 27, 2001 @10:27AM (#261376)
    A bit of a correction to my own point, it's not a petabyte database, that petabyte of storage contains several hundred copies of the database. It's still a friggin LOT of data.
  • by eric17 ( 53263 ) on Friday April 27, 2001 @12:26PM (#261377)
    Well, $120 per license is a pretty good deal. Maybe the government should get the same deal for us citizens. For 150 million copies, the discount should be down to say, $100 a copy. That's only $15 billion, just a drop in the bucket for rich old uncle sam, and just a bit more than half of M$'s yearly revenues, so it won't hurt them either, but OMG--think of the savings!
  • by Tackhead ( 54550 ) on Friday April 27, 2001 @01:45PM (#261378)
    > petabyte == 1million gigabytes
    > can you just imaging how much _______ (insert your choice: mp3s, pr0n, divX;), etc) you could store! damn. *drool*

    A full USENET feed (including binaries) is about 250GB per day (yes, about an OC-3 saturated), and growing at 50-60% per year.

    One petabyte works out to only four more years of future USENET, give or take 50%.

    Scary, ain't it?

  • by dopolon ( 88100 ) <<david.opolon> <at> <wanadoo.fr>> on Friday April 27, 2001 @11:27AM (#261379)
    They actually use some compression algorithm (gzip I think) to compress the pages of the cache, because it would be silly to keeep a complete uncompressed mirror of the cache, since it's a feature that's probably used by only 20% of users
  • by turbodog42 ( 122173 ) on Friday April 27, 2001 @11:41AM (#261380)
    Well, when was the last time you searched on Google? It has a stunning amount of servers indexed. I can search for just about anything, and Google always finds more accurate hits, faster, than any other search engine. (Don't turn this into a search engine flame war, either.) They have to constantly refresh their indexes, and they have to turn around fast answers.

    Yeah 1.3 billion pages indexed is stunning. But even more stunning is the fact the total number of "pages" (an overly broad terms I concede) on the Internet is at least 100, if not 500 times [brightplanet.com] that size. Basically Google is behind on indexing by 2 to 3 orders of magnitude.

    It's true that they constantly refresh their index. But it takes them about 2 months to do it. That ain't fast no matter how you look at it. As evidence, take a look at the date on the cached CNN.com home page [google.com]
  • by HerrGlock ( 141750 ) on Friday April 27, 2001 @10:23AM (#261381) Homepage
    I wonder which gives them the highest electric bill, the servers themselves or the airconditioner required to do it?

    I'd just give up and get a handful of S/390s and do the same thing.

    DanH
    Cav Pilot's Reference Page [cavalrypilot.com]
  • by ichimunki ( 194887 ) on Friday April 27, 2001 @11:18AM (#261382)
    What good would open source search engine code do? Unless you wrote it in such a way that it ran on some sort of distributed basis, only your direct competitors would have the hardware to run it. I mean, Google is in the business of providing search results. If they give away the software that does this, anyone with a server farm can build the same engine. Now if they were a not-for-profit company (you know, a charity) or a volunteer effort like DMOZ, then I could see it, but I expect the stakeholders at Google prefer black ink on their bottom line.

    Free software makes all kinds of sense when users demand it, especially when it comes to operating systems, programming languages, and "productivity" applications. But it makes zero sense for a company who has not only written the software, but has the only machine running that software, to give away the software.
  • by kinnunen ( 197981 ) on Friday April 27, 2001 @01:11PM (#261383)
    http://www.google.com/corporate/index.html [google.com] (under business mode).

    Also, do a search for "porn". Ads.

    --

  • by update() ( 217397 ) on Friday April 27, 2001 @11:04AM (#261384) Homepage
    Disclaimer: I don't know anything about enterprise-scale IT. If I'm saying something ridiculous, let me know!

    That said, I'm surprised by the positive slant on this story. 8000 boxes that have to be separately administered? This is cost-effective (and environmentally sound) compared to a small number of heavy-hitter Solaris, AIX or Tru64 systems? I have to say I was a lot more impressed by hearing what cdrom.com does with a single FreeBSD system than by how many Linux boxes Google has had to cobble together.

    I've got to wonder - if this were a story about 8000 W2K servers powering Hotmail, would it get the same spin?

    Unsettling MOTD at my ISP.

  • by rabtech ( 223758 ) on Friday April 27, 2001 @10:31AM (#261385) Homepage
    Why bother to put together 8,000 Linux boxes, when one could obtain high-powered 64-bit computers to accomplish the same task?

    You can always go with Tru64, W2K Datacenter, AIX, et al.

    It would be interesting to figure out how much high-powered hardware would be required to replace those 8,000 boxen and the software to run it, and see if it comes out less or more than running the 8k separate Linux boxes.
    -------
    -- russ

    "You want people to think logically? ACK! Turn in your UID, you traitor!"
  • by SpaceLifeForm ( 228190 ) on Friday April 27, 2001 @03:08PM (#261386)
    If you want to really know how it works.

    http://www-db.stanford.edu/~backrub/google.html
    Note: the document was written in 1998.
    two snipets:
    6.3 Scalable Architecture

    Aside from the quality of search, Google is designed to scale. It must be efficient in both space and time, and constant factors are very important when dealing with the entire Web. In implementing Google, we have seen bottlenecks in CPU, memory access, memory capacity, disk seeks, disk throughput, disk capacity, and network IO. Google has evolved to overcome a number of these bottlenecks during various operations. Google's major data structures make efficient use of available storage space. Furthermore, the crawling, indexing, and sorting operations are efficient enough to be able to build an index of a substantial portion of the web -- 24 million pages, in less than one week. We expect to be able to build an index of 100 million pages in less than a month.

    9.1 Scalability of Google

    We have designed Google to be scalable in the near term to a goal of 100 million web pages. We have just received disk and machines to handle roughly that amount. All of the time consuming parts of the system are parallelize and roughly linear time. These include things like the crawlers, indexers, and sorters. We also think that most of the data structures will deal gracefully with the expansion. However, at 100 million web pages we will be very close up against all sorts of operating system limits in the common operating systems (currently we run on both Solaris and Linux). These include things like addressable memory, number of open file descriptors, network sockets and bandwidth, and many others. We believe expanding to a lot more than 100 million pages would greatly increase the complexity of our system.
  • Anybody out there have more nitty gritty details on the specs of the latest boxes added? I am interested in CPU speeds, gigabit ethernet, RAM. 8000 of these things! The mind boggles...

    Evidently, they shun multiprocessor boxes, use big & fast IDE drives (2 per PC, one on each IDE channel), and from last year's article [internetweek.com], use 100 Mbps links on the racks, with gigabit links between the racks. Last year's articles also quotes "256 megabytes of memory and 80 gigabytes of storage", though I imagine it's closer to 512MB (at least) and 180 GB per server now. Also says that they pack them in 1U on each side of a rack.

    But, here's the kicker, "Many of the systems are based on Intel Celeron processors, the same chips in cheap consumer PCs."!

  • by chris_mahan ( 256577 ) <chris.mahan@gmail.com> on Friday April 27, 2001 @12:27PM (#261388) Homepage
    The point of failure thing is a good point. If 10% of their servers fail (800) they still have 7200 that work fine, and they can probably handle things just fine.

    If 50% of their servers fail, then they would be slow, but still work fine.

    If 90 percent of their servers failed, they would still have 800 up. It would be very slow, but might still handle the load.

    If you had 1000 servers with disk array and your system failed, then ouch!

    In the other hand, they probably have half a dozen burned CDs of their implementation of Linux (depending on the HW configuration), so if a server fails, they take it offline, put another on there, load the OS already preconfigured from the CD (with all conf and stuff done already) and load it online.

    One tech can probably put 10 servers online a day.

    So 30 techs can probably put up 300 servers a day.

    Assuming each Linux box operates without admin intervention for 90 days, there would be 88 boxes that need to be fixed each day (about 1%), and so 9 techs could handle it.

    They probably have more than that.

    And since the technology is not hard to understand because it's a dual pentium PC, they don't have to call the IBM mainframe guy over. Also, they probably have a few dozen servers already configured, ready to be popped into the rack.

  • by dougel ( 447369 ) on Friday April 27, 2001 @11:01AM (#261389) Homepage
    I mean why not... Really: Windows 2000 Server OEM 642.60 Times 8000 PC's Is only $5,140,800 Now for the peace of mind that comes with a crash proof windows box, why would linx even be an alternative. The worst part about this post is there are MCSE's who are reading and saying "right on my brainwashed friend!" =-=-=- Doug
  • by Anonymous Coward on Friday April 27, 2001 @11:06AM (#261390)
    Funny story. Google got into the Virginia facility when Globalcenter owned the datacenter. Before google, the sales people would only sell "floor space". Google's one and half cage, jammed full of 1U linux boxes pulled so much power that it rendered 6 surrounding cages unsellable. After that, sales people began selling "Amp capped floor space" rather than just square ft.
  • by ch-chuck ( 9622 ) on Friday April 27, 2001 @11:01AM (#261391) Homepage
    Microsoft would be woefully inneficient in that environment

    ... 8000 Msft boxen is probably getting to the point where you'd need 3 shifts of McSE's full time just to reboot the damn things - kinda like the days they made computers with so many vacuum tubes that their failure rate caught up with them, and it would barely run before another tube needed replacing.
  • by pivo ( 11957 ) on Friday April 27, 2001 @11:11AM (#261392)
    Considering that they're not necessarily Linux advocates, I'd imagine the did that calculation *before* buying all those machines.

    In any case, they'd have done it at some point along the line before the 8000th server arrived, and if they found they were making a mistake I can't see why they wouldn't have switched by now. Especially since if they thought NT would somehow be so much better they could have just removed Linux and installed NT and not have had to buy more hardware.

    Sounds like Linux is working out pretty well for them.

  • by kinkie ( 15482 ) on Friday April 27, 2001 @12:57PM (#261393) Homepage

    Let's recap how a single packet is to be handled (and probably I forgot something):
    you get the ethernet interrupt, you have to DMA the frame off the board, check to what protocols it belongs (if it's not IP, drop), checksum, check if you have to do any reassembly, check what protocol it is (it might not be TCP after all), check that the packet makes sense given the connection's history (i.e. sequence numbers and various other bits here and there), identify the process waiting for the packet, copy to userspace, signal process.
    A multithreaded TCP/IP stack means that more than one packet can be in the pipeline at the same time. It makes no difference on an UP system really, but on Nproc it can multiply your throughput by N (at least theoretically), just as a multithreaded app could increase throughput on a multiproc system.
    Of course, to be feasible, as many parts of the stack as possible must be reentrant, or you'll have to do locking and thus (in MS-ese) "serialize".
  • by travisd ( 35242 ) <`travisd' `at' `tubas.net'> on Friday April 27, 2001 @10:26AM (#261394) Homepage
    I've seen their cage out at Exodus in Virginia. Pretty cool.. They have like 6 racks of servers there - each rack is 80 servers I believe. They use systems from Rackable [rackable.com]. Generally in a hosting facility you pay per rackspace and bandwidth -- more servers/rack means less cost/month in space.
  • by Ender Ryan ( 79406 ) on Friday April 27, 2001 @10:36AM (#261395) Journal
    I thought I was really cool with my 100 gigs of storage at home filled with DivX ; ) movies and MP3s. 1 million gigabytes, that's insane.

    Ok, new poll

    What do you think is stored at Google?
    1. Huge search engine index
    2. Pr0n
    3. MP3s
    4. DivX ; ) Movies
    5. DivX ; ) Pr0n
    6. Marketing data collected with satellites and video cameras attached to flies... just like MLB
    7. Cowboyneal's transporter pattern buffer

    note: I own _MOST_ of the mp3's and divx movies I have...

  • by GreyyGuy ( 91753 ) on Friday April 27, 2001 @10:27AM (#261396)
    Just think how much it would cost to license 8000 servers with win2k and whatever database they would use. Would Google even be able to do this on M$?
  • by clink ( 148395 ) on Friday April 27, 2001 @10:23AM (#261397)
    I hope these people aren't located in California. Otherwise I think we've located the source of the electricity crunch.
  • The scalability of many small servers is great, but I would think they would run into a wall eventually due to the effort required to maintain all those machines. I mean, even if the failure rate is very low on a per machine-per time basis, if you have enough machines, you're going to wind up replacing multiple hard drive, cards, mobos etc every day. Their system is redundant enough that this doesn't affect performance, but there is a cost associated with the manpower required to do all that maintenance.

    I just gotta wonder at what point they would get better overall efficiency by replacing all those little boxes with a couple of big iron mainframes.

  • by sumengen ( 230420 ) on Friday April 27, 2001 @03:49PM (#261399)
    I have listened to a Google senior engineer for about 10 months ago. They are really good at load balancing and should become a good example for other companies. Interesting points I remember:

    - Number of websites are increasing exponentially. So your number of computers or required CPU cycles are increasing exponentially. On the other hand prices per CPU Mhz also decreases exponentially (Moore's law ???). That is the key solution for the scalabbility. At least the problem is not exponential.
    - As mentioned in this article, they have been running Celeron 500+256MB RAM+ 2x 40GB harddisks back then. When a computer fails it is easier to replace them because of the cheap hardware.
    - Buy systems as much parts integrated to the main board as possible (NIC card, etc.) It is supposedly more reliable.
    - They are not running linux because it is cheaper. I have seen headlines about this including Slashdot, but it is not true. They are not denying that they saved a lot of money because of that, but hen they started Google that wasn't the issue. He mentioned that they could have had got a good deal from Sun for Solaris. The reason was that the openness of the source code and other reasons mentioned in the article. By the way he mentioned that TCP stack issues were also considered when the decision have been made. it looks like they are confident that they can fix problems at home if any exist.
    Google wants to design all software they run at Google. They don't want to use third party software because it introduces instability and it is difficult to fix bugs in that case.
    - They are not running Apache. using linux doesn't mean running apache. They designed their web server, which is simplest possible and therefore fastest. They don't need a complicated web server. All the computation is done in the background on 8000 linux servers. Web server needs only to send the query to the query server and display the results.
    - Googles job was easier than people might think. Their database is not dynamic. It only gets updated once a month. Updating means replacing the old files with the new ones, which is an offline process. Comparing this with an ecommerce site displaying real time statistics, you can see that google has an advantage and makes things easier for them.
    - Lets say Spidering and crawling is done on one datacenter. You need to copy these terabytes of data over to other datacenters and then replicate it to multiple server farms in each datacenter. You have to do this fast and without any errors. You don't want to use OS file system functions.
    - They rent bandwidth of multi gigabits for offline hours when there is not much traffic. of course for a very very cheap price. They use this bandwidth to copy data files from west coast to east coast. We are talking about many terabytes.
  • by V50 ( 248015 ) on Friday April 27, 2001 @10:30AM (#261400) Journal

    They are still NOWHERE near a Googol Servers like their name suggests... Humph...


    --Volrath50

  • however I havn't seen that many testimonies/reviews from sites that use it.

    http://slashdot.org/article.pl?sid=01/04/26/033921 9 [slashdot.org]

    Anandtech.com [anandtech.com] is using it.

    --

  • by Anonymous Coward on Friday April 27, 2001 @10:23AM (#261402)
    This is what you can tell people when they tell you that linux is a toy. The best search engine in the world is *not* a toy.
  • by Precision ( 1410 ) on Friday April 27, 2001 @10:55AM (#261403) Homepage
    We have been using LVS on SourceForge, Linux.com and Themes.org and I nothing but good things to say about it. I have yet to have any real problems. We have 2 firewalls with automagic failover using heartbeat. We also use keepalived to automagically remove webservers from the queue if they go down.. all in all it's been a great piece of software.
  • by ethereal ( 13958 ) on Friday April 27, 2001 @01:11PM (#261404) Journal

    Totally not the case - they've made their OS what they want, and they can change it if they want to. Don't confuse the cost of rolling out changes to 8000 machines with the cost of forcing a proprietary OS vendor to make the changes you need - you can roll out 8000 machines on a rolling basis in a week, assuming a conservative 1 hour automatic install 80 at a time (1% unavailability). You may never be able to get Sun or Microsoft to make the changes you need in an OS, if it isn't in their best interest to do so. Google's only "locked in" to RH in the sense that they can only achieve sufficient flexibility with an open source OS, and it sounds like they just went with RH because it's easier to hire admins. I bet they could run on any other flavor of Linux pretty easily, and *BSD without too much pain if they had to.

    Moderators, the above was only insightful if you don't care to think very hard...

    Caution: contents may be quarrelsome and meticulous!

  • by blinx_ ( 16376 ) on Friday April 27, 2001 @10:25AM (#261405)
    In the recent months I've been trying to read everything I can find about loadbalancing large web sites, and google sure does make an interresting example.
    My company is in the progress of moving from one big server to several smaller onces, to allow for greater scalability, there is just a limit to how much cpu + memory you can put in a single box. Our future site will proberly use linux virtual server, which seems quite nice, however I havn't seen that many testimonies/reviews from sites that use it. The company I work for creates online image manipulating services, and part of the process is rendering large high quality images - and the hard part seems to be shared storage of these images (scsi over tcp/ip seems very interresting), load balancing with static pages seems easy enough. Anyway google's way of using many small machines is an inspiration.

  • by leperjuice ( 18261 ) on Friday April 27, 2001 @10:48AM (#261406)
    Google's applications are unique, requiring far more extensive load-balancing, computing, and input-output bandwidth than other enterprise applications.

    The question that should be asked here is if they are sharing the results of their word. I bet that they're probably lifting some of their techniques hot and fresh off of research papers and they may be the first to actually use them in a enterprise environment.

    Note that I personally believe that closed source is not necessarily a bad thing. But if Google has made radical changes to these enterprise-grade tools, it would be nice to see them trickle down into the mainstream distros. While we as home users would probably never need them, it would certainly put to rest some of the pro-Microsoft arguments against Linux as a server-grade OS.

    Of course, for all I know, they could be actively working with Cox et al to incorporate their findings into the kernel and related tools.

    Either way, a very impressive job done with a operating system that "is simply a fad that has been generated by the media and is destined to fall by the wayside in time." [microsoft.com]

    Note that I use Windows and Linux so I'm no bigot... (some of my best friends as Microsoft Programmers!)

  • by ottffssent ( 18387 ) on Friday April 27, 2001 @11:05AM (#261407)
    "And no, Linux on IBM/390 WILL NOT help them because it is just an emulation, and disk arrays of this one huge computer will get swamped by the billions of read requests (the same way they will get swamped on Starfire or the same S390 under OS390)"

    Exactly. Even at ~1M/s per IDE drive (lots of random reads), that's 1M/s * 8000 machines * 2 drives/machine (yeah, some have 4, but the article doesn't say how many) = 16GB/sec. It would take a hell of a SCSI setup to equal that bandwidth, let alone the massive numbers of IOs.

    Further, even if the boxen only have 2G memory each, that's 16TB of memory, which you could put in one big server, but no single memory system is going to provide the throughput that 8000 SDRAM channels will.
  • by Chewie ( 24912 ) on Friday April 27, 2001 @10:45AM (#261408)
    Several points here: W2K DC doesn't run 64-bit, at least not until Itanium is released. Second, for something like this, there are two reasons to do a large server farm: scalability and throughput. They said that they do not have one monolithic storage system, but instead partition the database up into small segments in the servers themselves. This means that they can handle many more I/Os per second than one (or several) big iron boxes could do. Also, those big 64-bit boxes are damn expensive (both hardware and software). For the price of one of those, you can get cheap servers and cluster them together. The big iron boxes are great for large databases that can't be split up among several servers/storage systems, but if you can split the database up (as they have done), a farm of small servers will always provide better scalability and throughput than one big box. And aren't those two things the secret behind the web game?
  • what in gods name do you need 8000 linux servers for? quake? I cant figure out what google could possibly use all that power for... if they really *need* all that power, they're obviously doing something wrong with their code.

    Well, when was the last time you searched on Google? It has a stunning amount of servers indexed. I can search for just about anything, and Google always finds more accurate hits, faster, than any other search engine. (Don't turn this into a search engine flame war, either.) They have to constantly refresh their indexes, and they have to turn around fast answers.

    Yahoo even uses them for their search engine. I can't imagine being able to service Yahoo's search needs with anything less than a full-fledged data center split across two cities.
  • by revscat ( 35618 ) on Friday April 27, 2001 @10:24AM (#261410) Journal

    This is only tangentially related to the story at hand, but I would just like to compliment Google on a job done extremely well. They have successfully built the fastest search engine out there, using open methodologies and without whoring themselves out like any number of other search engines. They continue to add interesting (and [gasp!] useful) features such searching PDF documents and their translation engine. They have really helped the Open Directory Project along, as well.

    There are successful .coms out there, but I think their business practices are so foreign to the "regular" business community that they aren't quite sure how to handle it.

    BTW: Anyone else see a philosophical relationship between Google and ArsDigita?

  • by Phrogz ( 43803 ) <!@phrogz.net> on Friday April 27, 2001 @11:03AM (#261411) Homepage
    Google indexes 1.3 Web billion pages on over a petabyte of storage--that's more than a million gigabytes. "That's not to say that the index takes up a petabyte..."

    And what takes up all that size? You know it--pr0n. The storage size says it all...it's not a petabyte they've got there, but a pedobyte. Sick google bastards. :)

  • by Tackhead ( 54550 ) on Friday April 27, 2001 @01:38PM (#261412)
    > All true, but are they really making money? I rarely see an ad there (not banner ad, mind you, but they're own form of search-related targetted ads). So are they still going off of vc, or do the few ads I see cover the bills?

    Actually, I think they're being smart about it.

    If the typical query returns one USENET post - maybe 2-3 kilobytes of text - why would you want to (as Deja did) spend money sending 20-30 kilobytes of HTML for the associated frames and banners and other ad support?

    The user's gonna see one ad. Google's bandwidth and I/O costs are gonna explode if the HTML wrapped around each ad takes up 10 times as much space as each query's results.

    By going with text-based ads and a non-frames approach, they not only make the site more user-friendly (thereby adding value), they cut their own costs by a sizable fraction.

    With lower bandwidth costs and I/O requirements, Google can make money with less ads, not more. That's where (IMHO) Deja went wrong - the more they needed the ad-revenue, the more they escalated the cost of serving the ads, in a vicious circle that consumed them.

    It's also where (IMHO) Google is doing it right.

  • by supabeast! ( 84658 ) on Friday April 27, 2001 @10:26AM (#261413)
    I have seen some of Google's stuff in the Northen Virginia. Those guys really know how to do high density racks. They have double-sided racks of 1U servers, with what I believe is 47 servers per side. The cabling alone is gorgeous. The bright red and shiny steel racks full of hundreds of flashing LEDS looks like something out of a rave.
  • by StoryMan ( 130421 ) on Friday April 27, 2001 @10:41AM (#261414)
    What they should do is utilize the heat escaping from that chimney of theirs to power steam turbines.

    Then use the turbines to drive generators.

    Then send the power from those generators to the western united states.

    Now -- follow me here -- this would be a self-sustaining system, no?

    Users use google to search the web and read their embarrassing usenet posts from 1995. Power is generated. That power is funneled back to the user so that his or her computer stays on, the lights stay on, and they don't have to worry about getting stuck in an elevator during a rolling blackout.

    Users are happy, nuclear opponents don't have to worry about radioactive leaks into the environment from improperly sealed cooling tanks and leaking water, and google remains up and active, chugging away ad infinitum.

    Simple.

    Tomorrow, I'll work on my plan for cold fusion. Maybe a couple of Guiness glasses filled with tapwater, a couple of batteries, and a beowulf cluster ...

  • by BMazurek ( 137285 ) on Friday April 27, 2001 @11:09AM (#261415)
    Now -- follow me here -- this would be a self-sustaining system, no?

    "Lisa! In this house we obey the laws of thermodynamics" -- Homer Simpson

  • by bellings ( 137948 ) on Friday April 27, 2001 @02:20PM (#261416)
    8000 boxes that have to be separately administered?

    Why would 8,000 identical boxes be difficult to administer? The guys that develop the monitoring software and the install and upgrade processes are probably pretty smart cookies. But the actual maintence of the machines could probably be handled by monkeys.

    Think about it: the instructions for handling a hardware failure in one of these machines is probably:
    1. Identify bad part
    2. Replace bad part with any of the two dozen exactly identical parts we keep in the spare parts closet.
    3. Put system recovery CD in drive.
    4. reboot.
    5. remove system recovery CD when it automatically ejects and the end of the recovery process.
    6. If this doesn't work, call our system engineer, at 555-1212
    The spare parts closet probably just has boxes with labels like: "This box contains 80GB Maxtor hard drives -- exact match for every hard drive in rack 5, 7, and 8." Another box might be labeled: "AMI A571 motherboards -- exact match for all motherboards in rack 1, 2, 3, 4, and 7."

    Another box in the closet is probably labeled "Empty, pre-labeled Fed-Ex shipping boxes that are exactly the right size for our rack mounted hardware. Use to ship any badly broken machines back to our system engineer. Call first!"
  • by Poligraf ( 146965 ) on Friday April 27, 2001 @10:44AM (#261417)
    It is that their information and the cost of failure are not critical. If one of the Google's servers (or hard drives) dies they can just find out what pages were stored there (from the master DB) and reload them into the storage on a new PC (and I'm sure they have some PCs with identical data).

    Now imagine an e-commerce site built like that. Loss of any part of user list or merchandise catalog is a major failure. This is why such sites are usually powered by a moderate (typical site) to huge (Amazon, eBay) database with an enormous redundancy built in.

    And no, Linux on IBM/390 WILL NOT help them because it is just an emulation, and disk arrays of this one huge computer will get swamped by the billions of read requests (the same way they will get swamped on Starfire or the same S390 under OS390). The entire idea of the setup is that you have a lot of independent disk channels.

    Another interesting insight is that they have done some improvement to administering all of these machines remotely. Otherwise they will blow all their money on paying sysadmins ;-)
  • by Matthew Luckie ( 173043 ) on Friday April 27, 2001 @11:14AM (#261418)
    "It doesn't look like Google got the e-mail that the dotcom boom is over"

    three possible explanations:

    1. they have a spam filter in place
    2. they have a microsoft exchange server somewhere
    3. they were too busy going through everyone else's embarassing usenet postings than to read their own email
    my guess is the third one

  • by Sinjun ( 176671 ) on Friday April 27, 2001 @10:22AM (#261419)
    I wonder what kind of information Google has about the deficiencies of the Linux TCP/IP stack? Certainly with 8,000 servers they could have some input as to how the lack of mult-threading has affects performance on a major site. I know that the most recent kernels and Apache versions were suposed to have dealt with this issue, but has anyone seen such a large scale experiment?
  • by stype ( 179072 ) on Friday April 27, 2001 @10:47AM (#261420) Homepage
    Go to google, click on preferences and change your language to "bork, bork, bork." From now on the site is completely in Swedish Chef (no joke).
    -Stype
  • by vslashg ( 209560 ) on Friday April 27, 2001 @10:55AM (#261421)
    "That's not to say that the index takes up a petabyte. We have several hundred copies of the index," Felton said. "Most of the servers are serving up some fraction of the index." The index is partitioned into individual segments, and queries are routed to the appropriate server based on which segment is likely to hold the answer.
    An interesting metric that they don't go into in this article:
    • 4,718 of the servers index pr0n
    • 2,148 of the servers index warez
    • 1,634 of the servers index MP3 sites
    • 1,139 of the servers index various "ate my balls", "all your base", and other joke-of-the-month sites
    • 278 of the servers index content
  • by SirChive ( 229195 ) on Friday April 27, 2001 @10:41AM (#261422)
    Google is wonderful. But I'm left wondering where they get their financing and what their long term goals are.

    The Google site features minimal advertising. So they are most likely funded with VC money. This means that they must have a plan for making money at some point. What is it and when will it kick in?
  • I'm sure Red Hat is upset that they are missing out on the sale of 8000+ Linux licenses!! :-) Maybe they should block downloads from the *.google.com domain

    I imagine they only download it once, then distribute via LAN. Besides, from last year's coverage, "Google actually paid for only about 50 copies of Red Hat, and those purchases were more of a goodwill gesture. "I feel like I should be nice, so when I go to Fry's I pick up a copy," Brin said."

  • by epiphani ( 254981 ) <epiphani@@@dal...net> on Friday April 27, 2001 @10:30AM (#261424)
    not nessecarily commenting on the multi-threading issue, kernels 2.4.x have substantially better socket handling... there were articles floating around on slashdot and linux.com a while back about a DALnet server breaking 38,000 simulatious active open sockets at one time. Linux has done wonders with their 2.4.x tcp/ip stack.. until recently, nobody even considered linux's stack worthy of an attempt at an IRC server of any reasonable size.
  • by daveym ( 258550 ) on Friday April 27, 2001 @11:00AM (#261425)
    "The Google site features minimal advertising. So they are most likely funded with VC money. This means that they must have a plan for making money at some point. What is it and when will it kick in?" Ummm...If you go to google and read about their company, you will learn that most of their income comes from licensing their awesome search engine for internal use by other companies. NOT from advertising. With everyone just now learning that advertising on the web sucks balls, this looks like a pretty shrewd move on the part of Google....
  • by nrozema ( 317031 ) on Friday April 27, 2001 @10:35AM (#261426)
    I just spent all of yesterday afternoon installing a 63-node rack from Rackable. The build quality of these units is excellent... amazingly dense and efficient. According to the installers, in addition to google, their systems are also used extensively by yahoo and hotmail.
  • by dhamsaic ( 410174 ) on Friday April 27, 2001 @11:54AM (#261427)
    Has Google bragged about how much electricity they are consuming to run 8,000 electrical heaters? Have they boasted about how much pollution their power consumption generates? - they haven't bragged about *anything* - an article was simply written by an outside source which gave some details of their setup. They also note that they have hundreds of copies of the index, so that the redundancy is there - if one server goes down, another hops back up. Google *is* a business, and they need to be reliable. They're out to a) provide a useful service and b) make money. It's not useful if you can't get to it.

    They're using 8,000 computers to accomplish a pretty amazing feat, and they're doing this instead of buying a pretty huge farm of larger and faster computers anyway. Sometimes more smaller parts are better - you don't have one big machine that fails, separate parts are replaceable (say 10 or 20 machines instead of a few larger servers).

    You don't build a house starting with a large block of concrete - you use bricks. Google is doing the same thing. Cut them some slack.

  • by Magumbo ( 414471 ) on Friday April 27, 2001 @10:52AM (#261428)
    "they direct heat to a central chimney which is blown up to a high-powered fan"

    And these high powered fans then blow the blisteringly hot air along a complex series of ducts which lead to facilities which:

    a) generate electricity for the wall-o-lava-lamps [google.com]
    b) are used to fill state-of-the-art, floating, hot-air furniture [google.com]
    c) keep folks warm-n-toasty in the sauna [google.com]
    d) make you hot [google.com] and thirsty [google.com]

    --

The trouble with being punctual is that nobody's there to appreciate it. -- Franklin P. Jones

Working...