How Many Google Machines, Really? 476
BoneThugND writes "I found this article on TNL.NET. It takes information from the S-1 Filing to reverse engineer how many machines Google has (hint: a lot more than 10,000).
'According to calculations by the IEE, in a paper about the Google cluster, a rack with 88 dual-CPU machines used to cost about $278,000. If you divide the $250 million figure from the S-1 filing by $278,000, you end up with a bit over 899 racks. Assuming that each rack holds 88 machines, you end up with 79,000 machines.'" An anonymous source claims
over 100,000.
What is that as a percentage ... (Score:3, Interesting)
* of servers in the USA
* of servers running Linux
IPO changes things (Score:5, Interesting)
Assumptions? (Score:5, Interesting)
Um, don't you think if you were buying 899 racks you might actually, you know, negotiate for a better price?
This isn't the only assumption in your analysis, and the problems with them will be compounded. What's the point of this, really?
Cheap hardware (Score:1, Interesting)
So, they probably don't used "racks" but if they were, that means they could only get about 12-15 desktop machines (single proc) per rack. That's a whole lot less than 42 - 1U rackmounts to fill the rack.
Re:$278k ?? (Score:1, Interesting)
This is actually useful (Score:3, Interesting)
where do you go to buy 80,000 hard drives?
88 machines per rack? hardly. (Score:3, Interesting)
Google hosting (Score:5, Interesting)
Re:What a waste (Score:5, Interesting)
Sunny Dubey
15 Megawatts (Score:5, Interesting)
...assuming 200W per server, which is probably low, but probably compensates for 79,000 being most likely an overestimate. However, that doesn't even begin to account for the energy used to keep the stuff cool.
Anyone know how many trees per second that would be? Conversion to clubbed-baby-seals-per-sec optional.
Re:Which brings up an interesting question... (Score:4, Interesting)
This is how it should be, since knowing the size of Google's hardware capacity is a very, very strategic bit of information, and the kind of thing that would allow Yahoo/MSN/whoever to get a feel for how much capital would be necessary to duplicate or improve upon it.
Re:88 machines per rack? hardly. (Score:3, Interesting)
inside information (Score:5, Interesting)
Interesting People 2004/05 [interesting-people.org]:
I know for a FACT they passed 100,000 last November. One thing the Louis calculation may have missed is Google's obsession with low cost. For example read the company's technical white paper on the Google file system. It was designed so that Google could purchase the cheapest disks possible, expecting them to have a high failure rate. What happens when you factor cost obsession into his equation?
You're not factoring in Google's culture (Score:4, Interesting)
Re:$278k ?? (Score:5, Interesting)
Scary... DDOS? (Score:3, Interesting)
Someone mentioned that they have enough bandwidth/processing power to saturate a T1000 line. Scary...
Re:Environmental impact: power to 68,000 homes (Score:3, Interesting)
Whereas Slashdot uses nothing but solar power.
But his low end number are Wrong... (Score:3, Interesting)
Re:Nobody has 88 systems in a rack (Score:5, Interesting)
Re:lego? (Score:3, Interesting)
Re:Google hosting (Score:5, Interesting)
Web in memory (Score:2, Interesting)
Of course, once you consider that they keep thumbnails of al the images they index, things get tight very quickly. Plus, we can't forget the actual INDEX from words to documents -- that's in memory, too. And Orkut (which is probably pretty small, come to think of it).
GMail is another story altogether. 1 GB per user for 100K users would saturate their cluster. Plus indexes for searching mail. It seems unlikely that we'll have all-memory mail accounts anytime soon.
I think they include infrastructure & air cool (Score:3, Interesting)
Re:Google hosting (Score:4, Interesting)
Re:$278k ?? (Score:5, Interesting)
You've never worked in a medical field have you? You'd think that that would be a big deal and in theory data integrity is a very high priority but in reality...
I used to work as the IT Manager for a diagnostic imaging and cancer treatment center (and still do contract work with them because my replacement is kind of a noob) While loosing studies isn't exactly a "no big deal" situation it's still far more common than patients will ever realize. The server that stores and processes all of the digital images from the scanning equipment is a single CPU home rolled P4 using some shitty onboard IDE raid controller (doesn't even do RAID5!) running Windows 2K. The most money I could get for setting up a backup solution was the $200 an external firewire drive cost. Somehow we never managed to loose a study once it reached my network in the 9 months I worked there but I know three or four were deleted from the cameras themselves before being sent properly so whoops it's gone, gotta reschedule (and bill their insurance or Medicare again!) Two weeks ago one of the drives in that 0+1 array failed and despite my pleadings they still haven't ordered a replacement yet...
Now it's tempting to think that this place is just a special case of cheapness and sloppiness but from talking to the diagnostic techs (the people that operate the cameras) that's not so. That clinic is a little worse than average in terms of loosing patient information but by no means the worst some of them at seen/heard of/worked at in their careers. It's worse in general at small facilities but even large hospitals often suffer from the same unprofessionalism.
Your bank and the phone company keep much better track of your calls or your ATM transactions than most hospitals do with your CT or MRI scans...
Re:hardcore (Score:3, Interesting)
Actually, that's pretty close to the number of copies of Red Hat Google actually paid for in 200.
From here [internetwk.com]Redundancy (Score:4, Interesting)
Some of the reasons these techniques aren't used in enterprise computing:
Since I've seen it up close a few times, I can say that the standard "enterprise way" (Oracle/Sun/EMC) delivers very poor bang for the buck. If Google wanted to, they could deliver a modified GFS with any desired level of reliability by increasing the redundancy. And even after that bloating, it would still deliver greater bang for the buck than the conventional solutions.
Re:Redundancy (Score:3, Interesting)
Hmm, yes. The really bright programmers are living in their parents' basement and working for IBM for free. The dumb ones are getting paid a pile of money to code up forms and reports in fancy code-generation tools, then clocking off at 5 and enjoying themselves.
The system can only respond quickly to a finite set of transactions that was known at design time.
Those dumb business programmers left that paradigm behind in the 80s. The tech to do it (the relational database) was developed in the 70s.
Since I've seen it up close a few times, I can say that the standard "enterprise way" (Oracle/Sun/EMC) delivers very poor bang for the buck.
You've "seen it up", I've "set it up", kid. Once you've been around the block a few times, you'll drop your tech-snobbery and just choose the right tool for the job.