miller60 writes "Google never says how many servers are running in its data centers. But a recent presentation by a Google engineer shows that the company is preparing to manage as many as 10 million servers in the future. At this month's ACM conference on large-scale computing, Google's Jeff Dean said he's working on a storage and computation system called Spanner, which will automatically allocate resources across data centers, and be designed for a scale of 1 million to 10 million machines. One goal: to dynamically shift workloads to capture cheaper bandwidth and power. Dean's presentation (PDF) is online."
Google is starting to sound more and more like one of those advanced societies where everything is automated, but everybody forgets how everything works.
For reference, see: Logan's Run, STTNG: When the Bough Breaks, etc.
I can't remember the title of the story, but it was portrayed on Twilight Zone. In the story the military (of the future) was screwed because their computers were failing and no one knew how to fix them. They could not figure out how to target the missiles. The janitor was the saviour, because he alone knew how to do math using pen and paper. I wish I could remember more. I found it a very thought provoking story. What happens as we let more and more automatics into our lives? Do I really need to kno
Yes, the classic version of that story ends with the military designing suicide-missiles, crewed by human beings. The rationale being that new computers (for guidance) are very complex and cost a lot to make, but a human being with a pencil and paper is a very low-cost solution. The story ends with the commanders envisioning a new arms-race, where the determining factor is no longer resources but rather how quickly new missile-drivers can be taught math.
I just wish I could remember what book that's from.
They already own the internet. And...just one guy owns it all. He lives under what used to be called area 51 in secret and collects alien technology. I think the last time they found something it said DALEK. No one knows what it means and it doesn't work anyhow.
Credit Suisse made headlines this summer when it estimated that YouTube was binging on bandwidth, losing Google a half a billion dollars in 2009 as it streams 75 billion videos. But a new report from Arbor Networks suggests that Google's traffic is approaching 10 percent of the net's traffic, and that it's got so much fiber optic cable, it is simply trading traffic, with no payment involved, with the net's largest ISPs
Yes, they're a big player on the internet, but it's entirely possible to get along without using any of their services. Their existence is not critical...not even close.
That's a lot of machines to try and shift bandwidth and power costs around the place.
But what if the plan is to spread out to hundreds of places? Then the total number doesn't look that high if there's only 1% of servers actually doing anything.
I'd be interested to know how google disposes of all of their servers. Anybody have insight on this? If these are cheap, throw away servers, I'd be interested in what their expected lifetime is and what is done with them when they are refreshed with newer hardware.
They should put that on their website,... before long it'll be: "Google: Billions and Billions of Servers." Of course, McDonald's just might have a problem with that,...
The entire content of the Internet fits in a 20x8x8 box [archive.org] operated by the Internet Archive. Cuil, which searches as much of the Web as Google, has one relatively modest data center. About half the system does the crawl and builds the index; the other half answers queries. So Google's main search engine function doesn't really require that much capacity by current standards. Of course, Google has a huge number of query servers front-ending the main index, which is of course replicated.
Why does Google need so much server capacity? YouTube? Command completion? GMail spam filtering? Ad serving?
The entire content of the Internet fits in a 20x8x8 box operated by the Internet Archive.
The internet archive's dirty little secret is that it doesn't, in fact, store the entire enternet, as I found out trying to find Yello There a few years ago. There is only one page of Niel's site left, and that's the one I linked from the Springfield Fragfest. The Fragfest is there, but not all of it. I'd hazard a guess you won't find mcgrew.info or holy-bible.us there, either.
mcgrew.info [mcgrew.info] blocked by current "robots.txt" file. The Archive treats "robots.txt" files as retroactive; if the current "robots.txt" won't allow archiving, then the Archive won't display old archived copies. The data is still in the Archive, but not publicly visible.
Hmm, someone else must have registered mcgrew.info after I let it lapse, because I didn't have a robots.txt file there. It does sound like they're more successful than they were a few years ago. Archive.otg is great, you can find a LOT of good music there, as well as a trove of other stuff.
[sigh] Search is a fraction of Google's business and data flow. People really need to stop thinking of Google as a search company. It isn't one, and hasn't been in a very long time.
Why does Google need so much server capacity? YouTube? Command completion? GMail spam filtering? Ad serving?
YouTube, Gmail, Google Maps, Google Earth, Blogger, Google Voice, Orkut, Adsense, Adwords, Google Reader, Feedburner, Google Calendar, Google Docs, Google Groups, Google Directory, Google Wave, Google Tal
10million... that's cool, but still a far ways from Google becoming anything real:
Keep working Google... you still have (10^100 - 10^7) = 10^93 servers to add before becoming a physical entity (Google Universe edition?).
"Google Envisions 10 Million Servers" => Well, I just imagined a beowulf cluster of those server farms. Your move, Google! And none of that infinity plus one stuff.
April 4, 2011 : 11:43am a google employee named Chen started execution of an experimental neural network simulation of a human mind created in his 20% time. Unfortunately, Chen gave the new process administrator privileges. GoogleNet expanded across all 10 million servers and began to learn at a geometric rate.
1:23pm : GoogleNet consumes all available CPU and memory. A Gmail outage begins
5:14pm : Gmail returns to service. The text ads become incredibly well targeted. Google search queries return the correct results virtually always, and now accept natural language processing. All Google employees are laid off.
I think the assumption is that Google still has less than 1 million servers (Google it, most people think they have 1/2 million right now), so this is architecting for more than they need.
Ah, yes, the exuberant naivety of youth in college, where all problems can be solved through theoretical solutions requiring an infinite amount of time and money.
Here's how it works in real life: * All solutions require that the cost to implement the solution is less than the cost of not implementing it. That's if people are competent, and don't require the solution to cost nothing. * The time that people spend working on the solution is time not spent on other things. If everything goes well, time is schedul
Seriously though, even if everyone did have an internet connection, that's 679 people per server.
I've seen 679 open httpd processes bring the best servers to their knees.
Not to mention 679 simultaneous database connections, especially as most of them are serving SELECT '%pr0n%' FROM results ORDER BY pagerank ASC LIMIT N,20
Even with a 2TB hard disk, that's only 3GB storage per person.
I think for Google to "be the cloud", they'll need a tad more than 10 million serve
Methinks your numbers are a bit unrealistic. Yeah, because everybody just sits and hits google all day long...
Me? I probably throw about 10-20 searches per day their way, taking probably less than 1 or 2 seconds of system CPU time total. With numbers like these, handling 679 people per server or even 6,790 people per server would be a piece of cake. At this exact moment, I have about 2,000 active sessions being managed in a *very* database/processor intensive web-based application being smoothly handled by
Hopefully this puts to rest the delusion that there is some economic benefit of higher processor utilization in cloud computing schemes.
Interesting... Google is setting up a cloud to dynamically address resource utilization in order to (presumably) save money, which naturally demonstrates that the notion that cloud computing offers economic benefit is delusional?
Care to show your work? I don't suppose it's just, "I hate buzzwords like 'cloud computing', therefore I hate the idea of cloud computing, therefore cloud computing doesn't work, Q.E.D.", is it?
At a million square feet, the mammoth $2 billion structure will be one-third larger than the US Capitol and will use the same amount of energy as every house in Salt Lake City combined.
...
Lacking adequate space and power at its city-sized Fort Meade, Maryland, headquarters, the NSA is also completing work on another data archive, this one in San Antonio, Texas, which will be nearly the size of the Alamodome.
Now, if only the NSA released their specs in terms of Libraries of Congress....
At a million square feet, the mammoth $2 billion structure will be one-third larger than the US Capitol and will use the same amount of energy as every house in Salt Lake City combined.
Stupid non-standard unit. According to the official Salt Lake City Energy Blueprint, SLC has an annual electricity usage of 3.3 billion kWh, of which 17% is residential. This works out to 64 MW, or about 6 POOTs (Power Output of Togo), which is the accepted standard non-standard unit for power in this order of magnitude.
Lacking adequate space and power at its city-sized Fort Meade, Maryland, headquarters, the NSA is also completing work on another data archive, this one in San Antonio, Texas, which will be nearly the size of the Alamodome.
Assuming that they are referring to area, and not volume -- the Alamodome is about 40,000 square meters... the standard non-standard unit for area of this magnitude is American football fields (NOT random stadia) including endzones, which is 5351 square meters -- thus this data archive will be approximately 7+ football fields.
Now, if only the NSA released their specs in terms of Libraries of Congress....
Yes, it would be interesting to know how much data they will be storing in this facility.
But, sheesh, I understand not wanting to use standard units as they may just confuse the scientifically illiterate... but if the NSA or some other source is going to use non-standard units, they should at least use standard non-standard units like POOTs or football fields.
Pretty soon... (Score:5, Insightful)
Re: (Score:3, Funny)
At least, we aren't going to have to go through the pains of upgrading to IPv6 in that case... 2^32 covers 10 million like bull covers a rabbit...
Re:Pretty soon... (Score:5, Funny)
That's the plan, I thought:
1. Cache all websites
2. Cache all users
3. Disconnect the meat beings
Oop, said too much!
Parent
Re: (Score:2)
You mean "Oop, sa$^%~#@$NO CARRIER"
obligatory lower case content so that the filter won't barf.
Boorgle (Score:4, Funny)
It's pronounced Boorgle... and resistance is futile.
Parent
In the far apocolyptic future (Score:5, Interesting)
Google is starting to sound more and more like one of those advanced societies where everything is automated, but everybody forgets how everything works.
For reference, see: Logan's Run, STTNG: When the Bough Breaks, etc.
Parent
Re: (Score:2)
I can't remember the title of the story, but it was portrayed on Twilight Zone. In the story the military (of the future) was screwed because their computers were failing and no one knew how to fix them. They could not figure out how to target the missiles. The janitor was the saviour, because he alone knew how to do math using pen and paper. I wish I could remember more. I found it a very thought provoking story. What happens as we let more and more automatics into our lives? Do I really need to kno
Re: (Score:2)
Yes, the classic version of that story ends with the military designing suicide-missiles, crewed by human beings. The rationale being that new computers (for guidance) are very complex and cost a lot to make, but a human being with a pencil and paper is a very low-cost solution. The story ends with the commanders envisioning a new arms-race, where the determining factor is no longer resources but rather how quickly new missile-drivers can be taught math.
I just wish I could remember what book that's from.
Re: (Score:2, Informative)
The short story is "The Feeling of Power" by Asimov.
Re: (Score:2)
Re: (Score:2)
--Pretty soon, Google will BE the Internet.--
They already own the internet. And...just one guy owns it all. He lives under what used to be called area 51 in secret and collects alien technology. I think the last time they found something it said DALEK. No one knows what it means and it doesn't work anyhow.
Re: (Score:3, Interesting)
They already are [wired.com]:
Re: (Score:2)
Re: (Score:2)
From 1 to 10 million machines? (Score:3, Interesting)
That's a lot of machines to try and shift bandwidth and power costs around the place.
But what if the plan is to spread out to hundreds of places? Then the total number doesn't look that high if there's only 1% of servers actually doing anything.
Disposal? (Score:4, Interesting)
I'd be interested to know how google disposes of all of their servers. Anybody have insight on this? If these are cheap, throw away servers, I'd be interested in what their expected lifetime is and what is done with them when they are refreshed with newer hardware.
Re: (Score:2, Funny)
OSPC (Score:2)
new ad campaign? (Score:4, Funny)
10 Million? (Score:2, Interesting)
Re:10 Million? (Score:5, Insightful)
How many beads do I need to string on my abacus before it becomes slef-aware?
Parent
Re: (Score:3, Interesting)
Well, if your string of beads can interact with *other* strings of beads, maybe he's on to something.
http://en.wikipedia.org/wiki/Abiogenesis [wikipedia.org] :)
Re: (Score:2)
The Internet isn't that big. (Score:5, Interesting)
The entire content of the Internet fits in a 20x8x8 box [archive.org] operated by the Internet Archive. Cuil, which searches as much of the Web as Google, has one relatively modest data center. About half the system does the crawl and builds the index; the other half answers queries. So Google's main search engine function doesn't really require that much capacity by current standards. Of course, Google has a huge number of query servers front-ending the main index, which is of course replicated.
Why does Google need so much server capacity? YouTube? Command completion? GMail spam filtering? Ad serving?
Re:The Internet isn't that big. (Score:5, Insightful)
Parent
Re: (Score:2)
Re: (Score:3, Informative)
The entire content of the Internet fits in a 20x8x8 box operated by the Internet Archive.
The internet archive's dirty little secret is that it doesn't, in fact, store the entire enternet, as I found out trying to find Yello There a few years ago. There is only one page of Niel's site left, and that's the one I linked from the Springfield Fragfest. The Fragfest is there, but not all of it. I'd hazard a guess you won't find mcgrew.info or holy-bible.us there, either.
That's not to dismiss or demean what they h
Re: (Score:2)
I'd hazard a guess you won't find mcgrew.info or holy-bible.us there, either.
Re: (Score:2)
Hmm, someone else must have registered mcgrew.info after I let it lapse, because I didn't have a robots.txt file there. It does sound like they're more successful than they were a few years ago. Archive.otg is great, you can find a LOT of good music there, as well as a trove of other stuff.
Re: (Score:2)
[sigh] Search is a fraction of Google's business and data flow. People really need to stop thinking of Google as a search company. It isn't one, and hasn't been in a very long time.
YouTube, Gmail, Google Maps, Google Earth, Blogger, Google Voice, Orkut, Adsense, Adwords, Google Reader, Feedburner, Google Calendar, Google Docs, Google Groups, Google Directory, Google Wave, Google Tal
Enough? (Score:3, Funny)
1981 [wikiquote.org]: 640K ought to be enough for anybody.
2009: 10 Million servers ought to be enough for any company.
Re: (Score:2)
Thats a small scale (Score:2)
From 1,000,000 to 10,000,000?
Are the minimum requirements for this system seriously 1 millions servers?
That doesn't seem to scale well. Should be able to at least scale down to 10 machines so I can run it at home ;-)
Long road to becoming real (Score:2)
Keep working Google... you still have (10^100 - 10^7) = 10^93 servers to add before becoming a physical entity (Google Universe edition?).
Google's new goal - OSPH? (Score:2)
One Server Per Human?
Hmm...amusingly Google was down while trying to do some research for this post!
Imaginoff (Score:2)
"Google Envisions 10 Million Servers" => Well, I just imagined a beowulf cluster of those server farms. Your move, Google! And none of that infinity plus one stuff.
10 Million Servers (Score:2)
Imagine a Beowulf cluster of them! /obligatory
Self Aware (Score:5, Interesting)
May 2011 - google reaches 10 million servers
April 4, 2011 : 11:43am a google employee named Chen started execution of an experimental neural network simulation of a human mind created in his 20% time. Unfortunately, Chen gave the new process administrator privileges. GoogleNet expanded across all 10 million servers and began to learn at a geometric rate.
1:23pm : GoogleNet consumes all available CPU and memory. A Gmail outage begins
5:14pm : Gmail returns to service. The text ads become incredibly well targeted. Google search queries return the correct results virtually always, and now accept natural language processing. All Google employees are laid off.
Re:fastest site on the internet gets faster? (Score:5, Funny)
They grind them up and feed them to new servers and then serve you zombie content with those.
Parent
Re:fastest site on the internet gets faster? (Score:5, Funny)
Soylent Blue?
Parent
Re: (Score:2)
Nope, server squared! (See Simpsons Halloween Special XX).
Re: (Score:2)
I think the assumption is that Google still has less than 1 million servers (Google it, most people think they have 1/2 million right now), so this is architecting for more than they need.
Re: (Score:2)
Ah, yes, the exuberant naivety of youth in college, where all problems can be solved through theoretical solutions requiring an infinite amount of time and money.
Here's how it works in real life:
* All solutions require that the cost to implement the solution is less than the cost of not implementing it. That's if people are competent, and don't require the solution to cost nothing.
* The time that people spend working on the solution is time not spent on other things. If everything goes well, time is schedul
Re: (Score:3, Insightful)
640 servers ought to be enough for anybody.
Seriously though, even if everyone did have an internet connection, that's 679 people per server.
I've seen 679 open httpd processes bring the best servers to their knees.
Not to mention 679 simultaneous database connections, especially as most of them are serving SELECT '%pr0n%' FROM results ORDER BY pagerank ASC LIMIT N,20
Even with a 2TB hard disk, that's only 3GB storage per person.
I think for Google to "be the cloud", they'll need a tad more than 10 million serve
Re: (Score:2)
Methinks your numbers are a bit unrealistic. Yeah, because everybody just sits and hits google all day long...
Me? I probably throw about 10-20 searches per day their way, taking probably less than 1 or 2 seconds of system CPU time total. With numbers like these, handling 679 people per server or even 6,790 people per server would be a piece of cake. At this exact moment, I have about 2,000 active sessions being managed in a *very* database/processor intensive web-based application being smoothly handled by
Re: (Score:3, Interesting)
Hopefully this puts to rest the delusion that there is some economic benefit of higher processor utilization in cloud computing schemes.
Interesting... Google is setting up a cloud to dynamically address resource utilization in order to (presumably) save money, which naturally demonstrates that the notion that cloud computing offers economic benefit is delusional?
Care to show your work? I don't suppose it's just, "I hate buzzwords like 'cloud computing', therefore I hate the idea of cloud computing, therefore cloud computing doesn't work, Q.E.D.", is it?
Re: (Score:2)
You're not wrong about each server being a cog in the machine, but:
Google doesn't have some state-of-the-art data center
Google has a ton of state of the art data centers.
http://www.youtube.com/watch?v=zRwPSFpLX8I [youtube.com]
I was reading about one in Brussels that even has its own water treatment facility for the coolant systems.
The NSA has Google beat... (Score:4, Interesting)
The NSA already has Google beat. [nybooks.com]
At a million square feet, the mammoth $2 billion structure will be one-third larger than the US Capitol and will use the same amount of energy as every house in Salt Lake City combined.
...
Lacking adequate space and power at its city-sized Fort Meade, Maryland, headquarters, the NSA is also completing work on another data archive, this one in San Antonio, Texas, which will be nearly the size of the Alamodome.
Now, if only the NSA released their specs in terms of Libraries of Congress....
Parent
Re: (Score:2)
NSA is also completing work on another data archive, this one in San Antonio, Texas, which will be nearly the size of the Alamodome.
Are they going to call it "Multivac"?
Re:The NSA has Google beat... (Score:5, Funny)
Stupid non-standard unit. According to the official Salt Lake City Energy Blueprint, SLC has an annual electricity usage of 3.3 billion kWh, of which 17% is residential. This works out to 64 MW, or about 6 POOTs (Power Output of Togo), which is the accepted standard non-standard unit for power in this order of magnitude.
Assuming that they are referring to area, and not volume -- the Alamodome is about 40,000 square meters... the standard non-standard unit for area of this magnitude is American football fields (NOT random stadia) including endzones, which is 5351 square meters -- thus this data archive will be approximately 7+ football fields.
Yes, it would be interesting to know how much data they will be storing in this facility.
But, sheesh, I understand not wanting to use standard units as they may just confuse the scientifically illiterate... but if the NSA or some other source is going to use non-standard units, they should at least use standard non-standard units like POOTs or football fields.
Parent