Inside Facebook's Infrastructure

Inside Facebook's Infrastructure 77

Posted by CmdrTaco on Thursday September 30, 2010 @08:45AM from the when-it-crashes-it-burns dept.

miller60 writes "Facebook served up 690 billion page views to its 540 million users in August, according to data from Google's DoubleClick. How does it manage that massive amount of traffic? Data Center Knowledge has put together a guide to the infrastructure powering Facebook, with details on the size and location of its data centers, its use of open source software, and its dispute with Greenpeace over energy sourcing for its newest server farm. There are also links to technical presentations by Facebook staff, including a 2009 technical presentation on memcached by CEO Mark Zuckerberg."

Inside Facebook's Infrastructure

This discussion has been archived. No new comments can be posted.

Search 77 Comments Log In/Create an Account

Comments Filter:

Environmentalist (Score:3, Interesting)

by AnonymousClown ( 1788472 ) writes: on Thursday September 30, 2010 @08:51AM (#33745812)

I support environmental causes (Sierra Club and others), but I for one will not support Greenpeace and I don't think they are credible. They use violence to get their message out and their founder is now a corporate consultant that shows them how to get around environmental laws and pollute.
That's all.

Mark Zuckerberg's presentation link is wrong (Score:3, Interesting)

by francium de neobie ( 590783 ) writes: on Thursday September 30, 2010 @09:13AM (#33745972)

It links to Facebook's "wrong browser" page. The real link may be here: http://www.facebook.com/video/video.php?v=631826881803 [facebook.com]

Re:Facebook ID (Score:4, Interesting)

by rtaylor ( 70602 ) writes: on Thursday September 30, 2010 @09:29AM (#33746104) Homepage

Facebooks knows anything about you that 3rd parties (friends, family, etc.) might tell them too.
I didn't create an account or provide any information to facebook; yet there are bits and pieces of information on it about me.

Call me dense, but... (Score:5, Interesting)

by mlts ( 1038732 ) * writes: on Thursday September 30, 2010 @10:33AM (#33746868)

Call me dense, but with all the racks of 1U x86 equipment FB uses, wouldn't they be far better served by machines built from the ground up to handle the TPM and I/O needs?
Instead of trying to get so many x86 machines working, why not go with upper end Oracle or IBM hardware like a pSeries 795 or even zSeries hardware? FB's needs are exactly what mainframes are built to accomplish (random database access, high I/O levels) and do the task 24/7/365 with five 9s uptime.
To boot, the latest EMC, Oracle and IBM product lines are good at energy saving. The EMC SANs will automatically move data and spin down drives not in use to save power. The CPUs on the top of the line equipment not just power down what parts are not in use, but wise use of LPARs or LDoms would also help with energy costs just due to having fewer machines.

Re:Facebook ID (Score:3, Interesting)

by tophermeyer ( 1573841 ) writes: on Thursday September 30, 2010 @10:46AM (#33747048)

One of the reasons that's keeping me from deleting my facebook account is that having it active allows me to untag myself from all the pictures that I wish my friends would stop making public. If I didn't have an account they could link to, my name would just sit on the picture for anyone to see.

How many times a day do people check Facebook? (Score:3, Interesting)

by Comboman ( 895500 ) writes: on Thursday September 30, 2010 @11:03AM (#33747358)

"690 billion page views to its 540 million users in August"? Good lord, that's 1278 page views PER USER in just one month! That's (on average) 41 page views per user, per day, every single day! The mind boggles.

infrastructure secrecy versus openness (Score:3, Interesting)

by peter303 ( 12292 ) writes: on Thursday September 30, 2010 @11:44AM (#33747914)

Its interesting how FB is open about their data server infrastructure while some places like Google and MicroSoft ware very secretive. It is competitive for Google to shave every tenth of second off of a search they can through clever software and hardware. They are an "on ramp" to the Information Super Highway, not a destination like FB. And because Google is one of the largest data servers on the planet, even small efficiency increases translate in mega-million-dollar savings.

Re:Call me dense, but... (Score:3, Interesting)

by Cylix ( 55374 ) * writes: on Thursday September 30, 2010 @11:51AM (#33748008) Homepage Journal

The latest x86 architecture lines are moving far more in the direction of mainframe type units in terms of density and bandwidth. This is a hardware type from several years back and would not be really compare to the denser offerings being explored today. However, the reasoning behind commodity hardware is not just the ability to switch to one platform from another, but rather it keeps costs down with vendor competition. One design can be produced by multiple vendors with the goal of earning the lowest bid. There are several other advantages as well with a commodity or generic based design.
With commodity hardware that is not designed with five nines there is an expectation the application can fail away. The need for the application to fail away gracefully is actually more fundamental then at the server level. When considering application resiliency you want to target at the datacenter level so that you are not locked to a specific region. To build something as large as facebook they are no longer load balancing at the router, but at the datacenter level itself. With this concept the datacenter becomes a bucket entity with the ability to service X traffic and if it should fail you simply move services away. With a sufficiently advanced version of this very generic and very hardware abstracted model it is now possible to distribute load to third party farms via cloud infrastructures.
Still, the world is not black and white and even within these models there will be small clusters of special purpose hardware for things like data warehousing and reporting. Far more typical I find the larger systems in industries where there can be no possible downtime or the loss of data cannot occur.

data servers = industrial engines of 21st century (Score:3, Interesting)

by peter303 ( 12292 ) writes: on Thursday September 30, 2010 @11:56AM (#33748090)

When these data centers start showing up as measurable consumers of the national power grid and components of the GDP, you might consider them metamorphically as power-plants of the information industry.

In his book on the modern energy industry "The Bottomless Well", author Peter Huber places commodity computing near the top of his "energy pyramid". Peter's thesis is modern technology has transformed energy into ever more sophisticated and useful forms. He calls this "energy refining". At the base of his pyramid are relative raw energy like biomass and coal. The come electrivity, computing, optical, etc. I think its interesting to view computing a refined form of energy.

Re:Call me dense, but... (Score:3, Interesting)

by mlts ( 1038732 ) * writes: on Thursday September 30, 2010 @02:58PM (#33750944)

Actually neither. Its just that to an observer like me, FB is trying to reinvent the wheel on a problem that already has been solved.
Obviously, IBM is not cheap. Nor is Oracle/Sun hardware. However, the time and money spent developing a large scale framework on the application layer is not a trivial expense either. It might be that the time FB puts in trying to deploy something uncharted like this may cost them more in the long run.

Re:How many times a day do people check Facebook? (Score:3, Interesting)

by Overzeetop ( 214511 ) writes: on Thursday September 30, 2010 @03:31PM (#33751410) Journal

Have you seen how often Facebook crashes /has problems? you have to constantly reload the thing to get anything done. Thank goodness Google Calendar doesn't have that problem or I'd probably have a thousand hits a day to my calendar page alone.
Also, FB pages tend to be pretty content-sparse. It's not uncommon for me to hit a dozen pages in 2-3 minutes if I check facebook.

Re:Call me dense, but... (Score:4, Interesting)

by RajivSLK ( 398494 ) writes: on Thursday September 30, 2010 @04:04PM (#33751920)

Well we do the same thing as facebook but on a much smaller scale... Our "commodity hardware" (mostly supermicro motherboards with generic cases, memory etc) has pretty much the same uptime and performance as vendor servers. For example we have a Quad CPU database server that has been up for 3 years. If I remember correctly it cost about 1/2 as much as a server with equivalent specs from a vendor.
The system basically works like this. Buy 5 or so (or 500 if you are facebook) servers at once with identical specs and hardware. If a server fails (not very often) there are basically 4 common reasons:
1) Power supply or fan failure -- very easy to identify.
Solution: Leave server down until maintenance day (or whenever you have a chance) swap for a new power supply (total time 15min [less time that calling the vendor tech support]).
2) Hard drive failure -- usually easy to identify
Solution: Leave server down until maintenance day (or whenever you have a chance) swap for a new hard drive (total time 15min [less time that calling the vendor tech support]). When the server reboots it will automatically be setup by various autoconfig methods (bootP whatever). I suspect that facebook doesn't even have HDs in most servers.
3) Ram Failure -- can be hard to indentify
Solution: Leave server down until maintenance day (or whenever you have a chance) swap for new ram (total time 15min [less time that calling the vendor tech support]).
3) Motherboard Failure (almost never happens) -- can be hard to indentify
Solution: Replace entire server -- keep old server for spare parts (ram, power supply whatever)
I don't really see what a vendor adds besides inefficiency. If you have to call a telephone agent who then has to call a tech guy from the vendor who then has to drive across town at a moments notice to spend 10 minutes swapping out your ram it's going to cost you. At a place like facebook why not just hire your own guy?

Re:Call me dense, but... (Score:3, Interesting)

by TheSunborn ( 68004 ) writes: <mtilsted.gmail@com> on Thursday September 30, 2010 @09:17PM (#33754932)

The problem is that for any specific budget* the x86-64 solution will give you more aggregate io and more processor hardware then the mainframe. The argument for the mainframe is then that the software might be more easy to write but there don't exists any mainframe which can serve even 1/10 of Facebook so you need to cluster them anyway. And if you need to special cluster magic you might as well have x86-64.
And IBM will not promise you 99.999% uptime if you buy a single mainframe. If you need that kind of uptime you need to buy multiple mainframes and cluster them.
*Counting in either rackspace used or money paid for hardware.

There may be more comments in this discussion. Without JavaScript enabled, you might want to turn on Classic Discussion System in your preferences instead.

Inside Facebook's Infrastructure 77

Inside Facebook's Infrastructure More Login

Inside Facebook's Infrastructure

Environmentalist (Score:3, Interesting)

Mark Zuckerberg's presentation link is wrong (Score:3, Interesting)

Re:Facebook ID (Score:4, Interesting)

Call me dense, but... (Score:5, Interesting)

Re:Facebook ID (Score:3, Interesting)

How many times a day do people check Facebook? (Score:3, Interesting)

infrastructure secrecy versus openness (Score:3, Interesting)

Re:Call me dense, but... (Score:3, Interesting)

data servers = industrial engines of 21st century (Score:3, Interesting)

Re:Call me dense, but... (Score:3, Interesting)

Re:How many times a day do people check Facebook? (Score:3, Interesting)

Re:Call me dense, but... (Score:4, Interesting)

Re:Call me dense, but... (Score:3, Interesting)

Related Links Top of the: day, week, month.

Slashdot Top Deals

Slashdot