Slashdot Log In
Huge Traffic On Wikipedia's Non-Profit Budget
Posted by
timothy
on Tuesday June 24, @01:11PM
from the optimizing-smartitude dept.
from the optimizing-smartitude dept.
miller60 writes "'As a non-profit running one of the world's busiest web destinations, Wikipedia provides an unusual case study of a high-performance site. In an era when Google and Microsoft can spend $500 million on one of their global data center projects, Wikipedia's infrastructure runs on fewer than 300 servers housed in a single data center in Tampa, Fla.' Domas Mituzas of MySQL/Sun gave a presentation Monday at the Velocity conference that provided an inside look at the technology behind Wikipedia, which he calls an 'operations underdog.'"
Related Stories
The Fine Print: The following comments are owned by whoever posted them. We are not responsible for them in any way.
Full
Abbreviated
Hidden
Loading... please wait.

Impressive (Score:5, Insightful)
Given that their topic sites are generally in the top three for any search engine query, the volume of traffic they're dealing with (and the budget that they have!) is very impressive. I always thought that they had much beefier infrastructure than the article says.
Reply to This
Re:Impressive (Score:5, Funny)
Reply to This
Parent
Wikipedia = much more traffic than slashdot (Score:5, Interesting)
Slashdot does .. what? 40 mbit of traffic at peak? Wikipedia
is roughly 100 times larger [nedworks.org]. (And WP has three datacenters, not one)
Slashdot traffic hasn't created noticeable blips on Wikipedia's radar for years.
OTOH, if Wikipedia linked slashdot on every page slashdot would go down, if do to nothing else but bandwidth exhaustion.
Reply to This
Parent
Re:Impressive (Score:5, Informative)
Reply to This
Parent
The power of low standards (Score:5, Insightful)
Our organizations' databases (also a non-profit) get several thousand writes per second. Losing 'a few seconds' would mean potentially hundreds of users' record changes were lost. If that happened here, it would be a huge deal. If it happened regularly, it would destroy the business.
Reply to This
Re:The power of low standards (Score:5, Insightful)
Okay. So pay attention to the sentence before the one you quoted which read, "I'm not suggesting you should follow how we do it."
Reply to This
Parent
Re:The power of low standards (Score:5, Insightful)
Don't be too harsh -- the standards are dependent on the application. Your application, by the nature of the information and its purposes, requires a different standard of reliability than Wikipedia does. You're certainly entitled to be proud of yourself for maintaining that standard.
But don't let that turn into being derogatory about the Wikipedia operation. Wikipedia has identified the correct standard for their application, and by doing so they have successfully avoided the costs and hassle of over-engineering. To each his own...
Reply to This
Parent
Re:The power of low standards (Score:5, Informative)
A bank requires "six nines" of performance (i.e., right 99.9999% of the time) and probably wants even better than that.
Reply to This
Parent
Easy to Increase the budget or add servers (Score:5, Funny)
How hard can it be to increase the budget or add more servers?
Just go to the Wikipedia page with those numbers and change them. You don't even need to have an account.
Reply to This
Note to self (Score:5, Funny)
If you ever find yourself in a flamewar on Wikipedia you cannot win, bomb Tampa, Florida out of existence.
Reply to This
Re:Note to self (Score:5, Funny)
Reply to This
Parent
Re:Note to self (Score:5, Interesting)
Or do a hurricane dance, and let nature do its thing...
Having all their servers in Tampa, FL (of all places given hurricanes, frequent lightning, flooding, etc there) doesn't seem too smart - I would have thought, given Wikipedia's popularity, their servers would be geographically spread out in multiple locations.
Though to do that adds a level of complexity and costs that even many for-profit ventures, such as Slashdot, likely can't afford / justify; Slashdot's servers are in one place - Chicago ... to digress a bit, I notice this site's accessibility (ie. more page not found / timeouts lately) has been spotty since the servers move.
Ron
Reply to This
Parent
Re:Note to self (Score:5, Informative)
They're not all in Tampa, they have a bunch in Netherlands and a few more in South Korea.
Reply to This
Parent
More importantly (Score:5, Interesting)
Reply to This
Simplicity (Score:5, Interesting)
Although much of the Mediawiki software is a hideous twitching blob of PHP Hell, the base functionality is fairly simple and run perpetually and scale massively as long as you don't mess with it.
What spoils a lot of projects like this is the constant need for customization. Wikimedia essentially can't be customized (except for plugins obviously, which you install at your own peril) and that is a big reason why it scales so massively.
As for Wikipedia itself, I suspect it is massively weighted in favor of reads. That simplifies circumstances a lot.
Reply to This
Parent
Off-topic, I know, but...what about /.'s hardware? (Score:5, Interesting)
I.e. the promised follow-up to this story [slashdot.org] about moving to the new Chicago datacenter? You know, the one where Mr. Taco promised a follow-up story "in a few days" about the "ridiculously overpowered new hardware".
I was quite looking forward to that, but it never eventuated, unless I missed it. It's certainly not filed under Topics->Slashdot.
Reply to This
Works great because it's not "Web 2.0" (Score:5, Insightful)
Most of Wikipedia is a collection of static pages. Most users of Wikipedia are just reading the latest version of an article, to which they were taken by a non-Wikipedia search engine. So all Wikipedia has to do for them is serve a static page. No database work or page generation is required.
Older revisions of pages come from the database, as do the versions one sees during editing and previewing, the history information, and such. Those operations involve the MySQL databases. There are only about 10-20 updates per second taking place in the editing end of the system. When a page is updated, static copies are propagated out to the static page servers after a few tens of seconds.
Article editing is a check-out/check in system. When you start editing a page, you get a version token, and when you update the page, the token has to match the latest revision or you get an edit conflict. It's all standard form requests; there's no need for frantic XMLHttpRequest processing while you're working on a page.
Because there are no ads, there's no overhead associated with inserting variable ad info into the pages. No need for ad rotators, ad trackers, "beacons" or similar overhead.
Reply to This
Nonsense. Wikipedia is THE web 2.0 (Score:5, Insightful)
Web 2.0 is not just about flashy Ajax or what not, it's about user generated dynamic content. WP's "everything is a wiki" architecture might /look/ a bit archaic compared to fancy schmancy dynamic rotating animated gradient-filled forums, but it's much more powerful.
Moreover, WP is not a collection of static pages, if you're logged in at least, every pages is dynamically generated, and every page's history is updated within a few seconds.
Reply to This
Parent
Confused by the title (Score:5, Insightful)
What does "Non-Profit Budget" mean, anyway? There are non-profits bigger than the company I work for. Non-profit isn't the same as poorly financed.
Reply to This
Link to wikipedia? (Score:5, Funny)
The summary was wrong to include a link to the Wikipedia homepage without a Wikipedia link about Wikipedia [wikipedia.org] in case you don't know what Wikipedia is. I myself had to Google Wikipedia to find out what Wikipedia was so I am providing the Wikipedia link about Wikipedia in case others were likewise in the dark regarding Wikipedia.
-l
P.s., Wikipedia.
Reply to This
Re:Some thoughts (Score:5, Insightful)
This is so true; I've always said, "you get what you pay for."
Do you want to pay for software, or do you want to pay for people?
Only one can create the other.
Reply to This
Parent
Re:Some thoughts (Score:5, Funny)
Reply to This
Parent
Re:What is the role of Open Source (Score:5, Interesting)
Reply to This
Parent
Re:I've always wondered... (Score:5, Funny)
"It would be neat to have a deeper look at their budget to see how I can save money and boost performance at work."
Since they are using LAMP, obviously they could save money by following Microsoft's "Get The Facts" advice!
Reply to This
Parent
Re:What amazes me... (Score:5, Interesting)
Slashdot is great at taking down sites on crappy shared hosting, but anything with a decently configured dedicated server will likely survive just fine.
Wikipedia's probably getting hit with hundreds of times the traffic Slashdot is at all times.
Reply to This
Parent