Twitter Not Rocket Science, but Still a Work in Progress 111
While it may not be rocket science, the Twitter team has been making a concerted effort to effect better communication with their community at large. Recently they were set-upon by a barrage of technical and related questions and the resulting answers are actually somewhat interesting. "Before we share our answers, it's important to note one very big piece of information: We are currently taking a new approach to the way Twitter functions technically with the help of a recently enhanced staff of amazing systems engineers formerly of Google, IBM, and other high-profile technology companies added to our core team. Our answers below refer to how Twitter has worked historically--we know it is not correct and we're changing that."
No mention or Ruby/Rails? (Score:2, Interesting)
i wonder what they mean by.. (Score:5, Interesting)
I wonder what the disadvantages of setting up a front end to an email system and covert incoming tweets to actual an actual email is. On the retrieval side you just read the mailbox and convert back to the tweet format then send them on to the destination.
Twitter Google App Engine (Score:4, Interesting)
Re:twitter hate... (Score:5, Interesting)
I do have two concerns about the service, apart from the supposed outages.
One is that they don't seem to have any plan that can make money on their service. I suppose they are trying to build up a service that they can sell to Microsoft, Yahoo or Google, but how much of the user base is going to tolerate ads if any of them bought it as a platform to serve ads?
Another is that they are very quick to cancel passwords due to inactivity, requiring a request to get it reset. I don't know if it's one week or one month, but it's as if they don't want you at all if you don't log in at least once a week. However often they do it, I've not seen any other service do it so quickly.
Re:Twitter Google App Engine (Score:3, Interesting)
Re:Twitter Google App Engine (Score:5, Interesting)
http://eric.themoritzfamily.com/2008/05/20/appengine-vs-twitter/ [themoritzfamily.com]
I haven't tried it tho.
Plurk seems pretty stable so far (Score:5, Interesting)
Plurk [is.gd] has been gaining popularity in the past 24 hours, and it's handling scalability rather well so far (after having been mentioned by Leo Laporte, Robert Scoble, TechCrunch, and others). I'm very curious to see how well it would hold up if it had the same number of users as Twitter, though.
Re:i wonder what they mean by.. (Score:4, Interesting)
We initially used a simple file based approach for user data (on vanilla Linux commodity boxes). It worked well.
As we grew, there was some pressure to move to a database approach, so we switched to Sybase (free on Linux). It worked well, and scaled us through a lot of growth.
However, eventually, when the database bogged down and no amount of tuning would help, rather than clustering, we looked at the nature of our data (a user's data was self-contained, generally not related to any other user's data), so having a massive relational database of hundreds of millions of records wasn't really necessary. So we went back to the file-based approach (with a good central "locking daemon" to ensure atomicity of writes), and gained a lot of performance (and simplicity). Even with thousands of bits of information or transactions for a user, a flat file is pretty darn manageable.
Generally people jump at databases as a default way to store data. If a single user's data is fairly manageable, and you have millions of users, there are times when plain old files suffice. (Doing some smart things like ensuring that directories don't grow arbitrarily and such also help, but that stuff is generally a lot easier than db design and maintenance.)
It sounds like Twitter didn't have well-thought out foundations, and they're reworking some of that. Good for them. (I've actually found some good consulting work in helping companies like them deal with scalability issues, from my experience with such things...)
Re:It's the algorithm, stupid (Score:3, Interesting)
That's the crappiest, most long-winded apology for poor performance I've seen. Yes, everything you said about O() notation is more or less correct. No, you can't wish it to be applicable just by squinting really hard and hoping.
At some point, that constant does start to matter. Suppose your O(n) algorithm written in $fast_language can support the world's population logged in simultaneously. Further suppose that you wrote prototype #1 in $spiffy_language that can support about 100 users on the same hardware as the $fast_language version. Sure, both are O(n), but that doesn't magically make them equally good solutions.
No, I don't think Ruby is 60 million times slower than the fastest language, but it very well might be 100 times slower [debian.org] for their algorithms. Do you honestly think that wouldn't make any difference?