Twitter Not Rocket Science, but Still a Work in Progress 111
While it may not be rocket science, the Twitter team has been making a concerted effort to effect better communication with their community at large. Recently they were set-upon by a barrage of technical and related questions and the resulting answers are actually somewhat interesting. "Before we share our answers, it's important to note one very big piece of information: We are currently taking a new approach to the way Twitter functions technically with the help of a recently enhanced staff of amazing systems engineers formerly of Google, IBM, and other high-profile technology companies added to our core team. Our answers below refer to how Twitter has worked historically--we know it is not correct and we're changing that."
twitter hate... (Score:5, Funny)
Re: (Score:1, Funny)
Re:twitter hate... (Score:5, Interesting)
I do have two concerns about the service, apart from the supposed outages.
One is that they don't seem to have any plan that can make money on their service. I suppose they are trying to build up a service that they can sell to Microsoft, Yahoo or Google, but how much of the user base is going to tolerate ads if any of them bought it as a platform to serve ads?
Another is that they are very quick to cancel passwords due to inactivity, requiring a request to get it reset. I don't know if it's one week or one month, but it's as if they don't want you at all if you don't log in at least once a week. However often they do it, I've not seen any other service do it so quickly.
Re: (Score:2, Informative)
Re: (Score:3, Funny)
of course, now its not funny anymore.
twit, twitter, twittest? (Score:2)
Re: (Score:2)
Dude, if you can't be bothered to tell people what you had for breakfast each day maybe twitter's not for you.
Re: (Score:1)
Who cares when you've still got venture capital to burn through?
Re:twitter hate... (Score:4, Funny)
I DO hang out at places other than
Now that I have seen the twitter weblog, I'm still tempted mentally to associate the two, and want to spam it with goatse.cx links to somehow balance the shite. (just kidding-sort of)
I could not get an answer from the twitter weblog about how many sock puppet accounts you could register though, so I guess I will pass on the whole thing, and attempt to just filter 'anything twitter' from my web experience, as any more in my mind twitter==mindless rant.
And before some twitter (weblog) fanboy tries to bust my chops, yes I know I could be 'missing out on something' with that attitude, well that also goes for a million and one other websites that I currently don't know about. Save it!
Re: (Score:3, Funny)
So I went to the site and found I was correct. Its a AI-based troll that talks up products. Sadly there is a glitch where it also talks about toilet related obsessions.
Re: (Score:2)
Then again, I guess someone could set up a twitter-like slashdot posting service....
And Twitter is... (Score:2)
Re: (Score:2)
Re: (Score:3, Insightful)
Re: (Score:2)
Seriously, I'm liking it, because I get very isolated working, and like to feel slightly more like I'm in some sort of context more intimate than the far side of the moon.
Also, I find that if I post just what I'm up to, it motivates me to have that be something productive, and that's useful to me.
Seems to me it's a good fit for anybody entreprenurial, because it lets you sort of 'connect' a bit while stopping you from wasting any serious ti
Re:And Twitter is... (Score:5, Informative)
Re: (Score:2)
Alright, poop-time (Score:5, Funny)
This is correct (Score:1, Funny)
Re: (Score:1, Funny)
That's "effect", not "affect" (Score:5, Informative)
Obligatory xkcd (Score:5, Insightful)
Re: (Score:2, Insightful)
Re: (Score:1)
Re: (Score:2)
Re: (Score:1)
Re: (Score:1)
Re: (Score:1)
Re: (Score:2)
Re: (Score:2)
No mention or Ruby/Rails? (Score:2, Interesting)
Re: (Score:1, Insightful)
I use neither Twitter nor Ruby, but seems like the talk has been than Rails wasn't scaling well.
So you're basically flamebait. Rails might had been a factor, but this is not what it's about. It's not as easy as, rewrite it all in java or php and it will work blazing fast. Twitter needs to find a better way to reorganize or perhaps rewrite their code and it may still be in ruby as far I am concerned. Dare I also mention that their biggest overhead is not the rails framework, nor the ruby language, but their database, wich lead to their downtime in the first place. Nevertheless, big players such as IBM
It's the algorithm, stupid (Score:5, Informative)
When I started irately writing this post, I wrote it in a tone that would have gotten me modded into oblivion. But then I realized that ignorance, not idiocy, drives the particular myth I'm debunking. let me educate, not flame, those of you who haven't formally studied computer science.
It's become fashionable to blame Ruby for Twitter's problems, but that's wrong. The particular choice of language doesn't matter a bit when you talk about scalability, no matter what the language or the problem.
First, sending a twitter message is an algorithm. An algorithm is just a recipe for doing something to some data. Although most computer science literature deals with more abstract and general algorithms, like those for sorting and searching, the same principles applies to even the most mundane processes, like what rm foo does to a file system, or how a database engine runs an INSERT.
One way we can talk about algorithms is to use something called Big-O Notion [wikipedia.org], which describes the relationship between how much stuff an algorithm processes and how long it takes to run.*
It's easier to see things with examples. Say we have an algorithm and we give it three sets of data, D1 and D2, and D3, each twice as large as the last, so that D2 is twice as large as D1, and D3 is four times as large as D1.
If we call the algorithm O(1), it will take the same amount of time to process D1, D2, and D3. If we instead say it's O(N), D2 will take twice as long to run as D1, and D3 will take four times as long.
If N represents the number of users for a web application, and we want to double N, twice as many users, we'd need twice as many web servers if the bottleneck algorithms are O(N). If the database is the bottleneck, we'd need a twice-beefier database server, or some partitioning.
Things start to get interesting with O(N^2). In that case, D2 takes four times longer to run than D1, and D3 takes four times longer than D2, which sixteen times longer than D1.
That means that if we want to support twice as many users, we need four times as many web servers, or more likely, a four-times beefier database server.
It can get a lot worse than O(N^2) too, especially if you're not paying attention to complexity. For example, many graph (think social networking) algorithms can easily become O(2^N), which is a lot worse than N to a constant power.
When you try to scale a poorly-designed algorithm (pretty much anything worse than O(N)), you start running out of cores, rack space, electricity, and atoms in the universe.
One useful bit about big-O notation is that it lets us ignore piddly details that don't matter. Say we had an O(2N) version of the O(N) algorithm. Sure, the O(2) algorithm might take twice as long to run, but it can still handle double the data with double the capacity or double the time. Even if it's O(10N), you don't start boiling the oceans to cool your data center when you want to increase your visit capacity a thousandfold.
This observation is why the choice of language doesn't matter. If a language implementation is slow, all it does is add a constant factor to any algorithms written in that language. A Python application might be ten times slower than one written in C, but its big-O complexity will be the same.
At the worst, that means you'll need ten times as many servers as with the C web application. The increase in development efficiency writing in Python (or Ruby on Rails, or Lisp, or anything else) might make the trade-off worth it. You can deal with a constant factor slowdown.
If on the other hand, you code a wicked fast implementation of an O(N^3) algorithm in C, no amount of hardware will save you. You'll hit a number of users beyond which your servers slow to a crawl and you lose blagosphereic karma. Even if you double your capacity, or buy a four-times-beefier database server, that
Re: (Score:1, Redundant)
Re: (Score:3, Insightful)
Re: (Score:2)
Re: (Score:3, Insightful)
Good scalability is not about how fast something processes, it's about how much the speed degrades as the load increases. It sounds simple - but it's NOT.
Re: (Score:3, Insightful)
The language you choose may affect your ability to scale when you take its concurrency model (or lack thereof in some cases) in to account. For instance, I can have a O(1) algorithm, using a hashmap, but that doesn't mean that I'll be able to have the runtime performance of constant time. For a solid example, let's use Java (Java 6, with java.util.concurrent,
Re:It's the algorithm, stupid (Score:4, Insightful)
If your whole web application is bottlenecked by one hashmap, you're going to run into scalability problems as soon as you need more than one machine anyway. On the other hand, If the performance of the web application as a whole does not depend on the hashmap, then your argument is irrelevant to the scalability of the application as a whole.
I concede that a more efficient runtime environment might make better use of the same hardware, supporting, say, 70 clients instead of 50 per machine. But that's not the kind of scalability I'm talking about. Even a platform that achieved only one client per machine that scaled linearly would be better than one that handled 70 clients per machine, except that you were limited to one machine.
And yes, on one machine, a bad choice of data structure can affect scalability. But the blame for that rests on the data structure itself, not the language in which it is implemented. As an associative array, a Python hash table (dict) will scale far better than a C linked list. Why? Because one's a hash table and one is a linked list!
Which data structures are available in which language might factor into the choice of language, but it's only a convenience: you can always create your own data structure implementations.
Creating a scalable application means being able to throw hardware at the problem.
Let's assume you've gotten your application to scale beyond one machine anyway. That's a prerequisite for this section.
Now, if the machines don't communicate and users don't care, you automatically win O(N) scalability.
If your machines must communicate, they do so over some kind of network. The way this communication is achieved determines the scalability of the application. While some environments might have more intuitive network facilities than others (think Erlang), ultimately one can use any approach to networking with any language.
Again, we're reduced to choice of data structures and algorithms, not language, as the marker of scalability.
The choice of language does not dictate the data structure the designer of the application uses, and so the language is not a serious barrier to scalability. I concede it may be more difficult to implement efficient protocols in some languages than in others, but we're dealing with turing-complete languages here, aren't we?
I should note that languages typically thought of as "slower" are often more expressive. It often takes less effort to write efficient algorithms in expressive languages.
(Returning to our previous example, since writing a hash table is more complex than writing a naive linked list in C, a C programmer is more likely to use a linked list at the expense of scalability. In Python, using a hash table is as simple as writing {}, so an equally-skilled programmer is more likely to use the more efficient data structure, resulting in better performance in a "slower" language.)
The bottom line is that if communication between nodes is required, complexity must be > O(N). And if complexity is greater than O(N), then as N increase without bound, the communication overhead approaches infinity anyway. The key is to make that growth as slow as possible.
The tools and techniques used to slow that growth --- thinking about the problem, designing efficient algorithms --- are features of the human mind, and not any particular language.
Saying that one language is better at scaling than another is like arguing that one human language is better for building cars than another!
Re: (Score:2)
Re: (Score:3, Interesting)
This observation is why the choice of language doesn't matter. If a language implementation is slow, all it does is add a constant factor to any algorithms written in that language.
That's the crappiest, most long-winded apology for poor performance I've seen. Yes, everything you said about O() notation is more or less correct. No, you can't wish it to be applicable just by squinting really hard and hoping.
At some point, that constant does start to matter. Suppose your O(n) algorithm written in $fast_language can support the world's population logged in simultaneously. Further suppose that you wrote prototype #1 in $spiffy_language that can support about 100 users on the same h
i wonder what they mean by.. (Score:5, Interesting)
I wonder what the disadvantages of setting up a front end to an email system and covert incoming tweets to actual an actual email is. On the retrieval side you just read the mailbox and convert back to the tweet format then send them on to the destination.
Re: (Score:2)
Re: (Score:2, Informative)
Re:i wonder what they mean by.. (Score:4, Interesting)
We initially used a simple file based approach for user data (on vanilla Linux commodity boxes). It worked well.
As we grew, there was some pressure to move to a database approach, so we switched to Sybase (free on Linux). It worked well, and scaled us through a lot of growth.
However, eventually, when the database bogged down and no amount of tuning would help, rather than clustering, we looked at the nature of our data (a user's data was self-contained, generally not related to any other user's data), so having a massive relational database of hundreds of millions of records wasn't really necessary. So we went back to the file-based approach (with a good central "locking daemon" to ensure atomicity of writes), and gained a lot of performance (and simplicity). Even with thousands of bits of information or transactions for a user, a flat file is pretty darn manageable.
Generally people jump at databases as a default way to store data. If a single user's data is fairly manageable, and you have millions of users, there are times when plain old files suffice. (Doing some smart things like ensuring that directories don't grow arbitrarily and such also help, but that stuff is generally a lot easier than db design and maintenance.)
It sounds like Twitter didn't have well-thought out foundations, and they're reworking some of that. Good for them. (I've actually found some good consulting work in helping companies like them deal with scalability issues, from my experience with such things...)
Re: (Score:2)
Re: (Score:2)
Faster? Than a properly designed, maintained, backed-up, optimized, licensed, Oracle database? Probably not.
But it's fast enough, scalable, an order of magnitude simpler, doesn't require a six or seven-figure licensing fee once you get big, doesn't require major hardware to run on, and doesn't require a $100k/year Oracle specialist to maintain it. I've a
Re: (Score:2)
Re: (Score:1)
Twitter Google App Engine (Score:4, Interesting)
Re: (Score:3, Interesting)
Re:Twitter Google App Engine (Score:5, Interesting)
http://eric.themoritzfamily.com/2008/05/20/appengine-vs-twitter/ [themoritzfamily.com]
I haven't tried it tho.
Re: (Score:2)
They wouldn't have Twitter's killer feature, which is free SMS updates. That's the only reason we don't already have 100 Twitter clones.
Re: (Score:3, Informative)
And the biggest irony is... (Score:2, Funny)
Too ittle too late. (Score:2)
From what I've heard the only affect twitter's had on communication isn't one worth bragging about.
Re: (Score:1)
I find it odd that this article is tagged effect. (Score:3, Funny)
Re: (Score:2)
You, also, failed to get the joke the original poster was making. Allow me to explain:
It seems to be meant to suggest that the article's use of "affect" is incorrect. Surely this is mistaken. If suggesting that twitter has anything to do with better communication isn't an affectation, I don't know what is.
This is a joke. The person who wrote it is well aware of the difference between "affect" and "effect". He deliberately used the word "affectation" (which looks related to "affect" but really means conspicuously artificial [reference.com] or unnatural speech or conduct [merriam-webster.com]). The WHOOSH was not a dismissal of the response; the WHOOSH was intended to be the sound of the joke passing over the respondent's head.
Whether the joke is funn
Re: (Score:2)
You mean with a direct object, right? When saying "Bob effects X", X is the direct object.
Re: (Score:2)
Also, try replacing "effect" with "create". "Bob creates X" is correct as well.
Lore Sjöberg on Twitter (Score:2)
ergh Twitter (Score:1, Flamebait)
Plurk seems pretty stable so far (Score:5, Interesting)
Plurk [is.gd] has been gaining popularity in the past 24 hours, and it's handling scalability rather well so far (after having been mentioned by Leo Laporte, Robert Scoble, TechCrunch, and others). I'm very curious to see how well it would hold up if it had the same number of users as Twitter, though.
Re: (Score:3, Insightful)
Nice try... (Score:3, Funny)
Re: (Score:2)
grammar nazi (Score:1)
Re: (Score:2)
"Our biggest mistake... (Score:1)
I doubt they'll come out and say that. But look at how flaky twitter is (try using it for a while). The two biggest sites on the internet that are built on RoR - Penny Arcade and twitter - are flaky as hell. I've met the PA coder, and I'm not willing to believe the twitter guys are incompetent, so it's obviously not their fault.
RoR is known for obscenely high resource usage. I can't believe it'd be a good choice for large sites like twitter, long term.
'team' (Score:3, Informative)
Psst, there's actually no "Twitter team." It's just one guy with like ten accounts.
Not rocket science? (Score:2)
Re: (Score:2)
Re:Big Brother(s) (Score:5, Insightful)
Re: (Score:2)
And am aware that people who formerly worked at places such as Google (especially) and IBM would most likely have a better understanding of how a large-scale system like this works.
But, Google itself (along with a lot of other major internet/software/hardware companies) were started in a 'grass roots'/garage/basement method.
It seems to me that more often than not, bringing in the help of (formerly employees of) large corporations tends to con
Re:Big Brother(s) (Score:4, Insightful)
Like the AC said, I think you're wildly exaggerating how ideological workplaces are, particularly from the point of view of a server monkey.
Re: (Score:3, Insightful)
Re: (Score:1)
Re: (Score:1)
Would you choose Anthony Anderson or Kevin Smith?
Re: (Score:2)
If you mean, someone who has experience directing movies, as apposed to someone who (seems) to only have experience being directed, and probably has "lots of ideas" on how to direct.
I'd be willing to take advice from either with an equal amount of salt.
Unless of course I was planning on directing, or writing a movie similar to one of Kevin Smiths, enwich case, the answer is obvious.
However, as for "working with", I would probably be more com
Re: (Score:2)
Re: (Score:2)
Fixed it for you.... (Score:2, Funny)
Granted, they seem to be more directed at the physical performance of the system, but I see this as somewhat of an inroad to Twitter actually working, thereby requiring me to acknowledge its existence. And I would much rather have some no-name website, where people living in their mother's basements come to tell a
Re: (Score:2)
Earthquake In China (Twitter Related)
http://news.slashdot.org/article.pl?sid=08/05/13/0245240 [slashdot.org]
YouTube Refuses To Remove Terrorist Videos
http://tech.slashdot.org/article.pl?sid=08/05/20/224218 [slashdot.org]
YouTube Fires Back At Viacom
http://tech.slashdot.org/article.pl?sid=08/05/27/2350225 [slashdot.org]
And when they become "Big" they fall prey to that greed, cencorship in favor of profit
Re: (Score:2)