Follow Slashdot stories on Twitter

 



Forgot your password?
typodupeerror
×
The Internet Technology

Twitter Not Rocket Science, but Still a Work in Progress 111

While it may not be rocket science, the Twitter team has been making a concerted effort to effect better communication with their community at large. Recently they were set-upon by a barrage of technical and related questions and the resulting answers are actually somewhat interesting. "Before we share our answers, it's important to note one very big piece of information: We are currently taking a new approach to the way Twitter functions technically with the help of a recently enhanced staff of amazing systems engineers formerly of Google, IBM, and other high-profile technology companies added to our core team. Our answers below refer to how Twitter has worked historically--we know it is not correct and we're changing that."
This discussion has been archived. No new comments can be posted.

Twitter Not Rocket Science, but Still a Work in Progress

Comments Filter:
  • by Pope ( 17780 ) on Monday June 02, 2008 @04:34PM (#23631095)
    Will "a rather pointless waste of time" fit on a postcard? I guess I'm not 'hip' enough to care about Twitter, it all seems a bit pointless for anyone who's not some interweb celebrity.
  • Obligatory xkcd (Score:5, Insightful)

    by Krishnoid ( 984597 ) on Monday June 02, 2008 @04:39PM (#23631193) Journal
    Grammar [xkcd.com] can be fun!
  • by devotedlhasa ( 1298843 ) on Monday June 02, 2008 @04:42PM (#23631223)
    affect better communication.... negatively.
  • Re:Big Brother(s) (Score:5, Insightful)

    by Anonymous Coward on Monday June 02, 2008 @04:43PM (#23631229)
    Hiring folks who used to work at IBM or Google is not the same thing as "large companies control[ling] how Twitter works." Some day, you'll have a job and you'll understand that. [Sorry to be an asshole about this, but your comment just shouts "teenage kid who's never had a serious job."] People with experience with large-scale applications may already know solutions to some of the problems Twitter is seeing. Those solutions aren't always in the text books; and if they were trivial and obvious, then such applications would be much more common.
  • Re:Big Brother(s) (Score:4, Insightful)

    by Otter ( 3800 ) on Monday June 02, 2008 @05:10PM (#23631563) Journal
    To argue against myself on your behalf though, having someone from a larger corporation, can be benificial aswell, because they may have been fired, or quit because the larger corporation wasn't listening to them, or was doing something in a way that went against their ideology, thereby possibly preventing said smaller company from becomming similar to the larger one.

    Like the AC said, I think you're wildly exaggerating how ideological workplaces are, particularly from the point of view of a server monkey.

  • by Anonymous Coward on Monday June 02, 2008 @05:12PM (#23631587)

    I use neither Twitter nor Ruby, but seems like the talk has been than Rails wasn't scaling well.
    So you're basically flamebait. Rails might had been a factor, but this is not what it's about. It's not as easy as, rewrite it all in java or php and it will work blazing fast. Twitter needs to find a better way to reorganize or perhaps rewrite their code and it may still be in ruby as far I am concerned. Dare I also mention that their biggest overhead is not the rails framework, nor the ruby language, but their database, wich lead to their downtime in the first place. Nevertheless, big players such as IBM or Google already faced these issues, and this is what this article is about.
  • by Bill, Shooter of Bul ( 629286 ) on Monday June 02, 2008 @06:49PM (#23632549) Journal
    That is interesting. Sounds like were still running on in house development level servers when the load hit -- ouch. If I do end up using such a site, I d rather say I'm Plurking than tweeting. It sounds much cooler, like something to do after having too many beers, rather than trying to imitate a bird.
  • Re:Big Brother(s) (Score:3, Insightful)

    by Antique Geekmeister ( 740220 ) on Monday June 02, 2008 @08:21PM (#23633341)
    Oh, ideology affects cage monkeys. The use of open source versus closed source, burchasing enough licenses for the software you use, and making '99.999%' uptime actually mean that instead of simply hiding downtime, and forcing people to spend more time documenting how much time they spent on a task than actually doing the task are all policies I've seen affect server work. Those may not be idologies per se, but they certainly arise from philosophies about how things should be done.
  • by mcrbids ( 148650 ) on Tuesday June 03, 2008 @02:17AM (#23635243) Journal
    Well written post! It's amazing how few people really understanding the difference between performance and scalability. Getting good performance isn't all that hard. Getting good scalability is much harder.

    Good scalability is not about how fast something processes, it's about how much the speed degrades as the load increases. It sounds simple - but it's NOT.
  • by QuoteMstr ( 55051 ) <dan.colascione@gmail.com> on Tuesday June 03, 2008 @02:36AM (#23635307)
    That's the beauty of the web. At a basic level, it's stateless. Web frameworks don't have any concept of communicating with other clients. The highest level they'll work at is the individual session. You only see complexities worse than O(N) when clients communicate with each other, and that communication must be an entirely application-specified thing. The web framework and language have nothing to do with this communication between clients, and so have nothing to do with scalability writ large.
  • by Gazzonyx ( 982402 ) <scott,lovenberg&gmail,com> on Tuesday June 03, 2008 @03:14AM (#23635415)
    First let me say, that was a very well thought out and informative post. I only would like to bring up a counter point for the sake of discussion.

    The language you choose may affect your ability to scale when you take its concurrency model (or lack thereof in some cases) in to account. For instance, I can have a O(1) algorithm, using a hashmap, but that doesn't mean that I'll be able to have the runtime performance of constant time. For a solid example, let's use Java (Java 6, with java.util.concurrent, as this is the concurrency framework I'm most familiar with).

    I can have a ConcurrentHashMap (a regular HashMap with a concurrency wrapper) with constant time access (get and put). However, every time that I modify the internal structure of the HashMap, every other thread that is accessing the HashMap at that time is kicked out. That means I've got my optimal algorithm, but due to competition for resources, it'll have the runtime performance of an algorithm chugging along at exponential time if everyone wants to write to that data structure.

    Granted, there are ways around this, but you can't just throw hardware at the problem and pretend it doesn't exist. In reality, more hardware raises the communication overhead to the process. Your language of choice may or may not scale at the same pace as others when you take in to account your concurrency needs and perhaps the backend you are given to interface. All that being said, I'm not sure how varied the runtime characteristics of various languages (say, an interpreted JIT versus native machine code) scale WRT each other. I would suspect that a JIT would have much more overhead when you layer it on top of an OSes concurrency model (handling locks within your own code, the JIT handling resources, and the OS doing the same, all at the same time... probably with a database and filesystem doing so at varying degrees). Of course, I could be wrong.
  • by QuoteMstr ( 55051 ) <dan.colascione@gmail.com> on Tuesday June 03, 2008 @04:00AM (#23635563)
    Thank you for your polite reply, but I feel like I haven't adequately communicated my point.

    The language you choose may affect your ability to scale when you take its concurrency model (or lack thereof in some cases) in to account. For instance, I can have a O(1) algorithm, using a hashmap, but that doesn't mean that I'll be able to have the runtime performance of constant time.


    If your whole web application is bottlenecked by one hashmap, you're going to run into scalability problems as soon as you need more than one machine anyway. On the other hand, If the performance of the web application as a whole does not depend on the hashmap, then your argument is irrelevant to the scalability of the application as a whole.

    I concede that a more efficient runtime environment might make better use of the same hardware, supporting, say, 70 clients instead of 50 per machine. But that's not the kind of scalability I'm talking about. Even a platform that achieved only one client per machine that scaled linearly would be better than one that handled 70 clients per machine, except that you were limited to one machine.

    And yes, on one machine, a bad choice of data structure can affect scalability. But the blame for that rests on the data structure itself, not the language in which it is implemented. As an associative array, a Python hash table (dict) will scale far better than a C linked list. Why? Because one's a hash table and one is a linked list!

    Which data structures are available in which language might factor into the choice of language, but it's only a convenience: you can always create your own data structure implementations.

    Granted, there are ways around this, but you can't just throw hardware at the problem and pretend it doesn't exist.


    Creating a scalable application means being able to throw hardware at the problem.

    Let's assume you've gotten your application to scale beyond one machine anyway. That's a prerequisite for this section.

    Now, if the machines don't communicate and users don't care, you automatically win O(N) scalability.

    If your machines must communicate, they do so over some kind of network. The way this communication is achieved determines the scalability of the application. While some environments might have more intuitive network facilities than others (think Erlang), ultimately one can use any approach to networking with any language.

    Again, we're reduced to choice of data structures and algorithms, not language, as the marker of scalability.

    The choice of language does not dictate the data structure the designer of the application uses, and so the language is not a serious barrier to scalability. I concede it may be more difficult to implement efficient protocols in some languages than in others, but we're dealing with turing-complete languages here, aren't we?

    I should note that languages typically thought of as "slower" are often more expressive. It often takes less effort to write efficient algorithms in expressive languages.

    (Returning to our previous example, since writing a hash table is more complex than writing a naive linked list in C, a C programmer is more likely to use a linked list at the expense of scalability. In Python, using a hash table is as simple as writing {}, so an equally-skilled programmer is more likely to use the more efficient data structure, resulting in better performance in a "slower" language.)

    The bottom line is that if communication between nodes is required, complexity must be > O(N). And if complexity is greater than O(N), then as N increase without bound, the communication overhead approaches infinity anyway. The key is to make that growth as slow as possible.

    The tools and techniques used to slow that growth --- thinking about the problem, designing efficient algorithms --- are features of the human mind, and not any particular language.

    Saying that one language is better at scaling than another is like arguing that one human language is better for building cars than another!

The hardest part of climbing the ladder of success is getting through the crowd at the bottom.

Working...