Want to read Slashdot from your mobile device? Point it at m.slashdot.org and keep reading!

 



Forgot your password?
typodupeerror
×
Technology

Beowulf In Business 46

"Cnet has a story on how businesses are starting to use Beowulf for those heavy duty tasks," writes NIB. The story notes, "Beowulf wouldn't be good for a program that executes large numbers of transactions, such as an airline reservation system. It would, however, work for business tasks such as deciding how to design an assembly line or which mix of currencies to buy, said International Data Corporation analyst Dan Kusnetzky."
This discussion has been archived. No new comments can be posted.

Beowulf In Business

Comments Filter:
  • by Zurk ( 37028 )
    MOSIX can do what beowulf cannot (transparent process migration) and its GPLised too...they should have at least commented on it...

  • Investors look for experience in both technology and business. What is yours?

    My background includes a couple of years at college spent studying management science, several years working with unix systems and TCP/IP networks, both LAN and WAN, and stints as a technical sales manager and, later, project manager, at a telco. I'm currently working as a systems engineer with mid-range unix servers (e.g. Sun e450 up to e4500's, SGI Origin 2k's, etc.) including relevant technologies such as FCAL and clustering.

    BTW, there is plenty of direction talk to Paralogic or Alta Tech.

    Both are cool companies, in my opinion, but they're not in quite the same market I want to be in. As regards the technology, obviously, I'm not privy to the R&D that's going on at these companies - it's very possible that they're working on ideas which are similar, or indeed identical to mine, in which case, I may well have to tear up by plan and wait for the next idea to pop into my head.

    My main skill, you see, isn't a technical or administrative one - it's my creative ability to come up with new ideas, to look at things in a different way to others, to learn about things I don't know about, and to solve problems.

    D.
    ..is for defunct.

  • Yeah, that was one of the suggestions but we need a chess source program to recompile for the project and then, as you point out, we need someone to actually challenge it.

    This was to be an ad hoc "Stone Soupercomputer" style configuration built out of machines brought to the conference by attendees.
  • ...I don't think it has all that many applications outside of scientific research...

    One big one is simulations for financial calculations. One such is roughly this: the price of some class of security is sensitive to interest rates. So you want to see what happens if, at several time steps, the interest rate goes up or down some small amount. Evaluating the different 'paths' of interest rates over time lends itself to parallel processing.


  • I can only assume that you're referring to the fact that there are many nodes in a cluster, as opposed to one large server.

    If this is the case, and you're saying that this increases the chances of the entire system crashing, then you're wrong and I would even go so far as to say that you don't have a clue what you're talking about.

    Small wonder you're posting as an AC. I'd be embarassed too if I was that stupid. :-)

    D.
    ..is for dunce.

  • True, but this is merely a design hurdle. The thing to note here is that transaction speed is not really mission-critical, if it takes half a second to complete a transaction, then that is fine. A bunch of dedicated file servers with partial databases set up to use a network hash table (certain database keys on certain systems) on a very high bandwidth backbone could drive quite a lot of transactions. While it would be a "modified Beowulf" or some such, a scheme like this would work quite well.

    As long as there was not one huge shared database, of course. And to dump old databases to semi-offline storage, a second backbone could be installed in these file servers to push the data onto backup servers, which would merge the databases again, and write them out. A lot of investment in hardware, but much less than a similar proprietary system.

  • There are linux clusters that, if not technically a Beowulf, are at least very close to being a Beowulf and employ a shared-disk scheme. Sadly, these schemes are just as impractical for many large data applications as the shared-nothing approaches.

    There's going to be some interesting developments in this area for Linux soon, even if it means I'm going to have to start them myself.

  • Besides being very poorly worded, your argument that "The definition of a Beowulf requires and "open source" OS" is simply incorrect. If you look back at why/when Beowulf was created (http://www.beowulf.org/intro.html), you'll find that it was from "their [the creators of Beowulf] idea of providing COTS (Commodity off the shelf) base systems to satisfy specific computational requirements."

    Whether or not they used an open-source operating system is not the point. The goal was to provide an MPP system for as little cost as possible. If a collection of Tru64 UNIX workstations operating in a Beowulf cluster provides more computing performance at less cost than a similar system from a major MPP vendor, then it seems to meet the criteria for why Beowulf began. Sure, it might cost more than the same Alpha workstations running Linux, but it is also likely to perform better. Life's little tradeoffs are everywhere, aren't they?

    Cheers,
    David Hull
    david.hull@england.com
  • With new distributed computing software (Mosix anyone?) more and more people are going to write software for clusters. Definitely there are issues with db coherency, record locking, etc., but solutions will be implemented; after all a cluster is pretty much the only way to increase throughput if an SMP box is not fast enough for you...
  • I have set up a few Beowulf machines for S&G. I used PVM [freshmeat.net], RH Linux 6.0/5.2, a 10/100 switch, and about 6 boxen. It worked quite well, except that it took a few days to get operating how I wanted. I wrote a couple applications to crunch numbers across the cluster, tested throuput, etc. For even more S&G I used MP3PVM [freshmeat.net] to RIP a few CD's real fast. Fun!

    Now this is all well and good, but wouldn't it be great if we could have a transparent virtual machine that runs across all the nodes? Something which you could use "/bin/bash" on as your command shell.

    Now, I am not sure how this would be accomiplshed-- forinstance how you would effciciently share memory accross machines or decide how to break up tasks (break on thread, would be one way); this is just to open up conversation.

    Imagine: Lower your SETI@Home WU time to mear seconds :) (is it far to run a distributed computer under a distributed computer?)


    -AP

  • Strange, they said that it wouldnt be good for large volume transactions. Isn't this exactly the sort of task that works really well concurrently? It seems to be the perfect candidate, lots of non-interconnected tasks, perfect for multiple execution.
  • Large volume OLTP has a huge amount of inter-connection at the data level. You have to lock and unlock all the records to maintain ACID properties. Beowulf is a shared-nothing approach and doesn't have facilities for sharing all the disks with appropriate concurrency control.

    The largest airline systems all use IBM's Transaction Processing Facility (TPF), which is a specialized real-time OS for mainframes. TPF shares disks amongst all processors in the cluster and pushes the locking down to the individual disk controllers (specialized microcode). Where I work, we get about 200,000 physical I/Os per second to the disk farm, using TPF on eight mainframes and a several TB of disk.

    Still, there's nothing about large-volume OLTP that Linux couldn't do, it's just a matter of programming. I, for one, would like to see it happen.
  • It may have somthing to do with that databases need to remain "consistant" such that only one operation is performed on a record at once. The problem might be in making sure that only one box "owns" a record at once. You would need to make that record unavailable to all the other nodes, or let them no not to use it. If the network is high latency, it could be a problem.

    of course, with gigabit ethernet... :)
    "Subtle mind control? Why do all these HTML buttons say 'Submit' ?"
  • I think the problem with the transaction systems is that they top out on a different bottleneck. CPU isn't the major gating factor. Multithreadde applications will take great advantage of this type of system. One application that I worked on in a previous life was a creditcard limit verification system for a major player. They had a 1 second transaction turn around specification. We ended up setting it up with discrete machines with a failure rollover mechanism involved. Much of the coordination we had to design would have been far easier in a coupled system like Beowulf.

    There's a movement on to put together a large Beowulf cluster for the Boston Geekfest in October. One of the things we're trying to come up with is a good demo that actually shows something to the crowd. We've had ideas from the realtime rendering of POV scenes to decryption (yeah, right, watch it hum for 20 hours and then spit out the true key) but haven't come up with a "killer demo app". Email if you have any ideas.
  • gigabit ethernet just increases the bandwidth,
    doesnt do much for the latency. Most of the latency is in software (TCP/IP etc) that shared memory systems avoid.

  • "Cluster", "Linux" and "Beowulf" are popular buzzwords at the moment, and I can see a bandwagon developing. However, it's a bandwagon which isn't going anywhere yet, because it doesn't have direction.

    Beowulf is an interesting technology, but I don't think it has all that many applications outside of scientific research. For Linux clustering to achieve credibility as a viable means of replacing mainframes and high-end servers, a more balanced architecture, providing for high availability and ease of administration needs to be developed.

    Luckily, I have the answer, am currently preparing a business plan and intend to begin seeking out interested parties in a month or two. If there are any venture capitalists out there who are interested in investing in a venture with more than hype and PR behind it, let me know. :-)

    Meanwhile, some guy at Dell UK is inviting people to participate in building a large Beowulf [beowulf.org] at Dell's "Proof of Concept Lab" in Limerick, Ireland, at the beginning of September.

    By an amazing coincidence, I'd already booked those two weeks off to go home and catch up with the family. A trip down to Limerick is on the cards, methinks...

    D.
    ..is for Dastardly.

  • What does it matter, I thought most airlines over booked seats anyway. ;)

  • The problem might be in making sure that only one box "owns" a record at once.

    Not a problem. Whilst I don't intend to tell you how I solved this one, I will give you a hint - don't think of it as a programming issue; think of it as an architectural issue.

    D.
    ..is for Devious.

  • I think that the problem is the performance in
    distibuted DB, like the one you need in this kind of environment... but a distributed FS may solve the probem:))
  • Now all I need to do is get ahold of about 100 of those power4 IBM chips when they are released build 25 Quad Processor 1GHZ machines with 500mhz bus and 1gb mem and throw em all in a cluster. Add 2-10 Terabytes of secondary storage, multiple OC-3 or faster connections and start leasing space on the fastest machine in the world. Handles 156E+10^8 hits/sec while doing recursive database lookups.

    heh... If Only...

  • by Anonymous Coward
    Seti is a good example of the type of data which lends itself well to being handled by a cluster. However, in this case you could just run the seti@home client on each of the 386's and get the same result. It's not a cluster, but seti@home is a great example of distributed computing.
  • by Anonymous Coward

    I just read the article. As a manufacturer of "turn-key" Beowulf systems, here was my reply to the author:

    Stephen,

    I just read your story about Beowulf systems. While the story was well written and informative, there are some points that you have missed.

    1) The definition of a Beowulf requires and "open source" OS (See "How to Build a Beowulf" by Sterling, Becker, Salmon, Savarese) Therefore, systems built from True 64 are NOT Beowulf systems.

    2) You missed my company, Paralogic Inc. We sell turnkey Beowulf systems. In fact rather than "several" as reported by IBM, we have several dozens of installed production systems at companies like Lucent, Amerada Hess, Conoco, Procter and Gamble, government sites like NASA, NRL, and the Air Force, and many Universities. (see www.xtreme-machines.com [xtreme-machines.com])

    3) There is a rather huge barrier to entry because of the technical nature of these machines. As far as I know, we are the only company who will offer support for Beowulf clusters. Without support the market can never enter the mainstream.

    4) There have been quite a few other people who contributed quite a lot of effort to the Beowulf technology other than IBM and VA Linux. Although all contributions are welcome, these guys are a little late to the party and we hope they stay.

    Sincerely

    Douglas Eadline, Ph.D.
    President

    Paralogic, Inc. [plogic.com]
    PEAK PARALLEL PERFORMANCE

  • hehe.. no joke.. I have a stack of 5 386's w/ 8mb (I think) that i've been threatening to turn into a cluster. But what would i use it for? Maybe get a seti@home client running? :)

  • One big one is simulations for financial calculations.

    You're absolutely correct - my oversight, and quite a large one, considering I've done this sort of modelling in the past - multiple regression analysis is one of the things I picked up whilst studying operational research.

    It actually comes in kinda useful for dealing with performance issues on large systems with lots of users, and I've spec'd systems and laid out upgrade paths based on the results of MRA...

    But, I digress....

    D.
    ..is for Deviant.

  • by male ( 71469 )
    I just got three p120's with an unknown amount of ram and harddisk space, but it will probably be at least 8megs and a gig...
    Other than learning a new technology, i don't have a real use for a parallel processing machine, i just do basic php -> mysql stuff with small dbs...
    however, it sounds like a really cool thing to set up, and i want to learn. Alot of these sites talk about beowulf alot, but don't give an explanation on how to set one up! Either i'm looking at the wrong sites or i just don't kow how to do it..
    Can anyone point me in the right direction or give me some tips on doing this? And responces like *just give me those three pcs* are appreciated but will be ignored :)

    Thanks for the help
    jc

    --yep. i'm a NEWBIE!!!
  • In follow up you may want to look at the following books:

    How to Build A Beowulf : A Guide to the Implementation and Application of PC Clusters;
    Sterling, Thomas L. / Becker, Donald J. / et al.

    http://www1.fatbrain.com/asp/bookinfo/bookinfo.a sp?theisbn=026269218X

    High Performance Cluster Computing : Architectures and Systems, Volume I; Buyya, Rajkumar

    http://www1.fatbrain.com/asp/bookinfo/bookinfo.a sp?theisbn=0130137847

    High Performance Cluster Computing : Volume 2, Programming and Applications

    http://www1.fatbrain.com/asp/bookinfo/bookinfo.a sp?theisbn=0130137855

  • Beowulf is best for CPU intensive tasks which can be broken up easily, don't require a lot of intranode communication, can deal with relatively high latency on the intranode communication, and can deal with single node failures easily.

    This is a relatively large domain or problems, but it doesn't work for everything. A lot of business applications require high reliability and availability. If you use beowulf, you have to implement these features for your application on your own.

    The simulations that businesses are running on these things aren't really in the same league. For the most part, they aren't time critical and if a failure occurs that invalidates a test run, they can ususally be rolled back to some midpoint and started again without a significant loss of time.

    Beowulf isn't just useful for CPU intensive tasks though. All those processors also provide significant amounts of memory bandwidth and all those machines provide potentially large amounts of disk storage and bandwidth, but again, you need memory or disk intensive tasks that can easily be split out to many loosely coupled nodes.

  • Can anyone point me in the right direction...

    The Beowulf Underground [beowulf-underground.org] site is a good starting place.

    D.
    ..is for Deranged.

  • gigabit ethernet increaces bandwidth, which has a minimal impact on latency because much network latency is the result of processing overhead, with a notable contribution from context switching as the data moves from hardware driver to network stack to usermode process. Cutting out these middlemen will help a lot.
  • Maby you could set it up as a chess engine? Really don't need to be beowulf but...

    The problem is that there will probably not be anyone there that could even beat a single computer |).

    LINUX stands for: Linux Inux Nux Ux X
  • Quite a lot of mfg design utilizes DES (discrete event simulation) models, and in my experience (which is fairly extensive in the semiconductor industry) these models contain vast amounts of detail. As a result they run REALLY REALLY SLOWLY. We're talking DAYS for a single replication of an experiment (you have to vary your random seeds and average across replications to eliminate the effects of "pseudorandom" numbers). What's worse, since these models are strictly time-based they're not parallelizable to any degree. So a Beowulf cluster buys you....nothing. You just need many CPUs and TONS of RAM to run the experiments in parallel...but you can't parallelize a single run.

    OTOH, if you're doing linear or integer programming, those are parallelizable if you're doing branch & bound stuff. But LP/IP doesn't always give you the granularity you need to make decisions as accurately as DES does (and can't account for the stochastic and dynamic nature of manufacturing processes).

The explanation requiring the fewest assumptions is the most likely to be correct. -- William of Occam

Working...