Stories
Slash Boxes
Comments

News for nerds, stuff that matters

Slashdot Log In

Log In

Create Account  |  Retrieve Password

The Subtle Tyranny Of Spreadsheets

Posted by timothy on Wed Mar 31, 2004 05:20 AM
from the step-into-my-elaborate-lair dept.
pipingguy writes "I found this link on a CAD-related mailing list which questioned the current state of spreadsheet usage. Since using spreadsheets is often only one step away from PowerPoint mastery, I thought it worthy of submission." An excerpt: "The second distortion caused by conventional spreadsheets is more subtle. It's described in a 1980s paper, written by university researcher Jeffrey Kottemann and others concerning what they called 'Performance, Beliefs, and the Illusion of Control.' The paper described an experiment in which subjects were asked to perform a planning task using different tools, some of them with elaborate what-if capability and others without it." Yup, it's a ZD/Yahoo link, but it raises good questions."
+ -
story
This discussion has been archived. No new comments can be posted.
The Fine Print: The following comments are owned by whoever posted them. We are not responsible for them in any way.
 Full
 Abbreviated
 Hidden
More
Loading... please wait.
  • back in 1997 when I was a physics exchange student in Glasgow, they made me solve a *quantum machanics* problem using excel! it was ridiculous. I kept the spreadsheet just for its absurdity (it's the only .xls file on my entire harddrive)

  • A manager at a company I worked for was presenting figures for the last year. He showed the financial breakdown for each division, with the profit being calculated as a percentage for each division. At the bottom, there was a summary line showing the total figures for the company and including the "average profit" for the company.

    Which he had calculated by summing the profit column and dividing by the number of divisions.

    I mentioned that this was producing a somewhat unrealistic figure, with a couple of small divisions showing very good profit margins and the largest department showing a slight loss. "No, that's the mathematical definition of 'median'," he answered.
  • by Sanity (1431) on Wednesday March 31 2004, @05:37AM (#8723700) Homepage Journal
    ...is that most people don't really understand statistics, and tools like spreadsheets help people to forget this reality by blinding them with lots of authorative looking numbers.

    The question is whether a tool can ever be a substitute for a good understanding of statistics and probability - or whether it will always be a case of monkeys playing with ever more sophisticated typewriters...?

    • by kidgenius (704962) on Wednesday March 31 2004, @05:59AM (#8723800)
      No. A good statistics knowledge is fundamental in making sure that the data you are putting in is valid. As the old adage goes, "Garbage In, Garbage Out."
      If you are just mindlessly putting stuff together and say "I think a median/mean/standard dev would go good here" then it's obvious that you shouldn't be doing statistical analysis. Also, after the numbers have been calculated, you need to understand what the significance of them are. I work in a highly statistical field (Reliability Engineering) and I will say that at times it really is a black art. Things may at first look good/bad, but until you sit down, and think about what it all means, you will have way of knowing whether what you just got out of your analysis is "correct."
        • by mst76 (629405) on Wednesday March 31 2004, @09:56AM (#8725109)
          Do you really know what the standard deviation is? For example, you know that most standard tools for calculating standard deviation it assume the data has a Gaussian distribution. But what if your data poisson distributed, or hypergeometric, or maxwell-boltzmann... [...]
          That is not correct. The standard deviation of a random variable is the square root of its variance. The variance is the squared expectation of the centralized variable (variable minus its mean). Calculating the variance of a random variable involves integrating (or summing) the probability density function p(x) times x^2. If it's a well known distribution, one would usually look it up in a book, or try to solve the sum or integral (by hand or with Maple or Mathematica). Note that the only thing you need to calculate the standard deviation is the distribution or density function, no actual data is involved.

          The things listed as mean and standard deviation in Excel are sample means and sample standard deviations. If you have a list of numbers, and you assume that they were drawn from some distribution with finite expectation and variance, you can calculate the sample mean (simple average), which is an estimate of the expectation. Then you substract this mean from all your numbers and take their squares. The average of that is an estimate of the variance of the distribution, take the square root for the standard deviation. The nice thing is that these estimates converge to the expectation and variance regardless of distribution. If you do make the additional assumption of Gaussianity, you can also say something about the rate of convergence and the distribution of the estimate. But the basic formulas in Excel and such for sample mean and variance work for any distribution that has a mean and variance.
          • by misterpies (632880) on Wednesday March 31 2004, @06:50AM (#8723966)
            >> And what the *fsck* does it mean, even if the data has a Gaussian distribution?

            that's what wikipedia is for :)

            it's a measure of the width of the distribution. Given a gaussian distribution, a random measurement will occur within one standard deviation of the mean with a probability of around 68%. Or to put it the other way round, if you have data and are trying to calculate the distribution, there's about a 68% chance that the true value of the mean falls within 1 s.d. of the value you calculated.

            If you don't have a gaussian distribution, you can still calculate a standard deviation but it will not have the same meaning with respect to the probability of you having got the right mean.

            This is of particular relevance to spreadsheets, since they're often used to do calculations on financial data such as stock prices and most financial data is not gaussian--it's 'log normal', meaning that the logarithm of the data values are gaussian, but not the data itself. So most people doing standard deviation calculations on such data are probably completely misinterpreting the results...

  • by farghen (759198) on Wednesday March 31 2004, @05:37AM (#8723701)
    "The expected Year 1 profit is $1 million, but there's a 30 percent chance of losses for the first two years."

    Unfortunately or not, this is not what the bosses want to hear. They want to know that profits will be $1 million. Perhaps the spreadsheets have not adapted to uncertainties for a reason.
  • by pubjames (468013) on Wednesday March 31 2004, @05:38AM (#8723705)

    This reminds me of something a successful businessman told me about accountants: "Accountants know the cost of everything, and the value of nothing".

    A problem occurs when people look at a spreadsheet of accounts and think it represents a business. It doesn't. A classic illustration of this is Marks & Spencer's returns policy. If you buy a pair of trousers from Marks & Spencers and then once you've got them home decide they don't fit or whatever, you can return them, no questions asked. To an accountant, this is just a cost. There is no identifiable figure in the accounts that you can point to and say, there's the benefit of that cost. And yet many people shop there because of the policy.
  • by rufusdufus (450462) on Wednesday March 31 2004, @05:43AM (#8723718)
    Of course this is actually an advertisement for a specific software package. But whats funny is that the story undercuts itself: It explains that people are wasting their time doing detailed future predictions with spreadsheets. Then it goes on to push this particular product as a way of doing detailed future predictions using statistics. But they never make the case that making predictions is good anyway, while they do provide evidence that its a waste of time!

    I dont know anyone who uses their spreadsheets for doing any kind of predictions. Everyone I know uses it just like the old-fashioned pen-and-paper..spreadsheet! Its a way of accounting for the here-and-now. How many businessmen don't understand their business prospects better than a garbage-in-garbage out number crunching computer?
  • by Technician (215283) on Wednesday March 31 2004, @05:44AM (#8723721)
    A spread sheet is not a stastics program. However if your office bundle includes a hammer, everyting starts to look like a nail. Excell does math, It's the hammer that makes stastics look like a spreadsheet problem. Enough said? Hammer - nail, Excell - spredsheetable data. For stastics programs look here for a list of some real stastics programs. They are not spreadsheets.

    http://www.wch.org.au/CEBU/software.htm

    I guess it's kind of like trying to write HTML with MS Notepad. It can be done, however other tools make the job easer.
  • by shic (309152) on Wednesday March 31 2004, @05:44AM (#8723725)
    Spreadsheets have been and will always continue to be an extraordinarily powerful ad-hoc tool for those wishing to tabulate data with automated calculations. They are worse than useless if, for whatever reason, the user has no savvy approach to the problem at hand, or if the model which requires manipulation has no concrete representation.

    After many years with little use for a spreadsheet (previously having used Supercalc and Lotus 123) I was shocked by corporate state of the art. Specifically, I was disturbed by the type system employed to represent cell values and by the way in which formatting settings can so easily obscure the values actually being processed. The way in which Excel handles dates seems particularly horrific... and OO-Spreadsheet just mimics the same mistakes. I was also amazed that modern spreadsheets haven't started to use extensible libraries to represent new data types. It seems a no-brainer for a spreadsheet to make use of pluggable C# or Java classes to allow domain specific types to be manipulated in the context of a spreadsheet environment. Am I missing something - or have we not only failed to advance the art (as suggested by the article) but actually taken several steps backwards?

  • Financial Planning (Score:5, Insightful)

    by awol (98751) on Wednesday March 31 2004, @05:45AM (#8723729) Journal
    The scope of the article is really limited to the use of spreadsheets in financial planning (forecasting). For which the criticisms of the author and the material he cites are pretty valid. Indeed we all have our pet hates when it comes to how the tool is used (you have no idea how much of the financial world is ruled by this spreadsheet or the other driving trading decisions!) however, the tabular representation of data is not inherently broken and it behooves the computer scientists amongst us to ask why this form has usurped the database for the representation of simple datasets and all to frequently complex ones.
  • "When you're holding Excel, everything looks like a spreadsheet" [yoz.com] by Yoz Grahame

    I particularly enjoyed it, and it made me wonder why I've always hated Excel. maybe it's time to forgive...

    (I always used to like Pipdream on the Archimedes though. That was a combined spreadsheet and word processor).

  • by joonasl (527630) <joonas DOT lyytinen AT iki DOT fi> on Wednesday March 31 2004, @05:47AM (#8723742) Homepage
    I think that the root of the problem is that many people who are not IT professionals are thought only how to use spreansheets (MS Excel) and word processors (MS Word) in college/university. Since their "toolkit" is so limited, they tend to do all possible tasks using those programs, even if they are not the best possible choices. I currently work in technology solutions branch one of the big consulting companies, and you can't belive what the business major managers use Excel here for.

    So far I have seen Excell used for issue mangement, system requirement repository, time tracking, time estimation, code dependency tracking, system reference data and configuration data repository, ...
    ..and in 99% of the cases the spreadsheets don't even use the SUM function.

  • by beacher (82033) on Wednesday March 31 2004, @05:49AM (#8723756) Homepage
    To anyone that has Excel '97 - On a new Worksheet, Press F5. Type X97:L97 and hit enter. Press the tab key. Hold Ctrl-Shift. Click on the Chart Wizard toolbar button. Use mouse to fly around - Right button forward/ Left button reverse.

    Excel 2000? Under file menu, do 'Save as Web Page'. Say 'Publish Sheet' and 'Add Interactivity'. Save to some htm page on your drive. Load the htm page with IE (don't give me any grief over this one- you're already screwing around with Excel so I don't want to hear it ). You should have Excel in the middle of the page. Scroll to row 2000, column WC. Select row 2000, and tab so that WC is the active column. Hold down Shift+Crtl+Alt nad click the Office logo in the upper-left. If you have DirectX, you will be playing what looks like spy hunter. Use the arrow keys to drive, space to fire, O to drop oil slicks, and when it gets dark, use H for your headlights. -B
  • Suspect citation (Score:5, Insightful)

    by dtmos (447842) on Wednesday March 31 2004, @05:57AM (#8723788)

    A quick google search reveals evidence of only one paper (but not the paper itself, unfortunately) entitled, "Performance, Beliefs, and the Illusion of Control", see, e.g., here [umd.edu]:

    Kottemann, J.E., Davis, F.D., & Remus, W.R. (1994). Computer-assisted decision making: Performance, beliefs, and the illusion of control. Organizational Behavior and Human Decision Processes, 57, 26-37.

    Note that this paper was published in 1994; it's not a "1980s paper" as cited in the article. Careless errors like this make one wonder what else in the author's train of thought is similarly researched. Perhaps he's just incorporating incertainty into his references, too--or, maybe he considers 1994 to be statistically similar to the 1980s?

  • by Anonymous Coward on Wednesday March 31 2004, @05:58AM (#8723794)
    "Enhancements" of spreadsheets over the last few years have not involved any substantive improvements in functionality, but have primarily just involved enhancing their "typesetting" capabilities, that is, the ability to change fonts, insert special formatting, and to otherwise make tables look "pretty."

    I put "enhancements" in quotes because I am skeptical that this actually represents a true improvement of either the quality of the information or user efficiency in finding and using information.

    These so-called improvements gloss over the continuing problems that plague spreadsheet users:
    • Spreadsheet models encourage the use of "spaghetti" logic, where cells point to cells that point to cells, and can grow into random networks of calculation logic;
    • They permit lots of easy off-by-one errors;
    • They generally are difficult to verify/audit;
    • They do not provide good tools for managing data either in terms of consolidation or searching for specific detail;
    • Perhaps most importantly, despite their convenience, spreadsheets are not a robust repository for information.
    I have seen one multinational enterprise that (believe it or not) built a budgeting system atop sets of dozens of departmental spreadsheets that they would roll up into a master budget; while it's a neat extension of the technology, only a fool would try to use this to run a large enterprise. One bad link in one subsheet, and the whole house of cards could fall down. (And the "top" vendor these days, Microsoft, isn't noted for building products that are of industrial grade robustness.)

    The last few points point towards where I would like to see spreadsheets go. They have been, and are very good at producing ad-hoc, one-off reports. This is a proper use of spreadsheets.

    They are often being used instead as repositories for information that really ought to be managed by a database management system of some sort.

    What spreadsheets should do is to allow, nay encourage, the use of data extracts from external sources, notably relational databases. The use of named ranges (which are a venerable feature from at least as early as Lotus 123 v2.01) is of assistance; Lotus Improv was a rather complex-to-use test platform for improved "modelling" whose functionality included database extraction.

    Using external repositories permits the benefits of:
    • A single repository that can be kept correct, rather than a multitude of mutually incompatible data stores;
    • Data synchronization (a restatement of the last);
    • All the good RDBMS "stuff" like:
      • Field validation,
      • Maintaining field relationships,
      • Transaction logging,
      • Centralized backups,
      and perhaps even more sophisticated things such as
      • Data modelling and
      • Stored Procedures/Triggers
    In effect, the real point I would propose is that the task of building a spreadsheet should involve some data modelling, with thought not just about the report at hand, but also about where the data comes from and perhaps should go to.
  • by Anonymous Coward on Wednesday March 31 2004, @06:00AM (#8723804)
    Spreadsheets suffer from programming flaws that we've ruthlessly stamped out in programming languages.

    Some of these flaws are :
    - Cryptic names for fields
    - No comments
    - No obvious flow of control
    - No modularisation
    - No capability to test spreadsheet sub-components in isolation
    - No capability to do a diff to see what's changed between versions

    Spreadsheets also add flaws of their own, such as unlocalised references.

    If we had to design the worst possible "programming language" we'd be wise to look at spreadsheets for an example of what to include.
  • by Trurl's Machine (651488) on Wednesday March 31 2004, @06:04AM (#8723814) Journal
    I was recently on the market for a new car (hoorrray!). I shortlisted three vehicles for me to consider and I asked the salespeople of the respective companies to mail me data on service plan, warranty, replacement part prices etc. on all the three vehicles. I got two replies with Excel documents and one with a printer-friendly PDF.

    I am all for open standards in communication, but what shall I do? Send a reply to the salesman "you f*ing Microserf moron, I don't want your car if you force me to buy a bloody spreadsheet just to read how much do you charge for a goddamned air filter?" But is it wise to choose a car just because of the software that a salesman uses?

    Finally I picked the one that was described in PDF. It was a coincidence - a decisive factor was actually that the make of that car constantly tops in the consumer surveys, while the other two are just about average. But then I started to think - maybe that's not a coincidence after all? Maybe this make tops in surveys just because it's policy is to make all stages of customer experience as convenient as possible and they ask themselves the question that other car salesmen don't ask - "what if my prospective client does not use Microsoft Excel(TM) or Microsoft Word(TM)?".

    Maybe it is possible for us to vote with our wallets against proprietary, closed standards?
  • by Anonymous Coward on Wednesday March 31 2004, @06:24AM (#8723887)
    At university, I am taking a course in business modelling. We use Simul8 s/ware to generate thousands of monti-carlo 'runs', then analyse the results as if they were real data.

    But it's not real data! It's completely determistic, even with a pseudo-random generator. The only things we deal with are simple supply-chain networks, which are just malkov-chains with a few probability distributions. We're using 2000 pounds worth of s/ware to solve high-school statistics problems :-/

    You'd get the same results, and have real justifications for the numbers, by using an HP Calculator and a pencil. Alarmingly our lecturers have yet to explain what any of the distributions mean, but they keep using words like 'proof' and 'verify'.

    I'm back to linearly regressing my calculated data. It's insane, they're all insane, one day the sane people will rule, wibble ...
  • by gomel (527311) on Wednesday March 31 2004, @06:39AM (#8723934) Homepage Journal
    From the article:

    The first distortion is the use of point values and simple arithmetic instead of probability distributions and statistical measures. So far as I know, there's no off-the-shelf spreadsheet product--certainly none in common use--that provides for input of numbers as uncertain quantities, even though almost all of our decisions rest on forecasts or on speculations.

    I am a student of this university : http://www.sgh.waw.pl/
    Currently I am having a course in the use of Excel for prediction purposes. We do a lot of different case studies. We use Monte Carlo simulations, statistical tests, Markov chains and so on. We always discuss risk (variance, value-at-risk and so on). Excel is our basic tool and it is fine. We use different tools for specific purposes: Best-Fit for distribution fitting.

    It is not a flaw of the tool, it is a flaw of the user. As someone said, give a monkey a PC instead of a type writer and you will get digital bullshit. I can only demand that people without proper education are not allowed to deliver multi-million business forcasts.
  • "Powerpoint Mastery" (Score:5, Informative)

    by foobsr (693224) on Wednesday March 31 2004, @06:50AM (#8723967) Homepage Journal
    Since using spreadsheets is often only one step away from PowerPoint mastery.

    Erm .. for "Powerpoint Mastery" have a look at Tufte [edwardtufte.com] "The Cognitive Style of PowerPoint".

    Yes, I know it was discussed here before (as I guess), but still - it is worth a mention.

    CC.
  • by kahei (466208) on Wednesday March 31 2004, @06:51AM (#8723972) Homepage

    1 -- the article is a content-free advert for Whitebirch's financial toolkit

    2 -- Excel is an incredibly powerful and important piece of software which many if not most large corps can't do without. There is no alternative to it. The fact that it's unpleasant to use is beside the point -- nobody has been able to come up with a better (or even comparable) replacement. In my experience, there is a large segment of the IT community that is pathologically unable to focus on business needs enough to understand this.

  • A little skeptical (Score:5, Interesting)

    by astrashe (7452) on Wednesday March 31 2004, @06:52AM (#8723974) Journal
    I saw my first spreadsheet on an old Osborne computer. My dad knew a guy who bought small banks, and he had the Osborne and VisiCalc.

    Before this guy could buy a bank, he had to value them, and his valuations were always based on a few guesses (predictions) -- what interest rates would be, or whatever (I don't know exactly how he did it).

    He told me that when he started doing this stuff with a normal calculator, a pencil, and paper, changing a guess took him a couple of days. Then he got a programmable calculator, and managed to cut it down to about 5 hours. With VisiCalc, it took a few seconds.

    The point being that both the programmable calculator and the spreadsheet software gave him an edge in his work -- they made him better at buying banks. They paid for themselves.

    *If* no one is using the sorts of software described in this article, and *if* the software really does make you better at making decisions, people should be able to use it to buy banks (or whatever) and do a better job than their competitors. It should give you a leg up in the market place.

    That's exactly what happened with spreadsheets. That's why they're popular. A lot of dumb people have started to misuse them, apparently (that sounds plausible to me), but there's no denying that they have provided and continue to provide enormous value to users.

    If this new stuff is better, then why isn't Warren Buffet using it? If the answer is "because he's too dumb", why doesn't someone else start using it, and outperform Buffet?

  • by Anonymous Coward on Wednesday March 31 2004, @07:01AM (#8723999)
    You know, whenever we're talking about software like P2P file sharing, or freeware DVD drivers, or software that opens Adobe files for backups, the Slashdot crowd tends to be firmly in the "don't punish the technology for abuse by the users" camp.


    And then we have these PowerPoint, Excel, yada yada threads where the Slashdot crowd tends to be firmly in the "don't punish the users, it's the fault of these evil software applications" camp.


    What's up with that?

  • by Anonymous Coward on Wednesday March 31 2004, @07:06AM (#8724014)
    Having worked as a front-office developer in a very large bank i can give a good example of how spreadsheets can be misused Excel spreadsheets were used by all traders on the desk i was supporting. They did not want to move to any other tool because only spreadsheets gave them the flexibility they wanted. The spreadsheets were absolutely HUGE, think direction 20 or more tabs, all with hundreds of DDE Links to Reuters RICS - complicated formulas hanging off these links producing tables of data each time a DDE link updated (about once a second on average). We had to install gigabytes of ram and dual CPU's desktops for them just so they could run their spreadsheets. Sure excel would crash every now and then, but not often enough to switch to a new solution.
    IT tried to introduce new more stable trading tools without success, not flexible enough-did not calculate "their" prices correctly-blahblah. Controlling tried to impose new tools on them to get a grip on their price calculation- all very difficult when the only data source is a "spreadsheet".
    The most insane thing that we tried was to write a spreadsheet parser that would traverse all cells, build a dependency graph, reparse the formulas inside to translate this to another programming language. Needless to say this failed.
  • Please! (Score:5, Insightful)

    by pkaral (104322) on Wednesday March 31 2004, @08:04AM (#8724249)
    The classical Slashdot debate features something-stupid-done-or-said-by-non-IT-savvy-gene ral-managers, and then the appropriate bashing by IT-savvy Slashdotters. If there were a similar forum where my profession were in majority, they would probably be bashing this very thread right now (I am an economist and business manager).

    Just like, say, PERL or Java, spreadsheets can be used well, and they can be used poorly. Furthermore, people with good "technical" Excel skills can produce lousy spreadsheets with little analytical value, and vice versa. I have seen some fantastic spreadsheets which have totally revolutionized the way people saw a problem. At an insurance company I worked with, they used a huge spreadsheet to do a simulation of the effects on every single customer of a planned, dramatic price increase. The result: They realized that the price increase would have much less impact than they feared. Thus, the product was kept and the employees kept their jobs. The thing with the spreadsheet was that it was developed in fast trial-and-error loops, which meant that their run-once-per-night SAS tools were not an option (this was 7 years ago).

    (I have, by the way, also seen people spend 3 months on developing a mega-spreadsheet for assessing the value of a company, only to use the wrong assumption for a critical value and thereby introducing an error of about 40% in the valuation [that critical value being the discount rate]).

    I can assure all the concerned citizens of this forum that there is indeed a lot of excellent, first-rate Excel usage out there. Analytical power beyond our wildest dreams is at the fingertips of people without skills in programming at any lower level. This, believe it or not, is a good thing, because anyone who has dedicated himself to becoming great at programming is probably less skilled in disciplines such as financial analysis.

    Sure, there is "bad code". Sure, people get a false sense of control. Sure, this new tool puts too much options in the hands of people who do not know how to use them. But how would that be untrue of other IT tools or programming environments? What does it matter that they use Excel as a database, as long as it gets their work done easier than getting an SQL education and then doing it "right"?

    Biases are part of all decision-making (as even economists are realizing [economist.com]). So what if that is the case in Spreadsheet World, too?
    • Re:please everybody (Score:5, Interesting)

      by zyridium (676524) on Wednesday March 31 2004, @05:26AM (#8723659)
      That makes absolutely no sense at all.

      Excel is perfect for creating lists of things, and being used as a way of storing simple data...

      If you want to use that data for other purposes or it is at all complex, then sure, don't use excel.

      What is a set of numbers, what about a list of data with associated figures, get real...
      • by REBloomfield (550182) <rebloomfield@gmail.com> on Wednesday March 31 2004, @05:27AM (#8723661)
        Problem being, that the data isn't 'related' in any sense, and when a user manages to move the data in one column, and then it's out of sync with everything else, they call me up and whine....
          • Re:please everybody (Score:5, Interesting)

            by smitty_one_each (243267) on Wednesday March 31 2004, @07:16AM (#8724042) Homepage Journal
            I've written a couple of applications that use .xls files as an interface.
            Idea being that you gan query some relational store, put lists of default values on a back tab, set named ranges to those lists, and then, on a front tab, use data validation to constrain the users to putting Correct Stuff in data rows.
            Oh, and there is no macro code in the .xls, so we don't run afoul of security settings.
            This is a back-to-the-future batch system. Blank forms go out as email attachments, and come back as email attachments. They are saved to a folder inexplicably named "inbox". When the time is right, we crack them open in turn and read them into our RDBMS, and then do reporting.
            If the .xls form is simple enough, in MS Access, you can have an .xls link table stub, and 'mount' each response in turn, and excecute straight SQL to read it in. Very fast and secure.
            More complicated stuff might require MS Access to instantiate Excel and open each .xls explicitely to map the response to the database.
            I've opened some of these .xls forms under GNUMeric with great results.
            Also, languages like Perl and Python can script COM objects like Access and Excel.
            Furthermore, as this is very stand-alone, you could use SQLite without concurrency issues.
            The biggest advantage of all is that you've blown off the whole web server mess. Obviously our problem domain is non-real-time, batch-able applications. But there are a lot of those. HTTP is great at what it does, but for shedule requests and what-I-did-this-week inputs (the two applications I've done in this mode), here is a way to do them that doesn't require much that isn't generally available and desktop-runnable.
            The other key is that most business people are fairly cozy with a spreadsheet interface, and die rapidly confronted with an .mdb similar. So the fear factor is reduced.
        • by B'Trey (111263) on Wednesday March 31 2004, @08:45AM (#8724493)
          This is an example of one person who does not understand how to organize information in a useful manner. It's a valid point, and quite frustrating if you're the poor sod who has to try to make sense out of his mess, but it doesn't invalidate the idea of using spreadsheets as simple databases. I've had to work with real databases that were designed by people who quite obviously had no clue what they were doing and had never heard of the concept of normalization. That isn't a strike against databases, it's just an indication of users in severe need of training.

          Excel as a simple database has a number of advantages. It's portable - most business users have access to Excel, and OpenOffice imports Excel sheets quite well. It requires limited knowledge to get some use out of it. Even unsophisticated users can usually manage a simpe search for the data they're looking for, and can update records. A well designed Access database can be easy to use, but it's much less portable. Not all users have Access, and I'm not aware of any Linux app which will open an Access database directly. (You can export the database from Access and then import it into MySQL or other database server, but that's obviously a great deal more work.) A database run on any of various servers with a custom front end, either web based or not, can be easy to use but is obviously much more difficult to attach to an email.

          If you need the power and robustness of a relational database, then use one. But for simple data collection functionality, particularly when portability is an issue, a spreadsheet works well.
    • by baryon351 (626717) on Wednesday March 31 2004, @05:33AM (#8723684)
      I've ranted about something similar to this before, but occasionally in the print business I worked at we would get Excel documents to print.

      No, they didn't want printed spreadsheets - people would lay out flyers, leaflets, posters and small booklets in Excel.

      I can only guess their creative genius had to be instantly addressed and they picked the first app they could think of to lay it out on, and excel was just sitting there loaded at the time.
    • Re:please everybody (Score:5, Interesting)

      by biobogonics (513416) on Wednesday March 31 2004, @05:50AM (#8723760)
      stop misusing spreadsheets/excel as databases- They are for calculating numbers, not creating lists of things

      1. Blame AppleWorks first. Before excel it made spreadsheets like databases.

      2. If you look at the history of the spreadsheet, you will see that VisiCalc was designed for "What If?" not large scale calculating work. I was taught that spreadsheets are for the display of information - not calculation.

      3. Of course I don't even need a database for storing some kinds of information. An ordinary text file is actually good enough. For example my address book is a text file.

      4. I think the greatest misuse of spreadsheets is in using them to consolidate financial data. It's seductive. You get to see what you are doing, you get visual feedback, but

      a. data is not protected against alteration
      b. formulas are not protected against alteration
      c. there is no audit trail
      d. you are using explicit formulas instead of looping over data files

      5. Lastly, you can say to yourself when you use a spreadsheet, "Look Mom, I'm not programming." Pretty soon you are using Macros, then Word Basic then Visual Basic for Applications. Pretty soon you have a maintenance nightmare since you have spent more time getting immediate answers than you have spent in thinking about design.

      6. Yet the usual database products are a disease in themselves. I think that relational databases are not the best for transaction processing. I prefer to use programming languages with built in database support.

      7. Last, using a computer gives you the illusion that numbers are real. Printed numbers assume god like authority. But of course projections are not facts or reality, except perhaps in government or the business world!

      • by 1010011010 (53039) on Wednesday March 31 2004, @07:17AM (#8724044) Homepage
        5. Lastly, you can say to yourself when you use a spreadsheet, "Look Mom, I'm not programming." Pretty soon you are using Macros, then Word Basic then Visual Basic for Applications. Pretty soon you have a maintenance nightmare since you have spent more time getting immediate answers than you have spent in thinking about design.

        This is a common thing, in my (corporate) experience. Not much thought is put into how the business fundamentally goes about its tasks, but there is a lot of time spent, e.g., masturbating with time sheet data for salaried employees, etc.

        Making things worse, Microsoft's tools encourage instant gratification over design: VBA, Office Macros, ASP and Visual Basic lend themselves not to rapid application devlopment, but stupid application development. It's so easy to tweak and reload that the "right" answer often ends up being the "easy" answer. It's development by instant gratification. The resulting "solutions" are often fragile and difficult to maintain. It's like Powerpoint for Programmers (referencing Tufte), in that the cognitive model of the tools distorts the outcome as much or more than it helps produce it. I'm not convinced that these convenience tools result in less time spent in development, either; quite the opposite. I think any amount of time spent in design and planning will be outweighed by all of the re-work that will usually have to be done because of the mindset the tools engender. This is overlooked because planning isn't a source of instant gratification (it seems to drag on forever, as it requires actual thinking) -- whereas development with tools like these is a source of instant gratification, thus masking their own consumption of your time.
      • by Awful Truth (766991) on Wednesday March 31 2004, @08:29AM (#8724380)
        Your point #5 brushes with the real problem. I work in a large -- very large -- financial organization, and we often see users sneaking business code into various 'documents.' Their favorite is, of course, Excel.

        When we ask our users, "why?" the answer is always "it's too much trouble to deal with you technology folks." They're willing to forgo robustness, auditing, data validation, etc. in order to escape the technology bureaucracy: Getting budgets and resources, all those damn planning meetings, dealing with System Administrators, and so on. They generally know the risks and limitations of using Excel but feel the advantage of getting quick results is entirely worth it.
    • by TelcusFreshbreeze (601347) on Wednesday March 31 2004, @06:06AM (#8723823)
      Not that I condone using Excel for data handling but consider this. When most PC's come with Standard MSOffice (which includes Excel but not Access, which comes in Professional), what application are users gonna be doing all database type scenarios?

      Microsoft have basically forced Excel to be the packhorse of data manipulation for the masses.

    • by supergiovane (606385) <arturo.digioiaNO@SPAMing.unitn.it> on Wednesday March 31 2004, @06:18AM (#8723867)
      Excel as a database [neopoleon.com].
    • by zakezuke (229119) on Wednesday March 31 2004, @06:23AM (#8723881)
      I find excel a wonderful powerful intermediary program because of it's ease of use to take a list of information that's delimited by a field, import it, export it delimited to another format.

      I don't understand the problem using excel as a small database. If you outgrow it, just export the whole shabang... delimited by whatever your database software supports. Heck, there have been times I reccomended using excel when getting groups of 10 or more people together doing manual data entry. The data gets entered, it's organized, and easily incorperated.

      No, excel is not a database, but a spreadsheet can be used for more thens then calculating numbers.
    • by Anonymous Coward on Wednesday March 31 2004, @06:41AM (#8723944)
      I wish I could persuade my boss to give me data in an Excel spreadsheet rather than a PDF produced by Word. At least I can save a spreadsheet as CVS and parse it for entry into a database.
      • by Threni (635302) on Wednesday March 31 2004, @07:42AM (#8724126)
        ---
        There are already a lot of posts berating the use of Excel as a database. Yet, I have not seen a single clear argument why this is a Bad Thing. The closest someone has gotten to is saying how users might inadvertently delete columns or add unwanted formatting, etc.
        ---

        Some people are too hung up on what something was designed for, and overlook what it could be used for. Presumably they're against the Wright brothers use of bicycle parts for the construction of the first plane also.
      • by linuxtelephony (141049) on Wednesday March 31 2004, @07:42AM (#8724127) Homepage
        One of the biggest reasons is the sort function combined with [l]user error.

        If a spreadsheet has more columns that fit on the screen, and is used by more than one person, at some point you can almost count on someone highlighting some, not all, of the columns and then sorting the highlighted columns, and saving the file. When that happens, the highlighted columns are sorted, the rest are left as is. Worst, the next person to use the file doesn't always realize the corruption has occurred.

        This was a problem in Office 97 and earlier. I think it was a problem in Office 2k, but I don't remember. I have not tested this on Office XP or 2003.
      • Re:please everybody (Score:5, Informative)

        by ChiaBen (160517) on Wednesday March 31 2004, @07:47AM (#8724158) Homepage
        You shouldn't be forced to use SQL for manipulating data, you should be restrained from using Excel. ;) The reality of the differences between a spreadsheet and a database is that a spreadsheet lacks the data constraints (relationships) necessary to keep a user from entering bad data. A database can control this (data integrity) to a large degree (depending on your datamodel design).

        An example I fight with daily is product attributes. I maintain a n ecommerce database with about 180,000 products, each of which would have, say, a color. The problem is that if I import data from a spreadsheet it might randomly insert spaces in the data (i.e. "Black " or " Black" instead of "Black"), whereas if I get the data entered through our tools, the user selects from a list of colors, and only if the choice doesn't exist do they add a new one.

        You mention how people are doing a knee-jerk that 'DB's are sacred'. Yes, they are. So are spreadsheets, the problem is that people bastard-ize their use and end up confused about why they both exist, and how to use them.

        Database = Data storage, data consistency, ease of data maintenance
        Spreadsheet = Data analysis, data redundancy, lack of data integrity.

        That's how I see it, anyhow.
    • What's annoying..? (Score:5, Interesting)

      by manavendra (688020) on Wednesday March 31 2004, @05:42AM (#8723714) Homepage Journal
      I really fail to see the point in these posts about a spreadsheet program (be it Excel), not being a database.

      Maybe there is a genuine need for a database program (and I use this term here loosely) that provides an interace as easy to use as spreadsheet? Not every user is a programmer, and the vagaries of the any DBMS are well known. Besides, no end-user wants to meddle with software administration.

      Maybe the users use it as a database, simply because it provides an easy means of storage and manipulation of trivial data? Not ever user (not in every case, at least) has a million records to work on.

      Yes, spreadsheet tools may not have capabilities such as porbability distributions or statistical measures. How many naive users need them? Oh, the average executive might need them to project forecasts, but then, is there a tool that allows this? Conversely, if this limitation has been identified (and I'm sure this must have been identified in the past and by others, as well), why do we not see this being incorporated in any mainstream spreadsheet? (hint: there probably is not enough critical mass of users demanding such a feature).

      The other point listed in the article - "the worst nightmare of those who justify IT's return on investment - spending extra money on a more time-consuming product that yields absolutely no measurable improvement?". Well then, perhaps in that given scenario, the need wasn't evaluated correctly? Or maybe such a complexity wasn't required after all?

      It's easy to point out the missing features/capabilities from any software, but if it's not asked for by average/most users, it will take a long while to be incorporated (if at all). Yes, this however leaves the issue of errors introduced by the use of such spreadsheets, whether tacit or implicit. In either cases, it would be due to the user being unable to find the right tool to model the problem, or not being able to understand the problem correctly and hence not taking into account as many (if not all) parameters involved.
    • Re:Kill them all (Score:5, Insightful)

      by wllf (627835) on Wednesday March 31 2004, @05:54AM (#8723776) Homepage
      What about people or (even worse) companies who are determined to send data in Excel sheets and it has to be processed automatically. Columns being deleted because the data typist thinks it is no longer needed, adding columns because there is more 'important' info to add, align the zipcode into its column using spaces after the address (hard one to spot!) and of course the very popular extra comments at the bottom of the data breaking the import routines.
      • Re:Kill them all (Score:5, Insightful)

        by Zocalo (252965) on Wednesday March 31 2004, @06:05AM (#8723817) Homepage
        OK, hand up. I've done this. I had a Word document that contained maybe two - three pages of text followed by a series of tables of data - total size of maybe 5MB. I wanted the document on my Clie for reference, but not at the 3MB Docs2Go was rendering it at, and PDF was even worse. So I dumped it into an Excel workbook, one worksheet per table for fast access and used odd numbered rows on the first page for the text, one paragraph per cell. With a decent left justified cell width, word wrap enabled and the grid turned off it looks fine. New file size: 300kB. Conclusion: Word's table markup is sub-optimal to say the least...

        Similarly, I don't have any problems with using Excel as a basic flatfile database (never relational though, I'm not that insane) where the visual layout of the data is more important than the flexibilty of querying. That said, on a basic flat-file database you can actually perform some quite sophisticated filters using Excel's auto-filter function.

        I don't think the problem is with using a spreadsheet as a word processor, database, or any of the other uses it can be shoehorned into. The problem is simply that people correctly see a spreadsheet as a jack of all trades, but forget that this implies it's a master of none, with the possible exception of what it was designed for: crunching tabulated numbers.

      • Re:Kill them all (Score:5, Interesting)

        by fucksl4shd0t (630000) on Wednesday March 31 2004, @06:18AM (#8723866) Homepage Journal

        Well, for really small lists, it's an easy way to store it. Especially when someone with very little computer knowledge couldn't even begin to create an SQL statement.

        How's this?

        I created a spreadsheet in KSpread (awesome program, that) to plan my menu for Texas Brand Barbecue. So I used it to estimate all of my costs and my gross sales. I'm a perfect example of who this article is about, but I think I'm above the sort of planning the article is talking about. :) (I intentionally went conservative on sales and liberal on costs. I could be wrong in the end, but if I'm going to err, I prefer to err where my error makes profit rather than loss)

        Then I needed a list of equipment to start up and to estimate the cost of all this equipment. So I switch to another sheet in the same workbook and create this list. A quick little formula gives me a total.

        Aha, so now I wanted to keep all my data in one place, and the next few pieces of data were tabular in nature, but no formula attached. I needed a list of local area farmer's markets, locations, dates and times, market coordinator, and contact phone number (website and email if available). So what did I do? Well, I made a new sheet in the workbook and put my table there. Now I refer to it whenever I need to call someone on the list, or if a market falls through (that process is over, now) I can easily find another market for that day.

        Spreadsheets, as another poster put it, are for the presentation of data. For my purpose, I could've taken a couple of weeks to write a program that would have less than half of the functionality of my spreadsheet, but why bother? The spreadsheet is there to do the job.

        This isn't saying that I wouldn't like to have something better, and I intend to home-grow a better solution. First I have to write a driver that will let me download transactions from my cash register to my database, though. The program gets complex after that, but the intent is to replace my spreadsheet with it. (And release it as open source, of course. Doesn't give me a competitive edge worthy of note, and others could benefit by it)