Forgot your password?
typodupeerror
Math Software Programming The Almighty Buck

Why You Shouldn't Use Spreadsheets For Important Work 422

Posted by Soulskill
from the they'll-throw-you-in-a-cell dept.
An anonymous reader writes "Computer science professor Daniel Lemire explains why spreadsheets shouldn't be used for important work, especially where dedicated software could do a better job. His post comes in response to evaluations of a new economics tome by Thomas Piketty, a book that is likely to be influential for years to come. Lemire writes, 'Unfortunately, like too many people, Piketty used spreadsheets instead of writing sane software. On the plus side, he published his code ... on the negative side, it appears that Piketty's code contains mistakes, fudging and other problems. ... Simply put, spreadsheets are good for quick and dirty work, but they are not designed for serious and reliable work. ... Spreadsheets make code review difficult. The code is hidden away in dozens if not hundreds of little cells If you are not reviewing your code carefully and if you make it difficult for others to review it, how do expect it to be reliable?'"
This discussion has been archived. No new comments can be posted.

Why You Shouldn't Use Spreadsheets For Important Work

Comments Filter:
  • by Anonymous Coward on Tuesday May 27, 2014 @08:02PM (#47103557)

    "I don't know how to use spread sheets properly."

  • by Anonymous Coward on Tuesday May 27, 2014 @08:03PM (#47103571)

    To be fair, neither to the vast majority of people who use spreadsheets for important work.

  • by Anonymous Coward on Tuesday May 27, 2014 @08:10PM (#47103629)

    Disagree. I think what he's really saying is "I've had to maintain and develop tools made by people that don't know how to use spreadsheets properly, and I'm fucking sick of it."

  • ugh...so anger! always with the nomenclature distinctions...this is a stupid approach to a real problem

    a spreadsheet is a computer program

    that's it...

    to criticize the act of entering data and performing computations on that data using computer software is the height of ignorance

    I don't know if he's right or not, but this guy's real criticizm, once you fight through his ignorance of the issue is that in his view Pickety didn't show enough of how he got his figures...or more accurately, the TFA author had to look at the spreasheet cell to see what formula it used (gasp!)

    The code is hidden away in dozens if not hundreds of little cells If you are not reviewing your code carefully and if you make it difficult for others to review it, how do expect it to be reliable?'"

    so he probably doesn't know how to use the interface of a spreadsheet very well, which makes the act of checking a formula tedious...

    then he writes some dumbass article inventing a problem to vent his frustration and reinforce his self-image...

    all the while missing the real problem with economics "research" (not Pickety but others do this...) it's called "P-hacking"

    P-hacking is the problem in social science/economics research, not using 'spreadsheets'

    gah!

  • by LordLucless (582312) on Tuesday May 27, 2014 @08:19PM (#47103703)

    It's not "spreadsheets shouldn't be used for important work", it's "spreadsheets should not be used for work that's not suitable for spreadsheets". Tools for the job, and all that.

  • by muhula (621678) on Tuesday May 27, 2014 @08:20PM (#47103711)
    If the inability to code review spreadsheets was a real issue, it wouldn't be too hard to convert spreadsheet functions into a functional language. For non-programmers, a spreadsheet lowers the barrier to entry. This allows people to do something useful and productive who couldn't do so otherwise. That's a good thing.
  • by Jonathan Mann (3481921) on Tuesday May 27, 2014 @08:23PM (#47103739)
    Another major issues with spreadsheets is that they don't handle data typing issues very well. For example, if you try to add a list of numbers, and somewhere in the list you have a number encoded as text, instead of throwing an error, it won't be included in the sum. Errors should never pass silently. Unless explicitly silenced.
  • by Virtucon (127420) on Tuesday May 27, 2014 @08:23PM (#47103741)

    You're doing it wrong.

  • Maybe you should read it again?
    His real criticizm is that spreadsheet software is horrible for any high end work, or with anything you want to share, and he is correct.

    "so he probably doesn't know how to use the interface of a spreadsheet very well, which makes the act of checking a formula tedious..."
    it is tedious, even if you are an expert and even if the user uses goof practices.

    "P-hacking is the problem in social science/economics research, not using 'spreadsheets'"
    I don't think you know what P-Hacking is.

  • by geekoid (135745) <dadinportland @ y a hoo.com> on Tuesday May 27, 2014 @08:28PM (#47103783) Homepage Journal

    For non programmers modern spread sheet give the user rope, with a noose already premade and a map on where to put your head.

  • by geekoid (135745) <dadinportland @ y a hoo.com> on Tuesday May 27, 2014 @08:30PM (#47103797) Homepage Journal

    It needs restating because people forget it all the time.

  • by matbury (3458347) on Tuesday May 27, 2014 @08:31PM (#47103801) Homepage

    The fact that Piketty's work describes a damning indictement of the USA's most cherished concept - free market capitalism - means that thousands of neo-liberal economists will pour over every single digit and operator in his spreadsheets looking for anything to negate the findings. If they can't find anything, they'll attack him. When you hear of character attacks against Piketty or some other diversionary tactic, you'll know his data is correct.

  • by preaction (1526109) on Tuesday May 27, 2014 @08:41PM (#47103865)

    Tell that to the entire finance and insurance industry.

  • by swm (171547) * <swmcd@world.std.com> on Tuesday May 27, 2014 @08:43PM (#47103877) Homepage

    I figured this out twenty-mumble years ago.
    I was doing data analysis in spreadsheets, and realized that I had no way to audit them.
    The data and the analysis were all just...there...in the spreadsheet.

    As soon as I got a grip on my data, I changed over to C programs that I could test, and document, and validate, and run at any time to demonstrate that input X generated output Y.

  • by Anonymous Coward on Tuesday May 27, 2014 @09:12PM (#47104031)

    There's already tons of documentation of Piketty's many mistakes. And his "mistakes" all seem to support his thesis, which is a pretty big pill to swallow.

    http://blogs.ft.com/money-supply/2014/05/23/data-problems-with-capital-in-the-21st-century/?Authorised=false

    His bigger issue is that his data does not support his eventual conclusion that a wealth tax is necessary. It's a big jump from his data to that extreme solution.

    My biggest issue is this: If R > G really trumped individual effort, why are the three richest men in the world (Gates, Buffett, Slim) all self made billionaires? If R > G was such a big deal, I would expect the richest person in the world to be a Rockefeller, or an heir from one of the other 19th century robber barons. Returns on investment have outpaced economic growth for a long time. It's clear that this is not the factor that Piketty makes it out to be. It's one thing to say that more and more will go to the top 1%. But if the top 1% changes every generation (and this is exactly what happens), is that as big of a problem as Picketty and other liberals make it out to be?

  • by Anonymous Coward on Tuesday May 27, 2014 @09:25PM (#47104107)

    Most people have no idea how to use a relational database.

  • by jd2112 (1535857) on Tuesday May 27, 2014 @09:36PM (#47104171)

    "I don't know how to use spread sheets properly."

    Or, I realize that just because I have a hammer not all problems are nails.

  • by jythie (914043) on Tuesday May 27, 2014 @09:40PM (#47104201)
    Eh, I think it can be legitimately argued that spreadsheets are a bad place to do complex things. Even people who are skilled at setting them up produce work that is difficult to examine and track. In many ways it is a technology that it still stuck in the 80s, even though they keep throwing in more and more complex functionality, but the method of storing and organizing the logic is dated in a bad (rather then proven) way.

    Even teaching students matlab would probably be an improvement, but excel is what they default to teaching anyone outside math and CS, building all the coursework around it.
  • by Anonymous Coward on Tuesday May 27, 2014 @09:47PM (#47104239)

    you can send one to anyone and not have to worry about what they have installed

    Except that they need to be running Windows or Mac, with Microsoft Office installed.

  • by CriminalNerd (882826) on Tuesday May 27, 2014 @09:48PM (#47104247)

    "I never worked in a company with normal people."

    I'm guessing you haven't had the pleasure of working in the typical firm where the company's years-old ENTIRE lifetime of work and data is passed around e-mail as a 80MB Excel attachment.

  • by labnet (457441) on Tuesday May 27, 2014 @10:17PM (#47104425)

    Wow!
    If I was in my early 20's, I'd probabbly think I was 'leet'
    Now in my mid 40's, I'd probabbly fire whomever wrote it.

  • by Baloroth (2370816) on Tuesday May 27, 2014 @10:28PM (#47104507)

    if it can execute the operation needed for the research then it is acceptable...if not, then no

    You could probably write this computational code in a shell script, too. But it would still be a terrible idea. Why? Because it's the wrong tool for the job. Simple as that. It doesn't matter what you can and cannot do, it matters what you should do, and you shouldn't use spreadsheets for anything complicated. It's simply too easy to make stupid mistakes that are difficult to trace and correct (or even notice).

    you can't blame a spreadsheet for a poorly devised experiment...you *can* blame a researcher for using an inappropriate statistical model...you *cannot* criticize the method of analysis as long as it is physically capable of the computation

    TFA isn't blaming the spreadsheets, he's blaming the people who use them for using them. It's not acceptable to use a tool that works poorly and is highly susceptible to mistakes, and no one should listen to anyone who does so unless that person is damned good at that tool: yes, it is possible that someone is so fantastically good with spreadsheets they can use them for massive data analysis with no problems. They are, however, the exception, and I would generally be inclined to disbelieve the results from anyone who does large work with spreadsheets (simply because of the possibility for errors and the lack of concern for accuracy that using spreadsheets demonstrates). So, the conclusion is that you shouldn't use spreadsheets for important work. You absolutely can criticize an analysis if it uses a tool that is highly likely to introduce errors, and that's fundamentally the point (and it's underscored by the fact that that is precisely what happened in Piketty’s case).

  • by timeOday (582209) on Tuesday May 27, 2014 @10:36PM (#47104557)
    The question is whether having the logic squirreled away in code or a DB would have made it more correct, which is a big assumption!

    I really think Piketty deserves a lot of credit for releasing his "source" spreadsheets on such a substantive and controversial work. Most authors do not. If the critiques turn out to be substantial and extensive, I plan on waiting for a second edition with corrections before investing time in reading it.

  • by ClickOnThis (137803) on Tuesday May 27, 2014 @11:14PM (#47104803) Journal

    you can send one to anyone and not have to worry about what they have installed

    Except that they need to be running Windows or Mac, with Microsoft Office installed.

    Actually, LibreOffice/OpenOffice are pretty good at importing and exporting .xls and .xlsx. And considering how incredibly obfuscated^H^H^H^H^H^H^H^H^H^H complicated the MS OOXML standard is, I'd say that's quite an accomplishment.

    You can even import .ods in MS Excel, if you have the relevant plugins installed.

    That said, I agree with TFA: don't go overboard with fancy spreadsheets. Keep them simple, for the sake of your own mental health and that of your co-workers.

  • by Coeurderoy (717228) on Wednesday May 28, 2014 @01:47AM (#47105383)

    No, what he is saying is that it is easy to "write" sloppy code for excel and hard to write good code.
    And even harder to review it.

    It's similar to the reason a) people moved away from basic, and b) basic evolved to be (duck, please no flame) almost usable (I still do not like it, but recognize that it is possible to write usable code in visual basic).

    If you want to criticizes him, picking on Piketty is VERY political, "excel" errors are galore in neocon publications, but of course the FT did not find anything not to love there, but saying that just maybe having a small group of people siphoning off all the cash from society is not sustainable for ever does make them nervous and very desirous to find some scab to pick at...

    Nevertheless he is right, it would be very good if decision makers would be able to "read the numbers" and not just "massage the numbers".
    Something like R or ADaMSoft would drive you to test ideas on datasets and learn from them whereas excel (or calc :)) have a tendency to get you to fiddle the numbers until the taxman aherm the reader sees what you would like them to see...

  • Spread sheets are such awesome tools that they allow non-programmers to create the same problems that noob programmers do while writing code.

  • by lonecrow (931585) on Wednesday May 28, 2014 @12:36PM (#47109975)
    Spreadsheets are just a part of the Darwinism of applications. Some sharp fellow within an organization things its important to start tracking some data point or another. Maybe it gets ignored and forgotten. Other times it grows as other people see its utility and start making requests to track related data points. Eventually you get a multi-worksheet or even multi-workbook spreadsheet masquerading as an application. At some point it becomes far to hard to maintain or understand so they contract out someone like me who moves it to a relational database with a web front end. Everyone is happy!

    This work forms a major part of my work load don't fuck with it!

    Also, it is appropriate. It would be inefficient to develop a proper relational database application on the whim that some set of data points might be useful. Spreadsheets are a proving ground, and important stage in the life cycle of an application.

We want to create puppets that pull their own strings. - Ann Marion

Working...