Donald Knuth Rips On Unit Tests and More 567
eldavojohn writes "You may be familiar with Donald Knuth from his famous Art of Computer Programming books but he's also the father of TeX and, arguably, one of the founders of open source. There's an interesting interview where he says a lot of stuff I wouldn't have predicted. One of the first surprises to me was that he didn't seem to be a huge proponent of unit tests. I use JUnit to test parts of my projects maybe 200 times a day but Knuth calls that kind of practice a 'waste of time' and claims 'nothing needs to be "mocked up."' He also states that methods to write software to take advantage of parallel programming hardware (like multi-core systems that we've discussed) are too difficult for him to tackle due to ever-changing hardware. He even goes so far as to vent about his unhappiness toward chipmakers for forcing us into the multicore realm. He pitches his idea of 'literate programming' which I must admit I've never heard of but find it intriguing. At the end, he even remarks on his adage that young people shouldn't do things just because they're trendy. Whether you love him or hate him, he sure has some interesting/flame-bait things to say."
Re:Did anyone claim the bug prize on TeX? (Score:5, Informative)
Re:Literate programming... (Score:5, Informative)
Literate Programming interleaves the documentation (written in TeX, naturally) and code into a single document. You then run that (Web) document through one of two processors (Tangle or Weave) to produce code or documentation respectively. The code is then compiled, and the documentation built with your TeX distribution. The documentation includes the nicely formatted source code within.
You can use literate programming in any language you want. I even wrote rules for Microsoft C 7.0's Programmer's Workbench to use it within the MSC environment.
I've frequently thought about going back. Javadoc and/or Sandcastle are poor alternatives.
and, arguably, one of the founders of open source? (Score:2, Informative)
Tex got started in 1977 after Unix (1974), well after SPICE (1973), and about even with BSD.
Documentation is the source (Score:5, Informative)
The snippets have markup to indicate when some snippet needs to come textually before another to keep a compiler happy, but mostly this is figured out automatically. But in general, the resulting C code is in a different order than it appears in the source documentation. For instance, the core algorithm might come first, with all the declarations and other housekeeping at the end. (With documentation about why you're using this supporting library and not that, of course.)
Re:Literate programming... (Score:2, Informative)
The Summary Exaggerates the Interview (Score:5, Informative)
Re:Literate programming... (Score:3, Informative)
Doxygen looks like it just extracts properly formatted comments in code in order to generate documentation. Web extracts properly formatted bits of code in order to generate a semantically correct C file.
Re:Literate programming... (Score:3, Informative)
Re:Literate programming... (Score:3, Informative)
And the philosophy is different, literate program is essentially embedding the code in the documentation. Doxygen is more about embedding documentation in the code.
So doxygen gives you fancy comments and a way of generating documentation from them and from the code structure itself. CWEB lets you write the documentation and put the code in it deriving the code structure from the documentation, sample cweb program: http://www-cs-faculty.stanford.edu/~knuth/programs/prime-sieve.w [stanford.edu]
Literate programming is more suited for "dense" programs, which surprise, surprise is the type of stuff Knuth does.
Re:he's from another era (Score:3, Informative)
Yes and no. Yes, the physical punch cards are gone, but they live on in financial institutions in the form of Automated Clearing House (ACH) debits and credits which use the 96 column IBM punch card format. So, the next time you use your credit card, ATM card, e-check or pay a bill online through some company's web site, on the backend they are probably using ACH upload files (aka NACHA format) which was based on IBM's 96 column punch card to transfer financial data. Magnetic tape may be used on a contingency basis but it has to have an additional header record, be EBCDIC encoded and use 9 track tape. The IRS and many state tax agencies use ACH transfers, as an option, to refund personal income taxes instead of sending taxpayers a physical check.
It's the same philosophy that K&R impart... (Score:3, Informative)
Re:Literate programming... (Score:4, Informative)
The really cool idea with LP is that the code snippets you use in the documentation are then weaved together to generate the "real" code of your program. So a LP document is BOTH the documentation and the code. A code snippet can contains references ("include") to other code snippets, and you can adds stuff to an existing code snippet.
Let me show you an example in simple (invented) syntax:
{my program}
{title}My super program{/title}
Blablabla we'd need to have the code organized in the following loop:
{main}:
{for all inputs}:
{filter inputs}
{apply processing on the filtered inputs}
{/}
{/}
The {for all inputs} consist in the following actions:
{for all inputs}:
some code
{/}
The filtering first remove all blue inputs:
{filter inputs}:
remove all blue inputs
{/}
{filter inputs}+:
remove all green inputs
{/}
etc.
{/}
The above is purely to illustrate the idea, the actual CWEB syntax is a bit different. But you can see how, starting with a single source document, you could generate both the code and the documentation of the code, and how you can introduce and explain your code gradually, explaining things in whichever way that makes the most sense (bottom-up, top-down, a mix of those..).
In a way, Doxygen or JavaDoc have similar goals: put documentation and code together and generate documentation. But they take the problem in reverse from what literate programming propose; with Doxygen/JavaDoc, you create your program, put some little snippets of documentation, and you automatically generate a documentation of your code. With LP, you write your documentation describing your program and you generate the program.
Those two approaches produce radically different results -- the "documentation" created by Doxygen/JavaDoc is more a "reference" kind of documentation, and does little to explain the reason of the program, the choice leading to the different functions or classes, or even something as important as explaining the relationships between classes. With some effort it's possible to have such doc system be the basis of nice documentation (Apple Cocoa documentation is great in that aspect for example), but 1/ this requires more work (Cocoa has descriptive documents in addition to the javadoc-like reference) 2/ it really only works well for stuff like libraries and frameworks.
LP is great because the documentation is really meant for humans, not for computers. It's also great because by nature it will produces better documentation and better code. It's not so great as it poorly integrates with the way we do code nowadays, and it poorly integrates with OOP.
But somehow I've always been thinking that there is a fundamentally good idea to explore there, just waiting for better tools/ide to exploit it
(also, the eponymous book from Knuth is a great read)
Re:Out of favor (Score:3, Informative)
That's not literate programming! (Score:3, Informative)
That's a mischaracterization of literate programming.
The whole idea of literate programming is to basically write good technical documentation -- think (readable) academic CS papers -- that you can in effect execute. What many people do with Mathematica and Maple worksheets is effectively literate programming.
It has nothing to do with what language you use, and is certainly not about making your code more COBOL-esque.
Maybe think of it this way: Good documentation should accurately describe what your code does. In literate programming, the computer code is just the "comments" you add to your documentation so that the computer can execute it.
See this post [slashdot.org], for instance.
Re:You misunderstand (Score:5, Informative)
Re:MMIX is poor design (Score:3, Informative)
Re:Literate programming... (Score:3, Informative)
Re:Spaghetti-O Code (Score:5, Informative)
Re:Shocked (Score:3, Informative)
Literate programming (Score:3, Informative)
I think most people who post here don't know what literate programming is. It's more like writing a textbook explaining how your code works, but you can strip away the text and actually have runnable code. This code can be in any language of your choice. It makes sense from Knuth's point of view, but for many of us, we don't write textbooks for a living.
Knuth also doesn't need unit testing because he probably runs (or type checks) the program in his head. Again, for most of us, seeing the program run provides additional assurance that it works. Unit tests also provide a specification of your program. It doesn't have to be just b = f(a). For example, if your code implements a balanced binary search tree, a unit test could check the depth of all subtrees to make sure the tree is balanced. Another unit test would check if the tree is ordered. You can prove by the structure of your program that these properties hold, but a lay-man doesn't want to write proofs for the code he writes, so the second best alternative is to use unit test.
About parallel programming, Knuth is actually right. Many high-performance parallel programs are actually very involved with the underlying architecture. But we can write a high-level essentially-sequential program that uses libraries to compute things like FFT and matrix multiplication in parallel. This tends to be the trend anyways.
Re:You misunderstand (Score:4, Informative)
It is also the use of accurate and descriptive symbol names.
Database database("data.txt");
if (database.empty())
is a lot more readable (i.e. literate) than
DB d("data.txt");
if (d.e())
Re:He's right (Score:3, Informative)