Don't Overlook Efficient C/C++ Cmd Line Processing 219
An anonymous reader writes "Command-line processing is historically one of the most ignored areas in software development. Just about any relatively complicated software has dozens of available command-line options. The GNU tool gperf is a "perfect" hash function that, for a given set of user-provided strings, generates C/C++ code for a hash table, a hash function, and a lookup function. This article provides a reference for a good discussion on how to use gperf for effective command-line processing in your C/C++ code."
Re:C++ I get (Score:1, Interesting)
1) Every linux kernel developer
2) Every *BSD kernel developer
3) John Carmack, for the core of every ID engine up to and possibly beyond Doom3
4) You, whenever you compile C++ code, as it is compiled to C before machine code (unless you are using an exotic compiler such as the Compaq AXP C++ compiler for TRU64).
Re:C++ I get (Score:1, Interesting)
Re:C++ I get (Score:5, Interesting)
You, whenever you compile C++ code, as it is compiled to C before machine code (unless you are using an exotic compiler such as the Compaq AXP C++ compiler for TRU64).
Excuse me???? That was not even true anymore when I started using C++, back in 1992. There are features in the C++ standard that are so extremely difficult to correctly implement in standard compliant C that it's a complete waste of effort trying to pass via C while compiling. Exception handling comes to mind as the prime example. A failed attempt to support exceptions was the reason why Cfront 4.0 was abandoned. Note that 3.0 was released as early as 1991. The last Cfront based compiler I had the horor of using was HP's CC. It was superseeded by the new native aCC by 1994 at the latest.
By the way, I used to write C/C++ compilation/optimisation stuff for a living, so I guess I know something about the topic.... :-)
Re:Correction... (Score:1, Interesting)
Re:It is if the linker complains about not finding (Score:3, Interesting)
Are you seriously trying to argue that gperf is more portable than getopt?
Re:All the world is not a PC (Score:1, Interesting)
I fail to see how is this strong argument in this discussion. How many of these embedded tools you write actually _do_ command line processing? If they do, why don't you invest in more (both memory- and time-) efficient ways to do IPC than the command line?
Another approach - parseargs (Score:3, Interesting)
The following two directories should bring it up to the latest version I know of.
This is not efficient, mind you. Command line parsing doesn't generally need to be efficient, even by my miserly standards, honed when a PDP-11 was something you hoped to upgrade to... some day...
ftp://ftp.uu.net/usenet/comp.sources.misc/volume2
ftp://ftp.uu.net/usenet/comp.sources.misc/volume3
http://www.cmcrossroads.com/bradapp/ftp/src/libs/
http://www.cmcrossroads.com/bradapp/ftp/src/libs/
This tool is much easier (Score:3, Interesting)
http://www.ibiblio.org/pub/Linux/devel/sugerget-1
With this code, you simply specify command-line strings and variables in a printf()
style format.
E.g. supergetopt( argc, argv,
"string1", "%d %d", function1,
"string2", "%s", function2 )
will call function1( int a, int b ) when string1 is on the command line,
and will call function2( char *s ) when string2 is used on the command line.
A whole lot easier than gperf, IMHO.
Re:Speed in options parsing? (Score:3, Interesting)
The problem is that people set their tab breaks at all sorts of places (eg: every 4 characters), and then use tabs to space things in the middle of lines, or they'll mix tabs and spaces at the beginnings of lines. When somebody with different settings opens the same file, the indentation looks really screwed. That happens even after you've gotten everybody to agree on a common number of columns for indentation.
I only know of two solutions:
I didn't have the energy to do the first, so I use the second solution.
If you're developing on your own it's not an issue, but I don't like to have one coding style here and another there - it's not just confusing, but it takes a while to change my editor settings every time I open code for somebody else. I use spaces and that's that. At least my editors are clever enough to know that Makefiles still need tabs!
-- Steve
Re:Wrong in so many ways (Score:3, Interesting)
I challenge: cite as an example any fixed set of strings (such as would be applicable for perfect hashing) for which a realistic perfect hashing scheme of any sort outperforms a statically-sized conventional chaining table using a trivial 33/37-style [google.com] string hash. I don't think you can. Gperf languishes in obscurity for a reason.
Re:Silly (Score:2, Interesting)
Um... why you think hashes are inefficient? In a lot of languages (Perl, Python, Javascript, etc) the standard collection is the hash. In Javascript, even a simple array is a hash! Why you think it is inefficient?
My thinking is that it is both CPU and cache efficient: it is CPU efficient because it usually just need one round of computation to get you to the correct result (as compared to a tree, which you need one round per tree level). It is cache efficient because you are usually not lead to somewhere irrelevant to your search (in contrast, any intermediate node in a tree when searching for an item in a binary tree will pollute your cache). Yes, in hash you have the hash table entries themselves which will pollute the cache, but that's not as much, exactly because of what you talk about: (spatial) locality of reference. In a hash all entries are in nearby memory, so it is likely that many searches in the same hash table will end up using very few cache lines. In contrast, in a search tree or a list, different nodes are allocated at different time and are much more likely to use completely different cache lines. At least this should be true until the time you overload it, but then you have extensible hashes.