Using named arguments
The other day I wrote a proc that took three integers as arguments. The arguments represented a date. The easy way to write the proc is to just define it with three arguments. Consider the following example:
set_date 3 12 2006
Am I setting the date to March 12 or December 3? It's impossible to know without looking at the definition of the proc. Of course, most of the time when this proc is called it will be called with variables so it would look like this:
set_date $month $day $year
Problem solved. That is, until you are tasked with calling this procedure from a new file and you never can remember if it's "month day year" or "day month year" or maybe "year month day". If you are familiar with the code it might be easy to remember but if you are a new team member trying to come up to speed on the application it may not be so obvious. Or imagine the scenario where standards changed over time and part of the code base uses "month day year" and part of it uses "day month year". It happens.
A simple solution to the problem is to use named arguments. Named arguments are argument pairs that are made up of a name (beginning with "-") followed by a value. Tk makes use of this technique for most of the tk commands (e.g. label .l -borderwidth 1 -relief raised ...) and it has proven to be a very effective way to work.
Using named arguments, the above code might look like the following. Notice how the intent of the code is now crystal clear and the programmer doesn't have to remember the order of the arguments:
set_date -month $month -day $day -year $year
No built in support?
The problem is, neither tcl nor tk have built-in routines for handling named arguments. The proc command has support for variable arguments and default arguments, but no support for named arguments.
Fortunately, over time library routines have been developed for parsing argument lists so we don't have to contiually reinvent the wheel. For example, tcllib [1] has the cmdline package [2], the Simple Development Library [3] has SimpleOption [4], and the tcl'ers wiki page titled "Named Arguments" [5] discusses other techniques. If you don't care about error checking you can even get by with something as simple as "array set option $args".
Given a choice, I prefer to keep program dependencies to a minimum. Maybe I'm inflicted with "Not Invented Here" syndrome, but I generally try to keep dependencies to a minimum. While some of the option parsing libraries support shortest unique abbreviations, type enforcement, aliases (e.g. treating -bd and -borderwidth as the same thing), etc., considerably more often then not I simply don't need those features.
My requirements are usually quite simple. I want to support optional, named arguments, and I need the ability to say control when option processing should stop in case a procedure needs to pass in data that looks like an option but isn't. To that end, I use a solution that makes use of an array to store options and values, and while and switch to parse the arguments.
Example: setting a date
Let's consider an example where we need to write a proc named 'set_date' for setting the date of an object. The user should be able to specify the month, day and year. In addition, we want a shorthand that says the date is in GMT but we also want to be able to explicitly set the timezone. The synopsis might look something like this:
set_date ?-month month? ?-day day? ?-year year? ?-gmt? ?-timezone|-tz tz?
Step 1: getting started
The rest of the code in this article assumes that a varible named 'args' is a list that contains the arguments to be parsed. This means we need to define args as the last (and in this case, only) argument in the proc definition:
proc set_date {args} {
...
}
Step 2: establish defaults
I see this step almost as much of a documentation step then anything else. It establishes defaults for the options, but equally importantly, it enumerates the accepted options in a concise manner. For this reason I define the defaults before parsing even though, strictly speaking, I don't need to do this until later.
The defaults will be stored in an array named 'default'. I'll use 'array set' if the default values are static or I'll set each value individually if the defaults are dynamic. For example, in this case the value for -gmt and -timezone is static, but the default day, month and year must be computed from todays date:
array set default {-gmt 0 -timezone ""}
set now [clock seconds]
set default(-month) [get_month $now]
set default(-day) [get_day $now]
set default(-year) [get_year $now]
If you want to copy and paste the above code you'll need to provide your own definitions for get_month, get_day and get_year, or replace the proc calls with static data.
Step 3: parse the arguments
To parse the arguments we'll use a while loop that looks for values that begin with a "-". When an argument is encountered that doesn't begin with "-" or there are no more arguments the loop will terminate. Each time an argument is found the corresponding opt element should be set. This may mean pulling another argument from the list, doing some computations, doing validation, etc.
Even though this example doesn't support additional arguments other than the named arguments we'll go ahead and support the argument "--" to mean "stop processing arguments".
while {[string match "-*" [lindex $args 0]]} {
set name [lindex $args 0]
set args [lrange $args 1 end]
switch -exact -- $name {
-- {
# stop processing any more arguments
break
}
-month -
-day -
-year {
set value [lindex $args 0]
set args [lrange $args 1 end]
if {![string is integer -strict $value]} {
return -code error "$name: invalid integer \"$value\""
}
set option($name) $value
}
-gmt {
if {[info exists option(-timezone)]} {
return -code error "-gmt cannot be specified if you set the timezone"
}
set option(-gmt) 1
}
-timezone -
-tz {
if {[info exists option(-gmt)] && $option(-gmt)} {
return -code error "timezone cannot be specified with -gmt 1"
}
set value [lindex $args 0]
set args [lrange $args 1 end]
set option(-timezone) $value
}
}
}
Checking for required arguments and applying the defaults
If you want to require the caller to supply certain arguments this is the place to do it. After parsing, our array now holds all of the options specified by the user so we can quickly determine if any required arguments were omitted. For this example we have no required arguments so there's nothing to do but to copy the default values from any missing arguments:
foreach name [array names default] {
if {![info exists option($name)]} {set option($name) $default($name)}
}
We could choose to not apply the defaults here, but that
means any place we need to reference an argument we have to first
check whether it was supplied or not. I prefer to fill in the
option array with all user values and all defaults so all argument
values are defined.
Finishing up
Once we've done all the parsing and applied all the defaults, there's nothing left to do. However, there may be data left in $args. This might be good if you're trying to implement an interface that accepts additional arguments after the optional arguments (e.g. 'puts -nonewline "string"'). It might be bad if your procedure does not accept any other arguments. In this case we'll throw an error if there are any additional arguments that haven't been processed:
if {[llength $args] > 0} {
return -code error "unknown arguments: $args"
}
A note about performance
There is some extra work being done in the above code that doesn't need to be done. For example, we do [lindex $args 0] twice, once in the while condition, and once at the top of the loop to assign the variable 'name'. We can set the variable name inside the loop condition but I think that makes the code harder to read. Since these commands take only microseconds to execute I think the tradeoff for readability is worth the overhead. If this were a time critical routine I probably wouldn't use named arguments to begin with, but the vast majority of the procs I write are not time-critical.
When to use named arguments
I dont' recommend using named arguments for every proc. Single-use procs or procs that take just one or two fixed arguments don't need the overhead of named arguments. Named arguments are best for procedures that take a variable number of arguments, or that have arguments that may be ambiguous when reading code that calls the procedure. Named arguments should not be used for time-critical functions. As a general rule, then, I recommend named arguments for 'public' procedures -- procedures that may be used outside their immediate context, such as in public APIs.Conclusion
Named arguments can be used to make your code more readable. This can translate into "more maintainable", and "more maintainable" is almost always desireable.
There are several canned solutions for argument parsing, and if you don't mind having external dependencies or already depend on a library that includes argument processing functions you should use an existing solutoin. However, if you don't want to depend on an external library it's easy to do simple named argument parsing that likely covers 90+% of the normal cases.
References
- Tcllib, http://tcllib.sourceforge.net/
- cmdline, http://tcllib.sourceforge.net/doc/cmdline.html
- SimpleDevLib, http://simpledevlib.sourceforge.net/
- SimpleOption, http://simpledevlib.sourceforge.net/SimpleOption.html
- Wiki Page "Named Arguments", http://wiki.tcl.tk/10702