Command line interface (CLI) tools have a reputation for being difficult to use.
Whilst powerful once mastered, a CLI inherently lacks a litany of user interface “nice-to-haves”, simply by virtue of its medium:
-
Simplicity: Common tasks should be easy to perform. This is often not the case, but worse is when tasks can be tweaked in myriad ways, leading to a combinatorial explosion of flags, options and modifiers.
-
Memorability: How does one remember all those different options? It’s not realistic, which is why we so often resort to double-checking the usage text,
man
pages, or search engine/generative AI du jour. -
Discoverability: Reminding ourselves how to perform what we know (or assume) can be done isn’t the same as guiding us towards the full gamut of what’s possible. Even when reading the entire reference manual (or source code), this isn’t a sure thing: useful emergent behaviour often goes undocumented.
-
Familiarity: So we effectively resort to guessing.
--help
must give us the usage text, right? There are common idioms that are pervasive, but even this is not a guarantee.
- Clarity: Open, meaningful communication is generally a good idea
in all walks of life. How often have you run a CLI command to be
presented with zero output, relying on
echo $?
to see if it even succeeded? Or, at the other extreme, so much logging noise flooding your terminal that you can’t tell what’s going on?
tar
, lampooned in the above XKCD, is — at nearly 45-years-old — a
product of its age. The current GNU (and, to a lesser extent, BSD) tar
implementations go some way to improve things, but old habits die hard
and it’s still very common to see tar chzPf
-and-friends in the wild.
Regardless, more modern tools also suffer from similar problems; anyone
who has used Git, or shudder GnuPG will attest to this.
As Topiary — Tweag’s code formatting engine — matures, I took the opportunity to modernise its CLI before suboptimal patterns become entrenched.
Sowing the seeds of change
In its early development, Topiary’s CLI was largely motivated by need. When features were added, they were often exposed as command line arguments, with little thought about the overall experience.
This is what it had organically grown into:
CLI app for Topiary, the universal code formatter.
Usage: topiary [OPTIONS] <--language <LANGUAGE>|--input-files [<INPUT_FILES>...]>
Options:
-l, --language <LANGUAGE>
Which language to parse and format [possible values: json,
nickel, ocaml, ocaml-interface, ocamllex, toml]
-f, --input-files [<INPUT_FILES>...]
Path to an input file or multiple input files. If omitted, or
equal to "-", read from standard input. If multiple files are
provided, `in_place` is assumed [default: -]
-q, --query <QUERY>
Which query file to use
-o, --output-file <OUTPUT_FILE>
Path to an output file. If omitted, or equal to "-", write to
standard output
-i, --in-place
Format the input files in place
-v, --visualise[=<OUTPUT_FORMAT>]
Visualise the syntax tree, rather than format [possible
values: json, dot]
-s, --skip-idempotence
Do not check that formatting twice gives the same output
--output-configuration
Output the full configuration to stderr before continuing
-t, --tolerate-parsing-errors
Format as much as possible even if some of the input causes
parsing errors
--configuration-override <CONFIGURATION_OVERRIDE>
Override all configuration with the provided file [env:
TOPIARY_CONFIGURATION_OVERRIDE=]
-c, --configuration-file <CONFIGURATION_FILE>
Add the specified configuration file with the highest priority
[env: TOPIARY_CONFIGURATION_FILE=]
-h, --help
Print help
-V, --version
Print version
Syntax tree visualisation is a good example: This was toggled with the
--visualise
argument, which took an optional parameter to switch the
output format between GraphViz DOT and JSON. Visualisation was designed
as a different mode of operation, to aid the development of formatting
queries, but being expressed as an argument could imply to users that
formatting should still happen. This implication is reinforced by the
presence of formatting-specific arguments, such as --skip-idempotence
and --tolerate-parsing-errors
; run together with --visualise
is
meaningless, but it worked regardless.
At a more fundamental level, there’s the question of I/O: From where is
input read, and to where is it written? The input to a formatting tool
is kind of important, to say the least, and if Topiary is to do anything
besides cooking silicon wafers, it’s probably a good idea to do
something with its result. Yet an awkward dance of --input-file
,
--output-file
and --in-place
was imposed. Those may seem
self-explanatory — and they were, in the beginning — but things
change:
-
When support for formatting multiple inputs was added,
--input-file
became--input-files
. However, what happens if I try to visualise multiple inputs? -
How do multiple inputs map to outputs? The usage text says that
--in-place
is assumed in this case, but what if I also specify an--output-file
? -
What if I want to work with standard input or standard output? What about read-only files? Is
--in-place
permitted in these cases? -
Why do I even have to specify
--input-files
(or its short form,-f
)? Of course I’m going to be providing an input of some kind, so it’s a bit redundant.
While some of these questions were answered by the usage text, to really know the behaviour of edge cases, you’d have to experiment or start reading the source code.
Then there are the little things. Like when providing custom
configuration, how do --configuration-file
and
--configuration-override
interact? (I actually did have to read the
source code to figure that one out!) What about logging output,
presuming there is any; how does one access that?
Death by a thousand cuts.
Don’t get me wrong: Topiary’s CLI was certainly functional in this state. However, it leant too heavily on assumptions and the Topiary team’s collective (but inscrutable) knowledge. This made it clumsy, sometimes surprising and, overall, imposed too much friction on new users.
Mighty oaks from little acorns grow
So I decided to fully rework the CLI. My mandate was clear:
Make illegal states unrepresentable
This is a classic idiom from strongly typed functional programming, where the type system is leveraged to forbid invalid input. The same logic applies to command line arguments and, as Topiary is written in Rust, I was able to achieve this using the same mechanism.
Topiary uses the excellent clap command line argument parser library, which has a feature to derive the parser directly from your types and their annotations (such as doc comments). This makes the definition of the CLI arguments purely declarative, leaving you with only the work of setting up your types correctly. All my other CLI modernisation work stems from here.
To give some examples of what’s now unrepresentable (i.e., will fail):
- Attempting to visualise multiple files;
- Specifying a formatting language when formatting files (which instead use inference on the filename, driven by the configuration) or, vice versa, specifying an input file when formatting standard input;
- Specifying a formatting query file without a formatting language when formatting standard input.
Separate modes of operation
Formatting and visualisation are two distinct modes of operation; they
do different things and take different options. Visualisation with
--in-place
wouldn’t only be wrong, it would be catastrophic! The
former CLI also had an additional “pseudo-mode” of outputting the raw
configuration to standard error, for debugging purposes.
The old-fashioned way of separating modes in non-interactive CLI tools
was to have multiple binaries. This tends to both pollute the global
command namespace (ImageMagick’s convert
, anyone?) and can be hindered
by a lack of consistency amongst related tools.
The new hotness — and by “new”, I mean “for the last decade, or two” — is to use subcommands to mark this separation. So that’s what I did:
topiary format
formats your code.topiary visualise
visualises your code.topiary config
outputs the computed configuration, as TOML. (Formatted, of course!)topiary completion
is a new feature, which generates shell completion scripts to aid discoverability.
All have common options and each have specific options, facilitated by the types I defined. If a new mode is developed in the future, it can just be added without breaking backwards compatibility.
Make use of familiar idioms
Interaction with a CLI tool, that misses out on the visual cues and
metaphors that a GUI can provide, should be like a dialogue. (Preferably
a productive and friendly one, rather than one that ends up with you
questioning your life choices.) To that end, a de facto lexicon exists
amongst CLI tools with common behaviours that should be adhered to. It
would be weird, for example, if --assist
was the option to show the
usage text, rather than --help
.
There are a few things I’ve done in the new Topiary CLI to improve its conversation skills:
-
While
--input-file
and the likes are not unheard of, it’s far more common, when input is required, to use positionalFILE
arguments. This also plays nicely with, say, scripts that one may wish to write. -
The point of a formatter is to format its input, it therefore stands to reason that
--in-place
is implied when dealing with files. (A notable exception to this issed
, but as a “stream editor” first, it has a little more right to an--in-place
option, which is not its default.) -
The ability to format standard input is still important, though. In which case, rather than using the
-
file convention, files are simply omitted. This allows Topiary to be put into a script pipeline and enables this common interaction:$ topiary format --language json ⟵ Invocation {"type":"your code here"} ⟵ Standard input <Ctrl+D> ⟵ EOF { "type": "your code here" } ⟵ Formatted standard output
-
Logging information has always existed in Topiary, but you wouldn’t know unless you read the
README
fairly thoroughly. It used to be exposed through theRUST_LOG
environment variable, which is an artefact of theenv_logger
library. This has been changed to a--verbose
flag, following the common idiom of “more occurrences means more output” (e.g.,-vvv
maps todebug
logging).
Make common tasks easy and unsurprising
Topiary exposes a number of options. These knobs allow you to change its behaviour in various ways and, while they shouldn’t be hidden away, they shouldn’t obstruct the “happy path”. This echoes the design principles that were laid out for Topiary’s formatting styles: uniform and “good enough”. The same can be said for the CLI: Simplicity, over steampunk.
Some examples I’ve implemented:
-
Formatting all your project’s files is just a matter of running
topiary format PROJECT_DIR
. Topiary will recursively walk the directory (no longer having to rely on shell expansion) and format every file it understands. -
Visualisation defaults to GraphViz DOT output, rather than JSON. A PDF of the syntax tree can now be created with the very natural:
topiary visualise /path/to/my.file \ | dot -Tpdf \ > syntax-tree.pdf
-
Topiary’s configuration is sourced from a priority list and then collated in one of three ways. This affords a high degree of customisation, but can make it difficult to introspect the runtime configuration. However,
topiary config
will not only output the computed configuration — which can then be reused, for reproducibility’s sake — but Topiary will also annotate it in a way that lets you understand where it came from:# Configuration collated from the following sources, # in priority order (lowest to highest): # # 1. Built-in configuration # 2. /home/user/.config/topiary/languages.toml # 3. /home/user/my-project/.topiary/languages.toml # # Collation mode: Merge [[language]] name = "json" extensions = [ "json", ] # etc.
Don’t paint yourself into a corner
You may be thinking:
If Topiary is primarily a formatter, why can’t I just skip the
format
subcommand and have the CLI assume that as the default?
It’s a good question and one which I asked myself. The most common task for the Topiary CLI will be formatting, so by my earlier admission of “making common tasks easy”, surely this optimisation would be beneficial?
The answer to this has two parts:
-
Firstly, clap doesn’t make this easy to do, when using its derivation feature. Subcommands can certainly be optional, but to force the parser into a “default path or subcommand path” state machine, while maintaining coherent usage text, significantly goes against its grain.Shortly after publishing this article, Ed Page — the principal contributor of clap — reached out to let me know that clap can actually do this, and quite straightforwardly. As to my second point, below, while generally true, he also suggested an interesting compromise, which the Topiary team will work on implementing.
-
More importantly, however, is that if Topiary commits to providing a stable interface — which it certainly does — then other subcommands (current and future) are effectively blocked by virtue of the positional and arbitrary file inputs. What happens, for example, if you want to format a file called
config
?
This second point is crucial. Not only does it have the effect of prohibiting legitimate subcommands, but users will come to rely on this shorthand which fossilises the interface with suboptimal behaviour.
For the same reason, I don’t allow subcommand or long option inference
(that is, expanding partial CLI arguments when there are no
ambiguities). There are, however, a handful of subcommand aliases which
are helpful to make common tasks quicker to type (e.g., fmt
for
format
), or when alternative spellings are common.
See the forest for the trees
The full usage text for what I ended up with is too long to paste into
this article, but it can of course be found in Topiary’s
documentation. Besides, maybe a little demo would be
more illustrative. Here’s how it looks with completion in zsh
(i.e.,
don’t blame Topiary for the clipping):
The path trod to arrive at this destination was not random. It was informed by my own experience with CLI tools, guided by the direction in which clap steers you. Further polish was added on the advice of the Command Line Interface Guidelines book. That said, UX research is a legitimate discipline — involving testing and analytics on, you know, users — which is usually the purview of GUIs. I’m not aware of any UX research done in the realm of CLIs, but this may be an interesting lead to follow.
In the meantime, the changes discussed in this article, as well as numerous “quality of life” improvements to the CLI codebase — and other enhancements to Topiary, as a whole — landed in Topiary v0.3. Making such sweeping changes was a big job that required careful planning and review. However, I think the results speak for themselves and that this upfront investment will pay off as Topiary blooms.
Thanks to Nicolas Jeannerod and Erin van der Veen for their reviews of this article.
About the author
Chris is a recovering mathematician and software engineer at Tweag. He has spent his career working for academia, from both the business and the research sides of the industry. He particularly enjoys writing well-tested, maintainable code that serves a pragmatic end, with a side-helping of DevOps to keep services ticking with minimal fuss.
If you enjoyed this article, you might be interested in joining the Tweag team.