I wrote a Scala-based SQL REPL called SQLShell. To make it more
portable, I chose to write a Scala readline wrapper API that talks to
multiple underlying readline implementations; that way, SQLShell could use
whatever was available, without the need for special-case code. (The
readline wrapper API, which supports editline, GNU Readline and
JLine, is available in my Grizzled Scala library. See the
grizzled.cmd
and grizzled.readline
packages.)
Naturally, I wanted to support tab completion. But, as it happens, most completion APIs are a little clunky. They give a bare minimum of information, leaving a fair amount of work to the caller.
For example, the Python readline
module provides for tab completion;
the completion function, according to the module’s documentation, “is
called as function(text, state), for state in 0, 1, 2, …, until it
returns a non-string value. It should return the next possible completion
starting with text.”
Well, that’s ugly. But, in all fairness, it merely mimics the ugly approach used by the underlying GNU Readline API. GNU Readline itself is considerably more complicated.
Another example is editline. You can install a completion callback,
which receives the Editline descriptor and a character. You can then query
the API for the information about the current line; you get back a
LineInfo
structure that looks like this:
1 2 3 4 5 6 7 |
|
From that structure, you know three things:
- The contents of the current buffer (which can contain more than the current line).
- The last character in the buffer (i.e., the end of the current line).
- The location of the cursor in the buffer.
You have to write your own code to find the token being completed.
These approaches have a couple problems.
First, every client program tends to do the same thing. Every Editline program, for instance, contains similar code to find the token being completed.
Second, the typical completion handler’s code isn’t exactly straightforward and easy to read. By necessity, it mixes lexical parsing (e.g., to find the token) with semantic interpretation (e.g., What does this token mean if it’s here in a line, as opposed to there?)
Using Scala pattern matching, I was able to craft a solution that allows my client code completion handlers to focus primarily on the semantics. You might find this completion approach interesting, or you may find it appalling. I wasn’t sure myself whether I was happy with it, until recently, when I had to fix a completion bug. I found that this approach made it very clear what was going on in the completion handler, and the bug was trivial to fix.
The easiest way to describe the approach is to show how it handles
the .desc
command in SQLShell. .desc
is used to describe
several things:
.desc database
describes the currently connected database. For
example:
1 2 3 4 5 6 7 8 9 |
|
.desc table
is used to describe a table. For example:
1 2 3 4 5 6 7 8 9 10 |
|
With the addition of the string “full”, it also gets index information:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 |
|
Thus, the basic command forms are:
1 2 |
|
Here are some of the completion challenges. In the examples, below, the location of the cursor is indicated with [].
A tab pressed at the location of the cursor, below, should complete “.desc”, since there are no other commands starting with “.d”:
1
|
|
By contrast, a tab pressed here does nothing, because the command is already completed:
1
|
|
In this next case, a tab pressed here should show the choices “database” and the list of tables that are available for completion:
1 2 3 4 |
|
Here, a tab should complete “foo”, since there’s a table named “foo”, and no other candidate starting with “f”:
1
|
|
In both of the following cases, pressing a tab should complete the word “full”:
1 2 |
|
To make this kind of parsing easier to model, my Scala readline adapter API converts the line into a list of tokens.
- A text token is stored in a
Some
object. - White space (the delimiter) is represented by
Delim
objects. All adjacent white space is collapsed into a single delimiter. - The cursor is represented by a special
Cursor
token. - The end of the token stream is denoted by
Nil
.
Given this input:
1
|
|
the API produces this token list:
1
|
|
Similarly, this input:
1
|
|
produces this token list:
1
|
|
With that approach, writing a completion handler is pretty straightforward:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 |
|
(The subCommandCompleter
object is an instance of a stock “List”
completer that is instantiated with a list of choices (“database” and the
list of tables, in this case) and returns zero, one or many completions
from that list. Completing from a list of choices is common, so the API
provides an easy way to do that.)
This matching-based approach hides the nitty gritty parsing details, allowing the completer to focus on the “business logic” of figuring out context and returning the appropriate token. For me, it also more closely mimics how I mentally model the command line being completed.