---
title: "Implementing DICT protocol: Part 1"
date: 2022-01-16
lang: en
categories: [ blog ]
tags: [dict, dictionary, go, golang, rfc2229, tcp ]
draft: true
translationKey: "2022-01-16-Dict-1"
---
## DICT Protocol
What is DICT protocol?
Notable implementations for this include [dict(d)][dict]
and [GNU dico(d)][dico]; the former is the reference implementation that
supports multiple database formats, as listed in [dictfmt (1)][man-dictfmt].
[dict]: https://github.com/cheusov/dictd
[dico]: https://www.gnu.org.ua/software/dico/
[man-dictfmt]: https://linux.die.net/man/1/dictfmt
I intend to implement a server and multiple clients (CLI, GUI, ~~web~~) to this
protocol, as well as some tools to easily create a dictd-readable database.
## Why?
No practical reason, but [dict] is one of the first command line tool
introduced to me and easily one of my favorite, along with curl and [jq][jq].
It's basically just a dictionary app, but it's cool:
- works perfectly in terminal
- easily self-hostable
- fast
- has cool dictionaries (though only Debian, Arch and derivatives distribute
those)
[jq]: /posts/2021-06-13-jq/
Also, I'm writing dictionaries for my [conlangs][conlang] and I want to
distribute them via this protocol. Clearly, implementing a server that is
already implemented doesn't help, but I tend to go down rabbit holes.
[conlang]: /misc/#conlangs
I also like to explore non-web protocols, and starting with something simple
like DICT might be a good idea.
## Reading the spec
The spec (linked at the top of this post) is shorter and easier to read than I
thought. Ignoring the introduction, examples and citation, it's les than 20
pages. There are five classes of commands:
- Querying the database: `DEFINE`, `MATCH`
- `SHOW` metadata about the servers and the databases
- Utilities: informing `CLIENT` name, check `STATUS`, show `HELP`, show
`OPTION` and `QUIT`
- Authentication: `AUTH` and `SASLAUTH`
The authentication ones are optional, and I don't find that useful, so I
won't implement it anyway, this limits to the first three categories.
## Handling TCP
DICT is based on TCP,
and there is a neat interactive TCP tool called [`telnet`][telnet],
which I used for testing the commands.
[telnet]: https://en.wikipedia.org/wiki/Telnet
### telnet
DICT runs on port 2628:
```sh
$ telnet dict.org 2628
Trying 199.48.130.6...
Connected to dict.org.
Escape character is '^]'.
220 dict.dict.org dictd 1.12.1/rf on Linux 4.19.0-10-amd64 <89168346.27665.1642303045@dict.dict.org>
```
Let's try out some commands to understand how this work. Note that I prefix
the command with `~> ` here so that it stands out of the response, and truncate
long results with `[...]`.
Let's first show what databases there are
```
~> SHOW DB
110 166 databases present
[...]
.
250 ok
```
There are a lot of dictionaries here, including [GCIDE][gcide], [WordNet][wn],
[The Jargon File][jargon], [V.E.R.A.][vera], [FOLDOC][foldoc], but most of them
are [FreeDict][fd] dictionaries.
To a word, the syntax is
```
~> MATCH database strategy word
```
Strategy is how the server will match the word you're looking up. To list all
strategies available, send the command:
```
~> SHOW STRATEGIES
```
There are various strategies supported by dictd, for example, `substring`,
which matches if the entry has the queried word as substring:
```
~> MATCH jargon substring program
152 13 matches found
jargon "c programmer's disease"
jargon "cargo cult programming"
jargon "mickey mouse program"
jargon "perfect programmer syndrome"
jargon "program"
[...]
.
250 ok [d/m/c = 0/13/5775; 0.000r 0.000u 0.000s]
```
This command only show which words in the database, if any, satisfy the match,
without showing the definition. To actually view a definition, one has to
supply the dictionary name to the `DEFINE` command. Note that, you can also
use `*` for both `DEFINE` and `MATCH` command, which will define/match for all
dictionaries.
```
~> DEFINE * programming
150 3 definitions retrieved
151 "programming" wn "WordNet (r) 3.0 (2006)"
programming
[...]
.
151 "programming" jargon "The Jargon File (version 4.4.7, 29 Dec 2003)"
programming
n.
[...]
.
151 "programming" foldoc "The Free On-line Dictionary of Computing (30 December 2018)"
programming
.
250 ok [d/m/c = 3/0/145; 0.000r 0.000u 0.000s]
```
That's a gist of how to look up words with DICT protocol. You can find more
commands with:
```
~> HELP
[...]
.
250 ok
```
Finally, to end the session, the command is:
```
~> QUIT
221 bye [d/m/c = 0/0/0; 123.000r 0.000u 0.000s]
```
Note that, the response always ends with a period and a `250 ok`
response---this is equivalent to HTTP's 200 OK---except for `QUIT`. These
response code are defined in [the protocol specification][rfc2229].
Commands other than `HELP` has some additional statistics, though this is
optional. I figured out that `d` means definitions, `m` means matches, and `s`
is probably the time it took to query (why are they always zero, though?), but
no clues on what `c`, `r`, and `u` mean. I might check the [source code][dict]
to figure that out, but let's leave it for another time.
[gcide]: https://gcide.gnu.org.ua/
[wn]: https://wordnet.princeton.edu/
[jargon]: http://www.catb.org/~esr/jargon/
[foldoc]: https://foldoc.org/
[vera]: https://savannah.gnu.org/projects/vera
[fd]: https://freedict.org/
[rfc2229]: https://datatracker.ietf.org/doc/html/rfc2229#page-23
### Go
Of course we are not going to make the users type these commands (though it's
not too unintuitive and can be easily remembered). I chose Go to build the CLI
client, though without any conscious consideration of fitness. I'm trying out
new things[^0] after all.
From the [doc][go-net], we can figure out how to make a TCP connection.
```go
conn, err := net.Dial("tcp", "golang.org:80")
if err != nil {
// handle error
}
fmt.Fprintf(conn, "GET / HTTP/1.0\r\n\r\n")
status, err := bufio.NewReader(conn).ReadString('\n')
// ...
```
Let's copy that and replace with DICT command instead of HTTP:
```go
conn, err := net.Dial("tcp", "dict.org:2628")
if err != nil {
panic(err)
}
defer conn.Close()
buf := bufio.NewReader(conn)
fmt.Fprintf(conn, "MATCH jargon word programming\n")
fmt.Fprintf(conn, "QUIT\n")
for {
response, err := buf.ReadString('\n')
if err != nil {
// oftentimes this is EOF error
fmt.Println(err)
break
}
fmt.Printf(response)
}
```
Running this code, we get response:
```
220 dict.dict.org dictd 1.12.1/rf on Linux 4.19.0-10-amd64 <89266600.1914.1642341395@dict.dict.org>
152 4 matches found
jargon "cargo cult programming"
jargon "programming"
jargon "programming fluid"
jargon "voodoo programming"
.
250 ok [d/m/c = 0/4/3814; 0.000r 0.000u 0.000s]
221 bye [d/m/c = 0/0/0; 0.000r 0.000u 0.000s]
EOF
```
which is a good start.
There is a problem with this code: currently we are reading line by line,
rather than reading the whole response for each command. We can't know if line
3 is response for the first command or the second this way. A solution is to
check if the line is prefixed with a status code, but do we have a better
solution?
Let's wait till next week!
[go-net]: https://pkg.go.dev/net
[^0]: Not really, I've written a CLI client for Wiktionary API with Go before.