an endless dictionary generated by a recurrent neural net

UPDATE: GLOSSATORY has had its neural net retrained on the same data set. The new net is bigger (512 x three layers) and has a better sense of grammar: it also produces more interesting content at lower temperatures.

@GLOSSATORY is a bot which tweets absurd definitions generated by a recurrent neural network (RNN). The RNN was trained on a set of 82,115 definitions extracted from WordNet, a lexical database.

Even though it only samples and generates text one character at a time, the net has learned quite a lot about how dictionaries are written: parenthetical comments which qualify a definition by restricting its context, plausible date ranges, very common words like the names of countries, stylistic features which are specific to a particular subdomain.

I love the combination of superficial plausibility and phantom semantic depth in its output: it reminds me of Joyce or Ben Marcus' The Age of Wire and String.

With other bots where the actual generation was computationally expensive or had fiddly software requirements, I've generated the content in advance and them set up some Python on a host to tweet it, but I wanted @GLOSSATORY to be live. I wasn't able to install the Torch environment on my web hosting provider, so instead I used my Raspberry Pi: on this hardware, it takes about twenty seconds to generate each tweet. I like the idea of a toy brain sitting in my loungeroom and generating little pellets of nonsense every three hours. Perhaps the reason I wanted it to be live was because it has the rudiments of a personality.

Also available in 500 characters on Mastodon:

If you like it or have any comments or suggestions, drop me a line at @bombinans.

Mike Lynch 2016-2017

Technical notes and credits

I built @GLOSSATORY with Justin Johnsons's Torch-RNN, an efficient recurrent neural net implemented in the Lua scientific framework Torch, after reading Andrej Karpathy's excellent blog post on the subject. The source code for the Python side of it is on my GitHub, although most of the action is in the torch-rnn code.

You can also download the new RNN itself (65M t7 file) - you'll need Torch-RNN to use it.