Projects

This page describes most of my recent projects and a selection of my favorite projects from years past. All of these are side projects, which means they were developed in my spare time.

Rust projects

Lately, I’ve become quite enamored with Rust. I very much enjoy the particular trade offs made in the language and like the direction that language is headed towards.

Regular expressions

A port of RE2 to Rust. It also features a regex! macro that will compile a regex to native code when your Rust program compiles.

ripgrep

A search tool that combines the usability of The Silver Searcher with the raw speed of GNU grep. I wrote a blog post detailing a benchmark and how I achieved such high performance: ripgrep is faster than {grep, ag, git grep, ucg, pt, sift}.

QuickCheck

This was my first Rust project. If you’ve never heard of QuickCheck before, it’s an awesome tool for random testing of program properties. My Rust library is a port of the Haskell QuickCheck package.

While randomized testing can often produce hard to debug witnesses, QuickCheck has a “shrinking” component that tries to discover small witnesses of failure. (Often in practice, with good shrinkers, minimal witnesses are consistently reported.)

CSV parser

This provides convenient type-based encoding and decoding of CSV data. It is also extremely fast. It should be on par with the performance of libcsv.

There are some rough benchmarks demonstrating its performance.

xsv

A command line swiss army knife for working with CSV data. It supports basic indexing of CSV data, which makes it able to perform operations like slicing very quickly.

docopt.rs

Docopt for Rust. This is a command line argument parser, where the parser is generated by writing the “usage” description for your program.

CBOR

An implementation of RFC 7049 (Concise Binary Object Representation). CBOR is effectively binary JSON.

suffix

An implementation of a linear time suffix array construction algorithm (SACA). To my knowledge, this is the only fast standalone SACA algorithm that operates on Unicode codepoints and is convenient to use.

Elastic tabstops

An easy to use writer for aligning columns of data on the command line. This is a simple port of Go’s text/tabwriter package.

byteorder

A simple crate for reading and writing integers in a binary format. Its purpose is to handle endianness conversions automatically.

walkdir

A crate for recursively iterating over a directory tree efficiently. It includes following symbolic links and a way to limit the number of simultaneously open file descriptors.

chan

Golang style multi-producer/multi-receiver channels. Includes a select_chan! macro with semantics similar to Go’s select statement.

fst

Provides ordered sets and maps represented by finite state machines. This provides a way to compress lots of keys while retaining the ability to search them. I wrote a blog post describing them in detail: Index 1,600,000,000 Keys with Automata and Rust.

Go projects

While Rust and Go are two very different languages, I enjoy using both of them. I love Go for its simplicity, orthogonal design and concurrency primitives.

Wingo (an X window manager)

I have a long history with X that culminated in the development of my ideal window manager. My ideal window manager has support for dynamic workspaces, independent workspace switching on each monitor, automatic tiling and good traditional stacking support.

Wingo is written in pure Go from bottom to top.

X Go Binding

This is a pure Go port of XCB with a concurrent implementation of the X wire protocol.

TOML

An encoder and decoder for TOML, an ini-like config file format. Incidentally, TOML is used to configure Hugo, which is used to generate this web site.

goim

A command line tool for downloading all of IMDb and loading it into a SQLite or PostgreSQL database. It comes with an elaborate search API complete with fuzzy searching. It also has a rename tool that can be used to automatically rename your media files quickly.

For example, this renames all of the Simpson episodes that I ripped off of the DVDs I own:

goim rename -tv 'the simpsons' S01E*.mkv

And here’s an example of searching, e.g., finding all episodes of the Simpsons with “maggie” in the episode title:

goim search '%maggie%' {show:the simpsons}

migration

A simple library for automatically applying migrations to relational databases that support schema changes in transactions. It’s used in goim.

Type parametric functions

Uses runtime reflection to make writing type parametric functions easier.

I love American football. Part of the way I enjoy it is by playing fantasy football. I think it’s a lot of fun to watch how your fantasy teams perform as the games are played, but I’m usually in several leagues on different web sites.

I solved this problem over a number of years by querying NFL.com for data and using it to compute scores for my fantasy teams all in one convenient web interface. (Some day I will make a demo video.)

nflgame

A library for retrieving an undocumented JSON feed from NFL.com. It is very accessible to those with little experience programming, but I’ve come to dislike the API I designed over the years. I very rarely use nflgame directly any more.

nfldb

Replaces (but builds on) nflgame by storing statistics in a relational schema with better documentation. It requires a PostgreSQL database, so it tends to be less widely used.

nflvid

A tool for downloading broadcast footage from NFL’s content delivery network and slicing it into play-by-play components. This comes with a nflvid-watch command line utility for searching and loading plays into a VLC playlist.

nflfan

A single user web application for viewing your fantasy teams scores while games are playing. It also allows you to query for any statistic back to 2009 and watch the corresponding play footage.

Other projects

These projects just don’t fit in the above categories.

pdoc

An automatic documentation tool for Python. I think Sphinx is far too complicated and too painful to use. epydoc was my go to tool for a long time, but the code has been abandoned and is not Python 3 compatible. pdoc is meant to be its replacement.

erd

An Entity-Relationship diagram generator written in Haskell. It takes a plain text description of your relational schema and converts it to a pretty diagram using GraphViz.