Huffman coding on NMEA sentances

I’m impressed with how well Huffman encoding works on the very verbose, very repetitive, ASCII based NMEA GPS sentances. I hacked up a Python script that bakes a fixed dictionary from example data and a device side C++ encoder that encodes based on the dictionary. The encoder is 46 statements, uses ~10 bytes of RAM, and still gets almost 3:1 compression.

For comparison, on my 135,548 byte test file:

  • Treating each character as a symbol gives 58,749 B (2.30x)
    • Treating the talker (‘GPGGA’), and each non-numeric field as a symbol gives 46,104 B (2.94x)
      • lzop gives 22,161 B (6.12x)
        • gzip gives 12,167 B (11.2x)
        The end goal is to strap a GPS, LPC810, and 32 KiB data flash to my plane and record the track while flying over the Allmend. Off the shelf is too easy.
Avatar
Michael Hope
Software Engineer