GeoJSON has become de facto standard when it comes to transferring, manipulating and visualizing geospatial data on the web. Major mapping libraries like Leaflet and OpenLayers have great support for it, making it pretty straightforward going from raw data to interactive web map. Github has gone so long that they show geodata in a map view if it’s commited to repository with .geojson-extension. No doubt geojson has become one of the most widespread geospatial formats.
I found a blog post written by Bjørn 2 years ago where he goes through simple tricks that help make GeoJSON files smaller, like removal of whitespaces, newlines and trailing zeros. But there’s more we can do to achieve even better results.
Delta and zigzag encoding
With lines and polygons being a series of consecutive coordinates it would be less space consuming to store the difference between coordinates instead of actual coordinates. Let’s take an example and look at geometry for a simple triangle:
As we can see each coordinate is represented separately as an array of two coordinates. Because in a polygon coordinates always follow the same order we can get rid of all the square brackets and store coordinates in a plain array:
By storing the difference between coordinates we can save even more space. We need to keep first coordinate in its original format so that we can restore coordinates later on.
Now the 6 and 7 digit coordinates has become 4 and in some cases even 2 digits. We can go even further and apply zigzag encoding so that we get rid of negative signs, ending up with this result:
With this approach we trimmed 71 character geometry to 46 and that was only 4 coordinates. Imagine a geojson file with world countries borders where geometries are much more complex with thousands of coordinates in them.
Dane Springmeyer mentioned in his talk at FOSS4G’14 that geometries in vector tiles are delta and zigzag encoded which results in a much more efficient storage - OpenStreetMap data for the whole planet can fit on a usb-stick. I was curious to find out how it can be applied to geojson and wrote a simple utility for that -
geojson-minifier can be integrated into existing node.js application or used as standalone command line tool. Going back to Bjørns blog post, let’s run files he ended up with through
Unpacking minified geojson is as easy:
As we can see, minified version is almost 3 times smaller. Here’s how fragments of uncompressed(left) and compressed(right) files look like:
I can see 2 scenarios where this utility can be applied. If geojson files are static and don’t change often it can be one time job to run them through minifier and serve to the client compressed version. If geojson is generated dynamically from for instance PostGIS one could use geojson minifier
pack/unpack methods to integrate with existing node.js application. Once transfered over to the browser one could use
unpack mehtod to convert minified geojson to original format. Even better would be to write plugins to Leaflet or OpenLayers that builds geometries on the fly from minified geojson.
Give it a try and let me know what you think!