geochunk

Crates.iogeochunk
lib.rsgeochunk
version1.0.0
sourcesrc
created_at2017-10-20 13:15:30.491344
updated_at2022-05-25 15:03:10.873435
descriptionSplit a CSV semi-evenly based on ZIP population stats
homepagehttp://blog.faraday.io/geochunk-fast-intelligent-splitting-for-piles-of-address-data/
repositoryhttps://github.com/faradayio/csv-tools
max_upload_size
id36353
size665,716
Eric Kidd (emk)

documentation

README

geochunk: Break data up into chunks of similar population

geochunk is intended for use in a distributed system. It provides a deterministic mapping from zip codes to "geochunks" that you can count on remaining stable. Geochunks will try to approximate the population size that you specify.

See this blog post on geochunk for an introduction and some pretty pictures.

Usage

Run geochunk --help for usage instructions.

geochunk - Partition data sets by estimated population.

Usage:
  geochunk export <type> <population>
  geochunk csv <type> <population> <input-column>
  geochunk (--help | --version)

Options:
  --help        Show this screen.
  --version     Show version.

Commands:
  export        Export the geochunk mapping for use by another program.
  csv           Add a geochunk column to a CSV file (used in a pipeline).

Types:
  zip2010       Use 2010 Census zip code population data.

How it works

See the Jupyter notebook, which explains the algorithm. We use census data to build variable-length zip code prefixes, and then try to group those prefixes together in a way that balances population size as much as possible.

Installing

Binary releases are available for OS X and Linux. To install these, unzip the file and copy geochunk to /usr/local/bin or another directory in your PATH:

unzip geochunk-v0.1.4-osx.zip
sudo cp geochunk /usr/local/bin/

You can also install from source:

# Mac and Linux.
curl https://sh.rustup.rs -sSf | sh
cargo install geochunk

# On Windows, see https://www.rustup.rs/ for instructions on installing
# Rust, then run:
cargo install geochunk

Windows hasn't been tested, but it should work, perhaps after some tweaking. If it doesn't, please feel free to submit issues, PRs or even an AppVeyor build configuration. In general, Rust command-line tools should work fine on Windows.

Commit count: 145

cargo fmt