Crates.io | datasaurust |
lib.rs | datasaurust |
version | 0.1.0 |
source | src |
created_at | 2023-04-10 18:21:07.689781 |
updated_at | 2023-04-10 18:21:07.689781 |
description | Blazingly fast implementation of the Datasaurus paper. |
homepage | https://github.com/araffin/datasaurust |
repository | https://github.com/araffin/datasaurust |
max_upload_size | |
id | 835342 |
size | 48,122 |
Blazingly fast implementation of the Datasaurus paper (500x faster than the original): "Same Stats, Different Graphs: Generating Datasets with Varied Appearance and Identical Statistics through Simulated Annealing" by Justin Matejka and George Fitzmaurice.
To run with plot -p
(using gnuplot):
cargo run --release -- -d data/seed_datasets/Datasaurus_data.csv -p
With pre-defined shape:
cargo run --release -- -p -n 3000000 --decimals 2 --shape cat --allowed-distance 0.1
Starting from Gaussian noise:
cargo run --release -- -p -n 3000000 --decimals 2 --shape cat --allowed-distance 0.1 --gaussian
Create video and gif (use --save-plot
):
pip install moviepy ffmpeg-python
python scripts/create_video.py logs/cat/ logs/cat.mp4
From one shape to another:
cargo run --release -- -p -n 2000000 --decimals 1 --shape dog --allowed-distance 0.1 --log-interval 10000 -d logs/gaussian_cat/output.csv --save-plots
Note: The original datasets and python code comes from http://www.autodeskresearch.com/papers/samestats