Crates.io | tiny-data |
lib.rs | tiny-data |
version | 0.1.2 |
source | src |
created_at | 2024-05-27 01:20:45.842632 |
updated_at | 2024-06-02 16:09:24.825595 |
description | A cli tool for building computer vision datasets. |
homepage | |
repository | https://github.com/nnethercott/tiny-data/tree/main |
max_upload_size | |
id | 1252929 |
size | 288,826 |
A rust-based cli tool for building computer vision datasets built with reqwest and tokio.
You can get a list of the available options by running the command below:
>> tiny-data -h
Usage: tiny-data [OPTIONS]
Options:
-t, --topics <TOPICS>... Space-delimited list of image classes
-n, --nsamples <NSAMPLES> number of images to download per-class [default: 20]
-d, --dir <DIR> name of directory to save to [default: images]
-h, --help Print help
Example:
>> tiny-data --topics bats wombats -n 10 --dir images
>> tree images
images
├── bats
│ ├── 0.jpeg
│ ├── 1.jpeg
│ ├── 2.jpeg
│ ├── 3.jpeg
│ ├── 4.jpeg
│ ├── 5.jpeg
│ ├── 6.jpeg
│ ├── 7.jpeg
│ ├── 8.jpeg
│ └── 9.jpeg
└── wombats
├── 0.jpeg
├── 1.jpeg
├── 2.jpeg
├── 3.jpeg
├── 4.jpeg
├── 5.jpeg
├── 6.jpeg
├── 7.jpeg
├── 8.jpeg
└── 9.jpeg
To get started with tiny-data
you need to enable the Custom Search API from Google and export the variables SEARCH_ENGINE_ID
and CUSTOM_SEARCH_API_KEY
to your environment.
Note: google limits the number of requests to 100/day which inherently puts a cap on the number of images you can download.
The package itself can be downloaded from crates.io by running:
cargo install tiny-data
The python bindings for the package can be downloaded from pypi with additional features for post-download filtering using CLIP by running:
pip install tinydata[ml]