tiny-data

Crates.iotiny-data
lib.rstiny-data
version0.1.2
sourcesrc
created_at2024-05-27 01:20:45.842632
updated_at2024-06-02 16:09:24.825595
descriptionA cli tool for building computer vision datasets.
homepage
repositoryhttps://github.com/nnethercott/tiny-data/tree/main
max_upload_size
id1252929
size288,826
Nate Nethercott (nnethercott)

documentation

README

tiny-data

A rust-based cli tool for building computer vision datasets built with reqwest and tokio.

alt text

You can get a list of the available options by running the command below:

>> tiny-data -h
Usage: tiny-data [OPTIONS]

Options:
  -t, --topics <TOPICS>...   Space-delimited list of image classes
  -n, --nsamples <NSAMPLES>  number of images to download per-class [default: 20]
  -d, --dir <DIR>            name of directory to save to [default: images]
  -h, --help                 Print help

Example:

>> tiny-data --topics bats wombats -n 10 --dir images
>> tree images
images
├── bats
│   ├── 0.jpeg
│   ├── 1.jpeg
│   ├── 2.jpeg
│   ├── 3.jpeg
│   ├── 4.jpeg
│   ├── 5.jpeg
│   ├── 6.jpeg
│   ├── 7.jpeg
│   ├── 8.jpeg
│   └── 9.jpeg
└── wombats
    ├── 0.jpeg
    ├── 1.jpeg
    ├── 2.jpeg
    ├── 3.jpeg
    ├── 4.jpeg
    ├── 5.jpeg
    ├── 6.jpeg
    ├── 7.jpeg
    ├── 8.jpeg
    └── 9.jpeg

Installation

To get started with tiny-data you need to enable the Custom Search API from Google and export the variables SEARCH_ENGINE_ID and CUSTOM_SEARCH_API_KEY to your environment.

Note: google limits the number of requests to 100/day which inherently puts a cap on the number of images you can download.

The package itself can be downloaded from crates.io by running:

cargo install tiny-data

The python bindings for the package can be downloaded from pypi with additional features for post-download filtering using CLIP by running:

pip install tinydata[ml]
Commit count: 0

cargo fmt