amazon-captcha-rs

Crates.ioamazon-captcha-rs
lib.rsamazon-captcha-rs
version0.2.2
sourcesrc
created_at2023-09-01 09:46:02.754535
updated_at2023-11-16 12:10:53.7689
descriptionA Rust library to solve Amazon CAPTCHAs
homepage
repositoryhttps://github.com/vlourme/amazoncaptcha-rs
max_upload_size
id960780
size16,783,212
Victor Lourme (vlourme)

documentation

README

Rust Amazon Captcha Solver

crates.io

Welcome to the Rust Amazon Captcha Solver, an tool designed to resolve Amazon.com captchas without relying on Tesseract OCR or any external OCR software. Inspired by the exceptional work of gopkg-dev/amazoncaptcha and a-maliarov/amazoncaptcha, this project builds upon their dataset, presenting it in an uncompressed bincode format.

Key Features

  • Lightning-Fast: Our solver boasts an impressive speed, resolving captchas in just around 7 milliseconds per image.
  • Pinpoint Accuracy: Achieve precise captcha resolution with an accuracy rate of up to 98.5%.
  • Simplicity and Maintainability: With a minimal codebase of only 200 lines of code, this project is easy to maintain and extend.

How It Works

Functional Schema

Amazon captchas exhibit a repetitive nature, with characters consistently less than 32 pixels in size, resulting in a finite number of possible combinations. We leverage the dataset.bin containing most of these possibilities. When a character doesn't match exactly, we employ a similarity comparison function for accurate resolution.

Usage Example

use amazon_captcha_rs::Solver; 

let image = image::open("captcha.jpg").unwrap();

let solver = Solver::new().unwrap();
let response = solver.resolve_image(&image);

assert_eq!(response, "caggpa");

Benchmarking

Starting from version 0.2.1, we've introduced a benchmarking feature to assess resolved captchas directly on Amazon. We conducted over 2000 tests with the following results:

Resolved: 1990/2020
Precision: 98.51%

You can run the benchmark using the following command:

cargo run -r --example benchmark_infinite

Additionally, we have a second benchmark performed on our proprietary labeled dataset located in example/dataset. This dataset contains only 100 images and yielded the following results:

Solved: 97/99
Precision: 97.98%
Average resolve time: 6.76ms
Total time: 0.75s

Execute this benchmark with the following command:

cargo run -r --example benchmark

Changelog

Version 0.2.1

  • Improved documentation
  • Added more benchmarking options

Version 0.2.0

  • Enhanced letter extraction method
  • Introduced letter merging (used when the first letter is cropped with the last letter)
  • Precision bumped up to 97%

Version 0.1.0

  • Initial release with a precision rate of 64%
Commit count: 15

cargo fmt