| Crates.io | distance-wasm |
| lib.rs | distance-wasm |
| version | 0.0.2 |
| created_at | 2025-11-16 12:24:30.890348+00 |
| updated_at | 2025-11-16 13:23:24.823837+00 |
| description | WebAssembly bindings for high-performance string distance and similarity algorithms |
| homepage | https://github.com/DemoMacro/nlptools#readme |
| repository | https://github.com/DemoMacro/nlptools |
| max_upload_size | |
| id | 1935469 |
| size | 37,722 |
High-performance string distance and similarity algorithms powered by WebAssembly
# Install with npm
$ npm install @nlptools/distance-wasm
# Install with yarn
$ yarn add @nlptools/distance-wasm
# Install with pnpm
$ pnpm add @nlptools/distance-wasm
import * as distanceWasm from "@nlptools/distance-wasm";
// All algorithms are available as named functions
console.log(distanceWasm.levenshtein("kitten", "sitting")); // 3
console.log(distanceWasm.jaro("hello", "hallo")); // 0.8666666666666667
console.log(distanceWasm.cosine("abc", "bcd")); // 0.6666666666666666
Most algorithms have both distance and normalized versions:
// Distance algorithms (lower is more similar)
const distance = distanceWasm.levenshtein("cat", "bat"); // 1
// Similarity algorithms (higher is more similar, 0-1 range)
const similarity = distanceWasm.levenshtein_normalized("cat", "bat"); // 0.6666666666666666
// Classic string edit distance
distanceWasm.levenshtein("saturday", "sunday"); // 3
distanceWasm.damerau_levenshtein("ca", "abc"); // 2
// Phonetic and similarity algorithms
distanceWasm.jaro("martha", "marhta"); // 0.9611111111111111
distanceWasm.jarowinkler("martha", "marhta"); // 0.9611111111111111
distanceWasm.hamming("karolin", "kathrin"); // 3
distanceWasm.sift4_simple("abc", "axc"); // 1
// Longest common subsequence/substring
distanceWasm.lcs_seq("ABCD", "ACBAD"); // 3
distanceWasm.lcs_str("ABCD", "ACBAD"); // 1
// Gestalt pattern matching
distanceWasm.ratcliff_obershelp("hello", "hallo"); // 0.8
// Local sequence alignment
distanceWasm.smith_waterman("ACGT", "ACGT"); // 4
// Set-based similarity measures
distanceWasm.jaccard("hello world", "world hello"); // 1
distanceWasm.cosine("hello world", "world hello"); // 1
distanceWasm.sorensen("hello world", "world hello"); // 1
distanceWasm.overlap("hello", "hello world"); // 1
distanceWasm.tversky("abc", "bcd"); // 0.5
// Character pair based similarity
distanceWasm.jaccard_bigram("night", "nacht"); // 0.14285714285714285
distanceWasm.cosine_bigram("night", "nacht"); // 0.25
// Simple comparison methods
distanceWasm.prefix("hello", "help"); // 0.6
distanceWasm.suffix("hello", "ello"); // 0.8
distanceWasm.length("hello", "hallo"); // 0
Use the universal function to access all algorithms by name:
const result = distanceWasm.compare("hello", "hallo", "jaro");
console.log(result); // 0.8666666666666667
levenshtein(s1: string, s2: string): numberCalculate the Levenshtein distance between two strings.
Parameters:
s1 (string) - First strings2 (string) - Second stringReturns: number - Edit distance (minimum number of single-character edits)
damerau_levenshtein(s1: string, s2: string): numberCalculate the Damerau-Levenshtein distance, which includes transposition operations.
Parameters:
s1 (string) - First strings2 (string) - Second stringReturns: number - Distance with transposition support
hamming(s1: string, s2: string): numberCalculate the Hamming distance for equal-length strings.
Parameters:
s1 (string) - First strings2 (string) - Second stringReturns: number - Number of positions where characters differ
jaro(s1: string, s2: string): numberCalculate Jaro similarity between two strings.
Parameters:
s1 (string) - First strings2 (string) - Second stringReturns: number - Jaro similarity score (0-1)
jarowinkler(s1: string, s2: string): numberCalculate Jaro-Winkler similarity, a modified version of Jaro.
Parameters:
s1 (string) - First strings2 (string) - Second stringReturns: number - Jaro-Winkler similarity score (0-1)
jaccard(s1: string, s2: string): numberCalculate Jaccard similarity based on character n-grams.
Parameters:
s1 (string) - First strings2 (string) - Second stringReturns: number - Jaccard similarity score (0-1)
cosine(s1: string, s2: string): numberCalculate cosine similarity between character n-gram vectors.
Parameters:
s1 (string) - First strings2 (string) - Second stringReturns: number - Cosine similarity score (0-1)
compare(s1: string, s2: string, algorithm: string): numberCompare two strings using any available algorithm by name.
Parameters:
s1 (string) - First strings2 (string) - Second stringalgorithm (string) - Algorithm name (e.g., 'levenshtein', 'jaro', 'jaccard')Returns: number - Similarity score (0-1) or distance value
Available Algorithm Names:
'levenshtein', 'damerau_levenshtein', 'jaro', 'jarowinkler', 'hamming', 'sift4_simple''lcs_seq', 'lcs_str', 'ratcliff_obershelp', 'smith_waterman''jaccard', 'cosine', 'sorensen', 'tversky', 'overlap''prefix', 'suffix', 'length''jaccard_bigram', 'cosine_bigram'Most distance algorithms have normalized versions that return similarity scores:
levenshtein_normalized, damerau_levenshtein_normalized, hamming_normalized, sift4_simple_normalizedlcs_seq_normalized, lcs_str_normalized, smith_waterman_normalizedThe WebAssembly implementation provides significant performance improvements:
This project incorporates and builds upon the following excellent open source projects:
myers.rs