ngrams

Crates.io	ngrams
lib.rs	ngrams
version	1.0.1
created_at	2015-11-18 05:17:34.411621+00
updated_at	2016-08-30 14:39:21.746178+00
description	Generate n-grams from sequences
homepage	https://pwoolcoc.gitlab.io/ngrams/ngrams/
repository	https://gitlab.com/pwoolcoc/ngrams.git
max_upload_size
id	3447
size	29,904

Paul Woolcock (pwoolcoc)

documentation

https://pwoolcoc.gitlab.io/ngrams

README

N-grams

Documentation

This crate takes a sequence of tokens and generates an n-gram for it. For more information about n-grams, check wikipedia: https://en.wikipedia.org/wiki/N-gram

Note: The canonical version of this crate is hosted on Gitlab

Usage

Probably the easiest way to use it is to use the iterator adaptor. If your tokens are strings (&str, String, char, or Vec), you don't have to do anything other than generate the token stream:

use ngrams::Ngram;
let grams: Vec<_> = "one two three".split(' ').ngrams(2).collect();
// => vec![
//        vec!["\u{2060}", "one"],
//        vec!["one", "two"],
//        vec!["two", "three"],
//        vec!["three", "\u{2060}"],
//    ]

(re: the "\u{2060}": We use the unicode WORD JOINER symbol as padding on the beginning and end of the token stream.)

If your token type isn't one of the listed types, you can still use the iterator adaptor by implementing the ngram::Pad trait for your type.

Commit count: 0

ngrams

documentation

README

N-grams

Usage

cargo fmt