Crates.io | egg-mode-text |
lib.rs | egg-mode-text |
version | 1.15.1 |
source | src |
created_at | 2017-11-27 23:30:03.660992 |
updated_at | 2023-01-31 19:35:16.67081 |
description | Text parsing for Twitter: character counting, hashtag/mention extraction |
homepage | |
repository | https://github.com/egg-mode-rs/egg-mode-text |
max_upload_size | |
id | 40791 |
size | 345,213 |
like twitter-text, but in rust
This library is an attempt to port twitter-text to Rust. It was originally part of egg-mode, but
it's totally distinct from egg-mode type-wise, so i pulled it out into its own library. (Also it's
chock full of macros so it was also a bid to bring egg-mode's compile times down. >_>
)
This library can be used to count characters for tweets, and extract URLs and "entities" from arbitrary text, such as @-mentions and hashtags.
For example, to see how many characters a given tweet takes:
use egg_mode_text::character_count;
let count = character_count("This is a test.", 23, 23);
assert_eq!(count, 15);
// URLs get replaced by a t.co URL of the given length
//
// This length is available from the Twitter API in `help/configuration`
let count = character_count("test.com", 23, 23);
assert_eq!(count, 23);
// Multiple URLs get shortened individually
let count =
character_count("Test https://test.com test https://test.com test.com test", 23, 23);
assert_eq!(count, 86);
To extract substrings of various "entities" used by Twitter:
use egg_mode_text::{EntityKind, entities};
let text = "sample #text with a link to twitter.com";
let mut results = entities(text).into_iter();
let entity = results.next().unwrap();
assert_eq!(entity.kind, EntityKind::Url);
assert_eq!(entity.substr(text), "twitter.com");
let entity = results.next().unwrap();
assert_eq!(entity.kind, EntityKind::Hashtag);
assert_eq!(entity.substr(text), "#text");
assert_eq!(results.next(), None);
For more information, check out the documentation.
To use this crate in your own project, add the following to your Cargo.toml:
[dependencies]
egg-mode-text = "1.15.1"
...and add the following to your crate root:
extern crate egg_mode_text;
egg-mode-text is licensed under the Mozilla Public License, version 2.0. See the LICENSE file for details.