| Crates.io | keyword-tools |
| lib.rs | keyword-tools |
| version | 0.1.0 |
| created_at | 2024-12-15 08:12:14.37488+00 |
| updated_at | 2024-12-15 08:12:14.37488+00 |
| description | Rust tools for keyword extraction and similarity search |
| homepage | |
| repository | https://github.com/akitenkrad/keywords.git |
| max_upload_size | |
| id | 1484003 |
| size | 303,074 |
cargo add keyword-tools
use keyword_tools::{extract_keywords, load_keywords, load_keywords_from_rsc, Language};
let keywords = load_keywords();
// keywords = load_keywords_from_rsc(PATH_TO_JSON_FILE);
let text = "After the introduction of Large Language Models (LLMs), there have been substantial improvements in the performance of Natural Language Generation (NLG) tasks, including Text Summarization and Machine Translation.";
let extracted_kwds = extract_keywords(text, Language::English);
pip install git+ssh://git@github.com/akitenkrad/keywords.git
get keywords
from keywords import Keyword
keywords: list[Keyword] = Keyword.load_keywords()
extract keywords from texts
from keywords import Keyword, extract_keywords
keywords: list[Keyword] = Keyword.load_keywords()
extracted_kws = extract_keywords(text, keywords, remove_stopwords=True)
calculate metrics
specialization factor
from keywords.metrics import SpecializationFactor
factors = SpecializationFactor.calculate(texts)
If you want to use your own keywords, just create your a JSON file according to the following instructions:
[
{
"category": "Topic",
"language": "English",
"word": "Edge AI",
"alias": "Edge AI",
"score": 10
},
]
Then, load the keywords.
let kws = load_keywords_from_rsc("PATH_TO_YOUR JSON_FILE")
The JSON file defines a collection of topic entries used by the crate. Each entry is an object with the following fields:
Specifies the category of the entry.
Available options:
Indicates the language of the entry.
Available options:
The main term or keyword for the topic.
An alternative name or synonym for the word.
A numerical value representing the importance or relevance of the topic. Available options: Any integer or floating-point number (e.g., 1, 10). Higher values may indicate greater importance.