Crates.io | rkllm-rs |
lib.rs | rkllm-rs |
version | |
source | src |
created_at | 2025-02-17 12:37:20.728746+00 |
updated_at | 2025-03-12 07:35:54.098901+00 |
description | rkllm rust ffi binding |
homepage | https://github.com/darkautism/rkllm-rs |
repository | |
max_upload_size | |
id | 1558870 |
Cargo.toml error: | TOML parse error at line 17, column 1 | 17 | autolib = false | ^^^^^^^ unknown field `autolib`, expected one of `name`, `version`, `edition`, `authors`, `description`, `readme`, `license`, `repository`, `homepage`, `documentation`, `build`, `resolver`, `links`, `default-run`, `default_dash_run`, `rust-version`, `rust_dash_version`, `rust_version`, `license-file`, `license_dash_file`, `license_file`, `licenseFile`, `license_capital_file`, `forced-target`, `forced_dash_target`, `autobins`, `autotests`, `autoexamples`, `autobenches`, `publish`, `metadata`, `keywords`, `categories`, `exclude`, `include` |
size | 0 |
rkllm-rs
is a Rust FFI wrapper for the librkllmrt
library.
Before using rkllm-rs
, you need to install librkllmrt
. Please download and install from the following link:
Please install librkllmrt.so
in one of the common Linux library paths:
/usr/lib
/lib
/usr/local/lib
/opt/lib
Alternatively, you can use the LD_LIBRARY_PATH
environment variable to specify the library path. For example:
export LD_LIBRARY_PATH=/path/to/your/library:$LD_LIBRARY_PATH
The model used in this example can be found here
For devices with less memory, you can use this model
First, install Rust, or refer to this guide
curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh
# If you already installed git-lfs, skip this step
sudo apt install git-lfs
sudo curl -L https://github.com/airockchip/rknn-llm/raw/refs/heads/main/rkllm-runtime/Linux/librkllm_api/aarch64/librkllmrt.so -o /usr/lib/librkllmrt.so
cargo install rkllm-rs --features bin
git clone https://huggingface.co/VRxiaojie/DeepSeek-R1-Distill-Qwen-1.5B-RK3588S-RKLLM1.1.4
rkllm ./DeepSeek-R1-Distill-Qwen-1.5B-RK3588S-RKLLM1.1.4/deepseek-r1-1.5B-rkllm1.1.4.rkllm --model_type=deepseek
You should now see the LLM start up:
I rkllm: rkllm-runtime version: 1.1.4, rknpu driver version: 0.9.7, platform: RK3588
rkllm init success
Say something: Hello
Robot:
<think>
</think>
Hello! How can I assist you today? 😊
Say something:
Add the following to your Cargo.toml
:
[dependencies]
rkllm-rs = "0.1.0"
rkllm-rs
also supports running as a binary, suitable for users who do not plan to do further development or prefer an out-of-the-box experience.
cargo install rkllm-rs --features bin
rkllm ~/DeepSeek-R1-Distill-Qwen-1.5B-RK3588S-RKLLM1.1.4/deepseek-r1-1.5B-rkllm1.1.4.rkllm --model_type=deepseek
Here is the help for the tool, with various parameters set according to the help:
Usage: rkllm [OPTIONS] [model]
Arguments:
[model] Rkllm model
Options:
--model_type <model_type>
Some module have special prefix in prompt, use this to fix [possible values: normal, deepseek]
-c, --context_len <max_context_len>
Maximum number of tokens in the context window
-n, --new_tokens <max_new_tokens>
Maximum number of new tokens to generate.
-K, --top_k <top_k>
Top-K sampling parameter for token generation.
-P, --top_p <top_p>
Top-P (nucleus) sampling parameter.
-t, --temperature <temperature>
Sampling temperature, affecting the randomness of token selection.
-r, --repeat_penalty <repeat_penalty>
Penalty for repeating tokens in generation.
-f, --frequency_penalty <frequency_penalty>
Penalizes frequent tokens during generation.
-p, --presence_penalty <presence_penalty>
Penalizes tokens based on their presence in the input.
--prompt_cache <prompt_cache_path>
Path to the prompt cache file.
--skip_special_token
Whether to skip special tokens during generation.
-h, --help
Print help (see more with '--help')
-V, --version
Print version
Currently, the model types are hardcoded in the program, and unsupported models will not correctly generate bos_token
and assistant prompts. Most models will produce incorrect responses without the correct prompts, such as irrelevant answers or self-dialogue (though, to be fair, they might still engage in self-dialogue even with the prompts).
Most models have tokenizer_config.json
available online. Reading this configuration file can generate the correct prompts.
You can manually create the prompt using tokenizer_config.json
or use Python's AutoTokenizer to generate it.
This library provides a method to automatically fetch the corresponding model's tokenizer_config.json
from online sources.
cargo install rkllm-rs --features "bin, online_config"
rkllm ~/Tinnyllama-1.1B-rk3588-rkllm-1.1.4/TinyLlama-1.1B-Chat-v1.0-rk3588-w8a8-opt-0-hybrid-ratio-0.5.rkllm --model_type=TinyLlama/TinyLlama-1.1B-Chat-v1.0
This tool will fetch the tokenizer_config.json
for TinyLlama online and attempt to correct the prompts.