| Crates.io | orp |
| lib.rs | orp |
| version | 0.9.2 |
| created_at | 2025-01-30 18:35:47.251998+00 |
| updated_at | 2025-03-23 16:42:16.820186+00 |
| description | Lightweight framework for building ONNX runtime pipelines with ort |
| homepage | https://github.com/fbilhaut/orp |
| repository | https://github.com/fbilhaut/orp |
| max_upload_size | |
| id | 1536713 |
| size | 54,393 |
orp is a lightweight framework designed to simplify the creation and execution of ONNX Runtime Pipelines in Rust. Built on top of the 🦀 ort runtime and the 🔗 composable crate, it provides an simple way to handle data pre- and post-processing, chain multiple ONNX models together, while encouraging code reuse and clarity.
🌿 gline-rs: inference engine for GLiNER models🧲 gte-rs: text embedding and re-rankingThe execution providers available in ort can be leveraged to perform considerably faster inferences on GPU/NPU hardware.
The first step is to pass the appropriate execution providers in RuntimeParameters. For example:
let rtp = RuntimeParameters::default().with_execution_providers([
CUDAExecutionProvider::default().build()
]);
The second step is to activate the appropriate features (see related section below), otherwise ir may silently fall-back to CPU. For example:
$ cargo run --features=cuda ...
Please refer to doc/ORT.md for details about execution providers.
This create mirrors the following ort features:
load-dynamiccuda, tensorrt, directml, coreml, rocm, openvino, onednn, xnnpack, qnn, cann, nnapi, tvm, acl, armnn, migraphx, vitis, and rknpu.ort: the ONNX runtime wrappercomposable: this crate is used to actually define the pre- and post-processing pipelines by composition or elementary steps, and can in turn be used to combine mutliple pipelines.