Crates.io | candle-lstm |
lib.rs | candle-lstm |
version | 0.2.3 |
source | src |
created_at | 2024-08-24 15:00:48.56997 |
updated_at | 2024-09-25 07:19:05.912539 |
description | optimize HuggingFace candle LSTM in some cases |
homepage | |
repository | https://github.com/kigichang/candle-lstm |
max_upload_size | |
id | 1350356 |
size | 25,130 |
Re-implementing Candle LSTM inference to speed inference up, including bidirectional LSTM.
This implementation is ONLY FOR CPU INFERENCE. DO NOT USE IT ON METAL OR CUDA.
I test inference on My Macbook Pro with M2 chip. It is ~5ms slower than Candle on Metal.
I test On RTX4090 with Cuda 12.5. It is ~6x slower than Candle on Cuda.
Install Pytorch and run simple.py to generate test data.