| Crates.io | candle-lstm |
| lib.rs | candle-lstm |
| version | 0.2.3 |
| created_at | 2024-08-24 15:00:48.56997+00 |
| updated_at | 2024-09-25 07:19:05.912539+00 |
| description | optimize HuggingFace candle LSTM in some cases |
| homepage | |
| repository | https://github.com/kigichang/candle-lstm |
| max_upload_size | |
| id | 1350356 |
| size | 25,130 |
Re-implementing Candle LSTM inference to speed inference up, including bidirectional LSTM.
This implementation is ONLY FOR CPU INFERENCE. DO NOT USE IT ON METAL OR CUDA.
I test inference on My Macbook Pro with M2 chip. It is ~5ms slower than Candle on Metal.
I test On RTX4090 with Cuda 12.5. It is ~6x slower than Candle on Cuda.
Install Pytorch and run simple.py to generate test data.