candle-flash-attn-v1

Crates.iocandle-flash-attn-v1
lib.rscandle-flash-attn-v1
version0.0.1
created_at2025-04-15 11:23:22.869699+00
updated_at2025-04-15 11:23:22.869699+00
descriptionFlash attention V1 layer for the candle ML framework.
homepagehttps://github.com/huggingface/candle-extensions/candle-flash-attn-v1/
repositoryhttps://github.com/huggingface/candle-extensions/
max_upload_size
id1634308
size25,601,223
Nicolas Patry (Narsil)

documentation

https://docs.rs/candle-flash-attn-v1

README

Candle Flash Attention v1 Layer

Flash Attention v2 does not support Turing GPUs (T4, RTX 2080). This layer can be used in replacement of the official flash attention Candle layer in the meantime.

Commit count: 6

cargo fmt