candle-flash-attn-v1

Crates.io	candle-flash-attn-v1
lib.rs	candle-flash-attn-v1
version	0.0.1
created_at	2025-04-15 11:23:22.869699+00
updated_at	2025-04-15 11:23:22.869699+00
description	Flash attention V1 layer for the candle ML framework.
homepage	https://github.com/huggingface/candle-extensions/candle-flash-attn-v1/
repository	https://github.com/huggingface/candle-extensions/
max_upload_size
id	1634308
size	25,601,223

Nicolas Patry (Narsil)

documentation

https://docs.rs/candle-flash-attn-v1

README

Candle Flash Attention v1 Layer

Flash Attention v2 does not support Turing GPUs (T4, RTX 2080). This layer can be used in replacement of the official flash attention Candle layer in the meantime.

Commit count: 6

candle-flash-attn-v1

documentation

README

Candle Flash Attention v1 Layer

cargo fmt