has_fast_pdep

Crates.iohas_fast_pdep
lib.rshas_fast_pdep
version0.1.3
created_at2025-05-23 03:57:47.28148+00
updated_at2025-05-23 05:24:47.492047+00
descriptionDetect fast hardware support for PDEP/PEXT.
homepage
repositoryhttps://github.com/seancroach/has_fast_pdep
max_upload_size
id1685869
size27,140
Sean C. Roach (seancroach)

documentation

README

Detect fast hardware support for PDEP/PEXT.

github crates.io docs.rs build status

A single-function, no-std library that returns true if the current CPU implements PDEP and PEXT with fast, non-microcoded hardware.

[dependencies]
has_fast_pdep = "0.1"

Compiler support: requires rustc 1.85+

Rationale

Zen, Zen+, Zen 2, and Hygon Dhyana CPUs implement PDEP and PEXT using microcode in a way that makes them slower than well-optimized, non-intrinsic fallbacks. In performance-critical code, checking for BMI2 support isn't enough—you could end up hurting performance on said CPUs where these instructions exist but are slow. This crate helps you avoid that by detecting speed, not just support.

Examples

Basic usage:

use has_fast_pdep::has_fast_pdep;

#[must_use]
pub fn exposed_fn(value: u64) -> u64 {
    if has_fast_pdep() {
        // SAFETY: The CPU has BMI2 and fast PDEP/PEXT instructions.
        unsafe { uses_pdep(value) }
    } else {
        fallback(value)
    }
}

#[must_use]
#[target_feature(enable = "bmi2")]
fn uses_pdep(value: u64) -> u64 {
    // TODO: implement PDEP/PEXT algorithm
    value
}

#[must_use]
fn fallback(value: u64) -> u64 {
    // TODO: implement fallback algorithm
    value
}

You can view the documentation on docs.rs here.

Implementation Details

The result of the hardware check is determined once at runtime. After the initial check, all future calls to has_fast_pdep becomes a simple true or false with zero branching or logic.

On x86 targets, CPUID is used directly without probing for its existence. This is intentional. For every tier 1 x86 Rust target, CPUID is guaranteed to be present. If you're targeting old hardware, such as an i486, this crate might not be for you. If you happen to be that individual, make an issue, and I'll reimplement the probing logic via inline assembly.

License

Licensed under either of:

Contribution

Unless you explicitly state otherwise, any contribution intentionally submitted for inclusion in the work by you, as defined in the Apache-2.0 license, shall be dual licensed as above, without any additional terms or conditions.

Commit count: 13

cargo fmt