# rust-lapper
[Documentation](https://docs.rs/rust-lapper) [Crates.io](https://crates.io/crates/rust-lapper) This is a rust port of Brent Pendersen's [nim-lapper](https://github.com/brentp/nim-lapper). It has a few notable differences, mostly that the find and seek methods both return iterators, so all adaptor methods may be used normally. This crate works well for most interval data that does not include very long intervals that engulf a majority of other intervals. It is still fairly comparable to other methods. If you absolutely need time guarantees in the worst case, see [COItres](https://github.com/dcjones/coitrees) and [IITree](https://docs.rs/bio/0.32.0/bio/data_structures/interval_tree/struct.ArrayBackedIntervalTree.html). However, on more typical datasets, this crate is between 4-10x faster than other interval overlap methods. It should also be noted that the `count` method is agnostic to data type, and should be about as fast as it is possible to be on any dataset. It is an implementation of the [BITS algorithm](https://academic.oup.com/bioinformatics/article/29/1/1/273289) ## Serde Support `rust-lapper` supports serialization with serde for `Lapper` and `Interval` objects: ```toml [dependencies] rust-lapper = { version = "*", features = ["with_serde"] } ``` See `examples/serde.rs` for a brief example. ## Benchmarks Benchmarking interval tree-ish datastructures is hard Please see the [interval_bakeoff](https://github.com/sstadick/interval_bakeoff) project for details on how the benchmarks were run... It's not fully baked yet though, and is finiky to run. Command to run: ``` ./target/release/interval_bakeoff fake -a -l RustLapper -l RustBio -l NestedInterval -n50000 -u100000 # This equates to the following params: # num_intervals 50000 # universe_size 100000 # min_interval_size 500 # max_interval_size 80000 # add_large_span true (universe spanning) ``` Set A / b Creation Times | crate/method | A time | B time | | ---------------- | -------- | -------- | | rust_lapper | 15.625ms | 31.25ms | | nested_intervals | 15.625ms | 15.625ms | | bio | 15.625ms | 31.25ms | 100% hit rate (A vs A) | crate/method | mean time | intersection | | ---------------------------------- | ---------- | ------------ | | rust_lapper/find | 4.78125s | 1469068763 | | rust_lapper/count | 15.625ms | 1469068763 | | nested_intervals/query_overlapping | 157.4375s | 1469068763 | | bio/find | 33.296875s | 1469068763 | Sub 100% hit rate (A vs B) | crate/method | mean time | intersection | | ---------------------------------- | ---------- | ------------ | | rust_lapper/find | 531.25ms | 176488436 | | rust_lapper/count | 15.625ms | 176488436 | | nested_intervals/query_overlapping | 11.109375s | 196090092 | | bio/find | 4.3125s | 176488436 | [nested_intervals](https://docs.rs/nested_intervals/0.2.0/nested_intervals/) [rust-bio](https://docs.rs/bio/0.28.2/bio/) *Note that rust-bio has a new interval tree structure which should be faster than what is shown here* ## Example ```rust use rust_lapper::{Interval, Lapper}; type Iv = Interval