Crates.io | buffed |
lib.rs | buffed |
version | 0.1.0 |
source | src |
created_at | 2023-02-20 15:22:34.231347 |
updated_at | 2023-02-20 15:22:34.231347 |
description | Traits & implementation of parsing buffered IO |
homepage | |
repository | https://gitlab.com/austreelis/buffed |
max_upload_size | |
id | 789880 |
size | 4,435,428 |
buffed
, a buffed buffered reader for Rust.buffed
provides traits and implementations akin to
std::io::BufRead
. buffed
's traits and strucs allow
reading serialized data directly from a byte stream. It does so without
requiring any structure in the input while still taking advantage of buffering.
It also makes it easy to avoid copies and allocations.
NOTE: This crate was primarily made as a personal challenge. It has not been properly tested or audited, and you shouldn't use it for anything more serious than a hobby. Consider using sequoia-pgp's buffered-reader instead. We will still happily accept issues and merge pull requests ! We still want this crate to be a nice abstraction.
This crate was first designed as a solution to the following problem: read a user-provided, UTF8-encoded, text file, and read some data through it. Do so blazingly fasttm, without ever hogging memory and still be robust against malicious input.
The requirement for speed implied the use of buffering, which already constrains what we can use in the standard library:
std::io::Read::read_to_string
loads the whole file
into memory, which is obviously not an option to meet the second and last
requirements.std::io::BufRead::read_line
actually has the same issue
if we are to defend against malicious input: a giant file without newlines
will be loaded whole in memory.The standard library also doesn't allow controlling allocation with a reasonably high-level API (and we're don't expect it to), which could help make things faster.
buffed
's APIbuffed
provides several traits:
BuffedRead
, akin to std::io::BufRead
, but gives more control on
buffering and allows reading types implementing FromBytes
without copy.FromBytes
(and the associated FromBytesError
), encapsulating the parsing
(and error reporting) logic for read data. The plan is for this trait to be
implemented for most types you would ever need it to, like nom
's parser
outputs, serde
-deserializable types, etc. If you'd like it implemented for
something, please file an issue/PR ! As long as it is behind an non-default
optional feature, it should be easy to get it merged.Buffer
, a trait for BuffedReader
's buffer. It can be used by other
BuffedRead
implentations to take advantage of other buffering techniques
than the one used by BuffedReader
.And some types:
BuffedReader
, what BufReader
is to BufRead
, for BuffedRead
. This is
a default implementation wrapping another type implementing std::io::Read
.
Notably, it uses a BoxBuffer
, which can be swapped out for another
implementation of Buffer
as needed.BoxBuffer
, a "default" implementation of Buffer
based on a simple Box
ed
byte slice (Box<[u8]>
).Error
, a general-purpose error type used by BuffedRead
.Reading the whole content of a file might using the Buffedread
API might be
done like:
use std::fs::File;
use buffed::{BuffedRead, BuffedReader, FromBytes};
fn main() {
let file = File::open("tests/assets/capital-ru.txt").unwrap();
let mut r = BuffedReader::new(file);
loop {
match r.fill_buf() {
// EOF
Ok("") => { return; }
Ok(data) => {
let size = data.size();
// Do something with data
// ...
r.consume(size).unwrap();
},
// Oopsies
Err(err) => panic!("{err}"),
}
}
}
Alternatively, the require_fill_buf(amount: usize)
method of BuffedRead
accepts a minimum amount of data to return (unless EOF) was reached, and
require_fill_buf_no_alloc(amount: usize)
does the same, but errors out if
the buffer needs to be reallocated for the amount of data to fit.
If you're looking for something similar to buffed
, you may be interested in:
buffed
for anything more serious than a hobby, you should
consider using this instead.std::io
types with a crate like
serde or nom. This is not
the easiest path but it's probably the best if none other fits your case.The following could also be of value, even though we wouldn't consider using them for the reasons noted:
buffed
,
without the extension of std
's traits, but only for enums.std::io
's buffered
types, but still only reads bytes, which makes handling UTF8 without copying
hard and clumsy. No release since August 2019.Read
& Write
traits for #![no_std]
. Largely incomplete, and hasn't seen activity since
December 2021.std
's Read
&
Write
traits in the same idea as buffed
, but doesn't add a lot.