pem-iterator

Crates.io	pem-iterator
lib.rs	pem-iterator
version	0.2.0
source	src
created_at	2017-11-13 01:36:36.292041
updated_at	2017-11-26 09:08:58.256397
description	Iterate over PEM-encoded data
homepage
repository	https://github.com/Popog/pem-iterator
max_upload_size
id	39185
size	78,541

(Popog)

documentation

https://docs.rs/pem-iterator/

README

PEM Iterator

Iterate over PEM-encoded data.

Features

Enables decoding PEM formatted data via iterators.
Fast. Current benchmarks put it at about 2x-4x faster than pem crate.
No dependencies, no unsafe, no dynamic allocation, only requires core.
Highly customizable encapsulation boundary parsing.
Resilient parsing. Errors generated by the underlying stream don't lose state.

Usage

Cargo.toml:

[dependencies]
pem-iterator = "0.2"

Crate root:

extern crate pem_iterator;

Example

extern crate pem_iterator;

use pem_iterator::boundary::{BoundaryType, BoundaryParser, LabelMatcher};
use pem_iterator::body::Single;


const SAMPLE: &'static str = "-----BEGIN RSA PRIVATE KEY-----
MIIBPQIBAAJBAOsfi5AGYhdRs/x6q5H7kScxA0Kzzqe6WI6gf6+tc6IvKQJo5rQc
dWWSQ0nRGt2hOPDO+35NKhQEjBQxPh/v7n0CAwEAAQJBAOGaBAyuw0ICyENy5NsO
2gkT00AWTSzM9Zns0HedY31yEabkuFvrMCHjscEF7u3Y6PB7An3IzooBHchsFDei
AAECIQD/JahddzR5K3A6rzTidmAf1PBtqi7296EnWv8WvpfAAQIhAOvowIXZI4Un
DXjgZ9ekuUjZN+GUQRAVlkEEohGLVy59AiEA90VtqDdQuWWpvJX0cM08V10tLXrT
TTGsEtITid1ogAECIQDAaFl90ZgS5cMrL3wCeatVKzVUmuJmB/VAmlLFFGzK0QIh
ANJGc7AFk4fyFD/OezhwGHbWmo/S+bfeAiIh2Ss2FxKJ
-----END RSA PRIVATE KEY-----";

let mut input = SAMPLE.chars().enumerate();

let mut label_buf = String::new();
{
    let mut parser = BoundaryParser::from_chars(BoundaryType::Begin, &mut input, &mut label_buf);
    assert_eq!(parser.next(), None);
    assert_eq!(parser.complete(), Ok(()));
}
println!("PEM label: {}", label_buf);

// Parse the body
let data: Result<Vec<u8>, _> = Single::from_chars(&mut input).collect();
let data = data.unwrap();

// Verify the end boundary has the same label as the begin boundary
{
    let mut parser = BoundaryParser::from_chars(BoundaryType::End, &mut input, LabelMatcher(label_buf.chars()));
    assert_eq!(parser.next(), None);
    assert_eq!(parser.complete(), Ok(()));
}

println!("data: {:?}", data);

`BoundaryParser` and `Label`

The first task in parsing a PEM formatted data is parsing the BEGIN boundary. Enter BoundaryParser. This iterator type takes three parameters to construct:

An enum value for BEGIN vs END.
The stream to get characters from.
An object to deal with the label.

That third parameter holds a lot of power. Basically as the parser encounters the label, it will notify this parameter of each character via the Label trait. This enables a bunch of different behaviors, such as:

Accumulating the characters into a buffer (e.g. &mut String)
Matching against known characters. (e.g. LabelMatcher("CERTIFICATE".chars()))
Discarding the characters completely (e.g. DiscardLabel)

In addition to a simple Mismatch error, this label processing also has the option to return custom, complex errors. Enabling significant versatility and expandability.

Since parsing the BEGIN label is totally separate from parsing END, one can mix and match strategies to customize the level of strictness (e.g. BEGIN and END can have different labels).

Chunked vs Single

For parsing the body this crate provides 2 iterators, Chunked and Single. The basic difference is Chunked emits 3 bytes of output at a time (corresponding to 4 characters of input), while Single emits only 1 byte at a time.

There may be some performance differences between the two, but presently they seem nearly identical. Originally there was more of a distinction between the two and a trade-off in performance vs functionality, but at this point, the difference is largely an ergonomic one.

Resilient parsing

The major types of this crate (BoundaryParser, Chunked, and Single), are all iterators. It's obvious why the body parsers are iterators: they need to iterate over the bytes of output. But why is BoundaryParser?

Basically, it makes parsing more resilient. If the underlying stream emits an errors, it can be forwarded to the caller and dealt with without losing parsing state. Is this useful? Probably not. Most of the time you'd just want to fail if the stream errors. But it is kind of neat.

Commit count: 3