Crates.io | unstructured |
lib.rs | unstructured |
version | 0.5.1 |
source | src |
created_at | 2019-05-01 21:05:40.70921 |
updated_at | 2021-01-09 05:58:07.699838 |
description | Generic types for unstructured data |
homepage | |
repository | https://github.com/proctorlabs/unstructured-rs |
max_upload_size | |
id | 131450 |
size | 101,227 |
This library provides types for usage with unstructured data. This is based on functionality from both serde_json and serde_value. Depending on your use case, it may make sense to use one of those instead.
These structures for serialization and deserialization into an intermediate container with serde and manipulation of this data while in this intermediate state.
So why not use one of the above libraries?
In addition to many of the futures provided by the above libraries, unstructured also provides:
Document::U64(100) == 100 as u64
doc1.merge(doc2)
or doc = doc1 + doc2
doc.select(".path.to.key")
docs.filter("[0].path.to.key | [1].path.to.array[0:5]")
The primary struct used in this repo is Document
. Document provides methods for easy type conversion and manipulation.
use unstructured::Document;
use std::collections::BTreeMap;
let mut map = BTreeMap::new(); // Will be inferred as BTreeMap<Document, Document> though root element can be any supported type
map.insert("test".into(), (100 as u64).into()); // From<> is implement for most basic data types
let doc: Document = map.into(); // Create a new Document where the root element is the map defined above
assert_eq!(doc["test"], Document::U64(100));
Document implements serialize and deserialize so that it can be easily used where the data format is unknown and manipulated after it has been received.
#[macro_use]
extern crate serde;
use unstructured::Document;
#[derive(Deserialize, Serialize)]
struct SomeStruct {
key: String,
}
fn main() {
let from_service = "{\"key\": \"value\"}";
let doc: Document = serde_json::from_str(from_service).unwrap();
let expected: Document = "value".into();
assert_eq!(doc["key"], expected);
let some_struct: SomeStruct = doc.try_into().unwrap();
assert_eq!(some_struct.key, "value");
let another_doc = Document::new(some_struct).unwrap();
assert_eq!(another_doc["key"], expected);
}
Selectors can be used to retrieve a reference to nested values, regardless of the incoming format.
doc.select("/path/to/key")
doc.select(".path.to.[\"key\"")
use unstructured::Document;
let doc: Document =
serde_json::from_str("{\"some\": {\"nested\": {\"value\": \"is this value\"}}}").unwrap();
let doc_element = doc.select("/some/nested/value").unwrap(); // Returns an Option<Document>, None if not found
let expected: Document = "is this value".into();
assert_eq!(*doc_element, expected);
In addition to selectors, filters can be used to create new documents from an array of input documents.
use unstructured::Document;
let docs: Vec<Document> = vec![
serde_json::from_str(r#"{"some": {"nested": {"vals": [1,2,3]}}}"#).unwrap(),
serde_json::from_str(r#"{"some": {"nested": {"vals": [4,5,6]}}}"#).unwrap(),
];
let result = Document::filter(&docs, "[0].some.nested.vals | [1].some.nested.vals").unwrap();
assert_eq!(result["some"]["nested"]["vals"][4], Document::U64(5));