Crates.io | ason |
lib.rs | ason |
version | |
source | src |
created_at | 2024-03-19 06:16:51.967953 |
updated_at | 2024-12-26 00:39:42.568403 |
description | ASON is a data format that evolved from JSON, introducing strong data typing and support for variant types. |
homepage | https://hemashushu.github.io/works/ason |
repository | https://github.com/hemashushu/ason |
max_upload_size | |
id | 1178879 |
Cargo.toml error: | TOML parse error at line 17, column 1 | 17 | autolib = false | ^^^^^^^ unknown field `autolib`, expected one of `name`, `version`, `edition`, `authors`, `description`, `readme`, `license`, `repository`, `homepage`, `documentation`, `build`, `resolver`, `links`, `default-run`, `default_dash_run`, `rust-version`, `rust_dash_version`, `rust_version`, `license-file`, `license_dash_file`, `license_file`, `licenseFile`, `license_capital_file`, `forced-target`, `forced_dash_target`, `autobins`, `autotests`, `autoexamples`, `autobenches`, `publish`, `metadata`, `keywords`, `categories`, `exclude`, `include` |
size | 0 |
ASON is a data format evolved from JSON, featuring strong data types and support for variant types. It offers excellent readability and maintainability. ASON is well-suited for configuration files, data transfer, and data storage.
Table of Content
JSON Compatibility: ASON integrates with base JSON and JSON5 syntax, making it easy for JSON users to transition to ASON.
Simple and Consistent Syntax: ASON's syntax closely resembles Rust, featuring support for comments, omitting double quotes for structure (object) field names, and allowing trailing commas in the last array element. These features enhance familiarity and writing fluency.
Strong Data Typing: ASON numbers can be explicitly typed (e.g., u8
, i32
, f32
, f64
), and integers can be represented in hexdecimal and binary formats. Additionally, new data types such as DateTime
, Tuple
, HexByteData
, Char
are introduced, enabling more precise and rigorous data representation.
Native Variant Data Type Support, Eliminating the Null Value: ASON natively supports variant data types (also known as algebraic types, similar to the Enums in Rust). This enables seamless serialization of complex data structures from high-level programming languages. Importantly, it eliminates the error-prone null
value.
An example of ASON text:
{
string: "Hello World 🍀"
raw_string: r"[a-z]+\d+"
integer_number: 123
floating_point_number: 3.14
number_with_explicit_type: 255_u8
boolean: true
datetime: d"2023-03-24 12:30:00+08:00"
bytedata: h"68 65 6c 6c 6f"
list: [1, 2, 3]
map: [
123: "Alice"
456: "Bob"
]
tuple: (1, "foo", true)
object: {
id: 123
name: "Alice"
}
variant: Option::None
variant_with_value: Option::Some(123)
tuple_style_variant: Color::RGB(255, 127, 63)
object_style_variant: Shape::Rect{
width: 200
height: 100
}
}
ASON is a "strong datatype" of JSON, but ASON is simpler, more consistent, and more expressive, with the following improvements:
Variant
data type is added, and the null
value is removed.Char
, DateTime
, Tuple
, HexByteData
are added.List
requires all elements to be of the same data type.List
, Tuple
, Object
and Map
.All three formats are simple enough to express data well when the dataset is small. However, when dealing with larger datasets, the results can vary.
YAML uses indentation to represent hierarchy, so the number of space characters in the prefix needs to be carefully controlled, and it is easy to make mistakes when editing multiple layers even with editor assistance. In addition, its specification is quite complex.
TOML is not good at expressing hierarchy, there are often redundant key names in the text, and the object list are not as clear as other formats.
ASON, on the other hand, has good consistency regardless of the size of the data. Of course, you still need to be careful that the braces are paired, but I don't think this is a problem with the help of modern text editors.
The extension name for ASON file is *.ason
, for example:
sample.ason
, package.ason
The Rust ASON library provides two sets of APIs for accessing ASON data: one based on serde for serialization and deserialization, and the other based on AST (Abstract Syntax Tree) for low-level access.
In general, it is recommended to use the serde API since it is simple enough to meet most needs.
Consider the following ASON text:
{
name: "foo"
type: Type::Application
version: "0.1.0"
dependencies: {
"random": Option::None
"regex": Option::Some("1.0.1")
}
}
This text consists of an object and a map: the object with name
, type
, version
and dependencies
fields, and the map with string as key and optional string as value. We need to create a Rust struct corresponding to these data:
#[derive(Debug, PartialEq, Serialize, Deserialize)]
struct Package {
name: String,
#[serde(rename = "type")]
type_: Type,
version: String,
dependencies: HashMap<String, Option<String>>,
}
Note that this struct has a derive
attribute, in which Serialize
and Deserialize
are traits provided by the serde serialization framework. Applying them to a struct to be serialized and deserialized.
Since "type" is a keyword in Rust language, we uses "type_" as the field name, and then uses the #[serde(rename = "type")]
attribute to tell serde to serialize the field as "type". Then we create an enum named "Type":
#[derive(Debug, PartialEq, Serialize, Deserialize)]
enum Type {
Application,
Library,
}
Now that the preparation is done, by using the function ason::from_str
to deserialize the ASON text into a Rust struct instance:
let text = "..."; // The above ASON text
let package = from_str::<Package>(text).unwrap();
And the function ason::to_string
is used for serializing a Rust struct instance to a string:
let package = Package{...}; // Feel free to build the `Package` instance
let s = to_string(&package);
The library also provides a set of low-level APIs for building, manipulating ASON data.
Consider the following ASON text:
{
id: 123
name: "John"
orders: [11, 13]
}
Use the parse_from_str
function to parse the above ASON text into an AST:
let text = "..."; // The above ASON text
let node = parse_from_str(text).unwrap();
assert_eq!(
node,
AsonNode::Object(vec![
KeyValuePair {
key: String::from("id"),
value: Box::new(AsonNode::Number(Number::I32(123)))
},
KeyValuePair {
key: String::from("name"),
value: Box::new(AsonNode::String(String::from("John")))
},
KeyValuePair {
key: String::from("orders"),
value: Box::new(AsonNode::List(vec![
AsonNode::Number(Number::I32(11)),
AsonNode::Number(Number::I32(13))
]))
}
])
);
In contrast, the function ason::print_to_string
formats the AST into text:
let s = print_to_string(&node);
ASON is composed of values and comments.
There are two types of values: primitive and compound. Primitive values are basic data types like integers, strings, booleans and datetimes. Compound values are structures made up of multiple values (includes primitive and compound values), such as lists and objects.
Here are examples of primitive values:
Integers: 123
, +456
, -789
Floating-point numbers: 3.142
, +1.414
, -1.732
Floating-point with exponent: 2.998e10
, 6.674e-11
Special Floating-point numbers: NaN
, Inf
, +Inf
, -Inf
Underscores can be inserted between any digits of a number, e.g. 123_456_789
, 6.626_070_e-34
The data type of a number can be explicitly specified by appending the type name after the number, e.g. 65u8
, 3.14f32
Underscores can also be inserted between the number and the type name, e.g. 933_199_u32
, 6.626e-34_f32
Each number in ASON has a specific data type. The default data type for integers is
i32
and for floating-point numbers isf64
if not explicitly specified. ASON supports the these numeric data types:i8
,u8
,i16
,u16
,i32
,u32
,i64
,u64
,f32
,f64
Hexadecimal integers: 0x41
, +0x51
, -0x61
, 0x71_u8
Binary integers: 0b1100
, +0b1010
, -0b0101
, 0b0110_1001_u8
Floating-point numbers in C/C++ language hexadecimal floating-point literal format: 0x1.4p3
, 0x1.921f_b6p1_f32
Note that you cannot represent a floating-point number by simply appending the "f32" or "f64" suffix to a normal hexadecimal integer, for example 0x21_f32
. This is because the character "f" is one of the hexadecimal digit characters (i.e., [0-9a-f]
), so 0x21_f32
will only be parsed as a normal hexadecimal integer 0x21f32
.
Booleans: true
, false
Characters: 'a'
, '文'
, '😊'
Escape characters: '\r'
, '\n'
, '\t'
, '\\'
Unicode escape characters: '\u{2d}'
, '\u{6587}'
Strings: "abc文字😊"
, "foo\nbar"
Raw strings: r"[a-z]+\d+"
, r#"<\w+\s(\w+="[^"]+")*>"#
Date and time: d"2024-03-16"
, d"2024-03-16 16:30:50"
, d"2024-03-16T16:30:50Z"
, d"2024-03-16T16:30:50+08:00"
Byte data: h"11 13 17 19"
To improve readability, ASON supports writing strings across multiple lines. Simply add a \
symbol at the end of the line and start the new line. The subsequent text will be automatically appended to the current string. For example:
{
long_string: "My very educated \
mother just served \
us nine pizzas"
}
This string is equivalent to "My very educated mother just served us nine pizzas"
. Note that all leading whitespaces in the lines of the text body will be automatically removed.
ASON supports multiple lines strings, for example:
{
multiline_string: "Planets
1. Mercury Venus Earth
2. Mars Jupiter Saturn
3. Uranus Neptune"
}
This represents a string with 4 lines, it is equivalent to:
Planets
1. Mercury Venus Earth
2. Mars Jupiter Saturn
3. Uranus Neptune
Note that all leading whitespaces are reserved.
ASON supports "auto-trimmed strings" by automatically trimming the same number of leading whitespaces from echo line, for example:
{
auto_trimmed_string: """
Planets and Satellites
1. The Earth
- The Moon
2. Saturn
- Titan
Titan is the largest moon of Saturn.
- Enceladus
3. Jupiter
- Io
Io is one of the four Galilean moons of the planet Jupiter.
- Europa
"""
}
In the example above, not every line has the same number of leading whitespaces, some lines have 8, others 11 or 13. In the end, each line will only be trimmed by the same number (i.e. 8) of leading whitespaces. It is equivalent to:
Planets and Satellites
1. The Earth
- The Moon
2. Saturn
- Titan
Titan is the largest moon of Saturn.
- Enceladus
3. Jupiter
- Io
Io is one of the four Galilean moons of the planet Jupiter.
- Europa
It is worth nothing that when writing auto-trimmed strings:
"""
must be followed by a new line."""
must start on a new line, and its leading spaces are not counted.\n
) is not part of the text.For example:
[
"""
hello
""", """
foo
bar
"""
]
This list is equivalent to ["hello","foo\n\nbar"]
.
An Object can contain multiple values, each with a name called a key. The keys are identifiers which are similar to strings but without quotation marks. A combination of a key and a value is called a key-value pair. An Object is a collection of key-value pairs. For example:
{
name: "ason",
version: "1.0.1",
edition: "2021", // Note that ths comma is allowed.
}
Note that ASON Objects allow a comma at the end of the last key-value pair, which is not allowed in JSON. This feature is primarily intended to make it easy to reorder key-value pairs when editing ASON text.
The comma at the end of each key-value pair is optional, so the text above could be written as:
{
name: "ason" // Note that commas can be omitted.
version: "1.0.1"
edition: "2021"
}
Of course, multiple key-value pairs can also be written on a single line. In this case, commas are required between key-value pairs. For example:
{name: "ason", version: "1.0.1", edition: "2021",}
The values within an Object can be any type, including primitive values (such as numbers, strings, dates) and compound values (such as Lists, Objects, Tuples). In the real world, an Object usually contains other Objects, for example:
{
name: "ason"
version: "1.0.1"
edition: "2021"
dependencies: {
serde: "1.0"
chrono: "0.4"
}
dev_dependencies: {
pretty_assertions: "1.4"
}
}
A List is a collection of values of the same data type, for example:
[11, 13, 17, 19]
Similar to objects, the elements in a List can also be written on separate lines, with optional commas at the end of each line, and a comma is allowed at the end of the last element. For example:
[
"Alice",
"Bob",
"Carol",
"Dan", // Note that ths comma is allowed.
]
and
[
"Alice" // Note that commas can be omitted.
"Bob"
"Carol"
"Dan"
]
The elements in List can be of any data type, but all the elements in a List must be of the same type. For instance, the following List is invalid:
// invalid list due to inconsistent data types of elements
[11, 13, "Alice", "Bob"]
If the elements in a List are Objects, then the keys in each object, as well as the data type of the corresponding values, must be consistent. In other words, the type of object is determined by the type of all key-value pairs, and the type of key-value pair is determined by the key name and data type of the value. For example, the following List is valid:
[
{
id: 123
name: "Alice"
}
{
id: 456
name: "Bob"
}
]
While the following List is invalid:
[
{
id: 123
name: "Alice"
}
{
id: 456
name: 'A' // The data type of the value is not consistent.
}
{
id: 789
addr: "Green St." // The key name is not consistent.
}
]
If the elements in a List are Lists, then the data type of the elements in each sub-list must be the same. In other words, the type of List is determined by the data type of its elements. But the number of elements is irrelevant, for instance, the following list is valid:
[
[11, 13, 17] // The length of this list is 3.
[101, 103, 107, 109] // A list of length 4 is Ok.
[211, 223] // This list has length 2 is also Ok.
]
In the example above, although the length of each sub-list is different, since the type of a List is determined ONLY by the type of its elements, the types of these sub-lists are asserted to be the same, and therefore it is a valid List.
A Map is a list composed of one or more name-value pairs. In appearance, a Map is similar to an Object, but the names of items in a Map are typically strings or numbers (primitive data types), rather than identifiers. Additionally, a Map is a special kind of list, so it is enclosed in square brackets ([...]
) instead of curly braces ({...}
).
[
"serde": "1.0"
"serde_bytes": "0.11"
"chrono": "0.4.38"
]
A Tuple can be considered as an Object that omits the keys, for example:
(11, "Alice", true)
Tuples are similar in appearance to Lists, but Tuples do not require the data types of each element to be consistent. Secondly, both the data type and number of the elements are part of the type of Tuple, for example ("Alice", "Bob")
and ("Alice", "Bob", "Carol")
are different types of Tuples because they don't have the same number of elements.
Similar to Objects and Lists, the elements of a Tuple can also be written on separate lines, with optional commas at the end of each line, and there can be a comma at the end of the last element. For example:
(
"Alice",
11,
true, // Note that ths comma is allowed.
)
and
(
"Alice" // Note that commas can be omitted.
11
true
)
A Variant consists of three parts: the Variant type name, the Variant member name, and the optional member value. For example:
// Variant without value.
Option::None
and
// Variant with a value.
Option::Some(11)
In the two Variants in the above example, "Option" is the Variant type name, "None" and "Some" are the Variant member names, and "11" is the Variant member value.
The types are the same as long as the Variant type names are the same. For example, Color::Red
and Color::Green
are of the same type, while Option::None
and Color::Red
are of different types.
If a Variant member carries a value, then the type of the value is also part of the type of the Variant member. For example, Option::Some(11)
and Option::Some(13)
are of the same types, but Option::Some(11)
and Option::Some("John")
are of different types.
Therefore, the following List is valid because all elements have the same Variant type name and the member Some
has the same type:
[
Option::None
Option::Some(11)
Option::None
Option::Some(13)
]
However, the following List is invalid, although the variant type names of all the elements are consistent, the type of the member Some
is inconsistent:
[
Option::None
Option::Some(11)
Option::Some("John") // The type of this member is not consistent.
]
A Variant member can carry a value of any type, such as an Object:
Option::Some({
id: 123
name: "Alice"
})
Or a Tuple:
Option::Some((211, 223))
In fact, a Variant member can also carry directly multiple values, which can be either Object-style or Tuple-style, for example:
// Object-style variant member
Shape:Rectangle{
width: 307
height: 311
}
and
// Tuple-style variant member
Color::RGB(255, 127, 63)
Like JavaScript and C/C++, ASON also supports two types of comments: line comments and block comments. Comments are for human reading and are completely ignored by the machine.
Line comments start with the //
symbol and continue until the end of the line. For example:
// This is a line comment.
{
id: 123 // This is also a line comment.
name: "Bob"
}
Block comments start with the /*
symbol and end with the */
symbol. For example:
/* This is a block comment. */
{
/*
This is also a block comment.
*/
id: 123
name: /* Definitely a block comment. */ "Bob"
}
Unlike JavaScript and C/C++, ASON block comments support nesting. For example:
/*
This is the first level.
/*
This is the second level.
*/
This is the first level again.
*/
The nesting feature of block comments makes it more convenient for us to comment on a piece of code that already has a block comment. If block comments do not support nesting like JavaScript and C/C++, we need to remove the inner block comment first before adding a comment to the outer layer, because the inner block comment symbol */
will end the outer block comments, no doubt this is an annoying issue.
An ASON document can only contain one value (one primitive value or one compound value), like JSON, a typical ASON document is usually an Object or a List. In fact, all types of values are allowed, not limited to Objects or Lists. For example, a Tuple, a Variant, even a number or a string is allowed. Just make sure that a document has exactly one value. For example, the following are both valid ASON documents:
// Valid ASON document.
(11, "Alice", true)
and
// Valid ASON document.
"Hello World!"
While the following two are invalid:
// Invalid ASON document because there are 2 values.
(11, "Alice", true)
"Hello World!"
and
// Invalid ASON document because there are 3 values.
11, "Alice", true
ASON natively supports most Rust data types, including Tuples, Enums and Vectors. Because ASON is also strongly data typed, both serialization and deserialization can ensure data accuracy. In fact, ASON is more compatible with Rust's data types than other data formats (such as JSON, YAML and TOML).
ASON is a data format that is perfectly compatible with Rust's data types.
The following is a list of supported Rust data types:
i8
/u8
to i64
/u64
f32
and f64
[i32; 4]
In general, we use structs in Rust to store a group of related data. Rust structs correspond to ASON Object
. The following is an example of a struct named "User" and its instance s1
:
#[derive(Serialize, Deserialize)]
struct User {
id: i32,
name: String
}
let s1 = User {
id: 123,
name: String::from("John")
};
The corresponding ASON text for instance s1
is:
{
id: 123
name: "John"
}
Real-world data is often complex, for example, a struct containing another struct to form a hierarchical relationship. The following code demonstrates struct User
contains a child struct named Address
:
#[derive(Serialize, Deserialize)]
struct User {
id: i32,
name: String,
address: Box<Address>
}
#[derive(Serialize, Deserialize)]
struct Address {
city: String,
street: String
}
let s2 = User {
id: 123,
name: String::from("John"),
address: Box::new(Address{
city: String::from("Shenzhen"),
street: String::from("Xinan")
})
}
The corresponding ASON text for instance s2
:
{
id: 123
name: "John"
address: {
city: "Shenzhen"
street: "Xinan"
}
}
Rust's HashMap corresponds to ASON's Map, e.g. the following creates a HashMap instance m1
of type <String, Option<String>>
:
let mut m1 = HashMap::<String, Option<String>>::new();
m1.insert("foo".to_owned(), Some("hello".to_owned()));
m1.insert("bar".to_owned(), None);
m1.insert("baz".to_owned(), Some("world".to_owned()));
The corresponding ASON text for instance m1
is:
{
"foo": Option::Some("hello")
"bar": Option::None
"baz": Option::Some("world")
}
Vec
(vector) is another common data structure in Rust, which is used for storing a series of similar data. Vec
corresponds to ASON List
. The following code demonstrates adding a field named orders
to the struct User
to store order numbers:
#[derive(Serialize, Deserialize)]
struct User {
id: i32,
name: String,
orders: Vec<i32>
}
let v1 = User {
id: 123,
name: String::from("John"),
orders: vec![11, 13, 17, 19]
};
The corresponding ASON text for instance v1
is:
{
id: 123
name: "John"
orders: [11, 13, 17, 19]
}
The elements in a vector can be either simple data (such as i32
in the above example) or complex data, such as struct. The following code demonstrates adding a field named addresses
to the struct User
to store shipping addresses:
#[derive(Serialize, Deserialize)]
struct User {
id: i32,
name: String,
addresses: Vec<Address>
}
#[derive(Serialize, Deserialize)]
struct Address {
city: String,
street: String
}
let v2 = User {
id: 123,
name: String::from("John"),
address: vec![
Address {
city: String::from("Guangzhou"),
street: String::from("Tianhe")
},
Address {
city: String::from("Shenzhen"),
street: String::from("Xinan")
},
]
};
The corresponding ASON text for instance v2
is:
{
id: 123
name: "John"
addresses: [
{
city: "Guangzhou"
street: "Tianhe"
}
{
city: "Shenzhen"
street: "Xinan"
}
]
}
There is another common data type tuple in Rust, which can be considered as structs with omitted field names. Tuple just corresponds to ASON Tuple
.
For example, in the above example, if you want the order list to include not only the order number but also the order status, you can use the Tuple (i32, String)
to replace i32
. The modified code is:
#[derive(Serialize, Deserialize)]
struct User {
id: i32,
name: String,
orders: Vec<(i32, String)>
}
let t1 = User {
id: 123,
name: String::from("John"),
orders: vec![
(11, String::from("ordered"),
(13, String::from("shipped"),
(17, String::from("delivered"),
(19, String::from("cancelled")
]
};
The corresponding ASON text for instance v1
is:
{
id: 123
name: "John"
orders: [
(11, "ordered")
(13, "shipped")
(17, "delivered")
(19, "cancelled")
]
}
It should be noted that in some programming languages, tuples and vectors are not clearly distinguished, but in Rust they are completely different data types. Vectors require that all elements have the same data type (Rust arrays are similar to vectors, but vectors have a variable number of elements, while arrays have a fixed size that cannot be changed after creation), while tuples do not require that their member data types be the same, but do require a fixed number of members. ASON's definition of Tuple
is consistent with Rust's.
In the above example, the order status is represented by a string. From historical lessons, we know that a batter solution is to use an enum. Rust enum corresponds to ASON Variant
. The following code uses the enum Status
to replace the String
in Vec<(i32, String)>
.
#[derive(Serialize, Deserialize)]
enum Status {
Ordered,
Shipped,
Delivered,
Cancelled
}
#[derive(Serialize, Deserialize)]
struct User {
id: i32,
name: String,
orders: Vec<(i32, Status)>
}
let e1 = User {
id: 123,
name: String::from("John"),
orders: vec![
(11, Status::Ordered),
(13, Status::Shipped),
(17, Status::Delivered),
(19, Status::Cancelled)
]
};
The corresponding ASON text for instance e1
is:
{
id: 123
name: "John"
orders: [
(11, Status::Ordered)
(13, Status::Shipped)
(17, Status::Delivered)
(19, Status::Cancelled)
]
}
Rust enum type is actually quite powerful, it can not only represent different categories of something but also carry data. For example, consider the following enum Color
:
#[derive(Serialize, Deserialize)]
enum Color {
Transparent,
Grayscale(u8),
Rgb(u8, u8, u8),
Hsl{
hue: i32,
saturation: u8,
lightness: u8
}
}
There are four types of values in Rust enums:
Color::Transparent
Color::Grayscale(u8)
Color::Rgb(u8, u8, u8)
Color::Hsl{...}
ASON Variant
fully supports all flavours of Rust enums, consider the following instance:
let e2 = vec![
Color::Transparent,
Color::Grayscale(127),
Color::Rgb(255, 127, 63),
Color::Hsl{
hue: 300,
saturation: 100,
lightness: 50
}
];
The corresponding ASON text for instance e2
is:
[
Color::Transparent
Color::Grayscale(127_u8)
Color::Rgb(255_u8, 127_u8, 63_u8)
Color::Hsl{
hue: 300
saturation: 100_u8
lightness: 50_u8
}
]
The ASON text closely resembles the Rust data literals, which is intentional. The design aims to reduce the learning curve for users by making ASON similar to existing data formats (JSON) and programming languages (Rust).
Some Rust data types are not supported, includes:
()
)sturct Foo;
struct Width(u32);
struct RGB(u8, u8, u8);
It is worth nothing that the serde framework's data model does not include the DateTime
type, so ASON DateTime
cannot be directly serialized or deserialized to Rust's chrono::DateTime
. If you serialize a chrono::DateTime
type value, you will get a regular string. A workaround is to wrap the chrono::DateTime
value as an ason::Date
type. For more details, please refer to the 'test_serialize' unit test in ason::serde::serde_date::tests
in the library source code.
In addition, serde treats fixed-length arrays such as [i32; 4]
as tuples rather than vectors, so the Rust array [11, 13, 17, 19]
will be serialized as ASON Tuple (11, 13, 17, 19)
.
Check out LICENSE and LICENSE.additional.