indentation_flattener

Crates.ioindentation_flattener
lib.rsindentation_flattener
version0.1.0
sourcesrc
created_at2017-04-22 17:56:30.917885
updated_at2017-04-22 17:56:30.917885
descriptionFrom indented input, generate plain output with indentation PUSH and POP codes.
homepage
repositoryhttps://github.com/jleahred/indentation_flattener
max_upload_size
id11568
size55,665
José Luis (jleahred)

documentation

README

= Indentation flattener

A small an simple library to transform text with indentations to a text with extra symbols "PushIndent" and "PopIndent"

This project is based on https://github.com/jleahred/indent_tokenizer[indent_tokenizer]

This format is more convenient to use with peg grammars (as I will do)

== Usage

Add to cargo.toml [source, toml]

[dependencies] pending...

See example below

== Modifs

None

== Indentation format

Tabs are no valid on indentation grouping.

Indentation "Push" and "Pop" will be inserted on each indentation level modif.

Let's see by example.

.Simple valid input

..... ... .... new ident level .... new ident level .... .... one back indent level .... .... new indent level .... two back indent level .... .... new indent level

Indentation groups can have any number of spaces

.Valid indentation different spaces

..... .... level1 <-- .... level2 .... level1 <-- .... .... .... .... level1 <-- .... level2 <--

It's not a good idea to have same level with different spaces, but it's allowed when you are creating a new level.

In this example, last level1 and last level2 are idented with more spaces than previus ones

.Invalid indentation

..... ... .... .... .... <-- incorrect indentation .... <-- correct previous ident level .... .... .... ....

In order to go back a level, the indentation has to match with the previous on this level.

As we saw in previous example, increasing level is free indentation.

.Start line indicator

|..... |..... |...... |......

You can start lines with |, but it's optional.

.Start line indicator is optional

..... |1234 1234 ...... ......

But if you combine lines with | and without on same level, columns will not match

It is usefull when you need to start with spaces.

.I want to start a line with spaces

..... | ..... <- This line starts with an space | ...... <- Starting with 2 spaces |..... <- starts with no spaces ..... <- starting with no spaces ... <- starting with no spaces

.My line starts with a |

..... ||..... This line starts with a | |...... This one starts with .

..... ..... ||..... This line starts with a | ...... This one starts with .


A line is empty when there are no content or it only has spaces (no matter how many).

Empty lines will produce new line char and no change of indentation level

.Empty lines

..... ..... ..... ..... ..... next line is empty

.....     next line is empty

..... ..... next line is empty


What if I want represent empty lines?

.Representing empty lines

..... ..... ..... There is a new line after (same indent level)

.....
.....     There is a new line after (explicitly marked)
|
.....     three new lines after
|
|
|

..... Two new lines at end of document | |

| is quite usefull if you need to represent empty lines at end of document.

What if I want to represent spaces at end of line?

Spaces at end of line will not be erased, therefore, you don't need to do anything about it.

But it could be intesting to represent it because some editors can run trailing or just because you can visualize it.

.Representing spaces at end line

..... ..... ..... ..... This line keeps 2 spaces and end | and you know it

Next line is properly indented and only has spaces
|   |

In fact, you can write | at end of all lines. It will be removed.

Next strings, are equivalent.

.| it's optional at end of line

.....| .....| .....| .....|

..... ..... ..... .....


But I could need a pipe | at end of line

.pipe at end of line

..... ..... ..... ..... This line ends with a pipe||


== Output format

The output will be a string with codes PUSH_INDENT and POP_INDENT

.From lib.rs [source, rust]

const PUSH_INDENT: char = 0x02 as char;
const POP_INDENT: char = 0x03 as char;

Spaces to mark indentation, will be removed from output.

See examples below.

As the system works with lines, every existing line with content, will finish with end of line

== API

It works with concrete types vs general types (as String, u32 or usize)

Constants:: [source, rust]

const EOL: char = '\n'; const PUSH_INDENT: char = 0x02 as char; const POP_INDENT: char = 0x03 as char;

Concrete types:: [source, rust]

#[derive(Debug, PartialEq, Copy, Clone)] pub struct LineNum(u32);

#[derive(Debug, PartialEq, Clone, Eq)] pub struct SLine(String);

#[derive(Debug, PartialEq, Clone, Eq, Default)] pub struct SFlattedText(String);

Function to call:: [source, rust]

pub fn flatter(input: &str) -> Result<SFlattedText, Error>

Error type:: [source, rust]

#[derive(Debug, PartialEq)] pub struct Error { pub line: LineNum, pub desc: String, }

Thats all

Look into lib.rs

== Examples

You can look into tests.rs, there are several tests.

.Simple example [source, rust]

this input...

0 01 02 020 021 023 0230 0231

produces...

0 \u{2}01 02 \u{2}020 021 023 \u{2}0230 0231 \u{3}\u{3}\u{3}"

As you can see, indentations has been removed by codes to mark PUSH_INDENT and POP_INDENT

[NOTE] All lines are finished with new line. If last line has not a new line, the system will insert one

.Complex example [source, rust]

let flat = flatter("

0 || 01a 01b 01c

 02a
 02b

    |020a
    ||020b

    |  021a
    |021b

1a 1b 11a ||11b 11c

12a  ||
|12b  ||

2a 21a 21b | |

") .unwrap();

assert!(flat ==
        SFlattedText::from("

0 \u{2}| 01a 01b 01c

02a 02b

\u{2}020a |020b

021a 021b \u{3}\u{3}1a 1b \u{2}11a |11b 11c

12a | 12b | \u{3}2a \u{2}21a 21b

\u{3}"));

More examples on tests.rs

Commit count: 6

cargo fmt