Crates.io | mdbook-fix-cjk-spacing |
lib.rs | mdbook-fix-cjk-spacing |
version | 0.1.1 |
source | src |
created_at | 2020-07-20 07:49:54.316427 |
updated_at | 2020-07-20 09:38:31.821385 |
description | mdbook preprocess that fixes CJK line breaks |
homepage | https://github.com/lotabout/mdbook-fix-cjk-spacing |
repository | https://github.com/lotabout/mdbook-fix-cjk-spacing |
max_upload_size | |
id | 267137 |
size | 101,488 |
mdbook will render extra space of continuous lines with CJK characters.
.....中文结尾
中文顶格...
will result in
.....中文结尾 中文顶格...
`- note the space here
This preprocessor will fix that.
PATH
.
cargo install mdbook-fix-cjk-spacing
book.toml
[preprocessor.fix-cjk-spacing]
command = "mdbook-fix-cjk-spacing"
This preprocessor will work on AST of the markdown file:
SoftBreak
token, it will search before and after for a Text
token.SoftBreak
is omitted when the previous text ends with CJK and next text starts with CJK character.The binary has a "raw" mode for showing the processed output:
cat markdown.md | md-fix-cjk-spacing raw
In markdown, if we write several lines continuously, it will be parsed as a whole block:
line 1
line 2
line 3
// will be parsed as
<p>line 1
line 2
line 3</p>
That means line breaks are kept and all the three lines are treated as a whole paragraph.
However, the browser will convert the line break in a <p>
into a single
space, so when we see the previous content in a browser, it will look like:
line 1 line 2 line 3
That is OK except when we use Chinese. There is no concept of space in Chinese, so when we write:
中文第一行
中文接上行
// will show as
中文第一行 中文接上行
// `- not the space here
It is really frustrating! So there are two major solutions:
The first option is actually not so practical. This 'bug' exist for so long and still not fixed. The second will be so boring and un-friendly.
So here comes our solution with mdbook
: Write a preprocessor to merge
Chinese lines automatically before parsing!
Only the following situation are dealt with:
...<chinese character>[should contains no spaces]
[zero or more spaces|tab]<chinese character>
.....中文结尾
中文顶格...
// are modified to
.....中文结尾中文顶格...
// `- note no space here
Note that the content in code block will not be changed.