Crates.io | codump |
lib.rs | codump |
version | 0.1.1 |
source | src |
created_at | 2023-07-11 05:33:06.209122 |
updated_at | 2023-07-12 05:44:41.239883 |
description | A straightforward and flexible code/comment dump tool |
homepage | |
repository | https://github.com/Pistonite/codump |
max_upload_size | |
id | 913508 |
size | 111,453 |
A straightforward tool for dumping code/comments from source files.
The tool uses regex for parsing comments and does not parse the language at all. Therefore there's no requirement on directives or pragmas and it works on all source code, given your codebase has a system for doc comments.
There are many existing tools that take comments in the codebase and turn them into documentation. However, the quality of the generated documentation depends fully on the quality of the comments. Most of the time you will end up with a generated documentation where 90% of the pages are useless because the symbol is either not documented, or the comment is empty/doesn't say anything, like:
/// <summary>
///
/// <summary>
/// <param name="input">the input</param>
public void DoSomething(string input) {
...
}
I found it to be more useful to reverse this process, where I bring the code to the documentation. For example, my documentation can be a wiki page where I document not only the APIs and symbols, but also the design principles, examples, etc... And the documentation tool can generate code snippets to go along with the wiki page.
So that's the back story. This tool takes a file, a preset or custom comment syntax, and a search path to find the symbol, then it prints out the code in a nice format ready to be embedded in a documentation page.
Obviously, just printing the code into the console doesn't make it documentation. I use txtpp for generating files with embedded commands.
Contribution and issues are welcome. This is my personal project and I have limited bandwidth working on it, but I will take any suggestion/comment seriously.
As executable:
cargo install codump
As library:
cargo add codump
Add the cli
feature if you need to parse config from CLI args
cargo add codump --features cli
A straightforward and flexible code/comment dump tool
Usage: codump [OPTIONS] <FILE> <SEARCH_PATH>...
Arguments:
<FILE>
The input file to parse
<SEARCH_PATH>...
The component search path
Each path is a case-sensitive substring used to search for components at that level. The first line of the code after the doc comments is searched for the substring.
Options:
--outer <OUTER>
Outer single line comment regex
--outer-start <OUTER_START>
Outer multi line comment start regex
--outer-end <OUTER_END>
Outer multi line comment end regex
--inner <INNER>
Inner single line comment regex
--inner-start <INNER_START>
Inner multi line comment start regex
--inner-end <INNER_END>
Inner multi line comment end regex
-i, --ignore <IGNORE>
Pattern for lines that should be ignored
-f, --format <FORMAT>
Format for the output
[default: summary]
Possible values:
- summary: Comments + abbreviated code
- comment: Comment only format
- detail: Comment + all code
-p, --preset <PRESET>
Use a preset configuration
If both presets and individual options are set, the individual options will override the corresponding part of the preset. The rest of the preset will still be used.
Possible values:
- rust: Rust style
- rust-java: Rust style for single line and Java/JS/TS style for multiline
- python: Python style
-c, --context
Print context
Context is the parent components of the found component
-C, --context-comments
Print context comments
Print the comments of the parents along with the context (implies --context)
-h, --help
Print help (see a summary with '-h')
-V, --version
Print version
Outer comments is the comments placed before the thing it is for. Below is an example of an outer comment in rust style
/// Do something and returns the result
fn do_something() -> u32 {
todo!();
}
Same outer comment in JS/TS style
/**
Do something and returns the result
*/
export function doSomething(): number {
...
}
In this commenting style, all of the code the comment is for is after the comment.
Inner comments are comments placed inside the thing it is for. For example, module comments in rust.
//! This is a module to do things
pub fn do_thing_1() {
...
}
pub fn do_thing_2() {
...
}
Another example are doc comments in python
def do_something():
"""Do something and returns the result"""
...
The distinction is that the "signature" of the code the comment is for is outside of the comment, and the details are after the comment
A component conceptually contain its comments and code. Its code may contain child components.
//! imagine this file is example.rs
//! this is the inner comment for that file
// code inside this component, but not a child component
use std;
/// a function is a child component
fn foo() -> u32 {
...
}
/// a nested module is a child component too
mod bar {
//! inner comment (not sure if this is valid rust doc, but valid for this tool)
/// child components can have children too
fn biz() {
...
}
}
This tool searches for component given a path. Once a component is found, the tool can print out its summary, comment, or implementation details.
The tool uses a simple parsing style based on lines and regex.
The tool supports both single and multi-line comments for inner and outer comments. However, single and multi-line style cannot be mixes for the same component they are documenting
For example, you can do this:
//! Example typescript file
/// Example 1
///
/// This function is documented with rust style single line comments
function foo() { ... }
/**
* Example 2
*
* C/Java/JS style
*/
function bar() { ... }
And this is invalid:
/// This is a single line doc comment
/**
which cannot be mixed with multi line
*/
function foo() { ... }
This is to keep the parser simple. In reality you wouldn't want to mix them for the same component you are documenting. Note that you can still mix the comments for the child components inside a component, which is what example 1 does.
The parsing step turns list of lines into a component for searching. The component lines will be parsed assuming they are organized as following
Nested children are parsed from the body lines (lines between outer comments) like so:
Note that this is not friendly to C++ namespace style where the things inside the namespace are commonly not indented
/// I am a namespace
namespace foo {
/// Why is this not indented
void do_something();
} // end namespace foo
Two (undesired) things will happen:
do_something
will not be treated as a child component of foo
, but in the same level as the namespace.} // end namespace foo
will be treated as part of do_something
For 2, we can workaround this by adding ^} // end namespace
to the pattern of lines to ignore.
For 1, we can still find the symbol by treating it as a component in the file. However, if there are multiple functions with the exact same signature (except for the namespace) in the same file, the tool will not be able to uniquely find it.
The tool searches for a component by specifying a file and one of more search arguments.
Each search argument is used for searching the next nested component. If a nested component cannot be uniquely identified with the search term, the tool will error.
Since the tool uses comments to find the components, a component won't be found if it's not documented.
The tool supports 3 output formats for the component: summary
, comment
and detail
.
Addtionally, you can use the -k/-K
flags to print the parent(s) and the parent comments.
In summary mode, the outer and inner comments will be printed as-is.
Only lines in the body that do not have leading spaces will be printed. Indented blocks will be replaced by ...
with the same indentation
as the first line of that block. This also applies to lines between the outer and inner comments.
In comment mode, only the outer and inner comments are printed.
In detail mode, all content of the component will be printed as-is.