| Crates.io | datamars |
| lib.rs | datamars |
| version | 0.4.0 |
| created_at | 2022-06-06 21:02:10.259662+00 |
| updated_at | 2022-08-20 22:57:28.448224+00 |
| description | A Rust rewrite of https://www.gnu.org/software/datamash/ |
| homepage | |
| repository | https://github.com/InCogNiTo124/datamars.git |
| max_upload_size | |
| id | 601004 |
| size | 37,141 |
My own copy of GNU datamash.
This software is pre-alpha and it's missing quite a lot of features.
If you want the latest version, use cargo install datamars.
Note: the executable is called ms.
the algorithm basically goes down like this:
args = parse_args()
# parse ops
ops = build_operation_objects(args.ops)
groups = dict(str->list[ops])
for line in file:
parts = line.split(args.delimiter)
group = groupby(parts) if grouping is defined else default
if group not in groups:
# build new objects for every group
groups.add_group(group, ops)
for op in groups[group]:
op.process(parts)
for group in groups:
print(group, [op.result() for op in group])
There is a set of operations for every group. Operations keep the running state for a group. The operations update the state for every new entry in the group - but only the state that is needed, eg no calculation of squared deviations. Should be pretty fast.