datamars

Crates.iodatamars
lib.rsdatamars
version0.4.0
sourcesrc
created_at2022-06-06 21:02:10.259662
updated_at2022-08-20 22:57:28.448224
descriptionA Rust rewrite of https://www.gnu.org/software/datamash/
homepage
repositoryhttps://github.com/InCogNiTo124/datamars.git
max_upload_size
id601004
size37,141
Marijan Smetko (InCogNiTo124)

documentation

README

datamars

My own copy of GNU datamash.

This software is pre-alpha and it's missing quite a lot of features.

Installation

If you want the latest version, use cargo install datamars.

Note: the executable is called ms.

Implementation details

the algorithm basically goes down like this:

args = parse_args()
# parse ops
ops = build_operation_objects(args.ops)

groups = dict(str->list[ops])

for line in file: 
    parts = line.split(args.delimiter)
    group = groupby(parts) if grouping is defined else default
        
    if group not in groups:
        # build new objects for every group
        groups.add_group(group, ops)
        
    for op in groups[group]:
        op.process(parts)
        
for group in groups:
    print(group, [op.result() for op in group])

There is a set of operations for every group. Operations keep the running state for a group. The operations update the state for every new entry in the group - but only the state that is needed, eg no calculation of squared deviations. Should be pretty fast.

Commit count: 90

cargo fmt