dfsql

Crates.iodfsql
lib.rsdfsql
version
sourcesrc
created_at2023-12-05 00:43:01.377803
updated_at2024-11-02 19:53:06.768322
descriptionSQL REPL for Data Frames
homepage
repositoryhttps://github.com/Banyc/dfsql
max_upload_size
id1058235
Cargo.toml error:TOML parse error at line 17, column 1 | 17 | autolib = false | ^^^^^^^ unknown field `autolib`, expected one of `name`, `version`, `edition`, `authors`, `description`, `readme`, `license`, `repository`, `homepage`, `documentation`, `build`, `resolver`, `links`, `default-run`, `default_dash_run`, `rust-version`, `rust_dash_version`, `rust_version`, `license-file`, `license_dash_file`, `license_file`, `licenseFile`, `license_capital_file`, `forced-target`, `forced_dash_target`, `autobins`, `autotests`, `autoexamples`, `autobenches`, `publish`, `metadata`, `keywords`, `categories`, `exclude`, `include`
size0
IchHabeKeineNamen (Banyc)

documentation

README

dfsql

  • Revision: the standalone count command is replaced with len, so make sure to replace (count) and col "count" with len and col "len" respectively.
    • the unary count <col> command is unaffected.

Install

cargo install dfsql

How to run

dfsql --input your.csv --output a-new.csv
# ...or
dfsql -i your.csv -o a-new.csv

REPL

  • exit/quit: exit the REPL loop.
    exit
    
  • undo: undo the previous successful operation.
    undo
    
  • reset: reset all the changes and go back to the original data frame.
    reset
    
  • schema: show column names and types of the data frame.
    schema
    
  • save: save the current data frame to a file.
    save a-new.csv
    

Statements

  • select
    select <expr>*
    
    select last_name first_name
    
    • Select columns "last_name" and "first_name" and collect them into a data frame.
  • Group by
    group (<col> | <var>)* agg <expr>*
    
    group first_name agg (count)
    
    • Group the data frame by column "first_name" and then aggregate each group with the count of the members.
  • filter
    filter <expr>
    
    filter first_name = "John"
    
  • limit
    limit <int>
    
    limit 5
    
  • reverse
    reverse
    
  • sort
    sort ((asc | desc | ()) <col>)*
    
    sort icpsr_id
    
  • use
    use <var>
    
    use other
    
    • Switch to the data frame called other.
  • join
    (left | right | inner | full) join <var> on <col> <col>?
    
    left join other on id ID
    
    • left join the data frame called other on my column id and its column ID

Expressions

  • col: reference to a column.
    col : (<str> | <var>) -> <expr>
    
    select col first_name
    
  • exclude: remove columns from the data frame.
    exclude : <expr>* -> <expr>
    
    select exclude last_name first_name
    
  • literal: literal values like 42, "John", 1.0, and null.
  • binary operations
    select a * b
    
    • Calculate the product of columns "a" and "b" and collect the result.
  • unary operations
    select -a
    
    select sum a
    
    • Sum all values in column "a" and collect the scalar result.
  • alias: assign a name to a column.
    alias : (<col> | <var>) <expr> -> <expr>
    
    select alias product a * b
    
    • Assign the name "product" to the product and collect the new column.
  • conditional
    <conditional> : if <expr> then <expr> (if <expr> then <expr>)* otherwise <expr> -> <expr>
    
    select if class = 0 then "A" if class = 1 then "B" else null
    
  • cast: cast a column to either type str, int, or float.
    cast : <type> <expr> -> <expr>
    
    select cast str id
    
    • Cast the column "id" to type str and collect the result.
Commit count: 117

cargo fmt