owoof ===== [github](https://github.com/sqwishy/owoof) [crates.io](https://crates.io/crates/owoof) [docs.rs](https://docs.rs/owoof) A glorified query-builder inspired by [Datomic](https://docs.datomic.com/cloud/index.html) that uses a datalog-like format for querying and modifying information around a SQLite database. This is a pet project and probably shouldn't be used for anything serious. This is implemented as a rust library. It is documented, you can read the source or maybe find the [documentation published on docs.rs](https://docs.rs/owoof/*/owoof/). There are two rust executable targets. One provides a command-line-interface (as shown below) and another can be used for importing data from a csv file. ## CLI Compile this with `cargo build` using `--features cli --bin cli`. The CLI can be used to initialize new database files, assert/create, retract/remove, or query information. Here are some examples: ```shell $ echo '[{":db/attribute": ":pet/name"}, {":pet/name": "Garfield"}, {":pet/name": "Odie"}, {":pet/name": "Spot"}, {":db/attribute": ":person/name"}, {":db/attribute": ":person/starship"}, {":person/name": "Jon Arbuckle"}, {":person/name": "Lieutenant Commander Data", ":person/starship": "USS Enterprise (NCC-1701-D)"}]' \ | owoof assert [ "#45e9d8e9-51ea-47e6-8172-fc8179f8fbb7", "#4aa95e29-8d45-470b-98a7-ee39aae1b9c9", "#2450b9e6-71a4-4311-b93e-3920eebb2c06", "#c544251c-a279-4809-b9b6-7d3cd68d2f2c", "#19a4cba1-6fc7-4904-ad36-e8502445412f", "#f1bf032d-b036-4633-b6f1-78664e44603c", "#e7ecd66e-222f-44bc-9932-c778aa26d6ea", "#af32cfdb-b0f1-4bbc-830f-1eb83e4380a3" ] $ echo '[{":db/attribute": ":pet/owner"}, {":db/id": "#4aa95e29-8d45-470b-98a7-ee39aae1b9c9", ":pet/owner": "#e7ecd66e-222f-44bc-9932-c778aa26d6ea"}, {":db/id": "#2450b9e6-71a4-4311-b93e-3920eebb2c06", ":pet/owner": "#e7ecd66e-222f-44bc-9932-c778aa26d6ea"}, {":db/id": "#c544251c-a279-4809-b9b6-7d3cd68d2f2c", ":pet/owner": "#af32cfdb-b0f1-4bbc-830f-1eb83e4380a3"}]' \ | owoof assert [ "#ffc46ae2-1bde-4c08-bfea-09db8241aa2b", "#4aa95e29-8d45-470b-98a7-ee39aae1b9c9", "#2450b9e6-71a4-4311-b93e-3920eebb2c06", "#c544251c-a279-4809-b9b6-7d3cd68d2f2c" ] $ owoof '?pet :pet/owner ?owner' \ --show '?pet :pet/name' \ --show '?owner :person/name' [ [ { ":pet/name": "Garfield" }, { ":person/name": "Jon Arbuckle" } ], [ { ":pet/name": "Odie" }, { ":person/name": "Jon Arbuckle" } ], [ { ":pet/name": "Spot" }, { ":person/name": "Lieutenant Commander Data" } ] ] $ owoof '?person :person/starship "USS Enterprise (NCC-1701-D)"' \ '?pet :pet/owner ?person' \ '?pet :pet/name ?n' [ "Spot" ] # Or, suppose you know someone's name and their pet's name but don't know the attribute # that relates them... (But also this doesn't use indexes well so don't do it.) $ owoof '?person :person/name "Lieutenant Commander Data"' \ '?pet ?owner ?person' \ '?pet :pet/name "Spot"' \ --show '?owner :db/attribute' [ { ":db/attribute": ":pet/owner" } ] ``` Imported from the [goodbooks-10k](https://github.com/zygmuntz/goodbooks-10k) dataset. ```shell $ owoof '?r :rating/score 1' \ '?r :rating/book ?b' \ '?b :book/authors "Dan Brown"' \ --show '?r :rating/user' \ --show '?b :book/title' \ --limit 5 [ [ { ":rating/user": 9 }, { ":book/title": "Angels & Demons (Robert Langdon, #1)" } ], [ { ":rating/user": 58 }, { ":book/title": "The Da Vinci Code (Robert Langdon, #2)" } ], [ { ":rating/user": 65 }, { ":book/title": "The Da Vinci Code (Robert Langdon, #2)" } ], [ { ":rating/user": 80 }, { ":book/title": "The Da Vinci Code (Robert Langdon, #2)" } ], [ { ":rating/user": 89 }, { ":book/title": "The Da Vinci Code (Robert Langdon, #2)" } ] ] ``` ## Importing goodbooks-10k 1. Initialize an empty database. ```shell $ owoof init ``` 2. Import books & `--output` a copy of the data with the `:db/id` column for each imported row. ```shell $ owoof-csv --output -- \ :book/title \ :book/authors \ :book/isbn \ :book/avg-rating\ average_rating \ < goodbooks-10k/books.csv \ > /tmp/imported-books ``` 3. Import ratings, we're using `mlr` to join the ratings with the imported books. ```shell $ mlr --csv join \ -f /tmp/imported-books \ -j book_id \ < goodbooks-10k/ratings.csv \ | owoof-csv -- \ ':rating/book :db/id' \ ':rating/score rating' \ ':rating/user user_id' ``` 4. That takes some time (probably minutes) but then you can do something like. ```shell $ owoof '?calvin :book/title "The Complete Calvin and Hobbes"' \ '?rating :rating/book ?calvin' \ '?rating :rating/score 1' \ '?rating :rating/user ?u' \ '?more-great-takes :rating/user ?u' \ '?more-great-takes :rating/book ?b' \ '?more-great-takes :rating/score 5' \ --show '?b :book/title :book/avg-rating' \ --asc '?b :book/avg-rating' ``` And it should spit out some answers. ## TODO/Caveats - Testing is not extensive at this point. The schema _should_ be enforced, so no deleting attributes that are in use, but I haven't done the work to verify this so there might be some surprises. - Performance is not super reliable. Version 0.2 adds partial indexes over specific attributes and has helped a lot with search performance. However, there is no index on values. Some queries are impacted by this more than others, so performance is not reliable. The difficulty currently with a values index is that SQLite's query planner will prefer it in cases where it shouldn't. It isn't a good index and should be a last-resort -- it's also huge. - This is not feature-rich yet, constraints ensure equality and no support for constraints over ranges or involving logical operations exist yet and honestly I haven't tested how well it will perform with the schema changes made in 0.2. ## Internal TODOs - Create DontWoof off the Connection. - The Select borrowing Network is a bit weird. I tried to split it off but it was still weird. Not sure what to do about that. One consideration is that pushing a Select on to a Query only borrows from the network. Maybe this could be relaxed? - Test reference counting? Add a clean-up that removes soups with zero rc and runs pragma optimize. - Maybe add some sort of update thing to shorthand retract & assert? - The `:db/id` attribute is kind of silly since the entity and value are the same for triplets of that attribute. It's useful for object forms / mappings; like `{":db/id": ...}`. But maybe there is a more clever way to group by something? (Like some sort of primary key associated with every form that the database stores ... 🤔) ## See Also My blog post associated with version 0.1 this software: https://froghat.ca/blag/dont-woof #### License This is licensed under [Apache License, Version 2.0](LICENSE).