# Running TPC-H Queries Currently, RisingLight supports importing TPC-H data and run a subset of TPC-H queries. You may import the TPC-H data and run some simple queries in RisingLight. ## Generate TPC-H Data ### Use Make Recipe You may use the make recipe to download and generate TPC-H data (about 1GB in tbl format). ``` make tpch ``` If you want to run a larger query: ``` make tpch-10gb ``` The generated data will be placed under `tpch-dbgen/tbl` folder. ### Manual Generation First, you should use git to clone the tpch-dbgen repo: ``` git clone https://github.com/electrum/tpch-dbgen.git ``` This repo contains the program for generating TPC-H data. Then, enter the tpch-dbgen directory and type `make all`, and it will generate some executable binaries such as `dbgen` and `qgen`. We will show you how to generate TPC-H data by using one line of command in the following sections. Meanwhile, you can read this [README](https://github.com/electrum/tpch-dbgen/blob/master/README) for more details. Finally, type the following command and wait for several seconds: ``` ./dbgen ``` This command will generate the data we want, which contains a table called `LINEITEM` with a size of 700MB. ## Create Table and Import Data You will need to build RisingLight in release mode, so as to import data faster. ```shell cargo build --release ``` Then, use our test scripts to create tables. ```shell cargo run --release -- -f tests/sql/tpch/create.sql ``` We can use `\dt` to ensure that all tables have been imported. ``` cargo run --release # Inside SQL shell \dt +---+----------+---+----------+---+----------+ | 0 | postgres | 0 | postgres | 3 | supplier | | 0 | postgres | 0 | postgres | 1 | region | | 0 | postgres | 0 | postgres | 0 | nation | | 0 | postgres | 0 | postgres | 4 | partsupp | | 0 | postgres | 0 | postgres | 7 | lineitem | | 0 | postgres | 0 | postgres | 6 | orders | | 0 | postgres | 0 | postgres | 2 | part | | 0 | postgres | 0 | postgres | 5 | customer | +---+----------+---+----------+---+----------+ ``` Then, we may use the `import.sql` to import data, which calls `COPY FROM` SQL statements internally: ```shell cargo run --release -- -f tests/sql/tpch/import.sql ``` Generally, you can finish this process within several seconds. ## Run TPC-H Now, we can run simple queries on this table. ```shell cargo run --release ``` ```sql select sum(L_LINENUMBER) from LINEITEM; select count(L_ORDERKEY), sum(L_LINENUMBER) from LINEITEM where L_ORDERKEY > 2135527; ``` Or run real TPC-H queries: ```shell cargo run --release -- -f tests/sql/tpch/q1.sql cargo run --release -- -f tests/sql/tpch/q3.sql cargo run --release -- -f tests/sql/tpch/q5.sql cargo run --release -- -f tests/sql/tpch/q6.sql cargo run --release -- -f tests/sql/tpch/q10.sql ``` ## Clean Data All data of RisingLight is stored in `risinglight.secondary.db` folder. Simply remove it if you want to clean all data. ## Developers: Add new TPC-H tests In `tests`, we have two kinds of TPC-H tests: * Run TPC-H query on small dataset (used for unit tests). `tpch` folder. * Run TPC-H query on ~1GB dataset generated by tpch-gen. `tpch-full` folder. Everytime we add a new TPC-H query, we should add both kinds of tests, in sqllogictest format. For example, developers should: * create `tests/sql/tpch/_qXX.slt` * create `tests/sql/tpch-full/_qXX.slt` * create `tests/sql/tpch/qXX.sql` * add `include _qXX.slt` in `tests/sql/tpch/tpch.slt` * add `include _qXX.slt` in `tests/sql/tpch-full/tpch.slt` By using `output-format` parameter, we can easily get the output format required by sqllogictest: ``` cargo run --release -- -f tests/sql/tpch/q5.sql --output-format text psql -d tpch -f tests/sql/tpch/q5.sql -A -t -F " " ``` ... which yields ``` ALGERIA 55756674.2813 MOZAMBIQUE 54883960.1257 MOROCCO 50463646.0237 ETHIOPIA 49934541.2268 KENYA 48858086.8222 ``` Developers should check the output against Postgres in order to ensure the output is correct. The full 1GB TPC-H test suite can be verified by running: ``` cargo run --release -- -f tests/sql/tpch-full/_tpch_full.slt --output-format text ``` And if you want to start fresh from empty database, you may run: ``` ./tests/tpch-full.sh ```