Crates.io | datafusion-bigtable |
lib.rs | datafusion-bigtable |
version | 0.1.0 |
source | src |
created_at | 2022-03-12 04:32:00.70872 |
updated_at | 2022-03-12 04:32:00.70872 |
description | Bigtable data source for Apache Arrow Datafusion |
homepage | https://github.com/datafusion-contrib/datafusion-bigtable |
repository | https://github.com/datafusion-contrib/datafusion-bigtable |
max_upload_size | |
id | 548632 |
size | 39,275 |
Bigtable data source for Apache Arrow Datafusion
This crate implements Bigtable data source and Executor for Datafusion. It is built on top of gRPC client tonic.
let bigtable_datasource = BigtableDataSource::new(
"emulator".to_owned(), // project
"dev".to_owned(), // instance
"weather_balloons".to_owned(), // table
"measurements".to_owned(), // column family
vec!["_row_key".to_owned()], // table_partition_cols
"#".to_owned(), // table_partition_separator
vec![Field::new("pressure", DataType::Utf8, false)], // qualifiers
true, // only_read_latest
).await.unwrap();
let mut ctx = ExecutionContext::new();
ctx.register_table("weather_balloons", Arc::new(bigtable_datasource)).unwrap();
ctx.sql("SELECT \"_row_key\", pressure, \"_timestamp\" FROM weather_balloons where \"_row_key\" = 'us-west2#3698#2021-03-05-1200'").await?.collect().await?;
"_row_key" =
"_row_key" IN
"_row_key" BETWEEN
=
IN
BETWEEN
(only supported by last table_partition_cols)Note: datafusion-bigtable provides the physical Executor for Datafusion. Any aggregation, group by, join are implemented and handled by Datafusion.