Datafusion-Bigtable

Bigtable data source for Apache Arrow Datafusion

Run SQL on Bigtable

This crate implements Bigtable data source and Executor for Datafusion. It is built on top of gRPC client tonic.

Quick Start

``` let bigtabledatasource = BigtableDataSource::new( "emulator".toowned(), // project "dev".toowned(), // instance "weatherballoons".toowned(), // table "measurements".toowned(), // column family vec!["rowkey".toowned()], // tablepartitioncols "#".toowned(), // tablepartitionseparator vec![Field::new("pressure", DataType::Utf8, false)], // qualifiers true, // onlyreadlatest ).await.unwrap();

let mut ctx = ExecutionContext::new(); ctx.registertable("weatherballoons", Arc::new(bigtable_datasource)).unwrap();

ctx.sql("SELECT \"rowkey\", pressure, \"timestamp\" FROM weatherballoons where \"rowkey\" = 'us-west2#3698#2021-03-05-1200'").await?.collect().await?; ```

Roadmap

Bigtable

SQL

General

Note: datafusion-bigtable provides the physical Executor for Datafusion. Any aggregation, group by, join are implemented and handled by Datafusion.