PowerSQL

PowerSQL, the data transformation tool for data {engineers, scientists, analysts}.

PowerSQL automatically find the relations between your SQL statements and runs your transformations as a job on your database / data warehouse / data engine.

Features:

Getting started

Install the latest version using cargo (curl https://sh.rustup.rs -sSf | sh).

```bash

For PostgreSQL

cargo install powersql --features postgres

For BigQuery

cargo install powersql --features bigquery ```

PostgreSQL

To get started with PostgreSQL, simply create a new project in a file called powersql.toml:

[project] name = "my_project" models = ["models"] tests = ["tests]

Now create one or more models in the models directory:

sql CREATE VIEW my_model AS SELECT id, category from my_source; CREATE TABLE category_stats AS SELECT COUNT(*) category_count FROM my_model GROUP BY category;

PowerSQL automatically will create a DAG based on the relations in your database.

To run against the database, provide the following environment variables:

BigQuery

To run against the database, provide the following environment variables:

GOOGLE_APPLICATION_CREDENTIALS should refer to an service account key file (this can be set by an appliation rather than locally).

PROJECT_ID is the id (not number) of the project and DATASET_ID is the name of the dataset that is used by default.

LOCATION is an (optional) datacenter location id where the query is being executed.

Commands

Data tests

Data tests are ASSERT statements that you can run on your database tables and views and perform checks on data quality, recency, etc. Assert statements checks the result of a condition - a boolean expression. Assert-based testing are enabled for every backend, they are translated by PowerSQL to queries return a boolean.

Some examples: sql -- Column should be NOT NULL ASSERT NOT EXISTS( SELECT X FROM t WHERE column IS NULL ) ASSERT NOT EXISTS ( SELECT quantity FROM rev_per_product WHERE quantity <= 0 ) AS 'quantity should be positive'; ASSERT NOT EXISTS ( SELECT product_id FROM rev_per_product WHERE product_id IS NULL ) AS 'product_id should be not null'; ASSERT ( SELECT COUNT (*) FROM rev_per_product WHERE quantity < 10 ) >= 0.7 * ( SELECT COUNT(*) FROM rev_per_product ) AS 'At least 70% should have a quantity lower than 10'