PowerSQL, the data transformation tool.
Features:
CREATE [MATERIALIZED] VIEW
, CREATE TABLE AS
statements.Install the latest version using cargo
(curl https://sh.rustup.rs -sSf | sh
).
```bash
cargo install powersql --features postgres
cargo install powersql --features bigquery ```
To get started with PostgreSQL, simply create a new project in a file called powersql.toml
:
[project]
name = "my_project"
models = ["models"]
tests = ["tests]
Now create one or more models in the models
directory:
sql
CREATE VIEW my_model AS SELECT id, category from my_source;
CREATE TABLE category_stats AS SELECT COUNT(*) category_count FROM my_model GROUP BY category;
PowerSQL automatically will create a DAG based on the relations in your database.
To run against the database, provide the following environment variables:
To run against the database, provide the following environment variables:
GOOGLE_APPLICATION_CREDENTIALS
should refer to an service account key file (this can be set by an appliation rather than locally).
PROJECT_ID
is the id (not number) of the project and DATASET_ID
is the name of the dataset that is used by default.
LOCATION
is an (optional) datacenter location id where the query is being executed.
powersql check
: This will load all your .sql
files in the directories listed in models
. It will check the syntax of the SQL statements. After this, it will check the DAG and report if there is a circular dependency. Finally, it will run a type checker and report any type errors.powersql run
: Loads and runs the entire DAG of SQL statements.powersql test
: Loads and runs the data tests.Data tests are SQL queries that you can run on your database tables and views and perform checks on data quality, recency, etc. The test fails if the query returns 1 or more rows.
Some examples: ```sql -- NULL check SELECT 1 FROM t WHERE column IS NULL; -- Check values SELECT 1 FROM t WHERE amount < 0; -- Check relations SELECT 1 FROM t LEFT JOIN u ON t.id = u.id WHERE u.id IS NULL; -- Prefix check SELECT 1 FROM t WHERE NOT STARTSWITH(strcolumn, "http");
```