This application is a light ETL in rust that can be used as a connector between systems
| Feature | Values | Description |
| ---------------------------------------- | ----------------------------------------------------------------------- | ------------------------------------------------------------------ |
| Generate data | - | Generate data for testing |
| Supported formats | Json
, Jsonl
, CSV
, Toml
, XML
, Yaml
, Text
, Parquet
| Read and Write in these formats |
| Multi Connectors | Mongodb
, S3
, Minio
, APIs
| Read / Write / Clean data |
| Multi Http auths | Basic
, Bearer
, Jwt
| Read / Write / Clean data |
| Transform data | tera template | Transform the data in the fly |
| Multi Configuration formats | Json
, Yaml
| The project need a jobs configuration in input |
| Read data in parallel or sequential mode | Cursor
, Offset
| With this type of paginator, the data can be read in different way |
More useful information:
Requirement:
Commands to execute:
Bash
git clone https://github.com/jmfiaschi/chewdata.git chewdata
cd chewdata
cp .env.dev .env
vim .env // Edit the .env file
make build
make unit-tests
make integration-tests
If all the test pass, the project is ready. read the Makefile in order to see, what kind of shortcut you can use.
If you want some examples to discover this project, go in this section ./examples
If you run the program without parameters, the application will wait until you write json data. By default, the program write json data in the output and the program stop when you enter empty value.
Bash
$ cargo run
$ [{"key":"value"},{"name":"test"}]
$ enter
[{"key":"value"},{"name":"test"}]
Another example without etl configuration and with file in input
Bash
$ cat ./data/multi_lines.json | cargo run
[{...}]
or
Bash
$ cat ./data/multi_lines.json | make run
[{...}]
Another example, With a json etl configuration in argument
Bash
$ cat ./data/multi_lines.csv | cargo run '[{"type":"reader","document":{"type":"csv"}},{"type":"writer"}]'
[{...}] // Will transform the csv data into json format
or
Bash
$ cat ./data/multi_lines.csv | make run json='[{\"type\":\"reader\",\"document\":{\"type\":\"csv\"}},{\"type\":\"writer\"}]'
[{...}] // Will transform the csv data into json format
Another example, With etl file configuration in argument
Bash
$ echo '[{"type":"reader","connector":{"type":"io"},"document":{"type":"csv"}},{"type":"writer"}]' > my_etl.conf.json
$ cat ./data/multi_lines.csv | cargo run -- --file my_etl.conf.json
[{...}]
or
Bash
$ echo '[{"type":"reader","connector":{"type":"io"},"document":{"type":"csv"}},{"type":"writer"}]' > my_etl.conf.json
$ cat ./data/multi_lines.csv | make run file=my_etl.conf.json
[{...}]
It is possible to use alias and default value to decrease the configuration length
Bash
$ echo '[{"type":"r","doc":{"type":"csv"}},{"type":"w"}]' > my_etl.conf.json
$ cat ./data/multi_lines.csv | make run file=my_etl.conf.json
[{...}]
In progress...
After code modifications, please run all tests.
Bash
make test