This application is an simple ETL in rust that can be use as a connector between systems
* It handle multiple formats : Json, Jsonl, CSV, Toml, XML, Yaml, Text
* It can read/write data from :
* Mongodb database
* S3/Minio with versionning & select
* Http(s) APIs with with some authenicators: Basic, Bearer, Jwt
* Local
* Relational DB like PSQL (Not Yet
)
* Message broker (Not Yet
)
* It need only rustup
* No garbage collector
* Parallel work
* Multi platforms
the target of this project is to simplify the work of developers and simplify the connection between system. The work is not finished but I hope it will be useful for you.
Requirement: * Rust * Docker and Docker-compose for testing the code in local
Commands to execute:
Bash
$ git clone https://github.com/jmfiaschi/chewdata.git chewdata
$ cd chewdata
$ cp .env.dev .env
$ vim .env // Edit the .env file
$ make build
$ make unit-tests
$ make integration-tests
If all the test pass, the project is ready. read the Makefile in order to see, what kind of shortcut you can use.
If you want some examples to discover this project, go in this section ./examples
If you run the program without parameters, the application will wait until you write json data and finish by quit
/exit
/\q
. By default, the program write json data in the output.
Bash
$ cargo run
$ [{"key":"value"},{"name":"test"}]
$ exit
[{"key":"value"},{"name":"test"}]
Another example without etl configuration and with file in input
Bash
$ cat ./data/multi_lines.json | cargo run
[{...}]
or
Bash
$ cat ./data/multi_lines.json | make run
[{...}]
Another example, With a json etl configuration in argument
Bash
$ cat ./data/multi_lines.csv | cargo run '[{"type":"reader","document":{"type":"csv"}},{"type":"writer"}]'
[{...}] // Will transform the csv data into json format
or
Bash
$ cat ./data/multi_lines.csv | make run json='[{\"type\":\"reader\",\"document\":{\"type\":\"csv\"}},{\"type\":\"writer\"}]'
[{...}] // Will transform the csv data into json format
Another example, With etl file configuration in argument
Bash
$ echo '[{"type":"reader","connector":{"type":"io"},"document":{"type":"csv"}},{"type":"writer"}]' > my_etl.conf.json
$ cat ./data/multi_lines.csv | cargo run -- --file my_etl.conf.json
[{...}]
or
Bash
$ echo '[{"type":"reader","connector":{"type":"io"},"document":{"type":"csv"}},{"type":"writer"}]' > my_etl.conf.json
$ cat ./data/multi_lines.csv | make run file=my_etl.conf.json
[{...}]
It is possible to use alias and default value to decrease the configuration length
Bash
$ echo '[{"type":"r","doc":{"type":"csv"}},{"type":"w"}]' > my_etl.conf.json
$ cat ./data/multi_lines.csv | make run file=my_etl.conf.json
[{...}]
In progress...
After code modifications, please run all tests.
Bash
$ make test